
Mengmeng Wang
computer vision
generative models
video understanding & activity analysis
few-shot action recognition
spatial-temporal modeling
text-to-image generation
language and vision
scene generation
7
presentations
SHORT BIO
I am a Phd student in Zhejiang University, China. My research area includes action recognition, object tracking, object detection, depth estimation, multimodal learning and so on. My works have been published on top computer vision transactions/conferences (TPAMI, TIP, CVPR, ICCV, ECCV, AAAI etc) and top robotic conferences (ICRA, IROS).
Presentations

A Multimodal, Multi-Task Adapting Framework for Video Action Recognition
Mengmeng Wang and 8 other authors

SSMG: Spatial-Semantic Map Guided Diffusion Model for Free-Form Layout-to-Image Generation
Chengyou JIA and 6 other authors

Revisiting the Spatial and Temporal Modeling for Few-shot Action Recognition
Jiazheng Xing and 3 other authors

One-Shot Face Reenactment Using Appearance Adaptive Normalization
Guangming Yao and 7 other authors

HR-Depth : High Resolution Self-Supervised Monocular Depth Estimation
Xiaoyang Lyu and 7 other authors

FCFR-Net: Feature Fusion Based Coarse-to-Fine Residual Learning for Depth Completion
Lina Liu and 6 other authors

Structure-Aware Person Image Generation with Pose Decomposition and Semantic Correlation
Jilin Tang and 5 other authors