Weihong Zhong
Student @ Harbin Institute of Technology
pre-training
emotion
multi-modality
hallucination
alignment
video-language
large vision-language models
multimodal hallucination snowballing
argument structure constructions (ascs)
automatic identification of linguistic structures
dataset construction for model training
audio captioning
5
presentations
SHORT BIO
I’m a second-year Ph.D. student in the Research Center for Social Computing and Information Retrieval at Harbin Institute of Technology (HIT, China). My research interests focus on Natural Language Processing and Multimodal Generation.
Presentations
Investigating and Mitigating the Multimodal Hallucination Snowballing in Large Vision-Language Models
Weihong Zhong and 8 other authors
STOA-VLP: Spatial-Temporal Modeling of Object and Action for Video-Language Pre-training
Weihong Zhong and 6 other authors