AAAI 2026

January 23, 2026

Singapore, Singapore

Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.

Recent advancements in multimodal large language models (MLLMs) have shown remarkable progress in video understanding. However, video MLLMs (VideoMLLMs) still suffer from hallucinations, generating nonsensical or irrelevant content. This issue partly stems from over-reliance on pre-trained knowledge, sometimes neglecting the rich visual information present in the video. Additionally, many existing methods rely on uniform frame sampling, which can overlook critical visual cues. To address these challenges, we present EchoBat, a novel approach that leverages audio information as well as video temporal and logical consistency to improve preference data construction and keyframe extraction. Our method integrates Direct Preference Optimization (DPO) to mitigate hallucinations by leveraging high-quality, contextually rich preference feedback. Specifically, we use GPT-4o to generate high-quality video descriptions and integrate visually relevant segments from Whisper-derived transcripts to construct preference responses. Correspondingly, we use the reference model itself to describe the reversed video, and use GPT-4o to flashback the text and fill in the hallucination to produce non-preferred responses. This strategy enhances the model’s ability to better understand visual content and temporal, logical relationships within videos. Furthermore, we propose an echo-layered sampling strategy for keyframe extraction from videos, which can provide more precise visual supervision compared to uniform sampling. Experimental results on the three latest video hallucination benchmarks demonstrate the effectiveness of our approach.

Downloads

Paper

Next from AAAI 2026

ReBoot: Encrypted Training of Deep Neural Networks with CKKS Bootstrapping
poster

ReBoot: Encrypted Training of Deep Neural Networks with CKKS Bootstrapping

AAAI 2026

Luca Colombo and 1 other author

23 January 2026

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Presentations
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2025 Underline - All rights reserved