EMNLP 2025

November 06, 2025

Suzhou, China

Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.

Multimodal large language models (MLLMs) have shown impressive capabilities in document understanding, a rapidly growing research area with significant industrial demand. As a multimodal task, document understanding requires models to possess both perceptual and cognitive abilities. However, due to different types of annotation noise in training, current MLLMs often face conflicts between perception and cognition. Taking a document VQA task (cognition) as an example, an MLLM might generate answers that do not match the corresponding visual content identified by its OCR (perception). This conflict suggests that the MLLM might struggle to establish an intrinsic connection between the information it “sees” and what it “understands”. Such conflicts challenge the intuitive notion that cognition is consistent with perception, hindering the performance and explainability of MLLMs. In this paper, we define the conflicts between cognition and perception as Cognition and Perception (C&P) knowledge conflicts, a form of multimodal knowledge conflicts, and systematically assess them with a focus on document understanding. Our analysis reveals that even GPT-4o, a leading MLLM, achieves only 75.26% C&P consistency. To mitigate the C&P knowledge conflicts, we propose a novel method called Multimodal Knowledge Consistency Fine-tuning. Our method reduces C&P knowledge conflicts across all tested MLLMs and enhances their performance in both cognitive and perceptual tasks. All data we construct will be publicly available.

Downloads

SlidesPaperTranscript English (automatic)

Next from EMNLP 2025

MMDocIR: Benchmarking Multimodal Retrieval for Long Documents
poster

MMDocIR: Benchmarking Multimodal Retrieval for Long Documents

EMNLP 2025

+3Kuicai Dong
Yujing Chang and 5 other authors

06 November 2025

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Presentations
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2025 Underline - All rights reserved