AAAI 2026

January 22, 2026

Singapore, Singapore

Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.

Attributed Question Answering (AQA) aims to enhance the reliability of AI-generated answers by including references for each statement, helping users to validate the provided information. However, existing work on AQA has primarily focused on text-only input, and has largely overlooked the role of multimodality. We introduce MAVis, a first benchmark designed to evaluate end-to-end systems on understanding user intent behind visual questions, retrieving evidence from multimodal documents, and generating answers with citations. Our dataset comprises 157K visual QA instances, where each answer is annotated with sentence-level citations referring to multimodal documents. We develop automatic metrics along three dimensions -- informativeness, groundedness, and fluency -- and demonstrate their strong correlation with human judgments. Our key findings are threefold: (1) LVLMs within multimodal RAG generate more informative and fluent answers than unimodal RAG but exhibit weak groundedness for image documents, a gap amplified in multimodal settings. (2) Given the same multimodal documents, there is a trade-off between informativeness and groundedness across different prompting methods. (3) Our proposed method highlights mitigating contextual bias in interpreting image documents as a crucial direction for future research.

Downloads

Paper

Next from AAAI 2026

Vision Transformers Are Circulant Attention Learners
poster

Vision Transformers Are Circulant Attention Learners

AAAI 2026

+1
Dongchen Han and 3 other authors

22 January 2026

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Presentations
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2025 Underline - All rights reserved