EMNLP 2025

November 07, 2025

Suzhou, China

Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.

While Vision Language Models (VLMs) are trained to learn conceptual representations (generalized knowledge across many instances), they are typically used to analyze individual instances. When evaluation instances are atypical, this paradigm results in tension between two priors in the model. The first is a pragmatic prior that the textual and visual input are both relevant, arising from VLM finetuning on congruent inputs; the second is a semantic prior that the conceptual representation is generally true for instances of the category. In order to understand how VLMs trade-off these priors, we introduce a new evaluation dataset, VISaGE, consisting of both typical and exceptional images. In carefully balanced experiments, we show that VLMs are typically dominated by the semantic prior, which arises from the language modality, when answering queries about instances. In contrast, conceptual understanding degrades when the assumption of congruency underlying the pragmatic prior is violated with incongruent images.

Downloads

SlidesPaperTranscript English (automatic)

Next from EMNLP 2025

CodeSSM: Towards State Space Models for Code Understanding
technical paper

CodeSSM: Towards State Space Models for Code Understanding

EMNLP 2025

Abhinav Anand and 2 other authors

07 November 2025

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Presentations
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2026 Underline - All rights reserved