EMNLP 2025

November 06, 2025

Suzhou, China

Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.

Graphical user interface (GUI) agents powered by multimodal large language models (MLLMs) have demonstrated impressive capabilities in understanding and interacting with operating system environments. However, despite their strong task performance, these models often exhibit hallucinations—systematic errors in action prediction that compromise reliability. In this study, we conduct a comprehensive analysis of the hallucinatory behaviors exhibited by GUI agent models in an icon localization task. We introduce a novel evaluation framework that moves beyond traditional accuracy-based metrics by categorizing model predictions into four distinct types: correct predictions, biased hallucinations, misleading hallucinations, and confusing hallucinations. This fine-grained classification provides deeper insights into model failure modes. Furthermore, we investigate the distribution of output logits corresponding to different response types and reveal key deviations from the behavior observed in traditional classification tasks. To support this analysis, we propose a new metric derived from the structural characteristics of the logits distribution, offering a fresh perspective on model confidence and uncertainty in GUI interaction tasks.

Downloads

SlidesPaperTranscript English (automatic)

Next from EMNLP 2025

The Language of Interoception: Examining Embodiment and Emotion Through a Corpus of Body Part Mentions
poster

The Language of Interoception: Examining Embodiment and Emotion Through a Corpus of Body Part Mentions

EMNLP 2025

Saif Mohammad
Saif Mohammad and 2 other authors

06 November 2025

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Presentations
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2025 Underline - All rights reserved