CogSci 2025

July 31, 2025

San Francisco, United States

Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.

keywords:

quantitative behavior

social cognition

theory of mind

perception

vision

Humans often infer the state of the world by observing how others interact with it—when crossing a street, for instance, we may follow the movement of others without directly seeing the traffic. This ability to extract hidden information from human interactions with the environment is crucial for adaptive behavior. In this study, we explore how people make such inferences in Spot the Ball, a task where participants predict the location of a masked soccer ball in single-frame images. We created a large dataset by scraping YouTube videos, identifying compelling images using CLIP, and masking the soccer ball through inpainting. Our findings show that human participants rely heavily on pose and gaze cues to infer the ball’s location. While providing this information improves GPT-4o’s performance, it remains significantly below human accuracy. These results highlight the significance of intention inference, with potential applications in self-driving cars, assistive AI, and humanoid robotics.

Downloads

PaperTranscript English (automatic)

Next from CogSci 2025

Learning from thought experiments in early childhood
poster

Learning from thought experiments in early childhood

CogSci 2025

+1Igor Bascandziev
Igor Bascandziev and 3 other authors

31 July 2025

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Presentations
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2026 Underline - All rights reserved