United States

Humans often infer the state of the world by observing how others interact with it—when crossing a street, for instance, we may follow the movement of others without directly seeing the traffic. This ability to extract hidden information from human interactions with the environment is crucial for adaptive behavior. In this study, we explore how people make such inferences in Spot the Ball, a task where participants predict the location of a masked soccer ball in single-frame images. We created a large dataset by scraping YouTube videos, identifying compelling images using CLIP, and masking the soccer ball through inpainting. Our findings show that human participants rely heavily on pose and gaze cues to infer the ball’s location. While providing this information improves GPT-4o’s performance, it remains significantly below human accuracy. These results highlight the significance of intention inference, with potential applications in self-driving cars, assistive AI, and humanoid robotics.

CogSci 2025

Spot the ball: Inferring Hidden Information from Human Behavioral Cues

quantitative behavior

social cognition

theory of mind

perception

vision

poster

### Welcome to CogSci Conference 2025!

The 47th Annual Meeting of the Cognitive Science Society was a hybrid meeting held in San Francisco. 

<div style="position:relative;padding-top:0;width:900px;height:500px;"><iframe style="position:absolute;border:none;width:100%;height:100%;left:0;top:0;" src="https://online.fliphtml5.com/ebtyf/amvr/"  seamless="seamless" scrolling="no" frameborder="0" allowtransparency="true" allowfullscreen="true" ></iframe></div>

#### About

The Cognitive Science Society brings together researchers from around the world who hold a common goal: understanding the nature of the human mind. The mission of the Society is to promote Cognitive Science as a discipline, and to foster scientific interchange among researchers in various areas of study, including Artificial Intelligence, Linguistics, Anthropology, Psychology, Neuroscience, Philosophy, and Education.

The Society is a non-profit professional organization and its activities include sponsoring an annual conference and publishing the journals Cognitive Science and TopiCS.

#### Our History 

* **Society Creation**<br>
The Society was incorporated as a 501(c)(3) non-profit professional organization in Massachusetts in 1979. The organizing committee included Roger Schank, Allan Collins, Donald Norman, and a number of other scholars from psychology, linguistics, computer science, and philosophy. 
<br><br>
* **Conference Creation**<br>
The first conference on cognitive science was held at La Jolla, California in August, 1979, and has occurred annually since then. The proceedings of each conference are published, and those from most years are available through Lawrence Erlbaum Associates, Inc. The annual proceedings of the Cognitive Science Conference represent a major source of information on new work and new ideas in the scientific study of thinking. In 1990, the Society, with help from an anonymous donor, established the David Marr Prize for the best student paper at each annual meeting.
<br><br>
* **Journal Creation**<br>
The Journal, Cognitive Science, began publication in 1976, and is now published by Wiley-Blackwell. The Executive Editor is currently Richard P. Cooper of Birkbeck, University of London, and there are 18 Associate Editors and a 30-member editorial board. It serves as the premier outlet for research reports that intersect two or more disciplines. Copyrights for articles published in the journal are held by the Society. The Governing Board of the Cognitive Science Society voted in late 2006 to found a new journal, Topics in Cognitive Science (topiCS). The Editor in Chief is Wayne Gray, Cognitive Science Department, Rensselaer Polytechnic Institute. The journal seeks to fill a niche not occupied by Cognitive Science Journal or other cognitive science journals. Membership in the Society includes a subscription to Cognitive Science and TopiCS. Copyrights for articles published in the journal are held by the Society.
<br><br>

#### Code of Conduct

By attending the CogSci 2025 Conference, you are required to adhere to the society’s **[Code of Conduct](https://drive.google.com/file/d/1ChPuihLy6jE_BWqfO7J2KKgX35JW2zsM/view?usp=sharing)**.
<br><br>


You need to log in with the email address you registered with. 

Login credentials were sent to you from Underline -  subject line "Welcome to the CogSci 2025 Conference". Please be sure to check your spam/promotional inbox  if you do not see an email confirmation right away.





Please log in to join this event.

To access the site, please register [**here**](https://cognitivesciencesociety.org/registration/).

If you are registered and feel like you are seeing this message by mistake, please make sure you are logged in with the same email that you registered with. 

Please register!

The 47th Annual Meeting of the Cognitive Science Society presents the latest research across cognitive science and highlights the theme of Cognition in Context.

Thought experiments have been credited with generating new knowledge in the history of science. Although many parallels have been drawn between the thinking of scientists and children, it is not clear if children can generate new knowledge via thought experiments. We tested if the use of an extreme case thought experiment can help 6- to 9-year-olds to overcome the misconception that heavier rather than larger objects displace more water. A total of 70 children (MAge = 88.94 months) were assigned to a Control condition and to an Extreme Case condition designed to elicit children’s existing understanding of solidity, namely that two material objects cannot occupy the same space at the same time. Children received no feedback in either condition. We found that children in the Extreme Case condition performed better on both the Learning and Far Transfer trials, suggesting that thought experiments can serve as a learning tool in childhood.

Learning from thought experiments in early childhood

Infants learn better following expectancy violations. Yet it is unknown whether this surprise-induced learning operates across development, is all-or-none or graded, and whether surprise directly mediates it. We addressed these questions by showing adults events depicting varying numbers of violations. In Experiments 1 and 2, adults saw events with 0 to 3 physical violations, then heard a novel verb for the presented action. Adults learned better after observing violations; notably, their learning exhibited a Goldilocks pattern—initially increasing with number of observed violations, then declining. Experiment 3 asked whether this learning enhancement was driven by surprise itself, or by the search for explanations for the surprising events. Adults saw events with different numbers of violations, then rated their surprise and generated candidate explanations. Whereas surprise increased monotonically with violations, explanation-generation exhibited a Goldilocks pattern like that in Experiments 1-2. This suggests that surprise-induced learning may reflect the search for explanations.

Goldilocks Pattern of Learning after Observing Unexpected Physical Events

People of all ages explore the world through looking. Recently, Raz, Cao et al. (2025) built an image-computable model (RANCH) that predicts adults’ and infants’ looking behavior to a large stimulus set, including graded responses to changes in pose, animacy, and number. This model succeeded despite having only a perceptual embedding space of stimuli. However, looking may be influenced by non-perceptual considerations. Using the same data, we found that adults’ behaviors challenge a key assumption of perceptual-only account: since the perceptual distance between two items is symmetrical, behavior guided only by perceptual space should also be symmetrical. Yet, adults did not treat changes in different directions as mere reciprocal transformations. For instance, adults looked longer at magical appearance than disappearance. We suggest that image-computable models of looking behavior would benefit from representations of objects, in addition to perceptual features of images.

Surprise isn’t symmetrical: Adults’ looking suggests non-perceptual considerations during dishabituation

Recent work has shown that producing memory ratings during study may lead to greater retention than practice testing in some circumstances (Higham, 2023). This may be related to a phenomenon called judgment of learning (JOL) reactivity, in which making immediate JOLs during study can enhance later recall. However, JOLs and testing have not been directly compared in a typical testing effect (TE) paradigm. This study compared passive restudy, study with immediate JOLs, and testing in a TE paradigm. In Experiment 1, we found no clear TE and only tentative JOL reactivity when word pairs were not semantically related. In Experiment 2, the associative strength of the word pairs was increased. A robust TE emerged along with weak JOL reactivity. Importantly, testing significantly outperformed JOL and passive restudy. These findings are among the first to suggest that semantic relatedness is crucial for the TE and clarify how JOLs compare to testing.

Are You Sure About That? The Impact of Semantic Relatedness on Learning Through Testing, JOLs, and Passive Restudy

This study investigates how heritage Spanish-English bilinguals process sentences with canonical and non-canonical word orders, focusing on inanimate (IA) subjects and objects in subject-verb-object (SVO) and object-verb-subject (OVS) structures. By examining whether participants rely on sentential cues or semantic processing, we aim to test predictions from the Competition Model, which emphasizes cue reliability and validity, and the Good-Enough Processing Model, which suggests reliance on heuristics in challenging syntactic contexts.

Using the Tobii Pro Fusion eye tracker, we are collecting eye movement data from 50 bilingual participants (Spanish AoA: 0-3 years; English AoA: 0-8 years) as they read 80 sentences (40 per language), balanced for verb agreement and randomized to control for order effects. Participants will identify the subject after each sentence and complete tasks assessing language dominance (BLP), vocabulary (LexTALE, LexTALE-ESP), and literacy skills in English and Spanish.

Results will advance our understanding of current theories of sentence processing.

Sentence cues or semantics? Using eye tracking to study sentence processing in heritage Spanish-English bilinguals.

With the rise of Large Language Models (LLMs), interest in simulating interaction dynamics has grown, raising questions about their validity as cognitive models of human discourse. While extensive research focuses on their performance in various applications, we aim to quantify LLM conversational processes akin to traditional human studies. By analyzing how convergence entropy evolves across different conversational tasks, we propose a framework for quantitatively assessing LLMs’ ability to exhibit specific features. This approach offers a pathway to characterizing LLMs for agent-based modeling and broader discourse analysis.

Large Language Model Discourse Dynamics

Effective social navigation requires individuals to infer others' intentions and adjust their movements accordingly to avoid collisions. A key aspect of this process is recursive reasoning, where individuals anticipate that others are also inferring their intentions. In this study, we quantitatively measured whether humans exhibit spontaneous mentalizing during navigation and developed a computational model to demonstrate the basic principle. Then, we introduced a novel framework for quantifying the recursive depth of human mentalizing during social navigation. Using a Doors-choosing task within a VR environment, participants navigated between two doors while avoiding a virtual human. Analyzing choice patterns, confidence levels and walking trajectories, we found that participants engaged in one or two levels of recursion, with respective probabilities of 80% and 20%. This study provides a quantitative estimation of the recursive depth of mentalizing in navigation and establishes a foundation for integrating human recursive reasoning into socially intelligent agents.

Quantifying Recursive Mentalizing Depth for Social Navigation

Humans do not just follow rules and solve problems created by others: we modify those rules, set new goals, and create new problems—so can we be inventors and innovators. Creating a good rule or a good problem, however, depends not just on the ideas you come up with but on how you evaluate such proposals. Here, we study invention through the lens of game design. We focus particularly on the early stages of novice, “everyday” game creation, where the stakes are low. We draw on a dataset of over 450 human created games and conduct a model-based analysis of how people invented new games based on prior experience. We consider two different cognitive mechanisms that may be at work during the early processes of intuitive game invention: an associative proposal based on previous games one has seen, and evaluation based on simulations of play. In particular, we aim to understand two possible evaluation schemes (model-free and model-based) that a commonsense-based game creator may use to refine their initial draft proposals. We find that the generated games are best described by a model which incorporates both rapid model-free evaluations and slower, model-based estimates of game quality at a population level. Our work serves as a step forward towards the proposal and evaluation process in human invention. See https://sites.google.com/view/gen-eval-game-creation for additional details and preprint.

Generation and Evaluation in the Human Invention Process through the Lens of Game Design

How do people determine whether non-human entities have thoughts and feelings — an inner mental life? Prior work has proposed that people use compact sets of dimensions (e.g., body-heart-mind) to form beliefs about familiar kinds, but how do they generalize to novel entities? Here we investigate emerging beliefs about the mental capacities of large language models (LLMs) and how those beliefs are shaped by how LLMs are portrayed. Participants (N = 470) watched brief videos that encouraged them to view LLMs as either machines, tools, or companions then took a survey measuring mental capacity attributions. We found that the companion group more strongly endorsed statements regarding a broad array of mental capacities that LLMs might possess relative to the machine and tool groups, suggesting that people’s beliefs can be rapidly shaped by context. Our study highlights the need to explore the factors shaping people’s beliefs about emerging technologies to promote accurate public understanding.

Portraying Large Language Models as Machines, Tools, or Companions Affects What Mental Capacities People Attribute to Them

The cognitive processes underlying Go/No-Go performance may be explained by two plausible evidence accumulation models: Two-Boundary (2-B) and One-Boundary (1-B) decision drift models (DDMs). While both embed a Go decision, the 2-B DDM embeds a definitive No-Go decision, whereas the 1-B DDM embeds a response window for Go. Using simulations, we found that model comparison methods like leave-one-out cross-validation (LOO), coupled with Bayesian hierarchical modeling, can correctly identify the underlying model. Additionally, using the correct model reduces the risk of missing true effects or detecting spurious findings. Therefore, we recommend researchers implement and compare both models for Go/No-Go studies to reduce misleading results. Lastly, we implemented these models to investigate race effects in the decision to shoot during police training. We found that the accumulated evidence needed to reach the Shoot decision is lower for Black suspects, which explains the heightened error rates for shooting unarmed Black suspects in data.

Downloads

Next from CogSci 2025

Learning from thought experiments in early childhood

Stay up to date with the latest Underline news!

PRESENTATIONS

CONFERENCES

COMPANY

RESOURCES