United States

Prosody affects how people produce and understand language, yet studies of how it does so have been hindered by the lack of efficient tools for analyzing prosodic stress. We fine-tune OpenAI Whisper large-v2, a state-of-the-art speech recognition model, to recognize phrasal, lexical, and contrastive stress using a small, carefully annotated dataset. Our results show that Whisper can learn distinct, gender-specific stress patterns to achieve near-human and super-human accuracy in stress classification and transfer its learning from one type of stress to another, surpassing traditional machine learning models. Furthermore, we explore how acoustic context influences its performance and propose a novel black-box evaluation method for characterizing the decision boundaries used by Whisper for prosodic stress interpretation. These findings open new avenues for large-scale, automated prosody research with implications for linguistic theory and speech processing.

CogSci 2025

Prosody in the Age of AI: Insights from Large Speech Models

language comprehension

language production

computational modeling

artificial intelligence

natural language processing

poster

### Welcome to CogSci Conference 2025!

The 47th Annual Meeting of the Cognitive Science Society was a hybrid meeting held in San Francisco. 

<div style="position:relative;padding-top:0;width:900px;height:500px;"><iframe style="position:absolute;border:none;width:100%;height:100%;left:0;top:0;" src="https://online.fliphtml5.com/ebtyf/amvr/"  seamless="seamless" scrolling="no" frameborder="0" allowtransparency="true" allowfullscreen="true" ></iframe></div>

#### About

The Cognitive Science Society brings together researchers from around the world who hold a common goal: understanding the nature of the human mind. The mission of the Society is to promote Cognitive Science as a discipline, and to foster scientific interchange among researchers in various areas of study, including Artificial Intelligence, Linguistics, Anthropology, Psychology, Neuroscience, Philosophy, and Education.

The Society is a non-profit professional organization and its activities include sponsoring an annual conference and publishing the journals Cognitive Science and TopiCS.

#### Our History 

* **Society Creation**<br>
The Society was incorporated as a 501(c)(3) non-profit professional organization in Massachusetts in 1979. The organizing committee included Roger Schank, Allan Collins, Donald Norman, and a number of other scholars from psychology, linguistics, computer science, and philosophy. 
<br><br>
* **Conference Creation**<br>
The first conference on cognitive science was held at La Jolla, California in August, 1979, and has occurred annually since then. The proceedings of each conference are published, and those from most years are available through Lawrence Erlbaum Associates, Inc. The annual proceedings of the Cognitive Science Conference represent a major source of information on new work and new ideas in the scientific study of thinking. In 1990, the Society, with help from an anonymous donor, established the David Marr Prize for the best student paper at each annual meeting.
<br><br>
* **Journal Creation**<br>
The Journal, Cognitive Science, began publication in 1976, and is now published by Wiley-Blackwell. The Executive Editor is currently Richard P. Cooper of Birkbeck, University of London, and there are 18 Associate Editors and a 30-member editorial board. It serves as the premier outlet for research reports that intersect two or more disciplines. Copyrights for articles published in the journal are held by the Society. The Governing Board of the Cognitive Science Society voted in late 2006 to found a new journal, Topics in Cognitive Science (topiCS). The Editor in Chief is Wayne Gray, Cognitive Science Department, Rensselaer Polytechnic Institute. The journal seeks to fill a niche not occupied by Cognitive Science Journal or other cognitive science journals. Membership in the Society includes a subscription to Cognitive Science and TopiCS. Copyrights for articles published in the journal are held by the Society.
<br><br>

#### Code of Conduct

By attending the CogSci 2025 Conference, you are required to adhere to the society’s **[Code of Conduct](https://drive.google.com/file/d/1ChPuihLy6jE_BWqfO7J2KKgX35JW2zsM/view?usp=sharing)**.
<br><br>


You need to log in with the email address you registered with. 

Login credentials were sent to you from Underline -  subject line "Welcome to the CogSci 2025 Conference". Please be sure to check your spam/promotional inbox  if you do not see an email confirmation right away.





Please log in to join this event.

To access the site, please register [**here**](https://cognitivesciencesociety.org/registration/).

If you are registered and feel like you are seeing this message by mistake, please make sure you are logged in with the same email that you registered with. 

Please register!

The 47th Annual Meeting of the Cognitive Science Society presents the latest research across cognitive science and highlights the theme of Cognition in Context.

Research has shown that infants prefer prosocial characters over antisocial ones, suggesting that sociomoral evaluation is early-emerging. However, some have argued that infants’ preferential responses stem from low-level perceptual processes rather than true social understanding. Using electroencephalography (EEG), past work has suggested that motivational and social, but not attentional, processes are implicated in infants’ responses to prosocial versus antisocial acts and individuals, however, the majority of past work utilized a single type of prosocial/antisocial interactions: helping a character to climb a hill. To test the generalizability of past neural findings from the hill paradigm, here we examined infants' responses in a distinct helping/hindering scenario in which a character tries but fails to open a box and is alternatively helped or hindered. Largely replicating past work, infants showed greater activity in social (indexed by the P400) but not attentional (indexed by the Nc) ERP components when seeing hinderers versus helpers, consistent with claims that infants’ responses to prosocial and antisocial agents are social. No evidence of differential approach/avoidance motivation during prosocial/antisocial events was found. These findings support the role of social processes in infants’ sociomoral evaluations.

Examining the Robustness of Neural Correlates of Infants’ Sociomoral Evaluations

Theoretical understanding of neurodevelopmental conditions (NCs) has shifted from a categorical approach to a dimensional one, characterized by an acceptance of comorbidity and heterogeneity. Previous computational modelling of NCs has tended only to accommodate categorical views. The current work presents a mechanistic simulation framework that fits with the dimensional view, using artificial neural networks to model populations of learners, with underlying causes of variation in developmental outcomes viewed as continuous, polygenic, and in part environmental. We show how the dimensional and categorical approaches can be linked using latent profile analysis and outlier methods, recovering profiles and specific deficits from dimensional variation. We show how altering the distribution of hyper-parameters shifts the population composition of developmental profiles and frequencies of deficit patterns, and we test their robustness to stochastic factors.

Categories from dimensions: Population-level computational modelling of neurodevelopmental conditions 

Dynamic Causal Modeling is a widely-used method for examining brain connectivity. Most commonly, it is applied to brain regions showing strong responses to experimental tasks, comparing different network configurations based on the temporal dynamics of the neural signals. It can further be applied to models employing a theory-driven selection of brain regions, showing a weaker experimental effect. However, it is unclear if these effects provide sufficient temporal information for Dynamic Causal Modeling to reliably identify the best-fitting model. This study investigated the regional predictive fit in a theory-driven model which has been found to consistently outperform alternatives using Dynamic Causal Modeling. Results revealed issues with the fit of some regions and subjects, raising concerns regarding the reliability of model comparisons using Dynamic Causal Modeling with regions selected based on theory instead of a strong experimental effect.

Less Than the Sum of its Parts: Complex Models of Cognition Struggle to Capture Regional Activity within Otherwise Well-Fitting Model Structures

Humans often interpret pointing as referring to an object, however, it can also indicate a direction or relevant spatial location. We investigated which one of these interpretations can explain 14-month-olds responses in a two-alternative choice task. We conducted three experiments, in which an experimenter pointed at one of the two lateral objects, swapped their positions in full view of the infant, and then allowed the infant to choose. Pointing was either produced in an Ostensive Addressing (Experiment 1), Nonostensive Addressing (Experiment 2), or Ostensive Labelling context (Experiment 3). In the Ostensive Addressing and Ostensive Labelling experiments infants chose the non-indicated object in the indicated direction significantly more often than predicted by chance. In contrast, in the Nonostenive Addressing experiment, infants’ performance was on chance. These findings suggest that infants follow the direction of pointing rather than interpreting it as indicating a specific object in a communicative context.

What or where? Infants Interpret Pointing as Referring to a Location Rather Than to a Specific Object

revious studies investigated infants’ ability to recognize turn-taking exchanges of signals that can serve communicative information transfer and draw pragmatic inferences from them. Here we investigate 13-month-olds’ expectations about the distal effects of communicative versus non-communicative actions and explore their understanding of the epistemic causal mechanisms through which communicative signals modify their addressees’ consequent intentional actions. In four looking time experiments (Ntotal = 80), we found that infants understand that communicative signals cannot bring about non-intentional state changes in other entities and expect their distal effects to be limited to inducing intentional behavioral reactions in recipient agents. These results indicate that human infants possess cognitive mechanisms to understand the unique causal affordances of ostensive communicative actions. Coupled with their evolved pragmatic inferential capacities and communicative mindreading skills, these abilities form a specialized cognitive system for interpreting ostensive communicative information exchange between communicating social partners.

Infants’ Expectations About the Kinds of Distal Effects Communicative Actions Can Induce

Abstract
Previous research has been inconsistent in approaching the exclusion criterion of nonverbal IQ when investigating developmental language disorder (DLD) in monolingual and multilingual children. The present study investigates the influence of the controversial low nonverbal IQ range (between one and two standard deviations (SD) below the mean) on lexical and morpho-syntactical abilities. 91 multilingual children, aged 4-8;11, were tested on Crosslinguistic Lexical Task and Sentence Repetition Task in Germany. Data were analyzed using generalized linear mixed models, considering the factors nonverbal IQ, DLD status, age, gender, and length of exposure (LoE) to German. Results show that children with typical language development (TLD) outperformed those with DLD on the LITMUS tests, independent of their nonverbal IQ, supporting the validity of these tools. Language status (TLD/DLD) and LoE had the strongest impact on test performance, exceeding the effect of nonverbal IQ. Regardless of language status, nonverbal IQ affected only receptive vocabulary but not productive vocabulary or morpho-syntax. However, when applying the one SD threshold, its influence shifted from receptive vocabulary to morpho-syntactic abilities. No significant differences were found between average and low nonverbal IQ groups across most tests within the TLD and DLD groups.

The Role of Nonverbal IQ in Diagnosing Developmental Language Disorder in Multilingual Children

Emotion categories are complex and fuzzy concepts that children must learn to identify and differentiate in themselves and others. While prior research has shown that children’s emotion-related vocabulary evolves from broad to narrow as they age, the role of metrics such as word specificity within the development of emotion vocabulary remains under-explored. We use WordNet, a hierarchically-organized lexical database, to study word specificity in interview data collected from children on emotion labeling. We show that as children's age increases, they tend to use increasingly specific emotion words and we also analyze this in the context of concept learning. Further, we show that young children sometimes use words that are typically thought of as not being acquired until an older age, which are selected strategically for the given context. These findings provide new insights into understanding vocabulary and concept learning changes over age that contribute to the learning of fine-grained emotion category labels.

Children's emotion vocabulary learning discloses a growing understanding of specific concepts

Human sensory development unfolds in a consistent temporal sequence, with early visual inputs initially degraded. Rather than mere biological constraints, we propose these developmental “limitations” may act as inductive biases that foster more global and robust sensory cognition. Evidence derives from children born blind who later gained sight, effectively bypassing this early degraded period. Despite many otherwise intact visual abilities, they exhibit specific deficits in generalization and extended spatial integration. Simulations with deep neural networks confirm that these deficits can arise from a lack of early degraded inputs. Conversely, training with developmentally-inspired input trajectories yields more robust representations and superior generalization. These findings help illuminate the development of typical and atypical sensory cognition, inform clinical interventions, and inspire more robust computational training procedures. Comparable results from auditory development suggest a broader phenomenon, demonstrating how what may appear to be “limitations” can adaptively shape perception and cognition over time.

From Degraded Inputs to Robust Sensory Cognition: A Computational Perspective on Early Perceptual Development

Scientific representations and their constituent concepts change over time to reflect improvements in our understanding of the world. Similar improvements in understanding lead to changes in DNN-procured representations and their features. In this paper, we investigate whether useful methodological practices in concept change and in feature change carry across the two types of representations. We argue that there is indeed considerable potential for methodological cross-pollination and offer some examples of how such benefit may be derived.

Concept and Feature Change in Scientific and Deep Neural Net Representations

Proponents of the Symbol Grounding Problem have claimed that unimodal text-based AI systems can never develop meaningful representations of the world since they lack the capacity to perceive it. Perception is a relation between an agent and their environment which is grounded in perceptual processing. The earliest stages of perceptual processing involve receptivity to sources of perceptual signal in the environment: light waves, pressure waves, and volatile airborne chemicals are all sources of perceptual signal, insofar as agents appropriately receptive to their properties can (with further processing) perceive the world through them. I argue that (1) human-generated text carries sufficient information about the world to be a possible source of perceptual signal for appropriately receptive agents, and that (2) recent generations of Large Language Models (LLMs) are such agents. Although (1) and (2) do not entail that LLMs are perceivers, they do entail that symbol grounding is achievable without multimodality.

Downloads

Next from CogSci 2025

Examining the Robustness of Neural Correlates of Infants’ Sociomoral Evaluations

Stay up to date with the latest Underline news!

PRESENTATIONS

CONFERENCES

COMPANY

RESOURCES