Netherlands

Language models have great potential as cognitive models for studying human language acquisition, but current models are far less data-efficient than human learners. Children acquire language from 100 million words or less, but large language models are trained on trillions of words. We discuss the prospects for improving language models’ developmental plausibility through a meta-analysis of results from the 2023 BabyLM Challenge. BabyLM was a competition that invited participants to train a language model on a 100 million-word corpus including transcribed speech and child-appropriate texts. Results from over 30 submissions showed that new machine learning techniques and increased training iterations yielded models that outperformed leading large language models in grammar, language understanding, and linguistic generalization, while cognitively plausible approaches such as curriculum learning were less effective. We discuss the implications of these and other findings for computational cognitive modeling and explore ideas to ensure future competitions’ contributions to cognitive science.

**Authors:**

Alex Warstadt: ETH Zurich; Aaron Mueller: Northeastern University; Leshem Choshen: IBM; Ethan Gotlieb Wilcox: ETH Zurich; Chengxu Zhuang: MIT; Adina Williams: Meta Platforms Inc.; Ryan Cotterell: Institute for Machine Learning; Tal Linzen: New York University

CogSci 2024

Insights from the first BabyLM Challenge: Training sample-efficient language models on a developmentally plausible corpus

poster

The 46th Annual Meeting of the Cognitive Science Society was an in-person meeting held in Rotterdam, The Netherlands at the Postillion Hotel & Conference Centre.

**ON-DEMAND PROGRAM ACCESS AFTER THE CONFERENCE** 
Recordings of the invited program are now available. You can access them by clicking on the 'Schedule' icon on the left. Select view by week and navigate to the day and time slot of the recording you wish to view. Click on the time slot to access the recording.

Recordings available (click on each recording to view):

**Thursday, July 25** 
0900: [Keynote Speaker Morgan Barense on Enhancing real-world event memory](https://underline.io/events/465/sessions/17997/lecture/99084-dynamics-between-minds-and-the-environment) 
1000: [Gleitman Award Winner Isabelle Dautriche on Language Foundations: Insights from acquisition, communication, cognition, and more](https://underline.io/events/465/sessions/18000/lecture/99085-gleitman-talk) 
1415: [Invited Symposium: Dynamics between minds and the environment](https://underline.io/events/465/sessions?eventSessionId=18016&searchGroup=lecture) 
1700: [Rumelhart Prize Presentation Speaker Alison Gopnik on Exploit, explore, empower: Three ages and three intelligences](https://underline.io/events/465/sessions/18028/lecture/99161-rp1-rumelhart-prize-presentation) 

**Friday, July 26** 
0900: [C.L. de Carvalho-Heineken Prize Keynote Speaker Kia Nobre on Focusing in memory](https://underline.io/events/465/sessions/18035/lecture/99162-hp1-c-l-de-carvalho-heineken-prize-keynote-address) 
1030: [Rumelhart Symposium: Childhood as exploration](https://underline.io/events/465/sessions/18038/lecture/100346-childhood-as-exploration) 
1415: [Elman Prize Symposium](https://underline.io/events/465/sessions?eventSessionId=18053&searchGroup=lecture) 
1600: [Invited Symposium: Dynamics between minds](https://underline.io/events/465/sessions?eventSessionId=18065&searchGroup=lecture) 
1745: [Keynote Speaker Andrea E. Martin on Neural dynamics encode the structure and statistics of language](https://underline.io/events/465/sessions/18077/lecture/99271-neural-dynamics-encode-the-structure-and-statistics-of-language) 

**Saturday, July 27** 
0900: [Keynote Speaker Gregor Schöner on How higher cognition emerges from the dynamics of strongly interacting neural populations](https://underline.io/events/465/sessions/18083/lecture/99273-dynamics-within-the-mind) 
1000: [Glushko Talks](https://underline.io/events/465/sessions?eventSessionId=18086&searchGroup=lecture) 
1500: [Invited Symposium: Dynamics within minds](https://underline.io/events/465/sessions?eventSessionId=18100&searchGroup=lecture) 


#### **CogSci 2024 Program Book**
<div style="position:relative;padding-top:0;width:900px;height:500px;"><iframe style="position:absolute;border:none;width:100%;height:100%;left:0;top:0;" src="https://online.fliphtml5.com/ebtyf/cble/" seamless="seamless" scrolling="no" frameborder="0" allowtransparency="true" allowfullscreen="true" ></iframe></div>


**The Program Book can be downloaded [here](https://drive.google.com/file/d/1yRVp1cBqRnAbdYi4GXWV_pZeyWNsRquv/view?usp=sharing).**

#### **Letter from the President**

Welcome to the 2024 Meeting of the Cognitive Science Society in Rotterdam! CogSci 2024 brings together a large community of cognitive scientists who have traveled here from around the world, as well as a smaller group of remote presenters. I would like to extend an especially warm welcome to our first-time CogSci meeting attendees, and hope that 
their experience inspires them to join our future meetings for many years to come.

I want to thank this year’s conference co-chairs, Larissa K Samuelson, Stefan Frank, Mariya Toneva, Allyson Mackey, and Eliot Hazeltine, for putting together an amazing series of invited talks and symposia around the theme of Dynamics of Cognition. The co-chairs also deserve credit for handling a record-breaking number of submissions and producing a truly exciting conference program 

Beyond the invited talks and symposia dedicated to this year’s conference theme, I would like to highlight the presentations and events honoring the 
Rumelhart, Elman, Gleitman and C .L . de Carvalho Heineken prize winners, including the Rumelhart reception on Thursday. As you may know, we are recording all of the invited talks and symposia, and all prize-winners’ talks, and will be making this content available after the conference. 

The Cognitive Science Society relies on the volunteer efforts of our Governing Board members who work on matters related to our membership; conference policies; diversity and inclusion, international, and outreach initiatives; prizes and much more. We welcome interest from Society members who would like to become more involved in what we do. The 
Society is deeply grateful to Rick Dale and Andrea Bender, who serve as editors of the two Society journals, Cognitive Science and Topics in Cognitive Science respectively. We especially acknowledge our editors’ generosity in donating their stipends each year to D&I and outreach efforts. 

This year’s Annual Meeting depends critically on the support of Marischal De Armond and his team at Podium Conference Specialists: Sharon Zwack, Cendrine De Vis, Sarah-Kate Burke, and Rachel Waller worked tirelessly to secure the venue and handled a multitude of logistical and organizational issues, from arranging coffee breaks and receptions to making sure that a huge number of posters fits comfortably into the available space. Finally, and importantly, I would like to acknowledge the Cognitive Science Society Executive Officer, Erica Wojcik, for managing the complex and ever expanding sphere of our Society’s activities.

I hope you enjoy this year’s meeting and the many different opportunities to engage with the vibrant community of cognitive scientists gathered here. I also hope you get a chance to explore the great city of Rotterdam with its many attractions, museums, restaurants and scenic views. If you’d like to talk more about our Society, or simply want to say hello, please stop by!

![](https://assets.underline.io/markdown_image/1/image/3c66d0e21fff30e204df1138ffd15173.jpeg)

**Anna Papafragou** 
President, Cognitive Science 
Society 2023-2024

**CODE OF CONDUCT** 
By attending the CogSci 2024 Conference, you are required to adhere to the society’s [Code of Conduct](https://drive.google.com/file/d/16-6KkptF0Gn3ZYGJDlpqpwTdImw45Ng0/view?usp=drive_link).

**ABOUT THE COGNITIVE SCIENCE SOCIETY** 
The Cognitive Science Society brings together researchers from around the world who hold a common goal: understanding the nature of the human mind. The mission of the Society is to promote Cognitive Science as a discipline, and to foster scientific interchange among researchers in various areas of study, including Artificial Intelligence, Linguistics, Anthropology, Psychology, Neuroscience, Philosophy, and Education.

The Society is a non-profit professional organization and its activities include sponsoring an annual conference and publishing the journals Cognitive Science and TopiCS.

You need to log in with the email address you registered with. Access credentials have been sent to your email. 

Please be sure to check your spam and other email folders if you do not see an email confirmation right away.

Please log in to explore this event.

It looks like you are not registered for this event. 

To access the site please register [**here**](https://cognitivesciencesociety.org/registration/). 

Please register!

The 46th Annual Meeting of the Cognitive Science Society presents the latest research across cognitive science and highlights the theme of Cognition in Context.

Sleep staging serves as the foundation for sleep assessment and disease diagnosis, constituting a crucial aspect of sleep research. The related work on automatic sleep staging has achieved numerous satisfactory outcomes. However, current research predominantly focuses on using sleep information as classification features, employing time-domain or frequency-domain measures as local features, using comprehensive brain network information across channels as global features, while overlooking the spontaneous regularities in brain activity. Simultaneously, brain microstates are considered closely linked to brain activity and can be used to investigate the regular variations in the overall brain potential. To explore the regular changes in the microstates of brain function during sleep stages based on electroencephalogram (EEG), especially the regular changes in sleep structure, we initially conduct microstate clustering on the EEG data during sleep, followed by characterizing the sleep structure of the participants based on these microstates. Subsequently, we integrate the sleep structure with traditional sleep information features and perform automatic sleep staging.Our experiments make the following contributions: (1) Being the first to introduce the use of sleep structure for automatic sleep staging. (2) When there are 7 or more than 7 microstate classes, the model performs well. (3) Proposing a sleep automatic staging model that integrates sleep structure and sleep information.

**Authors:**

Ruixiang Liao: Hangzhou Dianzi University; Li Zhu: Hangzhou Dianzi University; Wanzeng Kong: Hangzhou Dianzi University; Zhengyi Wang: Hangzhou Dianzi University

An Automated Sleep Staging Method with EEG-based Sleep Structure Computation

Theory of mind is an essential ability for complex social interaction and collaboration. Researchers in cognitive science and psychology have previously sought to integrate theory of mind capabilities into artificial intelligence (AI) agents to improve collaborative abilities (Cuzzolin, Morelli, Cirstea, & Sahakian, 2020). These approaches, however, are hampered by the need for labor-intensive hand-labeling of datasets, which prevents them from scaling up to large, real-world datasets. To address this challenge, we introduce the Recurrent Conditional Variational Autoencoder (RCVAE), a novel model designed to predict intent from human behavioral trajectories without the prerequisite of hand-labeled data. We show that in the Overcooked-AI environment, the RCVAE outperforms baseline Long Short-Term Memory (LSTM) models in predicting intent, achieving higher prediction accuracy and greater predictive stability. The implications of these results are significant; the RCVAE's proficiency in learning the relationship between basic actions and resulting contextual behaviors, without needing hand-labeled data, will be crucial for scaling from simple to complex, real-world environments.

**Authors:**

Willa Mannering: Johns Hopkins Applied Physics Laboratory; Noah Ford: Johns Hopkins University Applied Physics Lab; Justin J Harsono: Johns Hopkins University Applied Physics Laboratory; John Winder: Johns Hopkins University Applied Physics Laboratory

Generative Artificial Intelligence for Behavioral Intent Prediction

Causal reasoning is a critical aspect of both human cognition and artificial intelligence (AI), playing a prominent role in understanding the relationships between events. Causal Bayesian Networks (CBNs) have been instrumental in modeling such relationships, using directed, acyclic links between nodes in a network to depict probabilistic associations between variables. Deviations from these graphical models’ edicts would result in biased judgments. This study explores one such bias in the causal judgments of humans and Large Language Models (LLMs) by examining two structures in CBNs: Canonical Chain (A→B→C) and Common Cause (A←B→C) networks. In these structures, once the intermediate variable (B) is known, the probability of the outcome (C) is normatively independent of the initial cause (A). However, studies have shown that humans often ignore this independence. We tested the mutually exclusive predictions of three theories that could account for this bias (N=300). Using hierarchical mixed-effect models, we found that humans tend to perceive causes in Chain structures as significantly stronger, providing support for only one of the hypotheses. This increase in perceived causal power might reflect a view of intermediate causes as more reflective of reliable mechanisms, which could in turn stem from our interactions with the world or the way we communicate causality to others. LLMs are primarily trained on language data. Therefore, examining whether they exhibit similar biases in causal reasoning can help us understand the origins of canonical Chain structures’ perceived causal power, while also shedding light on whether LLMs can abstract causal principles. To investigate this, we subjected three LLMs, GPT3.5-Turbo, GPT4, and Luminous Supreme Control to the same queries as our human subjects, adjusting a key ‘temperature’ hyperparameter. Our findings show that, particularly with higher temperatures (i.e., greater randomness), LLMs exhibit a similar boost in the perceived causal power of Chains, suggesting the bias is at least partly reflected in language use. Similar results across items suggest a degree of causal principle abstraction in the studied models. Implications for causal representation in humans and LLMs are discussed.

**Authors:**

Anita Keshmirian: Forward College; Moritz Willig: Technical University of Darmstadt; Babak Hemmatian: University of Illinois Urbana-Champaign; Kristian Kersting Kersting: TU Darmstadt; Ulrike Hahn: Birkbeck, University of London; Tobias Gerstenberg: Stanford University

Chain Versus Common Cause: Biased Causal Strength Judgments in Humans and Large Language Models

Although predicting others' behavior is a fundamental capacity of the human mind, building this intuitive psychology into machines has remained a challenge. To advance this aim, we introduce the Animate Agent World Modeling Benchmark - featuring agents engaged in a diverse repertoire of behaviors, such as goal-directed interactions with objects and multi-agent interactions, all governed by realistic physics. Humans tend to predict the future based on expected events rather than simulating step-by-step. Thus, our benchmark includes a cognitively-inspired evaluation pipeline designed to assess whether the simulated trajectories of world models capture the correct sequences of events. To perform well, models need to leverage predictive cues from the observations in order to accurately simulate the goals of animate agents over long horizons. Although recent developments have incorporated world models into state of the art model-based reinforcement learning (RL) agents, we demonstrate that these models perform poorly in our evaluations. A hierarchical oracle model sets an upper bound for performance, suggesting that in order to excel, a model should scaffold their predictions with abstractions like goals that guide the simulation process towards relevant future events.

**Authors:**

Logan Matthew Cross: Stanford University; Violet Xiang: Stanford University; Nick Haber: Stanford; Daniel Yamins: Stanford University

Animate Agent World Modeling Benchmark

Despite computational algorithms outperforming humans in certain tasks, algorithmic advice is less used than human advice (algorithm aversion). Thus, algorithmic advice should be designed to avoid algorithm aversion. However, few studies have discussed the use of advice with an interval (e.g., 60.0 ± 2.0 %), a common format in algorithmic advice. This study confirmed in two behavioral experiments (N = 200) that advice use differed across advisors and that different advisors have a mainly influence on the process by which judges decide whether to ignore advice. Therefore, this study proposed to individualize the advice presentation so that the advice would be such that decreases the rate of judges ignoring the advice. For individualization, we focused on the distance between the advice and the initial judgment, a significant factor in advice utilization. Another behavioral experiment (N = 100) confirmed that our proposed advice design overcomes differences among advisors.

**Authors:**

Rina Kagawa: University of Tsukuba; Hidehito Honda: The University of Tokyo; Hirokazu Nosato: National Institute of Advanced Industrial Science and Technology (AIST)

Advice Design to Increase the Use of Advice with an Interval to Overcome Algorithm Aversion

The multimodal deep neural networks, represented by CLIP, have generated rich downstream applications owing to their excellent performance, thus making understanding the decision-making process of CLIP an essential research topic. Due to the complex structure and the massive pre-training data, it is often regarded as a black-box model that is too difficult to understand and interpret. Concept-based models map the black-box visual representations extracted by deep neural networks onto a set of human-understandable concepts and use the concepts to make predictions, enhancing the transparency of the decision-making process. However, these methods involve the datasets labeled with fine-grained attributes by expert knowledge, which incur high costs and introduce excessive human prior knowledge and bias. In this paper, we observe the long-tail distribution of concepts, based on which we propose a two-stage Concept Selection Model (CSM) to mine core concepts without introducing any human priors. The concept greedy rough selection algorithm is applied to extract head concepts, and then the concept mask fine selection method performs the extraction of core concepts. Experiments show that our approach achieves comparable performance to end-to-end black-box models, and human evaluation demonstrates that the concepts discovered by our method are interpretable and comprehensible for humans.

**Authors:**

Chenming Shang: Tsinghua University; Hengyuan Zhang: Tsinghua University; Hao Wen: Tsinghua University; Yujiu Yang: Institute of Data and Information

Understanding Multimodal Deep Neural Networks: A Concept Selection View

The advancement of intelligent systems requires the exploration of efficient computational architectures based on emerging electronic computing devices and the effective simulation of biomimetic functions to enhance overall intelligence. Here we design a memristor-based circuit inspired by self-awareness, which can realize bionic adaptive decision-making effectively and more intelligently through mimicking habituation learning mechanisms. Memristors serve as underlying units in the circuit supporting the simulation of functions akin to biological neurons and synapses. These contribute to the implementation of functionalities such as information filtering, integration, and synaptic plasticity with concise circuit structures and efficient computing way. Experimental results indicate that our circuit is capable of supporting rapid and efficient information processing through in-memory analog computing and making more reasonable and intelligent adaptive decisions by integrating biomimetic mechanisms. Extending this work to further research on large-scale decision-making systems holds promise for application in intelligent platforms, aiming to achieve advanced cognitive capabilities.

**Authors:**

Zilu Wang: Harbin Institute Technology

Memristor-based Bionic Decision-making Circuit Inspired by Self-awareness

The UN Security Council (UNSC) is entrusted with the responsibility of safeguarding global security. Analyzing the cognitive patterns from UNSC debates helps scholars gain insights into the intricacies of international relations and diplomatic discourse. In this study, our focus lies in the cognitive analysis of debates held within the UNSC. We employ metaphors and their associated concept mappings as a methodological tool to dissect the cognitive nuances present in the debates, spanning from January 1995 to December 2020. To undertake this extensive analysis from a large volume of documents, we leverage MetaPro, a state-of-the-art computational metaphor processing system to obtain the concept mappings of metaphors. We analyze cognitive variations by temporal and geographical variables. We also demonstrate the correlation between metaphor-reflected cognition and diplomatic behavior, and their recursive influence, based on large sample research. Our finding highlights the mutual impacts of metaphorical cognition and voting behavior at the UN.

**Authors:**

Rui Mao: Nanyang Technological University; Tianwei Zhang: Nanyang Technological University; Qian Liu: School of Computer Science and Engineering; Amir Hussain: Edinburgh Napier University; Erik Cambria: NTUsg

Unveiling Diplomatic Narratives: Analyzing United Nations Security Council Debates Through Metaphorical Cognition

Perception Verbs (PVs) can have, besides their literal interpretation that 'X perceives Y', other interpretations depending on context. For example, in narratives we often find contexts where seeing something introduces a new referent, heralds a pivotal event, or compresses redundant information about characters' inner states. We computationally model the emergence of such pragmatic use in children (4-12y) with recent Language Models (LMs). Since LMs are partly trained on narrative corpora and can model coherence in narratives, we assume that a LM can infer whether using a PV is apt in contexts that humans can recognise as having a pragmatic function. We sample PV contexts from ChiSCor, a corpus of Dutch children's freely-told narratives, and use the confidence of LM predictions to identify developmental patterns in pragmatic use of PVs for children of different ages. Simultaneously, our setup allows us to identify types of pragmatic meaning that LMs still struggle with.

**Authors:**

Bram van Dijk: Leiden University; Max van Duijn: Leiden University; Li Kloostra MA: Utrecht University ; Marco Spruit: Leiden University; Barend Beekhuizen: University of Toronto

Modelling Pragmatic Inference in Children's Use of Perception Verbs with Language Models

Language models (LMs) have demonstrated remarkable proficiency in generating linguistically coherent text, sparking discussions about their relevance to understanding human language learnability. However, a significant gap exists between the training data for these models and the linguistic input a child receives. LMs are typically trained on data that is orders of magnitude larger and fundamentally different from child-directed speech (Warstadt & Bowman, 2022; Warstadt et al.,2023; Frank, 2023a). Addressing this discrepancy, our research focuses on training LMs on subsets of a single child’s linguistic input. Previously, Wang, Vong, Kim, and Lake (2023) found that LMs trained in this setting can form syntactic and semantic word clusters and develop sensitivity to certain linguistic phenomena, but they only considered LSTMs and simpler neural networks trained from just one single-child-directed dataset. We build upon previous research by conducting systematic tests on 5 datasets, comprising single and aggregated child data and a web corpus, using six different model architectures, including Transformers, to investigate whether the results of what is learnable from single-child input observed in previous studies are consistent across different model architectures and datasets. We find that models trained on three single-child datasets demonstrate consistent results, underscoring the robustness of forming meaningful syntactic and semantic representations from a subset of linguistic input specific to an individual child.

**Authors:**

Yulu Qin: New York University; Wentao Wang: New York University; Brenden Lake: NYU

Premium content

Downloads

Next from CogSci 2024

An Automated Sleep Staging Method with EEG-based Sleep Structure Computation

Stay up to date with the latest Underline news!

PRESENTATIONS

CONFERENCES

COMPANY

RESOURCES