Lecture image placeholder

Premium content

Access to this content requires a subscription. You must be a premium user to view this content.

Monthly subscription - $9.99Pay per view - $4.99Access through your institutionLogin with Underline account
Need help?
Contact us
Lecture placeholder background

CogSci 2024

July 25, 2024

Rotterdam, Netherlands

Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.

Although predicting others' behavior is a fundamental capacity of the human mind, building this intuitive psychology into machines has remained a challenge. To advance this aim, we introduce the Animate Agent World Modeling Benchmark - featuring agents engaged in a diverse repertoire of behaviors, such as goal-directed interactions with objects and multi-agent interactions, all governed by realistic physics. Humans tend to predict the future based on expected events rather than simulating step-by-step. Thus, our benchmark includes a cognitively-inspired evaluation pipeline designed to assess whether the simulated trajectories of world models capture the correct sequences of events. To perform well, models need to leverage predictive cues from the observations in order to accurately simulate the goals of animate agents over long horizons. Although recent developments have incorporated world models into state of the art model-based reinforcement learning (RL) agents, we demonstrate that these models perform poorly in our evaluations. A hierarchical oracle model sets an upper bound for performance, suggesting that in order to excel, a model should scaffold their predictions with abstractions like goals that guide the simulation process towards relevant future events.

Authors:

Logan Matthew Cross: Stanford University; Violet Xiang: Stanford University; Nick Haber: Stanford; Daniel Yamins: Stanford University

Downloads

Paper
access premium content

Next from CogSci 2024

Advice Design to Increase the Use of Advice with an Interval to Overcome Algorithm Aversion
poster

Advice Design to Increase the Use of Advice with an Interval to Overcome Algorithm Aversion

CogSci 2024

Rina Kagawa

25 July 2024

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Presentations
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2026 Underline - All rights reserved