
Antonio Torralba
reasoning
language grounding
large language models
robotics
theory of mind
multimodal models
2
presentations
3
number of views
SHORT BIO
My research is in the areas of computer vision, machine learning, and human visual perception. I am interested in building systems that can perceive the world like humans do. Although my work focuses on computer vision I am also interested in other modalities such as audition and touch. A system able to perceive the world through multiple senses might be able to learn without requiring massive curated datasets. Other interests include understanding neural networks, common-sense reasoning, computational photography, building image databases, ..., and the intersections between visual art and computation.
Presentations

MMToM-QA: Multimodal Theory of Mind Question Answering
Chuanyang Jin and 9 other authors

Skill Induction and Planning with Latent Language
Pratyusha Sharma and 2 other authors