AAAI 2026

January 23, 2026

Singapore, Singapore

Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.

We present OpenDriveVLA, a Vision-Language Action (VLA) model designed for end-to-end autonomous driving, built upon open-source large language models. OpenDriveVLA generates spatially-grounded driving actions by leveraging multimodal inputs, including both 2D and 3D instance-aware visual representations, ego vehicle states, and language commands. To bridge the modality gap between driving visual representations and language embeddings, we introduce a hierarchical vision-language alignment process, projecting both 2D and 3D structured visual tokens into a unified semantic space. Furthermore, we incorporate structured agent–environment–ego interaction modeling into the autoregressive decoding process, enabling the model to capture fine-grained spatial dependencies and behavior-aware dynamics critical for reliable trajectory planning. Extensive experiments on the nuScenes dataset demonstrate that OpenDriveVLA achieves state-of-the-art results across open-loop trajectory planning and driving-related question-answering tasks. Qualitative analyses further illustrate its superior capability to follow high-level driving commands and robustly generate trajectories under challenging scenarios, highlighting its potential for next-generation end-to-end autonomous driving.

Downloads

Paper

Next from AAAI 2026

OmniDPO: A Preference Optimization Framework to Address Omni-Modal Hallucination
poster

OmniDPO: A Preference Optimization Framework to Address Omni-Modal Hallucination

AAAI 2026

+5
Junzhe Chen and 7 other authors

23 January 2026

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Presentations
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2025 Underline - All rights reserved