AAAI 2026

January 22, 2026

Singapore, Singapore

Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.

Large Vision-Language Models (LVLMs) have demonstrated remarkable capabilities, yet their ability to ground language in complex, interactive environments such as video games remains a critical frontier. Existing benchmarks are inadequate for this purpose: real-world datasets like RefCOCO introduce a domain gap; GUI-centric benchmarks lack the complexity of modern game interfaces; and existing game-specific benchmarks are often too simplistic or narrow, failing to assess fine-grained, generalizable grounding capabilities. To address this issue, we propose GGBench — a large-scale, cross-genre benchmark designed to probe the grounding capabilities of LVLMs in diverse gaming scenarios. GGBench features unprecedented genre diversity, encompassing 10 categories including card games, first-person shooters, and role-playing games, with a total of 1335 test images. It focuses on tasks that require connecting natural language instructions to specific in-game objects and UI elements. Experimental results show existing models perform poorly on GGBench, with weak grounding abilities, especially in complex game scenarios. Due to limited data scale, fine-tuning them for gaming scenarios is also challenging. To address this, we propose Game-R1, a novel training method centered on the Grounded Reinforcement Policy Optimization (GRPO) algorithm. GRPO maximizes limited interaction data utility and enables robust few-shot generalization across games. Extensive experiments show Game-R1 significantly outperforms existing LVLMs on GGBench, validating our approach. GGBench provides a solid and comprehensive evaluation platform for subsequent research on agents in gaming environments, which strongly promotes development in this field.

Downloads

Paper

Next from AAAI 2026

PFAvatar: Pose-Fusion 3D Personalized Avatar Reconstruction from Real-World Outfit-of-the-Day Photos
poster

PFAvatar: Pose-Fusion 3D Personalized Avatar Reconstruction from Real-World Outfit-of-the-Day Photos

AAAI 2026

+6
Guoyuan An and 8 other authors

22 January 2026

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Presentations
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2025 Underline - All rights reserved