AAAI 2026

January 23, 2026

Singapore, Singapore

Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.

Recent advances in multi-modal large language models (MLLMs) have significantly improved object-level grounding and region captioning. However, they remain limited in visual relation understanding, struggling even with binary relation detection, let alone N-ary relations involving multiple semantic roles. The core reason is the lack of modeling for structural semantic dependencies among multi-entities, leading to over-reliance on language priors (e.g., defaulting to "person drinks a milk" if a person is merely holding it). To this end, we propose Relation-R1, the first unified relation comprehension framework that explicitly integrates cognitive chain-of-thought (CoT)-guided supervised fine-tuning (SFT) and group relative policy optimization (GRPO) within a reinforcement learning (RL) paradigm. Specifically, we first establish foundational reasoning capabilities via SFT, enforcing structured outputs with thinking processes. Then, GRPO is utilized to refine these outputs via multi-rewards optimization, prioritizing visual-semantic grounding over language-induced biases, thereby improving generalization capability. Furthermore, we investigate the impact of various CoT strategies within this framework, demonstrating that a specific-to-general progressive approach in CoT guidance further improves generalization, especially in capturing synonymous N-ary relations. Extensive experiments on widely-used PSG and SWiG datasets demonstrate that Relation-R1 achieves state-of-the-art performance in both binary and N-ary relation understanding.

Downloads

SlidesPaper

Next from AAAI 2026

Introducing Visual Scenes and Reasoning: A More Realistic Benchmark for Spoken Language Understanding
poster

Introducing Visual Scenes and Reasoning: A More Realistic Benchmark for Spoken Language Understanding

AAAI 2026

+7
Bianjing . and 9 other authors

23 January 2026

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Presentations
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2025 Underline - All rights reserved