AAAI 2026

January 23, 2026

Singapore, Singapore

Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.

The proliferation of multi-modal data on the internet has intensified the need for structured event understanding across text and visual modalities. However, existing cross-modal event extraction models suffer from three major limitations: the absence of explicit event schema guidance, coarse-grained multi-modal alignment strategies, and reliance on heterogeneous, misaligned multi-modal training dataset. To address these issues, we propose a Multi-modal Schema-Guided Progressive Instruction Tuning framework (LLaVA-MS-PIT) that explicitly injects structured multi-modal event schema knowledge into the model before event extraction. Specifically, we introduce textual event schema to establish the model’s prior knowledge of event information and enhance its ability for event structure reasoning, while visual event schema is employed to bridge the representational gap between textual and visual modalities at the event level, enabling unified and semantically aligned event representations across modalities. Furthermore, to alleviate these challenges of data scarcity and modality misalignment inherent in current benchmarks, we further construct imSitu-MME, a high-quality multi-modal parallel dataset constructed via schema-guided data generation and annotation. Extensive experiments demonstrate that LLaVA-MS-PIT achieves competitive performance on multi-modal event extraction benchmarks, indicating the effectiveness and necessity of schema-guided progressive instruction tuning.

Downloads

Paper

Next from AAAI 2026

MetaEval: Measuring the Discrimination of Benchmarks for Efficient LLM Evaluation
poster

MetaEval: Measuring the Discrimination of Benchmarks for Efficient LLM Evaluation

AAAI 2026

+2
Zhenxiao Cheng and 4 other authors

23 January 2026

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Presentations
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2025 Underline - All rights reserved