Lecture image placeholder

Premium content

Access to this content requires a subscription. You must be a premium user to view this content.

Monthly subscription - $9.99Pay per view - $4.99Access through your institutionLogin with Underline account
Need help?
Contact us
Lecture placeholder background

AAAI 2026

October 05, 2026

Singapore, Singapore

Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.

The exponential growth of video content has created an urgent need for efficient multimodal video retrieval systems. However, existing approaches face three critical challenges: (1) fixed-weight fusion strategies fail under cross-modal noise and ambiguous queries, (2) temporal modeling struggles to capture coherent event sequences while penalizing unrealistic gaps, and (3) systems require manual modality selection, reducing usability. We propose a unified multimodal video retrieval system with three key innovations. First, a cascaded dual-embedding pipeline combines BEiT-3 and SigLIP for broad retrieval, refined by BLIP-2 based reranking to balance recall and precision. Second, a temporal-aware scoring mechanism applies exponential decay penalties to large temporal gaps via beam search, constructing coherent event sequences rather than isolated frames. Third, LLM-guided query decomposition (GPT-4o) automatically interprets ambiguous queries, decomposes them into modality-specific sub-queries (visual/OCR/ASR), and performs adaptive score fusion eliminating manual modality selection. Qualitative analysis demonstrates that our system effectively handles ambiguous queries, retrieves temporally coherent sequences, and dynamically adapts fusion strategies, advancing interactive video search capabilities.

Next from AAAI 2026

LLandMark: A Multi-Agent Framework for Landmark-Aware Multimodal Interactive Video Retrieval
workshop paper

LLandMark: A Multi-Agent Framework for Landmark-Aware Multimodal Interactive Video Retrieval

AAAI 2026

05 October 2026

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Presentations
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2026 Underline - All rights reserved