MSTDiff: Multiscale-Aware Transformer Diffusion Network for Video Object Detection

Content not yet available

This lecture has no active video or poster.

AAAI 2026

January 24, 2026

Singapore, Singapore

Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.

Video object detection is a fundamental yet challenging task in computer vision. Recently, DETR-based methods have gained prominence in this domain owing to their powerful global modeling capabilities. However, these methods are still confronted with two key limitations: frame-agnostic initialization of object queries and scale-agnostic attention mechanisms, which hinder their capability to capture the appearance variations of dynamic objects and model the temporal consistency across frames. To alleviate these limitations, we propose a multiscale-aware transformer diffusion network (MSTDiff), a novel framework designed for the video object detection task, including two technical improvements over existing methods. First, we design a diffusion-driven adaptive query module, which models the object query distribution through a diffusion process conditioned on input frames, enabling an adaptive and content-aware initialization of object queries. Second, we develop a multiscale-aware transformer encoder module, which combines multi-head convolutional units with attention mechanisms to enhance multi-scale feature representations while preserving global dependence modeling. We conduct extensive experiments on the public ImageNet VID dataset, and the results demonstrate that our MSTDiff achieves 87.7% mAP with ResNet-101, outperforming previous state-of-the-art video object detection methods. The code will be made available.

Downloads

Paper

Next from AAAI 2026

G-UBS: Towards Robust Understanding of Implicit Feedback via Group-Aware User Behavior Simulation
poster

G-UBS: Towards Robust Understanding of Implicit Feedback via Group-Aware User Behavior Simulation

AAAI 2026

+7Siran Chen
Siran Chen and 9 other authors

24 January 2026

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Presentations
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2026 Underline - All rights reserved