Content not yet available

This lecture has no active video or poster.

AAAI 2026

January 25, 2026

Singapore, Singapore

Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.

Efficient Multimodal Large Language Models (MLLMs) compress vision tokens to reduce resource consumption, but the loss of visual information can degrade comprehension capabilities. While Knowledge Distillation could enhance student models through teacher guidance, existing methods overlook the fundamental differences in fine-grained vision comprehension caused by unbalanced vision tokens. In this paper, we propose EM-KD, a novel paradigm that enhances the Efficient MLLM with Knowledge Distillation. Firstly, we calculate the Mahattan distance between the vision logits of teacher and student, and align them in the spatial dimension with the Hungarian algorithm to solve the imbalance issue. After alignment, EM-KD introduces two key designs: 1) Vision-Language Affinity Distillation and 2) Vision-Semantic Distillation. Specifically, we calculate the affinity matrix between text tokens and aligned vision tokens, and minimize the smooth L1 distance of the student and the teacher affinity matrices. Considering the semantic richness of vision logits in the final layer, we employ the reverse KL divergence to measure the discrete probability distributions of the aligned vision logits over the vocabulary space. Comprehensive evaluation on diverse benchmarks demonstrates that EM-KD trained model outperforms prior Efficient MLLMs on accuracy and efficiency, validating its effectiveness.

Downloads

Paper

Next from AAAI 2026

FloorPlanFormer: Multi-Task Transformer Network for Floor Plan Recognition with Outer-to-Inner Feature Refinement
poster

FloorPlanFormer: Multi-Task Transformer Network for Floor Plan Recognition with Outer-to-Inner Feature Refinement

AAAI 2026

+3
Bo Hong and 5 other authors

25 January 2026

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Presentations
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2025 Underline - All rights reserved