Content not yet available

This lecture has no active video or poster.

AAAI 2026

January 22, 2026

Singapore, Singapore

Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.

Knowledge distillation (KD) is a widely adopted technique for transferring the capabilities of large teacher models to smaller student models, thereby significantly reducing inference costs and memory consumption. However, existing KD methods are all constrained by an inherent greedy optimization objective, rooted in the assumption of teacher superiority: "Trust all teacher-generated outputs (TGOs)" and "Distrust any student-generated outputs (SGOs) unsupported by the teacher". We propose ASKD, a novel KD method with adaptive skewness determined by sample quality, refining this objective to: "Learn TGOs proportionally to their quality, and distrust only low-quality unsupported SGOs". ASKD comprises three key components: (1) A reinforcement learning-style optimization formulation to mitigate the inherent approximation bias in sample-based Kullback-Leibler (KL) divergence approximations used by previous KD methods; (2) Well-designed quality supervision signals to map and achieve adaptive skewness in skewed KL loss, pioneering the usage of sample quality to adjust learning magnitudes; (3) A gradient-clip function on high-quality SGOs for findings that high-quality SGOs in KL loss fail to yield positive updates and even cause adverse effects on some samples. Extensive experiments indicate that ASKD builds high-performance student models across various tasks, including instruction following, mathematical reasoning, and code generation, outperforming state-of-the-art methods comprehensively and surpassing GRPO-like approaches that use advantages as multiplicative factors. We also provide detailed mathematical proofs demonstrating properties such as Lipschitz continuity of the update coefficient and uniform convergence of the loss function, ensuring theoretical rigor for key components of ASKD.

Downloads

Paper

Next from AAAI 2026

MirrorShield: Towards Dynamic Adaptive Defense Against Jailbreaks via Entropy-Guided Mirror Crafting
poster

MirrorShield: Towards Dynamic Adaptive Defense Against Jailbreaks via Entropy-Guided Mirror Crafting

AAAI 2026

+3
Rui Ha and 5 other authors

22 January 2026

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Presentations
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2025 Underline - All rights reserved