AAAI 2026

January 25, 2026

Singapore, Singapore

Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.

Vision Transformers (ViT) and their variants have achieved remarkable success in various complex tasks, but these accomplishments come with high computational costs and significant inference latency. Token pruning, as an effective technique, reduces computational burden by removing redundant or unimportant tokens, thereby lowering model resource consumption and inference time. Although existing retraining-free token pruning algorithms perform well in accelerating inference, their pruning strategies are often limited to locally optimal mask configurations. They fail to fully explore the interdependencies among intra-layer mask variables from a global perspective, which in turn constrains the overall performance improvement of the model. To address these limitations, we propose V-Pruner (A Fast and Globally-informed Token Pruning Framework for Vision Transformer). This framework delivers a fast, efficient, and streamlined end-to-end pruning workflow that operates without user intervention. This algorithm consists of three stages: Token Mask Search, Token Mask Rearrangement, and Token Mask Tuning. In the Token Mask Search stage, we utilize Fisher information to identify key and redundant tokens; In the Token Mask Rearrangement stage, we introduce Reinforcement learning algorithm to deeply explore the global interactions among intra-layer mask variables, thereby overcoming the limitation of traditional methods that focus only on local information and enhancing the overall pruning performance; Finally, in the Token Mask Tuning stage, we precisely adjust the mask variables to restore the accuracy of the pruned model, aiming to compensate for any potential accuracy loss during the pruning process. We evaluated this approach on ViT-L, DeiT-B, DeiT-S, and DeiT-T models, and experimental results show that compared to existing pruning methods, V-Pruner exhibits superior performance in balancing accuracy, speed, and FLOPs, providing a significant competitive advantage.

Downloads

SlidesPaperTranscript English (automatic)

Next from AAAI 2026

Revisiting MLLM Based Image Quality Assessment: Errors and Remedy
poster

Revisiting MLLM Based Image Quality Assessment: Errors and Remedy

AAAI 2026

+2
Jing Dong and 4 other authors

25 January 2026

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Presentations
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2025 Underline - All rights reserved