EMNLP 2025

November 07, 2025

Suzhou, China

Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.

The Transformer architecture has long dominated the development of large language models, but its quadratic complexity in sequence length presents scalability challenges. Recent advances in State Space Models, particularly Mamba series, offer a promising alternative with linear-time inference and competitive performance. While scaling model capacity via sparsification, exemplified by Mixture-of-Experts, has proven effective in reducing computation while expanding knowledge capacity, the integration of sparsification with Mamba remains largely unexplored. Existing attempts typically apply naive block-level stacking, failing to leverage Mamba’s internal structure for fine-grained sparsification. In this work, we mainly explore how to sparsify the parameters inside Mamba. We found that the effects of using sparsification strategies on parameters related to various mechanisms inside mamba are significantly different. Our proposed Mamba-MoZ framework introduces a flexible and effective sparsification mechanism inside Mamba, which can independently achieve parameter scalability and has stronger performance.

Downloads

SlidesPaperTranscript English (automatic)

Next from EMNLP 2025

ThinkQE: Query Expansion via an Evolving Thinking Process
poster

ThinkQE: Query Expansion via an Evolving Thinking Process

EMNLP 2025

Yibin Lei
Yibin Lei and 2 other authors

07 November 2025

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Presentations
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2025 Underline - All rights reserved