EMNLP 2025

November 05, 2025

Suzhou, China

Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.

Recent advances in large reasoning models (LRMs) have enabled strong multi-step reasoning capabilities. However, existing machine unlearning algorithms are tailored to standard language modeling and fail to address the unique challenges posed by LRMs. In this work, we present the first systematic study of LRM unlearning and reveal that conventional unlearning methods often overlook critical information leakage in reasoning traces, even when final answers are successfully removed. To address this, we propose Reasoning-aware Representation Misdirection for Unlearning (R²MU), a method that suppresses sensitive reasoning traces while preserving the model’s general reasoning ability. Our experiments demonstrate that R²MU significantly reduces reasoning trace leakage and achieves strong performance across both reasoning and safety benchmarks, including WMDP, StrongReject, JBB-Behaviors and WildJailbreak, under state-of-the-art models such as DeepSeek-R1-Distill-LLaMA-8B and DeepSeek-R1-Distill-Qwen-14B. To the best of our knowledge, MU is the first principled approach to both expose and mitigate reasoning trace leakage in LRM unlearning, while preserving reasoning ability.

Downloads

SlidesPaperTranscript English (automatic)

Next from EMNLP 2025

Chain-of-Talkers (CoTalk): Fast Human Annotation of Dense Image Captions
poster

Chain-of-Talkers (CoTalk): Fast Human Annotation of Dense Image Captions

EMNLP 2025

+4Delong Chen
Delong Chen and 6 other authors

05 November 2025

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Presentations
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2025 Underline - All rights reserved