EMNLP 2025

November 05, 2025

Suzhou, China

Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.

As large language models (LLMs) are increasingly deployed for complex reasoning tasks, Long Chain-of-Thought (Long-CoT) prompting has emerged as a key paradigm for structured inference. Despite early-stage safeguards enabled by alignment techniques such as RLHF, we identify a previously underexplored vulnerability: reasoning trajectories in Long-CoT models can drift from aligned paths, resulting in content that violates safety constraints. We term this phenomenon Path Drift. Through empirical analysis, we uncover three behavioral triggers of Path Drift: (1) first-person commitments that induce goal-driven reasoning that delays refusal signals; (2) ethical evaporation, where surface-level disclaimers bypass alignment checkpoints; (3) condition chain escalation, where layered cues progressively steer models toward unsafe completions. Building on these insights, we introduce a three-stage Path Drift Induction Framework comprising cognitive load amplification, self-role priming, and condition chain hijacking. Each stage independently reduces refusal rates, while their combination further compounds the effect. To mitigate these risks, we propose a path-level defense strategy incorporating role attribution correction and metacognitive reflection (reflective safety cues). Our findings highlight the need for trajectory-level alignment oversight in long-form reasoning beyond token-level alignment.

Downloads

SlidesPaperTranscript English (automatic)

Next from EMNLP 2025

Pointing to a Llama and Call it a Camel: On the Sycophancy of Multimodal Large Language Models
poster

Pointing to a Llama and Call it a Camel: On the Sycophancy of Multimodal Large Language Models

EMNLP 2025

+4
Jiahui Gao and 6 other authors

05 November 2025

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Presentations
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2025 Underline - All rights reserved