EMNLP 2025

November 05, 2025

Suzhou, China

Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.

Large Language Models (LLMs) often generate errors in their reasoning chains that can propagate and complicate checking the correctness of intermediate claims. Current LLM-based error detection methods usually take in the full reasoning chain as the context and output a score for each step. However, the model can be misled when there are incorrect steps in the context, and these errors are propagated to later steps. To address this problem, we leverage how humans typically check the soundness of claims in a reasoning chain, and introduce Reasoning Entailment Stability (RES), a novel probabilistic framework that inductively judges each step in a reasoning chain based solely on the previously validated claims. RES achieves 72.1% F1 (+8.2 points) across four benchmarks and 90.3% F1 (+27.6 points) on our controllable dataset with long reasoning chains.

Downloads

SlidesPaperTranscript English (automatic)

Next from EMNLP 2025

An Empirical Study of LLM Reasoning Ability Under Strict Output Length Constraint
poster

An Empirical Study of LLM Reasoning Ability Under Strict Output Length Constraint

EMNLP 2025

+8
Yuanchun Li and 10 other authors

05 November 2025

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Presentations
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2025 Underline - All rights reserved