EMNLP 2025

November 08, 2025

Suzhou, China

Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.

Natural language explanations (NLEs) are widely used to communicate model reasoning to humans, but they may also serve as effective control signals for improving model performance. In this paper, we present the first comprehensive evaluation of NLEs as prompts in in-context learning (ICL), comparing human-annotated, self-generated, and LLM-generated NLEs across five reasoning benchmarks and three instruction-tuned models (Llama 3 8B, Llama 3 70B, GPT-4o-mini). Our preliminary results show that LLM-generated explanations, especially those from GPT-4o-mini, yield the highest gains across tasks. We further plan to measure how the faithfulness of self-explanations strongly correlates to its utility, and if models retain partial robustness even when rationales are randomly mismatched or adversarially swapped.

Next from EMNLP 2025

No for Some, Yes for Others: Persona Prompts and Other Sources of False Refusal in Language Models
workshop paper

No for Some, Yes for Others: Persona Prompts and Other Sources of False Refusal in Language Models

EMNLP 2025

+3Dirk Hovy
Emanuele Borgonovo and 5 other authors

08 November 2025

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Presentations
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2026 Underline - All rights reserved