Lecture image placeholder

Premium content

Access to this content requires a subscription. You must be a premium user to view this content.

Monthly subscription - $9.99Pay per view - $4.99Access through your institutionLogin with Underline account
Need help?
Contact us
Lecture placeholder background
VIDEO DOI: https://doi.org/10.48448/q8ay-4x54

poster

ACL 2024

August 14, 2024

Bangkok, Thailand

Inference to the Best Explanation in Large Language Models

keywords:

natural language explanations

explanation evaluation

causal reasoning

reasoning

While Large Language Models (LLMs) have found success in real-world applications, their underlying explanatory process is still poorly understood. This paper proposes \textit{IBE-Eval}, a framework inspired by philosophical accounts on \emph{Inference to the Best Explanation (IBE)} to advance the interpretation and evaluation of LLMs' explanations. \textit{IBE-Eval} estimates the plausibility of natural language explanations through a combination of explicit logical and linguistic features including: \emph{consistency}, \emph{parsimony}, \emph{coherence}, and \emph{uncertainty}. Extensive experiments are conducted on \emph{Causal Question Answering (CQA)}, where \textit{IBE-Eval} is tasked to select the most plausible causal explanation amongst competing ones generated by LLMs (i.e., GPT 3.5 and Llama 2). The experiments reveal that \textit{IBE-Eval} can successfully identify the best explanation with up to 77\% accuracy ($\approx 27\%$ above random), improving upon a GPT 3.5-as-a-Judge baseline ($\approx+17\%$) while being intrinsically more efficient and interpretable. Additional analyses suggest that, despite model-specific variances, LLM-generated explanations tend to conform to IBE criteria and that \textit{IBE-Eval} is significantly correlated with human judgment, opening up opportunities for future development of automated explanation verification tools.

Downloads

SlidesTranscript English (automatic)

Next from ACL 2024

MinPrompt: Graph-based Minimal Prompt Data Augmentation for Few-shot Question Answering
poster

MinPrompt: Graph-based Minimal Prompt Data Augmentation for Few-shot Question Answering

ACL 2024

+3Xiusi Chen
Xiusi Chen and 5 other authors

14 August 2024

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Lectures
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2023 Underline - All rights reserved