Lecture image placeholder

Premium content

Access to this content requires a subscription. You must be a premium user to view this content.

Monthly subscription - $9.99Pay per view - $4.99Access through your institutionLogin with Underline account
Need help?
Contact us
Lecture placeholder background
VIDEO DOI: https://doi.org/10.48448/1hz5-0m20

poster

ACL 2024

August 13, 2024

Bangkok, Thailand

Accurate and Nuanced Open-QA Evaluation Through Textual Entailment

keywords:

textual entailment

open-domain question answering

question answering

evaluation

Open-domain question answering (Open-QA) is a common task for evaluating large language models (LLMs). However, current Open-QA evaluations are criticized for the ambiguity in questions and the lack of semantic understanding in evaluators. Complex evaluators, powered by foundation models or LLMs and pertaining to semantic equivalence, still deviate from human judgments by a large margin. We propose to study the entailment relations of answers to identify more informative and more general system answers, offering a much closer evaluation to human judgment on both NaturalQuestions and TriviaQA while being learning-free. The entailment-based evaluation we propose allows the assignment of bonus or partial marks by quantifying the inference gap between answers, enabling a nuanced ranking of answer correctness that has higher AUC than current methods.

Downloads

SlidesTranscript English (automatic)

Next from ACL 2024

Dictionary-Aided Translation for Handling Multi-Word Expressions in Low-Resource Languages
poster

Dictionary-Aided Translation for Handling Multi-Word Expressions in Low-Resource Languages

ACL 2024

Stella MarkantonatouAntonios Anastasopoulos
Antonios Dimakis and 2 other authors

13 August 2024

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Lectures
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2023 Underline - All rights reserved