Lecture image placeholder

Premium content

Access to this content requires a subscription. You must be a premium user to view this content.

Monthly subscription - $9.99Pay per view - $4.99Access through your institutionLogin with Underline account
Need help?
Contact us
Lecture placeholder background
VIDEO DOI: https://doi.org/10.48448/cwxf-fr06

poster

ACL 2024

August 22, 2024

Bangkok, Thailand

A Grounded Preference Model for LLM Alignment

keywords:

grounded preference model

helpfulness

faithfulness

reward model

Despite LLMs' recent advancements, they still suffer from factual inconsistency and hallucination. An often-opted remedy is retrieval-augmented generation -- however, there is no guarantee that the model will strictly adhere to retrieved grounding. Fundamentally, LLMs need to be aligned to be more faithful to grounding, which will require high-quality preference annotations. This paper investigates whether we can create high-quality grounded preference data for model alignment without using annotations from humans or large proprietary models. We experimented with existing entailment data and proposed approaches to generate synthetic grounded preference data, with which we train a Grounded Preference Model(GPM). We demonstrate through Proximal Policy Optimization(PPO) training of Mistral-7B-Instruct that our GPM model can successfully align powerful LLMs to generate much better grounded responses as judged by GPT4. Moreover, we show that our GPM is also a great faithfulness classifier, achieving SoTA in dialogue sub-tasks of the TRUE faithfulness Benchmark. We will release our GPM under the Apache 2.0 license.

Downloads

SlidesTranscript English (automatic)

Next from ACL 2024

Revisiting Multimodal Transformers for Tabular Data with Text Fields
poster

Revisiting Multimodal Transformers for Tabular Data with Text Fields

ACL 2024

Thomas Bonnier

22 August 2024

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Lectures
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2023 Underline - All rights reserved