Lecture image placeholder

Premium content

Access to this content requires a subscription. You must be a premium user to view this content.

Monthly subscription - $9.99Pay per view - $4.99Access through your institutionLogin with Underline account
Need help?
Contact us
Lecture placeholder background
VIDEO DOI: https://doi.org/10.48448/tr25-hm47

poster

ACL 2024

August 22, 2024

Bangkok, Thailand

FOFO: A Benchmark to Evaluate LLMs’ Format-Following Capability

keywords:

format following

llm

benchmark

evaluation

This paper presents FoFo, a pioneering benchmark for evaluating large language models' (LLMs) ability to follow complex, domain-specific formats, a crucial yet under-examined capability for their application as AI agents. Despite LLMs' advancements, existing benchmarks fail to assess their format-following proficiency adequately. FoFo fills this gap with a diverse range of real-world formats and instructions, developed through an AI-Human collaborative method. Our evaluation across both open-source (e.g., Llama 2, WizardLM) and closed-source (e.g., GPT-4, PALM2, Gemini) LLMs highlights three key findings: open-source models significantly lag behind closed-source ones in format adherence; LLMs' format-following performance is independent of their content generation quality; and LLMs' format proficiency varies across different domains. These insights suggest the need for specialized tuning for format-following skills and highlight FoFo's role in guiding the selection of domain-specific AI agents. FoFo will be publicly released, contributing a critical tool for advancing LLM evaluation and application.

Downloads

SlidesTranscript English (automatic)

Next from ACL 2024

Can LLMs Learn from Previous Mistakes? Investigating LLMs' Errors to Boost for Reasoning
poster

Can LLMs Learn from Previous Mistakes? Investigating LLMs' Errors to Boost for Reasoning

ACL 2024

+3Dawei Li
Yongqi Tong and 5 other authors

22 August 2024

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Lectures
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2023 Underline - All rights reserved