Lecture image placeholder

Premium content

Access to this content requires a subscription. You must be a premium user to view this content.

Monthly subscription - $9.99Pay per view - $4.99Access through your institutionLogin with Underline account
Need help?
Contact us
Lecture placeholder background
VIDEO DOI: https://doi.org/10.48448/xgpn-d402

poster

ACL 2024

August 12, 2024

Bangkok, Thailand

MathGenie: Generating Synthetic Data with Question Back-translation for Enhancing Mathematical Reasoning of LLMs

keywords:

large language model

mathematical reasoning

data augmentation

Large language models (LLMs) have exhibited great potential in mathematical reasoning. However, there remains a performance gap in this area between existing open-source models and closed-source models such as GPT-4. In this paper, we introduce MathGenie, a novel method for generating diverse and reliable math problems by leveraging the ground-truth solutions of the seed data. We augment these ground-truth solutions and use a specially finetuned model to translate these augmented solutions back into new questions. Subsequently, we generate code-integrated solutions for these questions. To ensure the correctness of the code-integrated solutions, we employ rationale-based verification for filtering. Then, we finetune various pretrained models, ranging from 7B to 70B, on the newly curated data, resulting in a family of models known as MathGenie. These models consistently outperform previous open-source models across five representative mathematical reasoning datasets, achieving state-of-the-art performance. In particular, MathGenie-InternLM2 achieves an accuracy of 87.7% on GSM8K and 55.7% on MATH, securing the best overall score.

Downloads

SlidesTranscript English (automatic)

Next from ACL 2024

On the Role of Long-tail Knowledge in Retrieval Augmented Large Language Models
poster

On the Role of Long-tail Knowledge in Retrieval Augmented Large Language Models

ACL 2024

+5
Dongyang Li and 7 other authors

12 August 2024

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Lectures
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2023 Underline - All rights reserved