EMNLP 2025

November 07, 2025

Suzhou, China

Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.

Encoder models offer efficiency for specific tasks, but their performance depend on data availability. While Large Language Models (LLMs) excel at few-shot learning, their direct application in real-world scenarios is often hindered by their high computational cost. To address this challenge, we propose a simple yet effective approach that uses LLMs for data generation and scoring to improve encoder only model performance. We evaluate this framework on few-shot Multiple Choice Question Answering (MCQA), an important task where acquiring labeled data is costly. Our approach utilizes LLMs to create MCQA questions and choices (exploring both direct JSON and decomposed generation methods) and assigns probability scores to these choices. This generated data and the LLM scores are then used to fine-tune smaller and more efficient DeBERTa-v3-base using distillation loss. Extensive experiments on the MMLU benchmark demonstrate that our method improves accuracy from 28.9% to 39.3%, representing a gain of over 10% compared to a baseline finetuned directly on 5-shot examples. This shows the effectiveness of LLM-driven data generation and knowledge distillation for few-shot MCQA.

Downloads

SlidesPaperTranscript English (automatic)

Next from EMNLP 2025

INREACT: An Inspire-Then-Reinforce Training Framework For Multimodal GUI Agent
poster

INREACT: An Inspire-Then-Reinforce Training Framework For Multimodal GUI Agent

EMNLP 2025

+1Ying Shen
Haohao Luo and 3 other authors

07 November 2025

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Presentations
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2025 Underline - All rights reserved