EMNLP 2025

November 06, 2025

Suzhou, China

Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.

We propose Paired by the Teacher (PbT), a two-stage teacher–student pipeline for synthesizing accurate input–output pairs without any human labeling or existing parallel data. In many low-resource natural language generation (NLG) scenarios, practitioners may have only raw outputs, like recaps, highlights, or questions, or only raw inputs, such as dialogues, articles, or paragraphs, but seldom both sides of the parallel data, unless we perform human labeling. This mismatch forces small models to learn from very few examples or rely on costly, broad-scope synthetic examples produced by large LLMs. In PbT, a teacher LLM first transforms each unpaired example into a concise intermediate representation (IR), and a student model learns to invert this transformation to reconstruct the original input from the IR. This enables us to pair each output with its generated input, creating high-quality paired data. We evaluate PbT on five benchmarks—dialogue summarization (SAMSum, DialogSum), document summarization (XSum, CNNDM), and question generation (SQuAD)—and an unpaired setting on SwitchBoard (paired with DialogSum summaries). An 8B student trained only on PbT data outperforms models trained on 70 B teacher-generated corpora and other unsupervised baselines, closing the gap to human-annotated pairs to within 2 ROUGE points. Human evaluation on SwitchBoard further confirms that only PbT meets target summary lengths with concise, faithful outputs, while all baselines remain overly verbose.

Downloads

SlidesPaperTranscript English (automatic)

Next from EMNLP 2025

Priority on High-Quality: Selecting Instruction Data via Consistency Verification of Noise Injection
poster

Priority on High-Quality: Selecting Instruction Data via Consistency Verification of Noise Injection

EMNLP 2025

+2Kangzheng Liu
Kangzheng Liu and 4 other authors

06 November 2025

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Presentations
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2026 Underline - All rights reserved