EMNLP 2025

November 05, 2025

Suzhou, China

Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.

Spoken language from older adults often deviates from written norms due to omission, disordered syntax, constituent errors, and redundancy, limiting the usefulness of automatic transcripts in downstream tasks. We present COAS2W, a Chinese spoken-to-written corpus of 10, 004 utterances from older adults, each paired with a written version, fine-grained error labels, and four-sentence context. Unlike existing resources, COAS2W captures cross-sentence dependencies crucial for resolving ambiguities and recovering missing content. Fine-tuned lightweight open-source models on COAS2W outperform larger closed-source models. Context ablation shows the value of multi-sentence input, and normalization improves performance on downstream translation tasks. COAS2W supports the development of inclusive, context-aware language technologies for older speakers.

Downloads

SlidesPaperTranscript English (automatic)

Next from EMNLP 2025

From Surveys to Narratives: Rethinking Cultural Value Adaptation in LLMs
poster

From Surveys to Narratives: Rethinking Cultural Value Adaptation in LLMs

EMNLP 2025

Iryna Gurevych
Alham Aji and 2 other authors

05 November 2025

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Presentations
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2025 Underline - All rights reserved