EMNLP 2025

November 07, 2025

Suzhou, China

Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.

Efficient resume parsing is critical for global hiring, yet the absence of dedicated benchmarks for evaluating large language models (LLMs) on multilingual, structure-rich resumes hinders progress. To address this, we introduce ResumeBench, the first privacy-compliant benchmark comprising 2,500 synthetic resumes spanning 50 templates, 30 career fields, and 5 languages. These resumes are generated through a human-in-the-loop pipeline that prioritizes realism, diversity, and privacy compliance, which are validated against real-world resumes. This paper evaluates 24 state-of-the-art LLMs on ResumeBench, revealing substantial variations in handling resume complexities. Specifically, top-performing models like GPT-4o exhibit challenges in cross-lingual structural alignment while smaller models show inconsistent scaling effects. Code-specialized LLMs underperform relative to generalists, while JSON outputs enhance schema compliance but fail to address semantic ambiguities. Our findings underscore the necessity for domain-specific optimization and hybrid training strategies to enhance structural and contextual reasoning in LLMs.

Downloads

SlidesPaperTranscript English (automatic)

Next from EMNLP 2025

AIR: Complex Instruction Generation via Automatic Iterative Refinement
poster

AIR: Complex Instruction Generation via Automatic Iterative Refinement

EMNLP 2025

+6
Yancheng He and 8 other authors

07 November 2025

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Presentations
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2025 Underline - All rights reserved