EMNLP 2025

November 06, 2025

Suzhou, China

Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.

The efficacy of text embedding models in representing and retrieving information is crucial for many NLP applications, with performance significantly advanced by Large Language Models (LLMs). Despite this progress, existing benchmarks predominantly use general-purpose datasets, inadequately addressing the nuanced requirements of specialized domains like finance. To bridge this gap, we introduce the Finance Massive Text Embedding Benchmark (FinMTEB), a comprehensive evaluation suite specifically designed for the financial domain. FinMTEB encompasses 64 datasets across 7 task types, including classification, clustering, retrieval, pair classification, reranking, summarization, and semantic textual similarity (STS) in English and Chinese. Alongside this benchmark, we introduce Fin-E5, a state-of-the-art finance-adapted embedding model, ranking first on FinMTEB. Fin-E5 is developed by fine-tuning e5-Mistral-7B-Instruct on a novel persona-based synthetic dataset tailored for diverse financial embedding tasks. Evaluating 15 prominent embedding models on FinMTEB, we derive three key findings: (1) domain-specific models, including our Fin-E5, significantly outperform general-purpose models; (2) performance on general benchmarks is a poor predictor of success on financial tasks; and (3) surprisingly, traditional Bag-of-Words (BoW) models surpass dense embedding models on financial STS tasks. This work provides a robust benchmark for financial NLP and offers actionable insights for developing future domain-adapted embedding solutions. Both FinMTEB and Fin-E5 will be open-sourced for the research community.

Downloads

SlidesPaperTranscript English (automatic)

Next from EMNLP 2025

T²: An Adaptive Test-Time Scaling Strategy for Contextual Question Answering
poster

T²: An Adaptive Test-Time Scaling Strategy for Contextual Question Answering

EMNLP 2025

+6Binyang Li
Binyang Li and 8 other authors

06 November 2025

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Presentations
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2025 Underline - All rights reserved