EMNLP 2025

November 05, 2025

Suzhou, China

Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.

Knowledge graphs (KGs) enhance pretrained language models by incorporating additional knowledge, improving their performance in specialized fields, for example, helping models learn domain-specific relationships between documents that might otherwise be missed. In the process industry, text logs contain crucial information about daily operations, such as events, instructions, and incident reports, and are often structured as sparse KGs. This paper explores how SciNCL, a graph-aware neighborhood contrastive learning methodology originally designed for scientific publications, can be adapted to the process industry domain. We use several KGs to train graph embedding (GE) models, which we then use to generate synthetic training datasets for a domain-specific text encoder. Our experiments demonstrate that language models fine-tuned with triplets derived from GE outperform a state-of-the-art mE5-large text encoder by 12-13.5% (6.68-7.54p) on the proprietary process industry text embedding benchmark (PITEB) while being 3-5 times smaller in size.

Downloads

Paper

Next from EMNLP 2025

Think-Search-Patch: A Retrieval-Augmented Reasoning Framework for Repository-Level Code Repair
poster

Think-Search-Patch: A Retrieval-Augmented Reasoning Framework for Repository-Level Code Repair

EMNLP 2025

+11
YIWU . and 13 other authors

05 November 2025

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Presentations
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2026 Underline - All rights reserved