EMNLP 2025

November 05, 2025

Suzhou, China

Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.

Standard language models employ unique, monolithic embeddings for each token, potentially limiting their ability to capture the multifaceted nature of word meanings. We investigate whether tokens can be more effectively represented through a compositional structure that accumulates diverse semantic facets. To explore this, we propose Aggregate Semantic Grouping (ASG), a novel approach leveraging Product Quantization (PQ). We apply ASG to standard transformer architectures (mBERT, XLM-R, mT5) and evaluate this representational scheme across diverse tasks (NLI, NER, QA). Our findings demonstrate that representing tokens compositionally via ASG gives significant savings in embedding parameters (0.4-0.5%), while maintaining > 95% task performance relative to the base model, even in generative tasks. Furthermore, ASG outperforms prior semantic grouping methods, particularly in preserving nuanced information crucial for zero-shot cross-lingual transfer. These results validate the principle that tokens can be effectively modeled as combinations of shared semantic building blocks. ASG offers a concrete method for achieving this, showcasing how compositional representations can capture linguistic richness while enabling more compact models.

Downloads

SlidesPaperTranscript English (automatic)

Next from EMNLP 2025

ExeSQL: Self-Taught Text-to-SQL Models with Execution-Driven Bootstrapping for SQL Dialects
poster

ExeSQL: Self-Taught Text-to-SQL Models with Execution-Driven Bootstrapping for SQL Dialects

EMNLP 2025

+4
Jiahui Gao and 6 other authors

05 November 2025

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Presentations
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2025 Underline - All rights reserved