EMNLP 2025

November 07, 2025

Suzhou, China

Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.

The frequency distribution of words in human-written texts roughly follows a simple mathematical form known as Zipf's law. Somewhat less well known is the related Heaps' law, which describes a sublinear power-law growth of vocabulary size with document size. We study the applicability of Zipf's and Heaps' laws to texts generated by Large Language Models (LLMs). We empirically show that Heaps’ and Zipf's laws only hold for LLM-generated texts in a narrow model-dependent temperature range. These temperatures have an optimal value close to t=1 for all the base models except the large Llama models, are higher for instruction-finetuned models and do not depend on the model size or prompting. This independently confirms the recent discovery of phase transitions in LLM-generated texts.

Downloads

SlidesPaperTranscript English (automatic)

Next from EMNLP 2025

A Comprehensive Taxonomy of Negation for NLP and Neural Retrievers
poster

A Comprehensive Taxonomy of Negation for NLP and Neural Retrievers

EMNLP 2025

+1Maarten de Rijke
Samarth Bhargav and 3 other authors

07 November 2025

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Presentations
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2025 Underline - All rights reserved