EMNLP 2025

November 07, 2025

Suzhou, China

Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.

Temporal reasoning and planning are essential capabilities for large language models (LLMs), yet most existing benchmarks evaluate them in isolation and under limited forms of complexity. We introduce TCP, a new benchmark that jointly assesses both abilities by framing each instance as a temporal constraint-based planning problem. Each instance features a naturalistic dialogue around a collaborative project, where diverse and interdependent temporal constraints are explicitly or implicitly expressed. The model must infer an optimal schedule that satisfies all constraints. To construct TCP, we first generate abstract problem prototypes using a Python script that samples from predefined constraint templates and performs exhaustive search to ensure logical consistency. These prototypes are then paired with realistic scenarios from various domains and enriched into full data instances using an LLM. A human quality check is finally performed on a sampled subset to confirm the reliability of our benchmark. We evaluate state-of-the-art LLMs and find that even the strongest models struggle with TCP, highlighting its difficulty and revealing limitations in LLMs’ temporal constraint-based planning abilities. We analyze underlying failure cases and hope our findings can inspire future research.

Downloads

SlidesPaperTranscript English (automatic)

Next from EMNLP 2025

Understanding Subword Compositionality of Large Language Models
poster

Understanding Subword Compositionality of Large Language Models

EMNLP 2025

Anders SøgaardYekun Chai
Yekun Chai and 2 other authors

07 November 2025

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Presentations
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2025 Underline - All rights reserved