EMNLP 2025

November 07, 2025

Suzhou, China

Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.

Large language models (LLMs) achieve impressive results on advanced mathematics benchmarks but sometimes fail on basic arithmetic tasks, raising the question of whether they have truly grasped fundamental arithmetic rules or are merely relying on pattern matching. To unravel this issue, we systematically probe LLMs’ understanding of two-integer addition (0 to 2⁶⁴) by testing three crucial properties: commutativity (A+B=B+A), representation invariance via symbolic remapping (e.g., 7mapsto Y), and consistent accuracy scaling with operand length. Our evaluation of 12 leading LLMs reveals a stark disconnect: while models achieve high numeric accuracy (73.8–99.8\%), they systematically fail these diagnostics. Specifically, accuracy plummets to le 7.5\% with symbolic inputs, commutativity is violated in up to 20\% of cases, and accuracy scaling is non-monotonic. Interventions further expose this pattern-matching reliance: explicitly providing rules degrades performance by 29.49\%, while prompting for explanations before answering merely maintains baseline accuracy. These findings demonstrate that current LLMs address elementary addition via pattern matching, not robust rule induction, motivating new diagnostic benchmarks and innovations in model architecture and training to cultivate genuine mathematical reasoning.

Downloads

SlidesPaperTranscript English (automatic)

Next from EMNLP 2025

The Impact of Negated Text on Hallucination with Large Language Models
poster

The Impact of Negated Text on Hallucination with Large Language Models

EMNLP 2025

Jaehyung SeoHeuiseok Lim
Heuiseok Lim and 2 other authors

07 November 2025

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Presentations
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2025 Underline - All rights reserved