AAAI 2026

January 25, 2026

Singapore, Singapore

Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.

Large Language Models (LLMs) have demonstrated remarkable capabilities in various tasks through the \textbf{example-driven learning paradigm}. However, in high-stakes domains such as emergency response or industrial safety, real incidents are scarce, confidential, or both, while concise \emph{rule books} are plentiful. We formalize this underexplored setting as \textbf{rule knowledge–driven reasoning} and ask: \emph{Can an LLM reason reliably when rules are abundant but examples are almost nil?} To answer this question we introduce \textbf{RULER}, a fully automatic benchmark that derives 32K rigorously verified questions from 1K expert-curated emergency response rule knowledge to probe three core abilities—\emph{rule memorization}, \emph{single-rule application}, and \emph{multi-rule complex reasoning}, supported by a hallucination-aware evaluation suite and novel relational metrics. A comprehensive empirical study of five open-source LLMs and five enhancement strategies shows that, after reliable performance on rule memorization and single-rule application, multi-rule complex reasoning plateaus at 5.4 on a 10-point scale. We bridge this gap with \textbf{RAMPS}—a \textbf{R}ule-knowledge-\textbf{A}ware \textbf{M}onte-Carlo-tree-search \textbf{P}rocess-reward \textbf{S}upervision framework. RAMPS injects rule knowledge priors into MCTS, distills 12K step-level traces without human annotation, and trains an advantage-based reward model that scores candidate reasoning paths during the beam search inference. Experimental results demonstrate a notable improvement in complex reasoning, increasing to 7.7 (+2.3). Together, RULER and RAMPS provide an automatic benchmark and a strong baseline suite for rule knowledge-driven reasoning in LLMs.

Downloads

Paper

Next from AAAI 2026

MemoryART: Enhancing LLMs via Multi-Memory Models with Adaptive Resonance Theory for Healthcare Agents
poster

MemoryART: Enhancing LLMs via Multi-Memory Models with Adaptive Resonance Theory for Healthcare Agents

AAAI 2026

+2
Hebin Hu and 4 other authors

25 January 2026

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Presentations
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2025 Underline - All rights reserved