Singapore

Specializing Large Language Models for educational domains is a key frontier in creating personalized learning tools. The central challenge is not data scarcity but its abundance: efficiently selecting a curated data subset from vast corpora to enhance specialized skills and foster generalization, without degrading existing abilities. Existing data selection paradigms, relying on superficial semantic similarity or model training dynamics, often lack a principled framework to identify data that promotes true cognitive growth. Our work proposes a paradigm shift from leveraging indirect proxies of learning value, such as semantic similarity and training dynamics, towards a framework that performs a direct, cognitive-level modeling of the learner&#39;s state. We introduce CASS, a novel framework that implements this cognitive approach through a clear pipeline, moving from an initial Diagnosis to the ultimate goal of expanding the model&#39;s cognitive frontier. First, CASS diagnoses the LLM&#39;s cognitive frontier using Multidimensional Item Response Theory. Leveraging this diagnosis, it then employs Fisher Information to select a data subset situated at LLM&#39;s cognitive frontier that offers maximum informational gain. Finally, the model is fine-tuned on this curated data using a structured, easy-to-hard curriculum to ensure effective learning. Experiments on our new multi-subject dataset show that models trained with CASS not only achieve superior accuracy in the target domain but also exhibit enhanced generalization. CASS provides a more efficient, effective, and theoretically-grounded paradigm for building expert educational LLMs.

AAAI 2026

From Diagnosis to Generalization: A Cognitive Approach to Data Selection for Educational LLMs

ai for education

cognitive modeling

data selection

large language models

generalization

Specializing Large Language Models for educational domains is a key frontier in creating personalized learning tools. The central challenge is not data scarcity but its abundance: efficiently selecting a curated data subset from vast corpora to enhance specialized skills and foster generalization, without degrading existing abilities. Existing data selection paradigms, relying on superficial semantic similarity or model training dynamics, often lack a principled framework to identify data that promotes true cognitive growth. Our work proposes a paradigm shift from leveraging indirect proxies of learning value, such as semantic similarity and training dynamics, towards a framework that performs a direct, cognitive-level modeling of the learner's state. We introduce CASS, a novel framework that implements this cognitive approach through a clear pipeline, moving from an initial Diagnosis to the ultimate goal of expanding the model's cognitive frontier. First, CASS diagnoses the LLM's cognitive frontier using Multidimensional Item Response Theory. Leveraging this diagnosis, it then employs Fisher Information to select a data subset situated at LLM's cognitive frontier that offers maximum informational gain. Finally, the model is fine-tuned on this curated data using a structured, easy-to-hard curriculum to ensure effective learning. Experiments on our new multi-subject dataset show that models trained with CASS not only achieve superior accuracy in the target domain but also exhibit enhanced generalization. CASS provides a more efficient, effective, and theoretically-grounded paradigm for building expert educational LLMs.

poster

We are pleased to announce the Fortieth AAAI Conference on Artificial Intelligence (AAAI-26), which will be held in Singapore EXPO from January 20 to January 27, 2026.

The purpose of the AAAI conference series is to promote research in Artificial Intelligence (AI) and foster scientific exchange between researchers, practitioners, scientists, students, and engineers across the entirety of AI and its affiliated disciplines. AAAI-26 will feature technical paper presentations, special tracks, invited speakers, workshops, tutorials, poster sessions, senior member presentations, competitions, and exhibit programs, and a range of other activities to be announced.<br><br>

To access this event page, you need to log in with the **email address you registered with**. <br>Access credentials will be sent to your email from Underline -  subject line "Welcome to AAAI 2026". Please be sure to check your spam email folder if you do not see an email confirmation right away.

Please log in

To access this event page, you are required to register.
Please complete your registration to continue.

We recommend reading [**the registration information**](https://aaai.org/conference/aaai/aaai-26/registration/) first.

**Online Registration Form**: https://aaai.getregistered.net/conference-2026 

Registration Required

We are pleased to announce the Fortieth AAAI Conference on Artificial Intelligence (AAAI-26), which will be held in Singapore EXPO from January 20 to January 27, 2026.

Mixed-Integer Linear Programming (MILP) lies at the core of many real-world combinatorial optimization (CO) problems, traditionally solved by branch-and-bound (B\&B). 
A key driver influencing B\&B solvers efficiency is the variable selection heuristic that guides branching decisions. 
Looking to move beyond static, hand-crafted heuristics, recent work has explored adapting traditional reinforcement learning (RL) algorithms to the B\&B setting, aiming to learn branching strategies tailored to specific MILP distributions. 
In parallel, RL agents have achieved remarkable success in board games, a very specific type of combinatorial problems, by leveraging environment simulators to plan via Monte Carlo Tree Search (MCTS).
Building on these developments, we introduce Plan-and-Branch-and-Bound (PlanB\&B), a model-based reinforcement learning (MBRL) agent that leverages a learned internal model of the B\&B dynamics to discover improved branching strategies. 
Computational experiments empirically validate our approach, with our MBRL branching agent outperforming previous state-of-the-art RL methods across four standard MILP benchmarks.

Planning in Branch-and-Bound: Model-Based Reinforcement Learning for Exact Combinatorial Optimization

Chinese Grammar Error Correction (CGEC) aims to identify and correct grammatical errors in Chinese sentences. Fine-tuning Large Language Models (LLMs) is a popular current method. However, we have observed a significant flaw: LLMs learn grammatical knowledge but often fail to explicitly use specific grammatical concepts to correct erroneous sentences, leading to multiple corrections without a clear indication of which is the most reliable. Humans possess an "intuitive thinking" mode, which allows them to quickly decide which correction is more reliable based on experience and intuition. To address this deficiency in LLMs, we propose the Expanding Intuitive Thinking Model (ExIT). ExIT extends the thinking process of LLMs for CGEC, providing them with a human-like rapid decision-making process. This enables LLMs to quickly select a more reliable correction from multiple alternatives based on experience and intuition. Unlike the LLM decoding process, which focuses only on the trustworthiness of local tokens, this is a global thinking process concerning the erroneous sentence and its correction. ExIT is a lightweight model that performs rapid computations without significantly increasing overhead. Our experimental results on CGEC datasets demonstrate that the proposed ExIT can substantially unleash the error correction potential of LLMs.

Intuitive Thinking: Expanding Large Language Models’ Thinking for Rapid Decision-Making on Candidate Corrections in Chinese Grammar Error Correction

Adverse weather conditions—such as rain, fog, and snow—significantly degrade LiDAR point cloud quality, causing substantial performance deterioration in detection models trained on clean data. To address this, we propose LTDNet, a novel point cloud quality improvement net-work that restores degraded LiDAR scans by learning an end-to-end mapping from corrupted to clean geometry. LTDNet leverages position encoding, spatial–frequency joint feature extraction, weather-aware refinement, and probabilistic pruning to effectively recover structural in-tegrity while suppressing weather-induced noise. To fa-cilitate standardized evaluation, we introduce IQA3D, a new benchmark comprising both synthetic and real-world sequences under adverse weather. This dual-design benchmark serves two complementary purposes: synthet-ic sequences provide pixel-wise correspondences between degraded and clean point clouds for quantitatively as-sessing restoration fidelity, while real-world sequences enable evaluation of the practical impact of improvement methods on downstream 3D object detection under au-thentic weather conditions. This makes IQA3D particular-ly suitable for jointly measuring both perceptual quality and task-level robustness of point cloud improvement models. Extensive experiments on IQA3D demonstrate that LTDNet significantly improves detection perfor-mance across various state-of-the-art 3D detectors and three tested weather conditions, making it a practical and effective solution for robust LiDAR-based detection.

Weather-Robust LiDAR Perception: Point Cloud Restoration from Adverse Weather

While diffusion model fine-tuning offers a powerful approach for customizing pre-trained models to generate specific objects, it frequently suffers from overfitting when training samples are limited, compromising both generalization capability and output diversity. This paper tackles the challenging yet most impactful task of adapting a diffusion model using just a single concept image, as single-image customization holds the greatest practical potential. We introduce *T-LoRA*, a **T**imestep-Dependent **Lo**w-**R**ank **A**daptation framework specifically designed for diffusion model personalization. In our work we show that higher diffusion timesteps are more prone to overfitting than lower ones, necessitating a timestep-sensitive fine-tuning strategy. *T-LoRA* incorporates two key innovations: (1) a dynamic fine-tuning strategy that adjusts rank-constrained updates based on diffusion timesteps, and (2) a weight parametrization technique that ensures independence between adapter components through orthogonal initialization. Extensive experiments show that *T-LoRA* and its individual components outperform standard LoRA and other diffusion model personalization techniques. They achieve a superior balance between concept fidelity and text alignment, highlighting the potential of *T-LoRA* in data-limited and resource-constrained scenarios.

T-LoRA: Single Image Diffusion Model Customization Without Overfitting

Precise environmental perception is critical for the reliability of autonomous driving systems. While collaborative perception mitigates the limitations of single-agent perception through information sharing, it encounters a fundamental communication-performance trade-off. Existing communication-efficient approaches typically assume MB-level data transmission per collaboration, which may fail due to practical network constraints. To address these issues, we propose InfoCom, an information-aware framework establishing the pioneering theoretical foundation for communication-efficient collaborative perception via extended Information Bottleneck principles. Departing from mainstream feature manipulation, InfoCom introduces a novel information purification paradigm that theoretically optimizes the extraction of minimal sufficient task-critical information under Information Bottleneck constraints. Its core innovations include: i) An Information-Aware Encoding condensing features into minimal messages while preserving perception-relevant information; ii) A Sparse Mask Generation identifying spatial cues with negligible communication cost; and iii) A Multi-Scale Decoding that progressively recovers perceptual information through mask-guided mechanisms rather than simple feature reconstruction. Comprehensive experiments across multiple datasets demonstrate that InfoCom achieves near-lossless perception while reducing communication overhead from megabyte to kilobyte-scale, representing 440-fold and 90-fold reductions per agent compared to Where2comm and ERMVP, respectively. The code will be open-sourced upon acceptance.

InfoCom: Kilobyte-Scale Communication-Efficient Collaborative Perception with Information Bottleneck

Scenes with water surfaces present a significant challenge for Gaussian Splatting due to the simultaneous presence of refraction and reflection, as well as the difficulty of accurately estimating the geometry of transparent water surfaces. To address this, we propose a novel framework for reconstructing scenes involving both reflection and refraction caused by water surfaces. The water surface is modeled as a trainable plane, and 2D Gaussian ray tracing is applied to account for refraction through the water. We extend 2D Gaussian Splatting by introducing a soft mask parameter and a dual set of Gaussian primitives, which handle both reflected and refracted effects. Our method achieves state-of-the-art performance on newly constructed water surface datasets, including both synthetic and real scenes, and significantly outperforms prior approaches in water-interacting regions. Furthermore, we demonstrate the editability of our model by manipulating the index of refraction to suppress or modify refractive effects, enabling scene transformations into different liquids.

Through the Water: Refractive Gaussian Splatting for Water Surface Scenes

Identification of fine-grained embryo developmental stages during In Vitro Fertilization (IVF) is crucial for assessing embryo viability. Although recent deep learning methods have achieved promising accuracy, existing approaches based on discriminative models fail to utilize the distributional prior of embryonic development. Moreover, they suffer from incomplete embryonic representation due to their reliance on single-focal information, thereby making them susceptible to feature ambiguity caused by cell occlusions. To address these limitations, we propose EmbryoDiff, a two-stage diffusion-based framework that utilizes sequence features as condition signals for accurate stage recognition. Specifically, in the first stage, a frame-level encoder is trained and fixed to extract robust multi-focal visual features for training the diffusion model. In the second stage, we introduce a Multi-Focal Feature Fusion strategy that integrates information across focal planes to build a morphological representation with 3D contextual awareness, mitigating ambiguity caused by cell occlusions. Based on the fused features, we further extract complementary semantic and boundary condition features and design a Hybrid Semantic-Boundary Condition Block to effectively inject them into the denoising process for accurate stage classification. Extensive experiments on two benchmark datasets demonstrate that our method achieves state-of-the-art performance. Notably, our model attains optimal average test performance with only one denoising step, achieving 82.8% and 81.3% accuracy on the two datasets, respectively.

EmbryoDiff: A Conditional Diffusion Framework with Multi-Focal Feature Fusion for Fine-Grained Embryo Developmental Stage Recognition

Large Language Models (LLMs) have achieved remarkable success across a wide range of natural language tasks, but often exhibit overconfidence and generate plausible yet incorrect answers. This overconfidence, especially in models undergone Reinforcement Learning from Human Feedback (RLHF), poses significant challenges for reliable uncertainty estimation and safe deployment.
In this paper, we propose EAGLE (Expectation of AGgregated internaL bEief), a novel self-evaluation-based calibration method that leverages the internal hidden states of LLMs to derive more accurate confidence scores.
Instead of relying on the model's final output, our approach extracts internal beliefs from multiple intermediate layers during self-evaluation. By aggregating these layer-wise beliefs and calculating the expectation over the resulting confidence score distribution, EAGLE produces a refined confidence score that more faithfully reflects the model's internal certainty. Extensive experiments on diverse datasets and LLMs demonstrate that EAGLE significantly improves calibration performance over existing baselines.
We also provide an in-depth analysis of EAGLE, including a layer-wise examination of uncertainty patterns, a study of the impact of self-evaluation prompts, and an analysis of the effect of self-evaluation score range.

Enhancing Uncertainty Estimation in LLMs with Expectation of Aggregated Internal Belief

Branch-and-Bound (B\&B) is the dominant exact solution method for Mixed Integer Linear Programs (MILP), yet its exponential time complexity poses significant challenges for large-scale instances. The growing capabilities of machine learning have spurred efforts to enhance B\&B by learning data-driven branching policies. However, most existing approaches rely on imitation learning, which tends to overfit to expert demonstrations and struggles to generalize to structurally diverse or unseen instances. In this work, we propose TGPPO, a novel framework that employs Proximal Policy Optimization---a reinforcement learning algorithm---to train a branching policy aimed at improving generalization across heterogeneous MILP instances. Our approach builds on a parameterized state space representation that dynamically captures the evolving context of the search tree. Empirical evaluations show that TGPPO often outperforms existing learning-based policies in terms of reducing the number of nodes explored and improving primal-dual integrals, particularly on out-of-distribution instances. These results highlight the potential of reinforcement learning to develop robust and adaptable branching strategies for MILP solvers.

Learning Branching Policies for MILPs with Proximal Policy Optimization

Multi-modal salient object detection (MSOD), which integrates complementary modalities such as depth or thermal data, primarily faces two challenges: accurately preserving salient object details and effectively aligning cross-modal features. Recent advances in using Stable Diffusion to generate images with fine edge details have inspired researchers to reformulate MSOD as a conditional mask generation process guided by salient features, which has achieved excellent visual results. However, these approaches often overlook the high computational cost and large-scale architecture of Stable Diffusion, both of which render it unsuitable for real-world MSOD applications.
Therefore, we propose SimpleDiffusion, the first lightweight and efficient conditional diffusion model for MSOD that does not rely on Stable Diffusion. Specifically, we propose an Adaptive Cross-Modal Fusion Conditional Network and a Latent Denoising Network to reduce the complexity of diffusion models. Furthermore, we design a Multi-modal Feature Rectification and Fusion Module to enhance the representational capacity of cross-modal salient features. Customized training and sampling strategies are also developed to improve inference efficiency and reduce erroneous object segmentations. Experiments on multiple MSOD datasets demonstrate that SimpleDiffusion reduces model size by over tenfold and improves inference speed by more than fivefold compared to other diffusion-based methods, while maintaining comparable or superior performance.Codes and models are available at: https://anonymous.4open.science/r/simple-diffusion.

Downloads

Next from AAAI 2026

Planning in Branch-and-Bound: Model-Based Reinforcement Learning for Exact Combinatorial Optimization

Stay up to date with the latest Underline news!

PRESENTATIONS

CONFERENCES

COMPANY

RESOURCES

.css-70qvj9{display:-webkit-box;display:-webkit-flex;display:-ms-flexbox;display:flex;-webkit-align-items:center;-webkit-box-align:center;-ms-flex-align:center;align-items:center;}Downloads

Next from AAAI 2026

Planning in Branch-and-Bound: Model-Based Reinforcement Learning for Exact Combinatorial Optimization

Stay up to date with the latest Underline news!

PRESENTATIONS

CONFERENCES

COMPANY

RESOURCES

Downloads