Singapore

Reward models (RMs) are a core component in the post-training of large language models (LLMs), serving as proxies for human preference evaluation and guiding model alignment. However, training reliable RMs under limited resources remains challenging due to the reliance on large-scale preference annotations and the high cost of fine-tuning LLMs. To address this, we propose SparseRM, which leverages Sparse Autoencoder (SAE) to extract preference-relevant information encoded in model representations, enabling the construction of a lightweight and interpretable reward model. SparseRM first employs SAE to decompose LLM representations into interpretable directions that capture preference-relevant features. The representations are then projected onto these directions to compute alignment scores, which quantify the strength of each preference feature in the representations. A simple reward head aggregates these scores to predict preference scores. Experiments on three preference modeling tasks show that SparseRM achieves superior performance over most mainstream RMs while using less than 1% of trainable parameters. Moreover, it integrates seamlessly into downstream alignment pipelines, highlighting its potential for efficient alignment.

AAAI 2026

SparseRM: A Lightweight Preference Modeling with Sparse Autoencoder

sparse autoencoder

alignment

reward model

poster

We are pleased to announce the Fortieth AAAI Conference on Artificial Intelligence (AAAI-26), which will be held in Singapore EXPO from January 20 to January 27, 2026.

The purpose of the AAAI conference series is to promote research in Artificial Intelligence (AI) and foster scientific exchange between researchers, practitioners, scientists, students, and engineers across the entirety of AI and its affiliated disciplines. AAAI-26 will feature technical paper presentations, special tracks, invited speakers, workshops, tutorials, poster sessions, senior member presentations, competitions, and exhibit programs, and a range of other activities to be announced.<br><br>

To access this event page, you need to log in with the **email address you registered with**. <br>Access credentials will be sent to your email from Underline -  subject line "Welcome to AAAI 2026". Please be sure to check your spam email folder if you do not see an email confirmation right away.

Please log in

To access this event page, you are required to register.
Please complete your registration to continue.

We recommend reading [**the registration information**](https://aaai.org/conference/aaai/aaai-26/registration/) first.

**Online Registration Form**: https://aaai.getregistered.net/conference-2026 

Registration Required

We are pleased to announce the Fortieth AAAI Conference on Artificial Intelligence (AAAI-26), which will be held in Singapore EXPO from January 20 to January 27, 2026.

Modern gaze estimation models can accurately predict human gaze from facial images. However, due to privacy concerns and intricate data collection procedures, gaze estimation datasets are typically smaller and less diverse compared to those for other vision tasks, which directly leads to poor generalization in gaze estimation models. Common solutions, such as domain adaptation models, require additional domain-specific data, yet such data is often difficult to obtain due to privacy restrictions. Meanwhile, domain generalization models suffer from limited performance due to insufficient training data. To address these fundamental challenges---privacy and data diversity---we explore privacy-preserving gaze data generation schemes and propose a novel data-driven generalization solution. Specifically, we develop two diffusion-based generative models, DDPM-Gaze and LDM-Gaze, for synthesizing gaze data. We demonstrate that synthetic data can significantly improve generalization performance when simply used with fine-tuning-based methods. Furthermore, we introduce the Domain Stability Adaptation (DSA) framework, a simple yet effective domain generalization approach that enhances model robustness by increasing the domain uncertainty of input samples while reducing prediction uncertainty. Extensive experiments validate the effectiveness of our synthetic data and demonstrate the superiority of our data-driven generalization solution.

Towards Privacy-Protected Generalized Gaze Estimation Using Diffusion Models and Domain Stability Adaptation Framework

Spiking Neural Networks (SNNs) become popular due to excellent energy efficiency, yet facing challenges for effective model training.
Recent works improve this by introducing knowledge distillation (KD) techniques, with the pre-trained artificial neural networks (ANNs) used as teachers and the target SNNs as students.
This is commonly accomplished through a straightforward element-wise alignment of intermediate features and prediction logits from ANNs and SNNs, often neglecting the intrinsic differences between their architectures. Specifically, ANN's outputs exhibit a continuous distribution, whereas SNN's outputs are characterized by sparsity and discreteness.
To mitigate this issue, we introduce two innovative KD strategies. 
Firstly, we propose the Saliency-scaled 
Activation Map Distillation} (SAMD), which aligns the spike activation map of the student SNN with the class-aware activation map of the teacher ANN. Rather than performing KD directly on the raw %and distinct 
features of ANN and SNN, our SAMD directs the student to learn from saliency activation maps that exhibit greater semantic and distribution consistency.
Additionally, we propose a Noise-smoothed Logits Distillation (NLD), which utilizes Gaussian noise to smooth the sparse logits of student SNN, facilitating the alignment with continuous logits from teacher ANN.
Extensive experiments on multiple datasets demonstrate the effectiveness of our methods, particularly on CIFAR100, where CKDSNN achieves an accuracy of 79.11\% with just one time step, surpassing the previous best method by 2\%.

A Closer Look at Knowledge Distillation in Spiking Neural Network Training

To identify objects beyond predefined categories, open-vocabulary aerial object detection(OVAD) leverages the zero-shot capabilities of visual-language models (VLMs) to generalize from base to novel categories. Existing approaches typically utilize self-learning mechanisms with weak text supervision to generate region-level pseudo-labels to align detectors with VLMs semantic spaces. However, text dependence induces semantic bias, restricting open-vocabulary expansion to text-specified concepts. We propose $\textbf{VK-Det}$, a $\textbf{V}$isual $\textbf{K}$nowledge-guided open-vocabulary object $\textbf{Det}$ection framework $\textit{without}$ extra supervision. First, we discover and leverage vision encoder's inherent informative region perception to attain fine-grained localization and adaptive distillation. Second, we introduce a novel prototype-aware pseudo-labeling strategy. It models inter-class decision boundaries through feature clustering and maps detection regions to latent categories via prototype matching. This enhances attention to novel objects while compensating for missing supervision. Extensive experiments show state-of-the-art performance, achieving 30.1 $\mathrm{mAP}^{N}$ on DIOR and 23.3 $\mathrm{mAP}^{N}$ on DOTA, outperforming even extra supervised methods.

VK-Det: Visual Knowledge Guided Prototype Learning for Open-Vocabulary Aerial Object Detection

Membership Inference Attack (MIA) aims to determine if a data sample is used in the training dataset of a target model. Traditional MIA obtains feature of target model via shadow models and uses the feature to train attack model, but the scale and complexity of training or fine-tuning data for large language model (LLM)-based recommendation systems make shadow models difficult to construct. Knowledge distillation as a method for extracting knowledge contributes to construct a stronger reference model. Knowledge distillation enables separate distillation for member and non-member data during the distillation process, enhancing the model's discriminative capability between the two in MIA. This paper propose a knowledge distillation-based MIA paradigm to improve the performance of membership inference attacks on LLM-based recommendation systems. Our paradigm introduces knowledge distillation to obtain a reference model, which enhances the reference model's ability to distinguish between member and non-member data. We obtain individual features from the reference model and train our attack model with fused feature. Our paradigm improves the attack performance of MIA compared to shadow model-based attack.

Membership Inference Attack Against Large Language Model-Based Recommendation Systems: A New Distillation-Based Paradigm

Large Language Models (LLMs) often falter at complex planning tasks that require exploration and self-correction, as their linear reasoning process struggles to recover from early mistakes. While search algorithms like Monte Carlo Tree Search (MCTS) can explore alternatives, they are often ineffective when guided by sparse rewards and fail to leverage the rich semantic capabilities of LLMs. We introduce SPIRAL (Symbolic LLM Planning via Grounded and Reflective Search), a novel framework that embeds a cognitive architecture of three specialized LLM agents into an MCTS loop. SPIRAL's key contribution is its integrated planning pipeline where a Planner proposes creative next steps, a Simulator grounds the search by predicting realistic outcomes, and a Critic provides dense reward signals through reflection. This synergy transforms MCTS from a brute-force search into a guided, self-correcting reasoning process. On the DailyLifeAPIs and HuggingFace datasets, SPIRAL consistently outperforms the default Chain-of-Thought planning method and other state-of-the-art agents. More importantly, it substantially surpasses other state-of-the-art agents; for example, SPIRAL achieves 83.6% overall accuracy on DailyLifeAPIs, an improvement of over 16 percentage points against the next-best search framework, while also demonstrating superior token efficiency. Our work demonstrates that structuring LLM reasoning as a guided, reflective, and grounded search process yields more robust and efficient autonomous planners. The source code for all experiments is available in the supplemental materials for reproducibility.

SPIRAL: Symbolic LLM Planning via Grounded and Reflective Search

Unlike traditional object detection, moving infrared small target detection is highly challenging due to tiny target size and limited labeled samples. Currently, most existing methods mainly focus on the pure-vision features usually by fully-supervised learning, heavily relying on extensive high-cost manual annotations. Moreover, they almost have not concerned the potentials of multi-modal (e.g., vision and text) learning yet. To address these issues, inspired by prevalent vision-language models, we propose the first semi-supervised vision-language (SeViL) framework with adaptive text prompt guiding. Breaking through traditional pure-vision modality, it takes text prompts as prior knowledge to adaptively enhance target regions and then filter the low-quality pseudo-labels generated on unlabeled data. In the meanwhile, we employ an adaptive cross-modal masking strategy to align text and vision features, promoting cross-modal deep interactions. Remarkably, our extensive experiments on three public datasets (DAUB, ITSDT-15K and IRDST) verify that our new scheme could outperform other semi-supervised ones, and even achieve comparable performance to fully-supervised state-of-the-art (SOTA) methods, with only 10% labeled training samples. Source codes will be publicly available after acceptance.

SeViL: Semi-supervised Vision-Language Learning with Text Prompt Guiding for Moving Infrared Small Target Detection

High quality datasets are critical for training reliable machine learning models, yet data faults caused by insufficient annotation expertise or malicious poisoning attacks remain prevalent. Traditional classifier based methods rely on manually curated subsets for fault detection, but their limited scale frequently leads to model overfitting. While multimodal large language models (MLLMs) based methods offer promising detection capabilities, their few-shot learning limitations hinder generalization in domain specific tasks. To address these challenges, we propose MLLM Guided Iterative Sample Filtering (MISF), a novel framework that combines the strengths of MLLM based initialization and iterative data refinement. Our framework initializes the detection model with MLLM generated synthetic images and a curated clean subset, then iteratively refines it by progressively selecting high certainty clean samples, improving both domain adaptation and detection accuracy. Extensive experiments on RESISC45 and Oxford-IIIT Pets datasets demonstrate that MISF effectively identifies data faults, outperforming existing approaches. MISF provides a robust, scalable solution for improving dataset quality in specialized domains.

MISF: MLLM Guided Iterative Sample Filtering for Data Fault Detection

Appearance-based gaze estimation, aiming to predict accurate 3D gaze direction from a single facial image, has made promising progress in recent years. However, most methods suffer from heavy performance degradation when facing across-domain evaluation due to gaze-irrelevant factor interference, such as expressions, wearables, and image quality. To alleviate this problem, we present a novel Hybrid-domain Adaptative Representation Learning (shorted by HARL) framework that exploits multi-source hybrid datasets to learn robust gaze representation. More specifically, we propose to disentangle gaze-relevant representation from low-quality facial images by aligning features extracted from high-quality near-eye images in an unsupervised domain-adaptation manner, which hardly requires any computational or inference costs. Additionally, we also analyze the effect of head-pose and design a simple yet efficient sparse graph fusion module to explore the inner geometric constraint between gaze direction and head-pose, which leads to dense and robust gaze representation. Extensive experiments on EyeDiap, MPIIFaceGaze, and Gaze360 datasets demonstrate that our approach achieves state-of-the-art accuracy of $\textbf{5.02}^{\circ}$ and $\textbf{3.36}^{\circ}$, and $\textbf{9.26}^{\circ}$ respectively, and present completing performances through cross-datasets evaluation.

Hybrid-Domain Adaptative Representation Learning for Gaze Estimation

Sparse-view 3D Gaussian splatting seeks to render high-quality novel views of 3D scenes from a limited set of input images. While recent pose-free feed-forward methods leveraging pre-trained 3D priors have achieved impressive results, most of them rely on full fine-tuning of large Vision Transformer (ViT) backbones and incur substantial GPU costs.
In this work, we introduce MuSASplat, a novel framework that dramatically reduces the computational burden of training pose-free feed-forward 3D Gaussian splats models with little compromise of rendering quality. Central to our approach is a lightweight Multi-Scale Adapter that enables efficient fine-tuning of ViT-based architectures with only a small fraction of training parameters. This design avoids the prohibitive GPU overhead associated with previous full-model adaptation techniques while maintaining high fidelity in novel view synthesis, even with very sparse input views. In addition, we introduce a Feature Fusion Aggregator that integrates features across input views effectively and efficiently. Unlike widely adopted memory banks, the Feature Fusion Aggregator ensures consistent geometric integration across input views and meanwhile mitigates the memory usage, training complexity, and computational costs significantly.
Extensive experiments across diverse datasets show that MuSASplat achieves state-of-the-art rendering quality but has significantly reduced parameters and training resource requirements as compared with existing methods.

MuSASplat: Efficient Sparse-View 3D Gaussian Splats via Lightweight Multi-Scale Adaptation

Understanding multi-page documents poses a significant challenge for multimodal large language models (MLLMs), as it requires fine-grained visual comprehension and multi-hop reasoning across pages. While prior work has explored reinforcement learning (RL) for enhancing advanced reasoning in MLLMs, its application to multi-page document understanding remains underexplored. In this paper, we introduce DocR1, an MLLM trained with a novel RL framework, Evidence Page-Guided GRPO (EviGRPO). EviGRPO incorporates an evidence-aware reward mechanism that promotes a coarse-to-fine reasoning strategy, guiding the model to first retrieve relevant pages before generating answers. To support this, we design a rigorous two-stage annotation pipeline and a curriculum learning strategy that enables effective training with limited supervision. Using this pipeline, we construct two datasets: EviBench, a high-quality training set with 4.8k examples, and ArxivFullQA, a benchmark with 8.6k QA examples over full scientific papers. Extensive experiments across a wide range of benchmarks demonstrate that DocR1 achieves state-of-the-art performance on multi-page tasks while maintaining strong results on single-page benchmarks.

Content not yet available

Next from AAAI 2026

Towards Privacy-Protected Generalized Gaze Estimation Using Diffusion Models and Domain Stability Adaptation Framework

Stay up to date with the latest Underline news!

PRESENTATIONS

CONFERENCES

COMPANY

RESOURCES