Singapore

Machine unlearning (MU) aims to remove the influence of specific data from trained models, addressing privacy concerns and ensuring compliance with regulations such as the &quot;right to be forgotten.&quot; Evaluating strong unlearning, where the unlearned model is indistinguishable from one retrained without the forgetting data, remains a significant challenge in deep neural networks (DNNs). Common black-box metrics, such as variants of membership inference attacks and accuracy comparisons, primarily assess model outputs but often fail to capture residual information in intermediate layers. To bridge this gap, we introduce the Information Difference Index (IDI), a novel white-box metric inspired by information theory. IDI quantifies retained information in intermediate features by measuring mutual information between those features and the labels to be forgotten, offering a more comprehensive assessment of unlearning efficacy. Our experiments demonstrate that IDI effectively measures the degree of unlearning across various datasets and architectures, providing a reliable tool for evaluating strong unlearning in DNNs.

AAAI 2026

An Information Theoretic Evaluation Metric for Strong Unlearning

ml: information theory

machine unlearning

unlearning

ml: privacy

ml: classification and regression

privacy

Machine unlearning (MU) aims to remove the influence of specific data from trained models, addressing privacy concerns and ensuring compliance with regulations such as the "right to be forgotten." Evaluating strong unlearning, where the unlearned model is indistinguishable from one retrained without the forgetting data, remains a significant challenge in deep neural networks (DNNs). Common black-box metrics, such as variants of membership inference attacks and accuracy comparisons, primarily assess model outputs but often fail to capture residual information in intermediate layers. To bridge this gap, we introduce the Information Difference Index (IDI), a novel white-box metric inspired by information theory. IDI quantifies retained information in intermediate features by measuring mutual information between those features and the labels to be forgotten, offering a more comprehensive assessment of unlearning efficacy. Our experiments demonstrate that IDI effectively measures the degree of unlearning across various datasets and architectures, providing a reliable tool for evaluating strong unlearning in DNNs.

poster

We are pleased to announce the Fortieth AAAI Conference on Artificial Intelligence (AAAI-26), which will be held in Singapore EXPO from January 20 to January 27, 2026.

The purpose of the AAAI conference series is to promote research in Artificial Intelligence (AI) and foster scientific exchange between researchers, practitioners, scientists, students, and engineers across the entirety of AI and its affiliated disciplines. AAAI-26 will feature technical paper presentations, special tracks, invited speakers, workshops, tutorials, poster sessions, senior member presentations, competitions, and exhibit programs, and a range of other activities to be announced.<br><br>

To access this event page, you need to log in with the **email address you registered with**. <br>Access credentials will be sent to your email from Underline -  subject line "Welcome to AAAI 2026". Please be sure to check your spam email folder if you do not see an email confirmation right away.

Please log in

To access this event page, you are required to register.
Please complete your registration to continue.

We recommend reading [**the registration information**](https://aaai.org/conference/aaai/aaai-26/registration/) first.

**Online Registration Form**: https://aaai.getregistered.net/conference-2026 

Registration Required

We are pleased to announce the Fortieth AAAI Conference on Artificial Intelligence (AAAI-26), which will be held in Singapore EXPO from January 20 to January 27, 2026.

Tree search-based methods have made significant progress in enhancing the code generation capabilities of large language models. However, due to the difficulty in effectively evaluating intermediate algorithmic steps and the inability to locate and timely correct erroneous steps, these methods often generate incorrect code and incur increased computational costs. To tackle these problems, we propose RPM-MCTS, an effective method that utilizes Knowledge-Retrieval as Process Reward Model based on Monte Carlo Tree Search to evaluate intermediate algorithmic steps. By utilizing knowledge base retrieval probabilities, RPM-MCTS avoids the complex process of training process reward models. During the expansion phase, similarity filtering is employed to remove redundant nodes, ensuring diversity in reasoning paths. Furthermore, our method utilizes sandbox execution feedback to locate erroneous algorithmic steps during generation, enabling timely and targeted corrections. Extensive experiments on four public code generation benchmarks demonstrate that RPM-MCTS outperforms current state-of-the-art methods while achieving an approximately 15% reduction in token consumption. Furthermore, full fine-tuning the base model using data constructed by RPM-MCTS significantly enhances its code capabilities.

RPM-MCTS: Knowledge-Retrieval as Process Reward Model with Monte Carlo Tree Search for Code Generation

Efficient Multimodal Large Language Models (MLLMs) compress vision tokens to reduce resource consumption, but the loss of visual information can degrade comprehension capabilities.
While Knowledge Distillation could enhance student models through teacher guidance, existing methods overlook the fundamental differences in fine-grained vision comprehension caused by unbalanced vision tokens.
In this paper, we propose EM-KD, a novel paradigm that enhances the Efficient MLLM with Knowledge Distillation.
Firstly, we calculate the Mahattan distance between the vision logits of teacher and student, and align them in the spatial dimension with the Hungarian algorithm to solve the imbalance issue.
After alignment, EM-KD introduces two key designs: 1) Vision-Language Affinity Distillation and 2) Vision-Semantic Distillation.
Specifically, we calculate the affinity matrix between text tokens and aligned vision tokens, and minimize the smooth L1 distance of the student and the teacher affinity matrices.
Considering the semantic richness of vision logits in the final layer, we employ the reverse KL divergence to measure the discrete probability distributions of the aligned vision logits over the vocabulary space.
Comprehensive evaluation on diverse benchmarks demonstrates that EM-KD trained model outperforms prior Efficient MLLMs on accuracy and efficiency, validating its effectiveness.

EM-KD: Distilling Efficient Multimodal Large Language Model with Unbalanced Vision Tokens

Floor plan recognition requires accurate segmentation and classification of entrance doors, outer contours (walls and windows) and inner contours (various room types) , despite strong spatial dependencies and large stylistic differences between different datasets. To overcome these challenges, we propose FloorPlanFormer, a multi-task learning network divided into three phases: the first phase introduces a Swin Transformer backbone with a pixel decoder to extract fine-grained pixel-level semantics; the second phase employs prompt encoder and mask decoder, and a novel Global Contextual Attention Module (GCAM) is designed to generate clear, high-quality outer contour masks; the third stage uses mask transformer decoder to recognize targets and designs a Masked Feature Refinement Module (MFRM) to accurately delineate the inner contour by modeling the relationship between the local inner and outer contours. Finally, we constructed FloorPlan8K, a dataset containing 8200 images and 77434 instances, on which our model was trained and evaluated, and the results greatly outperformed the state-of-the-art general segmentation methods and specialized methods.

FloorPlanFormer: Multi-Task Transformer Network for Floor Plan Recognition with Outer-to-Inner Feature Refinement

Bird's Eye View (BEV) representation has become pivotal for autonomous driving, yet existing polar coordinate-based approaches face two critical limitations: (1) distant semantic misprojection caused by radial resolution decay, and (2) region-specific geometric distortions from non-uniform polar discretization. To address these issues, we propose a novel framework addressing these challenges through three key innovations. First, we present a bilateral heterogeneous network constructs multi-granularity BEV spaces, efficiently exploiting dual-resolution visual information for distant detail preservation. Second, we employ an align-fusion strategy for multi-granularity feature aggregation. Specifically, the Mamba-Based Cross-Resolution Alignment module establishes semantic consistency for perspective features through shared state-space optimization. In the later stage, the Adaptive BEV Space Selector dynamically aggregates multi-granularity BEV features. Third, we introduce a Mixture of Radial-Angular Decoupled Experts, which employs polar-aware expert routing to disentangle radial compression and angular shear distortions through specialized geometric refinement. Comprehensive experiments on nuScenes and Lyft L5 demonstrate the state-of-the-art performance of our model across various resolution settings, visibility filtering, and perception ranges.

Seeing in Double: Dual-Granularity BEV Segmentation via Mamba-Driven Alignment and Polar-Decoupled Experts

Multimodal large language models (MLLMs) demonstrate strong capabilities in multimodal understanding, reasoning, and interaction but still face the fundamental limitation of hallucinations, where they generate erroneous or fabricated information. 
Most existing research induces hallucinations by manually perturbing visual or instruction inputs, then uses output differences or model-generated descriptions as references to mitigate hallucinations and improve response-visual consistency. However, these methods are constrained by model capabilities and prone to hallucination propagation.
We propose Visual Clue Guided Decoding (VCGD), a novel decoding strategy that introduces an auxiliary captioning model to generate precise visual clues during decoding for guiding model generation. It further incorporates image confidence constraints to critically suppress hallucination propagation during generation, thereby significantly improving content reliability and visual consistency. Specifically, VCGD leverages high-quality visual descriptions to guide MLLMs in correcting perceptual biases while generating answers. Furthermore, we introduce a Reinforcement Learning (RL-based training paradigm for the Caption Model, in which a Reward Agent provides feedback on the quality of visual clues, further enhancing the accuracy of visual information.
Extensive experiments across multiple benchmark datasets and state-of-the-art MLLMs demonstrate that VCGD significantly reduces hallucination rates and substantially improves cross-modal consistency. Our method exhibits strong generalizability and scalability, offering an effective decoding enhancement strategy that can be seamlessly integrated into existing multimodal frameworks.

VCGD: Visual Clue Guided Decoding with Caption Model for Mitigating Hallucination in Multimodal Large Language Models

Detecting Schelling Points—salient 3D mesh landmarks that serve as natural reference points for shape analysis—is a challenging problem in geometry processing. While existing CNN-based methods struggle with limited receptive fields and poor geometric context modeling, this paper proposes {\em SchellingFormer}, a novel Laplacian matrix-guided Geometric Transformer that effectively captures long-range dependencies and discriminative geometric features for robust Schelling point prediction. Our framework consists of two key components: (i) a hybrid geometric feature embedding module that integrates handcrafted descriptors (coordinates, Gaussian curvature, and curvature differences) to encode local geometry, and (ii) a Laplacian-driven vector attention mechanism, where spatial relationships encoded by the Laplacian matrix guide feature aggregation with the Transformer. This approach enables adaptive, geometry-aware message passing and contextual representation learning. Extensive experiments demonstrate that SchellingFormer outperforms state-of-the-art methods across multiple evaluation metrics. Our work bridges the gap between spectral mesh analysis and Transformer-based learning, offering a powerful tool for 3D shape understanding tasks such as shape matching and saliency detection.

SchellingFormer: Laplacian Matrix-guided Geometric Transformer for Robust Schelling Point Detection

Deep learning-based methods have achieved a breakthrough in image anomaly detection, but their complexity introduces a considerable challenge to understanding why an instance is predicted to be anomalous. We introduce a novel explanation method that generates multiple alternative modifications for each anomaly, capturing diverse concepts of anomalousness. Each modification is trained to be perceived as normal by the anomaly detector. The method provides a semantic explanation of the mechanism that triggered the detector, allowing users to explore ``what-if scenarios.'' Qualitative and quantitative analyses across various image datasets demonstrate that applying this method to state-of-the-art detectors provides high-quality semantic explanations.

Reimagining Anomalies: What If Anomalies Were Normal?

We study how the design of testing institutions, encompassing both the tests themselves and the procedures used to administer them, shapes selection outcomes in environments with multiple criteria and strategic agents.
We model the testing agency as either a set of independent bureaucracies (each test administered separately) or a joint bureaucracy (where test order and personalization can be coordinated). Our mechanism design analysis shows that under a joint bureaucracy, fixed-order sequential mechanisms with stringent tests are optimal for maximizing the probability mass of qualified candidates selected. Furthermore, we demonstrate that personalizing tests through upfront communication, now increasingly feasible via AI and automation, can select all qualified candidates.
Finally, we compare institutional settings and quantify the value of controlling test order, showing that the benefit depends critically on the distribution of testees and the stringency of optimal tests. Our results contribute to the design of robust, efficient, and fair testing systems in both human and AI-mediated environments.

Testing Under Strategic Manipulation: Mechanism Design for Human and AI Institutions

Future superhuman models will surpass the ability of humans and humans will only be able to \textit{weakly} supervise superhuman models.
To alleviate the issue of lacking high-quality data for model alignment, some works on weak-to-strong generalization (W2SG) finetune a strong pretrained model with a weak supervisor so that it can generalize beyond weak supervision.
However, the invariable use of weak supervision in existing methods exposes issues in robustness, with a proportion of weak labels proving harmful to models.
In this paper, we propose a selective W2SG framework to avoid using weak supervision when unnecessary.
We train a binary classifier P(IK) to identify questions that a strong model can answer and use its self-generated labels for alignment.
We further refine weak labels with a graph smoothing method.
Extensive experiments on three benchmarks show that our method consistently outperforms competitive baselines.
Further analyses show that P(IK) can generalize across tasks and difficulties, which indicates selective W2SG can help superalignment.

Selective Weak-to-Strong Generalization

While deep generative models have significantly advanced representation learning, they may inherit or amplify biases and fairness issues by encoding sensitive attributes alongside predictive features. Enforcing strict independence in disentanglement is often unrealistic when target and sensitive factors are naturally correlated. To address this challenge, we propose CAD-VAE(Correlation-Aware Disentangled VAE), which introduces a correlated latent code to capture the information shared between the target and sensitive attributes. Given this correlated latent, our method effectively separates overlapping factors without extra domain knowledge by directly minimizing the conditional mutual information between target and sensitive codes. A relevance-driven optimization strategy refines the correlated code by efficiently capturing essential correlated features and eliminating redundancy. Extensive experiments on benchmark datasets demonstrate that CAD-VAE produces fairer representations, realistic counterfactuals, and improved fairness-aware image editing.

Downloads

Next from AAAI 2026

RPM-MCTS: Knowledge-Retrieval as Process Reward Model with Monte Carlo Tree Search for Code Generation

Stay up to date with the latest Underline news!

PRESENTATIONS

CONFERENCES

COMPANY

RESOURCES

.css-70qvj9{display:-webkit-box;display:-webkit-flex;display:-ms-flexbox;display:flex;-webkit-align-items:center;-webkit-box-align:center;-ms-flex-align:center;align-items:center;}Downloads

Next from AAAI 2026

RPM-MCTS: Knowledge-Retrieval as Process Reward Model with Monte Carlo Tree Search for Code Generation

Stay up to date with the latest Underline news!

PRESENTATIONS

CONFERENCES

COMPANY

RESOURCES

Downloads