Singapore

Mixture-of-Experts (MoE) architectures have become a cornerstone for scaling large language models (LLMs) efficiently, yet how their sparse structure shapes knowledge acquisition during pre-training remains unknown. Existing interpretability methods predominantly focus on post-hoc analysis of dense models, overlooking the dynamic, architectural differences that define MoE. To bridge this gap, we introduce Gated-LPI, a neuron-level attribution metric that decomposes log-probability increase across neurons. We present the first time-resolved comparison of knowledge acquisition dynamics in MoE versus dense architectures through tracking checkpoints across 1.2M training steps ($\approx 5.2T$ tokens). Our analysis reveals three key phenomena: (1) Early consolidation. MoE model locks into a stable importance profile within $&lt;$100K steps, whereas the dense model remains volatile throughout training. (2) Low-entropy backbone. The top approximately 1\% of MoE neurons consistently receive $&gt;$45\% of positive updates, creating a persistent, high-utility core absent in the dense baseline. (3) Functional robustness. Masking the ten most important MoE attention heads reduces relational HIT@10 by $&lt;$10\%, compared with $&gt;$50\% for the dense model, showing that sparsity fosters distributed---rather than brittle---knowledge storage. These phenomena collectively demonstrate that sparsity fosters an intrinsically stable and distributed computational backbone from early in training. Together, these findings bridge the gap between sparse architectures and training-time interpretability, offering actionable insights for expert-pruning and routing-strategy design in next generation MoE models.

AAAI 2026

Deconstructing Pre-training: Knowledge Attribution Analysis in MoE and Dense Models

mixture of experts (moe)

and evaluation of nlp models

(large) language models

interpretability

analysis

Mixture-of-Experts (MoE) architectures have become a cornerstone for scaling large language models (LLMs) efficiently, yet how their sparse structure shapes knowledge acquisition during pre-training remains unknown. Existing interpretability methods predominantly focus on post-hoc analysis of dense models, overlooking the dynamic, architectural differences that define MoE. To bridge this gap, we introduce Gated-LPI, a neuron-level attribution metric that decomposes log-probability increase across neurons. We present the first time-resolved comparison of knowledge acquisition dynamics in MoE versus dense architectures through tracking checkpoints across 1.2M training steps ($\approx 5.2T$ tokens). Our analysis reveals three key phenomena: (1) Early consolidation. MoE model locks into a stable importance profile within $<$100K steps, whereas the dense model remains volatile throughout training. (2) Low-entropy backbone. The top approximately 1\% of MoE neurons consistently receive $>$45\% of positive updates, creating a persistent, high-utility core absent in the dense baseline. (3) Functional robustness. Masking the ten most important MoE attention heads reduces relational HIT@10 by $<$10\%, compared with $>$50\% for the dense model, showing that sparsity fosters distributed---rather than brittle---knowledge storage. These phenomena collectively demonstrate that sparsity fosters an intrinsically stable and distributed computational backbone from early in training. Together, these findings bridge the gap between sparse architectures and training-time interpretability, offering actionable insights for expert-pruning and routing-strategy design in next generation MoE models.

poster

We are pleased to announce the Fortieth AAAI Conference on Artificial Intelligence (AAAI-26), which will be held in Singapore EXPO from January 20 to January 27, 2026.

The purpose of the AAAI conference series is to promote research in Artificial Intelligence (AI) and foster scientific exchange between researchers, practitioners, scientists, students, and engineers across the entirety of AI and its affiliated disciplines. AAAI-26 will feature technical paper presentations, special tracks, invited speakers, workshops, tutorials, poster sessions, senior member presentations, competitions, and exhibit programs, and a range of other activities to be announced.<br><br>

To access this event page, you need to log in with the **email address you registered with**. <br>Access credentials will be sent to your email from Underline -  subject line "Welcome to AAAI 2026". Please be sure to check your spam email folder if you do not see an email confirmation right away.

Please log in

To access this event page, you are required to register.
Please complete your registration to continue.

We recommend reading [**the registration information**](https://aaai.org/conference/aaai/aaai-26/registration/) first.

**Online Registration Form**: https://aaai.getregistered.net/conference-2026 

Registration Required

We are pleased to announce the Fortieth AAAI Conference on Artificial Intelligence (AAAI-26), which will be held in Singapore EXPO from January 20 to January 27, 2026.

Parkinson's disease (PD) and Alzheimer's disease (AD) are the two most prevalent and incurable neurodegenerative diseases (NDs) worldwide, where early diagnosis is critical for delaying their progression. However, the high dimensionality of multi-metric data with diverse structural forms, the heterogeneity of neuroimaging and phenotypic data, and class imbalance collectively pose significant challenges to early ND diagnosis. To address these challenges, we propose a dynamically weighted dual graph attention network (DW-DGAT) that integrates: (1) a general-purpose data fusion strategy to merge three structural forms of multi-metric data; (2) a dual graph attention architecture based on brain regions and inter-sample relationships to extract both micro- and macro-level features; and (3) a class weight generation mechanism combined with two stable and effective loss functions to mitigate class imbalance. Rigorous experiments, based on the Parkinson Progression Marker Initiative (PPMI) and Alzhermer's Disease Neuroimaging Initiative (ADNI) studies, demonstrate the state-of-the-art performance of our approach. The code will be released on acceptance of this paper.

DW-DGAT: Dynamically Weighted Dual Graph Attention Network for Neurodegenerative Disease Diagnosis

In this paper, we study the Facility Location Problem with Scarce Resources (FLPSR) under the assumption that agents' type follow a probability distribution. In the FLPSR, the objective is to identify the optimal locations for one or more capacitated facilities to maximize Social Welfare (SW), defined as the sum of the utilities of all agents. Since the total capacity of the facilities is insufficient to serve all agents, they compete in a First-Come-First-Served game to get accommodated. The main contribution of this paper ties Optimal Transport theory to the problem of selecting a truthful mechanism tailored to the agents' distributions. For the case of a single facility, we show that an optimal mechanism always exists. We examine three classes of probability distributions and characterize the optimal mechanism either analytically or provide a routine to numerically compute it. We extend our results to the case in which we have two capacitated facilities to place. Initially we assume that agents are independent and identically distributed, but our techniques generalize to scenarios where agents are not identically distributed. Finally, we validate our findings through several numerical experiments, including: (i) deriving optimal mechanisms for the class of beta distributions, (ii) assessing the Bayesian approximation ratio of these mechanisms for small numbers of agents, and (iii) assessing how quickly the expected mechanism SW converges to its limit.

Designing Optimal Mechanisms to Locate Facilities with Insufficient Capacity for Bayesian Agents

Just as a coin has two sides, the impressive performance of large language models (LLMs) also brings inherent toxicity risks, prompting the need for effective detoxification to support responsible deployment. Prevailing methods generally follow an inflexible model-specific fashion, addressing only individual models or model families. Moreover, overlooking the underlying toxic risks involved in the input prefix can lead to toxic accumulation during autoregressive generation. Existing methods rely on external strong attribute interventions to address this issue, which further exacerbates contextual semantic inconsistencies and makes it difficult to balance toxicity efficacy and generation quality. To address these concerns, we propose a novel Model-Agnostic Adaptive Detoxification (MAAD) framework. To address accumulating toxicity, we present prefix heuristics that serve as contextual signals, guiding the base LLM toward safer generation. Along this line, we construct an antidote dataset to support a lightweight model, Detoxifier, which steers the base LLM to make in-scope and reliable detoxifying distribution adjustments while preserving fluency and contextual understanding. Designed as an easy-to-deploy module, Detoxifier requires a small amount of data and can be seamlessly applied to various base LLMs with one-off training. Since over-purifying often reduces diversity, we also propose a dynamic truncation method called CW-cutoff sampling to trade off language model quality and diversity. Extensive experiments demonstrate that MAAD strikes a better balance between detoxification effectiveness and generation quality, while also maintaining model utility.

From Chaos to Cure: A Prefix Heuristics Guided Model-Agnostic Adaptive Detoxification Framework

Sparse query-based detectors have emerged as the dominant paradigm in camera-only 3D object detection, owing to their exceptional performance and computational efficiency. 
A key component of these detectors is the use of reference points, which serve as learnable spatial anchors to guide queries in localizing target objects. 
However, existing methods typically employ a unified set of reference points across all scenes, a design we find suboptimal for handling complex scenarios with highly imbalanced object distributions, such as road intersections or occluded environments.
In this paper, we dive into the adaptability of reference points and propose Refine3D, an adaptive refinement mechanism that achieves scene-level alignment between the distribution of reference points and occurred objects. 
In particular, we introduce a novel Reference Point Distribution Loss (RPD-loss) to ensure reference points globally converge towards object positions, and a Scene-Adaptive Refinement head (SAR-head) that predicts dynamic offsets for each reference point. 
Both components can be seamlessly integrated into mainstream sparse detectors. 
Extensive experiments on two challenging autonomous driving datasets demonstrate that Refine3D outperforms the state-of-the-art with improved detection accuracy and robustness. Codes will be made publicly available.

Refine3D: Scene-Adaptive Reference Point Refinement for Sparse 3D Object Detection

Existing approaches for the problem of ultrasound image segmentation, whether supervised or semi-supervised, are typically specialized for specific anatomical structures or tasks, limiting their practical utility in clinical settings. In this paper, we pioneer the task of universal semi-supervised ultrasound image segmentation and propose ProPL, a framework that can handle multiple organs and segmentation tasks while leveraging both labeled and unlabeled data. At its core, ProPL employs a shared vision encoder coupled with prompt-guided dual decoders, enabling flexible task adaptation through a prompting-upon-decoding mechanism and reliable self-training via an uncertainty-driven pseudo-label calibration (UPLC) module. To facilitate research in this direction, we introduce a comprehensive ultrasound dataset spanning 5 organs and 8 segmentation tasks. Extensive experiments demonstrate that ProPL outperforms state-of-the-art methods across various metrics, establishing a new benchmark for universal ultrasound image segmentation. The code and data will be made publicly available.

ProPL: Universal Semi-Supervised Ultrasound Image Segmentation via Prompt-Guided Pseudo-Labeling

Electrocardiography (ECG) plays a central role in cardiovascular diagnostics, yet existing automated approaches often struggle to generalize across clinical tasks and offer limited support for open-ended reasoning. We present DiagECG, a novel framework that integrates time-series (TS) and language modeling by enabling large language models (LLMs) to process 12-lead ECG signals for clinical text generation tasks. Our approach discretizes continuous ECG embeddings into symbolic tokens using a lead-independent encoder and quantization module. These symbolic representations are then used to extend the LLM’s vocabulary with ECG-specific tokens, allowing the model to handle both ECG and natural language inputs in a unified manner. To bridge the modality gap, we pretrain the model on an autoregressive ECG forecasting task, enabling the LLM to model temporal dynamics using its native language modeling capabilities. Finally, we perform instruction tuning on both ECG question answering and diagnostic report generation. Without modifying the core model, DiagECG achieves strong performance across tasks while maintaining generalization to out-of-distribution settings. Extensive experiments demonstrate the effectiveness of each component and highlight the potential of integrating symbolic ECG representations into LLMs for medical reasoning.

HeartLLM: Discretized ECG Tokenization for LLM-Based Diagnostic Reasoning

Images are generally represented by pixel intensities or color values, which are usually used as direct inputs for learning. This study innovatively proposes a geometric image representation method and refreshes the general learning model (e.g., autoencoder) in the diffeomorphic space. Based on the theory of geometric optimal transport and quasiconformal mapping, we equivalently transform the intensity representation into a shape representation. The image space becomes a diffeomorphic space, where any image can be uniquely represented as a Beltrami coefficient function defined on a uniform grid reference, and vice versa. This innovative geometric image representation (G-IR) captures the fine-grained structure inherent in the entire image, which is different from the traditional feature extraction that focuses on the internal geometric objects of the image (such as boundaries and axes). The diffeomorphic property preserves structure in the generation process, which is very necessary in the field of real physics. It can be assembled into existing pipelines as a plug-in, providing structure-preserving properties for the entire framework. Applications in classical problems such as image reconstruction and interpolation have verified the efficiency, efficacy and applicability of G-IR, and show its performance that is superior to common image pixel-level appearance representations.

G-IR: Geometric Image Representation for Learning

Large Language Models (LLMs) have recently been integrated into Graph Neural Networks (GNNs) to improve learning on text-attributed graphs (TAGs), combining semantic-rich node features with structural information. However, this integration introduces dual vulnerabilities: GNNs are sensitive to structural perturbations, while LLM-derived features are vulnerable to prompt injection and adversarial phrasing. While existing adversarial attacks largely perturb structure or text independently, we find that uni-modal attacks cause only modest degradation in LLM-enhanced GNNs. Moreover, many existing attacks assume unrealistic capabilities, such as white-box access or direct modification of graph data.

To address these gaps, we propose GraphTextack, the first black-box, multi-modal node injection attack designed specifically for LLM-enhanced GNNs. GraphTextack injects nodes with carefully crafted structure and semantics to degrade model performance, operating under a realistic threat model without relying on model internals or surrogate models. To navigate the combinatorial, non-differentiable search space of connectivity and feature assignments, GraphTextack introduces a novel evolutionary optimization framework with a multi-objective fitness function that balances local prediction disruption and global graph influence. Extensive experiments on multiple benchmark datasets and state-of-the-art LLM-enhanced GNN models show that GraphTextack significantly outperforms strong baselines, achieving higher drop in accuracy and lower runtime on average.

GraphTextack: A Realistic Black-Box Node Injection Attack on LLM-Enhanced GNNs

Most commonsense reasoning models overlook the influence of personality traits, limiting their effectiveness in personalized systems such as dialogue generation. To address this limitation, we introduced the Personality-aware Commonsense Knowledge Graph (PCoKG), a structured dataset comprising $521,316$ quadruples. We began by employing three evaluators to score and filter events from the ATOMIC dataset, selecting those that are likely to elicit diverse reasoning patterns across different personality types. For knowledge graph construction, we leveraged the role-playing capabilities of large language models (LLMs) to perform reasoning tasks. To enhance the quality of the generated knowledge, we incorporated a debate mechanism consisting of a supporter, an opposer, and a judge, which iteratively refined the outputs through feedback loops. We evaluated the dataset from multiple perspectives and conducted fine-tuning and ablation experiments using multiple LLM backbones to assess PCoKG's robustness and the effectiveness of its construction pipeline. Our LoRA-based fine-tuning results indicated a positive correlation between model performance and the parameter scale of the base models. Finally, we applied PCoKG to persona-based dialogue generation, where it demonstrated improved consistency between generated responses and reference outputs. This work bridges the gap between commonsense reasoning and individual cognitive differences, enabling more personalized and context-aware AI systems.

PCoKG: Personality-aware Commonsense Reasoning with Debate

Traffic prediction plays an important role in urban management. However, existing methods rely on centralized traffic data, which may raise privacy concerns. Federated traffic prediction offers a promising solution for clients (e.g., traffic management administrations) in different regions to collaboratively train models in a distributed manner without exposing private data. Nonetheless, data isolation inherently breaks the correlations between nodes (i.e., traffic sensors collecting data) from different regions, which leads to the missing inter-client dependency. Consequently, current works either fail to capture the missing inter-client dependency or compromise data privacy to recover the inter-client dependency. To address this issue, we propose a novel Federated method which recovers the inter-client dependency with HIdden global componeNTs (FedHINT). We find that the traffic data from different local regions actually contain hidden global components that reflect cross-regional traffic changes. Therefore, our FedHINT aims to extract hidden global components from each client to generate proxy nodes that represent global information, which are then utilized to recover the inter-client dependency. To be specific, we employ an attention module, which is guided by the shared global queries to capture hidden global components from local traffic data, to generate proxy nodes. Subsequently, our FedHINT adaptively learns the correlations between proxy nodes and local nodes through a global encoder. During this process, the global information in proxy nodes compensate for the loss of information from cross-regional nodes, which thereby recovers the missing inter-client dependency. Intensive experiments on multiple datasets demonstrate that our FedHINT significantly outperforms the state-of-the-art methods, with an average decrease of 3.73 and 4.81 on MAE and RMSE, respectively.

Content not yet available

Next from AAAI 2026

DW-DGAT: Dynamically Weighted Dual Graph Attention Network for Neurodegenerative Disease Diagnosis

Stay up to date with the latest Underline news!

PRESENTATIONS

CONFERENCES

COMPANY

RESOURCES