Singapore

The Mixture-of-Experts (MoE) architecture has emerged as a promising paradigm for scaling large language models (LLMs) by activating only a sparse subset of experts per input. However, its massive parameter size remains a major obstacle to efficient deployment. Existing pruning methods often ignore two key aspects: the intricate structural dependencies among experts and the heterogeneous importance of different layers. To tackle these issues, we propose C-GNN-PRUNE, a unified and structure-aware compression framework tailored for MoE models. Our method introduces an EntropyGuided Allocation Module that dynamically assigns pruning budgets by leveraging expert activation entropy, enabling adaptive handling of inter-layer heterogeneity. To preserve structural collaboration patterns, we construct an expert interaction graph that fuses functional similarity and routing behavior, and employ a GNN-Based Embedding Module to learn structure-aware expert representations. These embeddings, along with co-activation patterns, are fed into a Community Detection Module to identify expert clusters for structured pruning. Finally, an Activation-Aware Selection Module retains the most critical experts in each community, balancing sparsity and expressiveness. Experiments on multiple open-source MoE models demonstrate that C-GNN-PRUNE consistently outperforms prior methods under various pruning ratios, achieving better trade-offs between compression and accuracy. This framework provides a modular and effective solution for structure-preserving compression of large-scale MoE models.

AAAI 2026

C-GNN-PRUNE: A Unified Graph-Based Framework for Structure-Aware Pruning of Mixture-of-Experts Models

ml: mixture of experts (moe)

ml: deep neural architectures and foundation models

ml: learning on the edge & model compression

poster

We are pleased to announce the Fortieth AAAI Conference on Artificial Intelligence (AAAI-26), which will be held in Singapore EXPO from January 20 to January 27, 2026.

The purpose of the AAAI conference series is to promote research in Artificial Intelligence (AI) and foster scientific exchange between researchers, practitioners, scientists, students, and engineers across the entirety of AI and its affiliated disciplines. AAAI-26 will feature technical paper presentations, special tracks, invited speakers, workshops, tutorials, poster sessions, senior member presentations, competitions, and exhibit programs, and a range of other activities to be announced.<br><br>

To access this event page, you need to log in with the **email address you registered with**. <br>Access credentials will be sent to your email from Underline -  subject line "Welcome to AAAI 2026". Please be sure to check your spam email folder if you do not see an email confirmation right away.

Please log in

To access this event page, you are required to register.
Please complete your registration to continue.

We recommend reading [**the registration information**](https://aaai.org/conference/aaai/aaai-26/registration/) first.

**Online Registration Form**: https://aaai.getregistered.net/conference-2026 

Registration Required

We are pleased to announce the Fortieth AAAI Conference on Artificial Intelligence (AAAI-26), which will be held in Singapore EXPO from January 20 to January 27, 2026.

Tabular data synthesis is a key technique for protecting data privacy and addressing class imbalance, yet existing generative models struggle to capture the complex intrinsic structure of the data. To overcome this limitation, we propose TabGeoFlow, a novel geometric flow matching model for tabular data synthesis. The core innovation of TabGeoFlow is the injection of an explicit geometric inductive bias into the conditional flow matching framework. We decompose the learned vector field into local tangent and normal components of the data manifold. By dynamically suppressing the predicted normal component via a controlling loss function, we constrain the generative path to follow the data's intrinsic structure. Implemented with a shared backbone for parameter efficiency, TabGeoFlow outperforms existing baseline methods across multiple benchmark datasets, demonstrating superior data quality in terms of both statistical similarity and downstream machine learning utility.

TabGeoFlow: A Geometric Flow Matching Model for Tabular Data Synthesis

Antibody design is critically important in biomedical and therapeutic contexts but remains extremely challenging due to the complexity of antibody sequence–structure relationships and stringent antigen specificity requirements. Traditional computational approaches rely on multi-stage pipelines and often overlook all-atom details (e.g., side-chain conformations) as well as fine-grained geometric features, resulting in limited effectiveness. To overcome these limitations, we propose Dynamic Geometric Equivariant Network (DGENet), an end-to-end all-atom antibody design model that integrates a geometric-kinematic equivariant dynamic optimization module (GK-EDO) with an all-atom E(3)-equivariant message-passing architecture. This framework enables iterative optimization of antibody structures under explicit geometric and kinematic constraints, generating complete antibody structures (including backbone and side chains) and simultaneously jointly optimizing the sequences and 3D structures of the complementarity-determining regions (CDRs). DGENet also introduces a novel virtual anchor docking mechanism that employs an adaptive PNet-Kabsch module to explicitly guide antibody–antigen binding and achieve precise bound conformations. Evaluations on multiple benchmark datasets demonstrate that DGENet exhibits outstanding performance in antibody structure and sequence generation as well as in designing high-affinity antibodies, underscoring its reliability as an advanced antibody design model.

Dynamic Geometric Equivariant Network for Full-Atom Antibody Design

Conformal Prediction (CP) is a popular method for uncertainty quantification that converts a pretrained model's point prediction into a prediction set, with the set size reflecting the model's confidence. Although existing CP methods are guaranteed to achieve marginal coverage, they often exhibit imbalanced coverage across classes under long-tailed label distributions, tending to over cover the head classes at the expense of under covering the remaining tail classes. This under coverage is particularly concerning, as it undermines the reliability of the prediction sets for minority classes, even with coverage ensured on average. In this paper, we propose the Tail-Aware Conformal Prediction (TACP) method to mitigate the under coverage of the tail classes by utilizing the long-tailed structure and narrowing the head-tail coverage gap. Theoretical analysis shows that it consistently achieves a smaller head-tail coverage gap than standard methods. To further improve coverage balance across all classes, we introduce an extension of TACP: soft TACP (sTACP) via a reweighting mechanism. The proposed framework can be combined with various non-conformity scores, and experiments on multiple long-tailed benchmark datasets demonstrate the effectiveness of our methods.

Conformal Prediction Meets Long-tail Classification

Traditional time-series forecasting often focuses only on minimizing prediction errors, ignoring the specific requirements of real-world applications that employ them. This paper presents a new training methodology, which allows a forecasting model to dynamically adjust its focus based on the importance of forecast ranges specified by the end application. Unlike previous methods that fix these ranges beforehand, our training approach breaks down predictions over the entire signal range into smaller segments, which are then dynamically weighted and combined to produce accurate forecasts within a region of interest. We tested our method on standard datasets, including a new wireless communication dataset, and found that not only it improves prediction accuracy but also enhances the performance of end application employing the forecasting model. This research provides a basis for creating forecasting systems that better connect prediction and decision-making in various practical applications.

Goal-Oriented Time-Series Forecasting: Foundation Framework Design

LLM-based approaches have recently achieved impressive results in zero-shot stance detection. However, they still struggle in complex real-world scenarios, where stance understanding requires dynamic background knowledge, target definitions involve compound entities or events that must be explicitly linked to stance labels, and rhetorical devices such as irony often obscure the author’s actual intent. To address these challenges, we propose MSME, a **M**ulti-**S**tage, **M**ulti-**E**xpert framework for zero-shot stance detection. MSME consists of three stages: (1) *Knowledge Preparation*, where relevant background knowledge is retrieved and stance labels are clarified; (2) *Expert Reasoning*, involving three specialized modules—Knowledge Expert distills salient facts and reasons from a knowledge perspective, Label Expert refines stance labels and reasons accordingly, and Pragmatic Expert detects rhetorical cues such as irony to infer intent from a pragmatic angle; (3) *Decision Aggregation*, where a Meta-Judge integrates all expert analyses to produce the final stance prediction. Experiments on three public datasets show that MSME achieves state-of-the-art performance across the board.

MSME: A Multi-Stage Multi-Expert Framework for Zero-Shot Stance Detection

Ensuring consistently high-quality training data is essential for developing reliable machine learning systems. Recent research demonstrates that incorporating human supervision into training set debugging effectively improves model performance, especially for text classification tasks.
However, such methods often prove inapplicable to image understanding tasks, where inherently unstructured pixel data presents challenges in understanding and correcting biases.
Inspired by human-AI alignment, we introduce AACA (Attribution Analysis-based Concept Alignment), a human-in-the-loop framework that mitigates bias in the training set by aligning the concepts used by humans and AI during the decision-making process. 
Specifically, AACA comprises two primary stages: interpretable data bug discovery and targeted data augmentation.
During the data bug discovery stage, AACA identifies confounded and valid concepts to explain why prediction failure occurs and what concept the model should focus, using interpretability methods and human annotation. 
In the stage of targeted data augmentation, AACA adopts these concept-level attributions as clues to synthesize debugging instances via text-to-image generative model. 
The initial model is then retrained on the augmented set to correct prediction failures. 
Comparative experiments conducted on crowdsourced annotations and real-world datasets demonstrate that AACA can accurately identifies data bugs and effectively repairs prediction failures, thereby significantly improving prediction performance.

Attribution Analysis-based Concept Alignment: A Human-in-the-loop Data Debugging Framework

Multi-label learning is a practical machine learning paradigm dealing with instances associated with multiple labels simultaneously. Most existing multi-label learning studies are designed under the closed-world assumption, i.e. a fixed size of label space. However, it encounters significant difficulties in open-set scenarios, where test data may contain unknown labels absent from the training set to be recognized. Existing method typically tackles this challenging problem through sub-labeling approximations and prototype-based comparisons, which often overlooks the implicit information carried by unknown labels. To address this, we propose a novel framework CREM, i.e. Classifier-induced REciprocal point for Multi-label open-set recognition, which rethinks the above problem from the reciprocal point perspective. Specifically, reciprocal points are formulated by explicitly constraining the opposition feature space to a learnable bounded margin. Then reciprocal points can be induced through the classifier with the instance-wise bias eliminated. Subsequently, a unified optimization framework is introduced to jointly facilitate the classifier and reciprocal points induction. Extensive experiments demonstrate the effectiveness and superiority of the proposed CREM approach in the multi-label open-set recognition paradigm.

Classifier-induced Reciprocal Points for Multi-label Open-set Recognition

This paper systematically examines nation-level biases exhibited by Large Language Models (LLMs) within the domain of International Relations (IR). Leveraging historical records from the United Nations Security Council (UNSC), we developed a bias evaluation framework comprising three distinct tests to explore nation-level bias in various LLMs, with a particular focus on the five permanent members of the UNSC. Experimental results show that, even with the general bias patterns across models (e.g., favorable biases toward the western nations, and unfavorable biases toward Russia), these still vary based on the LLM. Notably, even within the same LLM, the direction and magnitude of bias for a nation change depending on the evaluation context. This observation suggests that LLM biases are fundamentally multidimensional, varying across models and tasks. We also observe that models with stronger reasoning abilities show reduced bias and better performance. Building on this finding, we introduce a debiasing framework that improves LLMs’ factual reasoning combining Retrieval-Augmented Generation with Reflexion-based self-reflection techniques. Experiments show it effectively reduces nation-level bias, and improves performance, particularly in GPT-4o-mini and LLama-3.3-70B. Our findings emphasize the need to assess nation-level bias alongside performance when applying LLMs in the IR domain.

“As Eastern Powers, I Will Veto.”: An Investigation of Nation-Level Bias of Large Language Models in International Relations

Exploration in sparse-reward tasks remains a fundamental challenge in multi-agent reinforcement learning (MARL) due to complex inter-agent interactions and the expansive exploration space. To address this issue, we propose Targeted Multi-Agent Exploration (TMAE), a novel framework that uncovers the causal relationships between the state space and the reward function, thereby reducing the exploration space and enabling more targeted exploration. Specifically, we construct a structural causal model (SCM) to model the causality between sub-state variables and sparse rewards, providing a robust analytical foundation for subsequent causal inference. Through counterfactual causal intervention, TMAE identifies the most critical subspaces for discovering rare but pivotal events while filtering out confounders. By incorporating these causal insights into the exploration process, TMAE prioritizes subspaces with stronger causal effects on sparse rewards, significantly enhancing exploration efficiency. We evaluate TMAE on a range of MARL benchmarks featuring sparse rewards, consistently demonstrating superior exploration efficiency compared to state-of-the-art methods. Furthermore, visualized causal insights derived from TMAE reveal its ability to effectively capture intricate dependencies and priorities in targeted exploration, showcasing strong alignment with prior domain knowledge.

TMAE:Learning Targeted Multi-Agent Exploration via Causal Inference

Proper quantification of predictive uncertainty is essential for the use of machine learning in safety-critical applications. Various uncertainty measures have been proposed for this purpose, typically claiming superiority over other measures. In this paper, we argue that there is no single best measure. Instead, uncertainty quantification should be tailored to the specific application. To this end, we use a flexible family of uncertainty measures that distinguishes between total, aleatoric, and epistemic uncertainty. These measures can be instantiated with specific loss functions, so-called proper scoring rules, to control their characteristics, and we show that different characteristics are useful for different tasks. In particular, we show that, for the task of selective prediction, the scoring rule should ideally match the task loss. On the other hand, for out-of-distribution detection, our results confirm that mutual information, a widely used measure of epistemic uncertainty, performs best. Furthermore, in an active learning setting, epistemic uncertainty based on zero-one loss is shown to consistently outperform other uncertainty measures.

Downloads

Next from AAAI 2026

TabGeoFlow: A Geometric Flow Matching Model for Tabular Data Synthesis

Stay up to date with the latest Underline news!

PRESENTATIONS

CONFERENCES

COMPANY

RESOURCES

.css-70qvj9{display:-webkit-box;display:-webkit-flex;display:-ms-flexbox;display:flex;-webkit-align-items:center;-webkit-box-align:center;-ms-flex-align:center;align-items:center;}Downloads

Next from AAAI 2026

TabGeoFlow: A Geometric Flow Matching Model for Tabular Data Synthesis

Stay up to date with the latest Underline news!

PRESENTATIONS

CONFERENCES

COMPANY

RESOURCES

Downloads