Singapore

Hybrid action space, which combines discrete choices and continuous parameters, is prevalent in domains such as robot control and game AI. However, efficiently modeling and optimizing hybrid discrete-continuous action space remains a fundamental challenge, mainly due to limited policy expressiveness and poor scalability in high-dimensional settings. 
To address this challenge, we view the hybrid action space problem as a fully-cooperative game and propose a \textbf{Cooperative Hybrid Diffusion Policies (CHDP)} framework to solve it.
CHDP employs two cooperative agents that leverage a discrete and a continuous diffusion policy, respectively.
The continuous policy is conditioned on the discrete action&#39;s representation, explicitly modeling the dependency between them.
This cooperative design allows the diffusion policies to leverage their expressiveness to capture complex distributions in their respective action spaces.
To mitigate the update conflicts arising from simultaneous policy updates in this cooperative setting, we employ a sequential update scheme that fosters co-adaptation.
Moreover, to improve scalability when learning in high-dimensional discrete action space, we construct a codebook that embeds the action space into a low-dimensional latent space. 
This mapping enables the discrete policy to learn in a compact, structured space. 
Finally, we design a Q-function-based guidance mechanism to align the codebook&#39;s embeddings with the discrete policy&#39;s representation during training.
On challenging hybrid action benchmarks, CHDP outperforms state-of-the-art method by up to $19.3\%$ in success rate.

AAAI 2026

CHDP: Cooperative Hybrid Diffusion Policies for Reinforcement Learning in Parameterized Action Space

reinforcement leanring

Hybrid action space, which combines discrete choices and continuous parameters, is prevalent in domains such as robot control and game AI. However, efficiently modeling and optimizing hybrid discrete-continuous action space remains a fundamental challenge, mainly due to limited policy expressiveness and poor scalability in high-dimensional settings. 
To address this challenge, we view the hybrid action space problem as a fully-cooperative game and propose a \textbf{Cooperative Hybrid Diffusion Policies (CHDP)} framework to solve it.
CHDP employs two cooperative agents that leverage a discrete and a continuous diffusion policy, respectively.
The continuous policy is conditioned on the discrete action's representation, explicitly modeling the dependency between them.
This cooperative design allows the diffusion policies to leverage their expressiveness to capture complex distributions in their respective action spaces.
To mitigate the update conflicts arising from simultaneous policy updates in this cooperative setting, we employ a sequential update scheme that fosters co-adaptation.
Moreover, to improve scalability when learning in high-dimensional discrete action space, we construct a codebook that embeds the action space into a low-dimensional latent space. 
This mapping enables the discrete policy to learn in a compact, structured space. 
Finally, we design a Q-function-based guidance mechanism to align the codebook's embeddings with the discrete policy's representation during training.
On challenging hybrid action benchmarks, CHDP outperforms state-of-the-art method by up to $19.3\%$ in success rate.

poster

We are pleased to announce the Fortieth AAAI Conference on Artificial Intelligence (AAAI-26), which will be held in Singapore EXPO from January 20 to January 27, 2026.

The purpose of the AAAI conference series is to promote research in Artificial Intelligence (AI) and foster scientific exchange between researchers, practitioners, scientists, students, and engineers across the entirety of AI and its affiliated disciplines. AAAI-26 will feature technical paper presentations, special tracks, invited speakers, workshops, tutorials, poster sessions, senior member presentations, competitions, and exhibit programs, and a range of other activities to be announced.<br><br>

To access this event page, you need to log in with the **email address you registered with**. <br>Access credentials will be sent to your email from Underline -  subject line "Welcome to AAAI 2026". Please be sure to check your spam email folder if you do not see an email confirmation right away.

Please log in

To access this event page, you are required to register.
Please complete your registration to continue.

We recommend reading [**the registration information**](https://aaai.org/conference/aaai/aaai-26/registration/) first.

**Online Registration Form**: https://aaai.getregistered.net/conference-2026 

Registration Required

We are pleased to announce the Fortieth AAAI Conference on Artificial Intelligence (AAAI-26), which will be held in Singapore EXPO from January 20 to January 27, 2026.

In real-world time-series modelling, graph structures are widely adopted because they explicitly encode node topology and capture complex network dynamics. In practice, however, a complete graph is often partitioned across multiple parties; each party can access only its local sub-graph and, owing to privacy regulations, cannot share topology or data, creating pervasive data silos. Federated Graph Learning (FGL) offers a privacy-preserving collaborative-learning paradigm, yet current methods still face two key challenges: (1) they implicitly capture inter-edge information, making it difficult to accurately reconstruct the global structure and consequently degrading model performance; (2) explicitly exchanging inter-edge information may leak graph-topology privacy. To overcome these obstacles, we propose FedSkeleton, a privacy-preserving framework for time-series prediction that comprises a Skeleton Construction Module and a Dual-stream Forecasting Module, enabling global dependency capture without revealing the topology. Extensive experiments show that FedSkeleton consistently outperforms existing baselines and even surpasses centralised models with full-graph access. In addition, we conduct comprehensive security analysis, communication-cost evaluation and scalability experiments, demonstrating that FedSkeleton effectively resists common attacks, keeps communication overhead manageable and remains robust with respect to key hyper-parameters and the number of participating parties.

FedSkeleton: Secure Multi-Party Graph Skeleton Construction for Privacy-Preserving Federated Time-Series Forecasting

Although deep learning has substantially advanced speech separation in recent years, most existing studies continue to prioritize separation quality while overlooking computational efficiency, an essential factor for low-latency speech processing in real-time applications. In this paper, we propose SepPrune, the first structured pruning framework specifically designed to compress deep speech separation models and reduce their computational cost. SepPrune begins by analyzing the computational structure of a given model to identify layers with the highest computational burden. It then introduces a differentiable masking strategy to enable gradient-driven channel selection. Based on the learned masks, SepPrune prunes redundant channels and fine-tunes the remaining parameters to recover performance. Extensive experiments demonstrate that this learnable pruning paradigm yields substantial advantages for channel pruning in speech separation models, outperforming existing methods. Notably, a model pruned with SepPrune can recover 85% of the performance of a pre-trained model (trained over hundreds of epochs) with only one epoch of fine-tuning, and achieves convergence 36x faster than training from scratch.

SepPrune: Structured Pruning for Efficient Deep Speech Separation

The Mixture-of-Experts (MoE) architecture has emerged as a promising paradigm for scaling large language models (LLMs) by activating only a sparse subset of experts per input. However, its massive parameter size remains a major obstacle to efficient deployment. Existing pruning methods often ignore two key aspects: the intricate structural dependencies among experts and the heterogeneous importance of different layers. To tackle these issues, we propose C-GNN-PRUNE, a unified and structure-aware compression framework tailored for MoE models. Our method introduces an EntropyGuided Allocation Module that dynamically assigns pruning budgets by leveraging expert activation entropy, enabling adaptive handling of inter-layer heterogeneity. To preserve structural collaboration patterns, we construct an expert interaction graph that fuses functional similarity and routing behavior, and employ a GNN-Based Embedding Module to learn structure-aware expert representations. These embeddings, along with co-activation patterns, are fed into a Community Detection Module to identify expert clusters for structured pruning. Finally, an Activation-Aware Selection Module retains the most critical experts in each community, balancing sparsity and expressiveness. Experiments on multiple open-source MoE models demonstrate that C-GNN-PRUNE consistently outperforms prior methods under various pruning ratios, achieving better trade-offs between compression and accuracy. This framework provides a modular and effective solution for structure-preserving compression of large-scale MoE models.

C-GNN-PRUNE: A Unified Graph-Based Framework for Structure-Aware Pruning of Mixture-of-Experts Models

Tabular data synthesis is a key technique for protecting data privacy and addressing class imbalance, yet existing generative models struggle to capture the complex intrinsic structure of the data. To overcome this limitation, we propose TabGeoFlow, a novel geometric flow matching model for tabular data synthesis. The core innovation of TabGeoFlow is the injection of an explicit geometric inductive bias into the conditional flow matching framework. We decompose the learned vector field into local tangent and normal components of the data manifold. By dynamically suppressing the predicted normal component via a controlling loss function, we constrain the generative path to follow the data's intrinsic structure. Implemented with a shared backbone for parameter efficiency, TabGeoFlow outperforms existing baseline methods across multiple benchmark datasets, demonstrating superior data quality in terms of both statistical similarity and downstream machine learning utility.

TabGeoFlow: A Geometric Flow Matching Model for Tabular Data Synthesis

Antibody design is critically important in biomedical and therapeutic contexts but remains extremely challenging due to the complexity of antibody sequence–structure relationships and stringent antigen specificity requirements. Traditional computational approaches rely on multi-stage pipelines and often overlook all-atom details (e.g., side-chain conformations) as well as fine-grained geometric features, resulting in limited effectiveness. To overcome these limitations, we propose Dynamic Geometric Equivariant Network (DGENet), an end-to-end all-atom antibody design model that integrates a geometric-kinematic equivariant dynamic optimization module (GK-EDO) with an all-atom E(3)-equivariant message-passing architecture. This framework enables iterative optimization of antibody structures under explicit geometric and kinematic constraints, generating complete antibody structures (including backbone and side chains) and simultaneously jointly optimizing the sequences and 3D structures of the complementarity-determining regions (CDRs). DGENet also introduces a novel virtual anchor docking mechanism that employs an adaptive PNet-Kabsch module to explicitly guide antibody–antigen binding and achieve precise bound conformations. Evaluations on multiple benchmark datasets demonstrate that DGENet exhibits outstanding performance in antibody structure and sequence generation as well as in designing high-affinity antibodies, underscoring its reliability as an advanced antibody design model.

Dynamic Geometric Equivariant Network for Full-Atom Antibody Design

Conformal Prediction (CP) is a popular method for uncertainty quantification that converts a pretrained model's point prediction into a prediction set, with the set size reflecting the model's confidence. Although existing CP methods are guaranteed to achieve marginal coverage, they often exhibit imbalanced coverage across classes under long-tailed label distributions, tending to over cover the head classes at the expense of under covering the remaining tail classes. This under coverage is particularly concerning, as it undermines the reliability of the prediction sets for minority classes, even with coverage ensured on average. In this paper, we propose the Tail-Aware Conformal Prediction (TACP) method to mitigate the under coverage of the tail classes by utilizing the long-tailed structure and narrowing the head-tail coverage gap. Theoretical analysis shows that it consistently achieves a smaller head-tail coverage gap than standard methods. To further improve coverage balance across all classes, we introduce an extension of TACP: soft TACP (sTACP) via a reweighting mechanism. The proposed framework can be combined with various non-conformity scores, and experiments on multiple long-tailed benchmark datasets demonstrate the effectiveness of our methods.

Conformal Prediction Meets Long-tail Classification

Traditional time-series forecasting often focuses only on minimizing prediction errors, ignoring the specific requirements of real-world applications that employ them. This paper presents a new training methodology, which allows a forecasting model to dynamically adjust its focus based on the importance of forecast ranges specified by the end application. Unlike previous methods that fix these ranges beforehand, our training approach breaks down predictions over the entire signal range into smaller segments, which are then dynamically weighted and combined to produce accurate forecasts within a region of interest. We tested our method on standard datasets, including a new wireless communication dataset, and found that not only it improves prediction accuracy but also enhances the performance of end application employing the forecasting model. This research provides a basis for creating forecasting systems that better connect prediction and decision-making in various practical applications.

Goal-Oriented Time-Series Forecasting: Foundation Framework Design

LLM-based approaches have recently achieved impressive results in zero-shot stance detection. However, they still struggle in complex real-world scenarios, where stance understanding requires dynamic background knowledge, target definitions involve compound entities or events that must be explicitly linked to stance labels, and rhetorical devices such as irony often obscure the author’s actual intent. To address these challenges, we propose MSME, a **M**ulti-**S**tage, **M**ulti-**E**xpert framework for zero-shot stance detection. MSME consists of three stages: (1) *Knowledge Preparation*, where relevant background knowledge is retrieved and stance labels are clarified; (2) *Expert Reasoning*, involving three specialized modules—Knowledge Expert distills salient facts and reasons from a knowledge perspective, Label Expert refines stance labels and reasons accordingly, and Pragmatic Expert detects rhetorical cues such as irony to infer intent from a pragmatic angle; (3) *Decision Aggregation*, where a Meta-Judge integrates all expert analyses to produce the final stance prediction. Experiments on three public datasets show that MSME achieves state-of-the-art performance across the board.

MSME: A Multi-Stage Multi-Expert Framework for Zero-Shot Stance Detection

Ensuring consistently high-quality training data is essential for developing reliable machine learning systems. Recent research demonstrates that incorporating human supervision into training set debugging effectively improves model performance, especially for text classification tasks.
However, such methods often prove inapplicable to image understanding tasks, where inherently unstructured pixel data presents challenges in understanding and correcting biases.
Inspired by human-AI alignment, we introduce AACA (Attribution Analysis-based Concept Alignment), a human-in-the-loop framework that mitigates bias in the training set by aligning the concepts used by humans and AI during the decision-making process. 
Specifically, AACA comprises two primary stages: interpretable data bug discovery and targeted data augmentation.
During the data bug discovery stage, AACA identifies confounded and valid concepts to explain why prediction failure occurs and what concept the model should focus, using interpretability methods and human annotation. 
In the stage of targeted data augmentation, AACA adopts these concept-level attributions as clues to synthesize debugging instances via text-to-image generative model. 
The initial model is then retrained on the augmented set to correct prediction failures. 
Comparative experiments conducted on crowdsourced annotations and real-world datasets demonstrate that AACA can accurately identifies data bugs and effectively repairs prediction failures, thereby significantly improving prediction performance.

Attribution Analysis-based Concept Alignment: A Human-in-the-loop Data Debugging Framework

Multi-label learning is a practical machine learning paradigm dealing with instances associated with multiple labels simultaneously. Most existing multi-label learning studies are designed under the closed-world assumption, i.e. a fixed size of label space. However, it encounters significant difficulties in open-set scenarios, where test data may contain unknown labels absent from the training set to be recognized. Existing method typically tackles this challenging problem through sub-labeling approximations and prototype-based comparisons, which often overlooks the implicit information carried by unknown labels. To address this, we propose a novel framework CREM, i.e. Classifier-induced REciprocal point for Multi-label open-set recognition, which rethinks the above problem from the reciprocal point perspective. Specifically, reciprocal points are formulated by explicitly constraining the opposition feature space to a learnable bounded margin. Then reciprocal points can be induced through the classifier with the instance-wise bias eliminated. Subsequently, a unified optimization framework is introduced to jointly facilitate the classifier and reciprocal points induction. Extensive experiments demonstrate the effectiveness and superiority of the proposed CREM approach in the multi-label open-set recognition paradigm.

Content not yet available

Next from AAAI 2026

FedSkeleton: Secure Multi-Party Graph Skeleton Construction for Privacy-Preserving Federated Time-Series Forecasting

Stay up to date with the latest Underline news!

PRESENTATIONS

CONFERENCES

COMPANY

RESOURCES