Singapore

Contextual linear dueling bandits have recently garnered significant attention due to their widespread applications in important domains such as recommender systems and large language models. Classical dueling bandit algorithms are typically only applicable to a single agent. However, many applications of dueling bandits involve multiple agents who wish to collaborate for improved performance yet are unwilling to share their data. This motivates us to draw inspirations from $\textit{federated learning}$, which involves multiple agents aiming to collaboratively train their neural networks via gradient descent (GD) without sharing their raw data. Previous works have developed federated linear bandit algorithms which rely on closed-form updates of the bandit parameters (e.g., the linear function parameters) to achieve collaboration. However, in linear dueling bandits, the linear function parameters lack a closed-form expression and their estimation requires minimizing a loss function. This renders these previous methods inapplicable. In this work, we overcome this challenge through an innovative and principled combination of online gradient descent (OGD, for minimizing the loss function to estimate the linear function parameters) and federated learning, hence introducing our $\textit{federated linear dueling bandit with OGD} \texttt{(FLDB-OGD)}$ algorithm. Through rigorous theoretical analysis, we prove that $\textit{FLDB-OGD}$ enjoys a sub-linear upper bound on its cumulative regret and demonstrate a theoretical trade-off between regret and communication complexity. We conduct empirical experiments to demonstrate the effectiveness of $\textit{FLDB-OGD}$ and reveal valuable insights, such as the benefit of a larger number of agents, the regret-communication trade-off, among others.

AAAI 2026

Federated Linear Dueling Bandits

online learning & bandits;learning preferences or rankings;learning theory

poster

We are pleased to announce the Fortieth AAAI Conference on Artificial Intelligence (AAAI-26), which will be held in Singapore EXPO from January 20 to January 27, 2026.

The purpose of the AAAI conference series is to promote research in Artificial Intelligence (AI) and foster scientific exchange between researchers, practitioners, scientists, students, and engineers across the entirety of AI and its affiliated disciplines. AAAI-26 will feature technical paper presentations, special tracks, invited speakers, workshops, tutorials, poster sessions, senior member presentations, competitions, and exhibit programs, and a range of other activities to be announced.<br><br>

To access this event page, you need to log in with the **email address you registered with**. <br>Access credentials will be sent to your email from Underline -  subject line "Welcome to AAAI 2026". Please be sure to check your spam email folder if you do not see an email confirmation right away.

Please log in

To access this event page, you are required to register.
Please complete your registration to continue.

We recommend reading [**the registration information**](https://aaai.org/conference/aaai/aaai-26/registration/) first.

**Online Registration Form**: https://aaai.getregistered.net/conference-2026 

Registration Required

We are pleased to announce the Fortieth AAAI Conference on Artificial Intelligence (AAAI-26), which will be held in Singapore EXPO from January 20 to January 27, 2026.

Machine unlearning, which selectively removes harmful knowledge from a pre-trained model without retraining from scratch, is crucial for addressing privacy, regulatory compliance, and ethical concerns in Large Language Models (LLMs). However, existing unlearning methods often struggle to thoroughly remove harmful knowledge, leaving residual harmful knowledge that can be easily recovered. To address these limitations, we propose Knowledge Density-Guided Unlearning via Blocks Reinsertion (KUnBR), a novel approach that first identifies layers with rich harmful knowledge and then thoroughly eliminates the harmful knowledge via re-insertion strategy. Our method introduces knowledge density estimation to quantify and locate layers containing the most harmful knowledge, enabling precise unlearning. Additionally, we design a layer re-insertion strategy that extracts and re-inserts harmful knowledge-rich layers into the original LLM, bypassing gradient obstruction caused by cover layers and ensuring effective gradient propagation during unlearning. Extensive experiments conducted on several unlearning and general capability benchmarks demonstrate that KUnBR achieves state-of-the-art forgetting performance while maintaining model utility.

Beyond Superficial Forgetting: Thorough Unlearning Through Knowledge Density Estimation and Block Re-Insertion

Graph neural networks (GNNs) have revolutionized recommender systems by effectively modeling complex user-item interactions, yet data sparsity and the item cold-start problem significantly impair performance, particularly for new items with limited or no interaction history. While multimodal content offers a promising solution, existing methods result in suboptimal representations for new items due to noise and entanglement in sparse data. To address this, we transform multimodal recommendation into discrete semantic tokenization. We present Sparse-Regularized Multimodal Tokenization for Cold-Start Recommender Systems (MoToRec), a framework centered on a sparsely-regularized Residual Quantized Variational Autoencoder (RQ-VAE) that generates a compositional semantic code of discrete, interpretable tokens, promoting disentangled representations. MoToRec’s architecture is enhanced by three synergistic components: (1) a sparsely regularized RQ-VAE that promotes disentangled representations, (2) a novel adaptive rarity amplification that promotes prioritized learning for cold-start items, and (3) a hierarchical multi-source graph encoder for robust signal fusion with collaborative signals. Extensive experiments on three large-scale datasets demonstrate MoToRec’s superiority over state-of the-art methods in both overall and cold-start scenarios. Our work validates that discrete tokenization provides an effective and scalable alternative for mitigating the long-standing cold-start challenge.

MoToRec: Sparse-Regularized Multimodal Tokenization for Cold-Start Recommender

Recursive transformer (RT) is a promising parameter-sharing technique for reducing computational burden of large-scale model. While RT has been successfully applied to large language models (LLMs), its effectiveness in automatic speech recognition (ASR) remains limited, despite the parallel trend of model scaling in the speech domain. In this paper, we reveal that conventional RT designs for LLMs are suboptimal for speech recognition, primarily because they do not fully consider the layer-wise specialization inherent in the ASR architecture, where lower layers focus on phonetic features and upper layers capture linguistic localization. To address this, we propose BiCycle, a novel RT scheme tailored for ASR. In particular, we firstly analyze attention patterns in a pre-trained ASR model to divide its layers into phonetic and linguistic groups. BiCycle then constructs an efficient RT model by transferring the pre-trained model’s weights in a step-wise manner and applies recursion separately to the phonetic and linguistic groups, preventing conflicts between their roles. Extensive experimental results confirm that the proposed method not only preserves the original ASR mechanism but also outperforms conventional RT approaches.

BiCycle: Group-wise Recursive Transformer Based on ASR Mechanism

Multimodal recommender systems have emerged as a pivotal paradigm for harnessing diverse data modalities to deliver personalized services. Contemporary research predominantly focuses on integrating heterogeneous modality information through graph learning. However, these approaches face two key challenges: (1) the inherent complexity of modalities, characterized by entangled redundant signals and noise; and (2) the challenge of effectively integrating multimodal representations, each of which may exert varying degrees of influence on users' preferences. To address these challenges, we propose a novel Collaboration-Guided $\underline{M}$ultimodal $\underline{D}$isentanglement and $\underline{H}$ierarchical Fusion for $\underline{Rec}$ommendation (DHMRec), which simultaneously achieves intra-modal denoising disentanglement and inter-modal hierarchical fusion. Specifically, we introduce a collaboration-related modality disentanglement module to distinguish between modality-common and modality-specific features. Then, through multi-view graph learning to capture both item-item dependencies and user-item interaction patterns. Additionally, we implement hierarchical fusion between the disentangled multimodal features and ID embeddings using a positive-negative attention-aware fusion module and an interaction distribution-based alignment module. Extensive experiments on three benchmarks demonstrate that our DHMRec surpasses various state-of-the-art baselines, highlighting its effectiveness in intra-modal disentanglement and multimodal features fusion.

DHMRec: Collaboration-Guided Multimodal Disentanglement and Hierarchical Fusion for Recommendation

Data augmentation is an effective technique for regularizing deep networks, which helps to enhance the generalizability and robustness of the model. However, in the field of medical imaging, traditional data augmentation techniques such as cropping, rotation, and degradation may inadvertently alter the critical characteristics of pathological lesions. Conventional semantic augmentation methods, such as altering the color and contrast of the object background, may also affect the structural features of medical images in uncontrolled semantic directions. Such operational conditions compromise the model's diagnostic reliability in medical contexts. To address this issue, we propose a surprisingly efficient implicit augmentation-invariant learning strategy (AILS) via variational Bayesian inference on differentially constrained feature manifolds. Parameterizing probability measures over tangent space through deep networks enables precise estimation of semantic direction distributions. Subsequently, geodesic-aware semantic features are sampled from the reparameterized variational posterior using exponential mapping, achieving semantic-consistent augmentation. Simultaneously, to mine augmentation distribution invariance, we design the AiHLoss, which constrains the augmentation distribution to facilitate the network to learn augmentation invariance. Extensive experiments demonstrate that AILS exhibits high performance on public medical image datasets, outperforming existing augmentation methods.

Augmentation-invariant Learning Strategy via Data Augmentation for Improving Model Generalization

Recent progress in artificial intelligence has encouraged numerous attempts to understand and decode human visual perception and cognition from brain signals. These prior works typically align neural activity independently with semantic and perceptual features extracted from images using pre-trained vision models. However, they fail to account for two key challenges: (1) the modality gap arising from the natural difference in the information level of representation between brain signals and images, and (2) the fact that semantic and perceptual features are highly entangled within neural activity. To address these issues, we utilize hyperbolic space, which is well-suited for considering differences in amount of information and has the geometric property that geodesics between two points naturally bend toward the origin, where the representational capacity is lower. Leveraging these properties, we propose a novel framework, Hyperbolic Feature Interpolation (HyFI), which interpolates between semantic and perceptual visual features along hyperbolic geodesics. This enables both the fusion and compression of perceptual and semantic information, effectively reflecting the limited expressiveness of brain signals and the entangled nature of these features. As a result, it facilitates better alignment between brain and visual features. We demonstrate that HyFI achieves state-of-the-art performance in zero-shot brain-to-image retrieval, improving Top-1 accuracy by up to +16.8% on THINGS-EEG and +9.1% on THINGS-MEG.

HyFI: Hyperbolic Feature Interpolation for Brain-Vision Alignment

We present a novel framework for automated interior design that combines large language models (LLMs) with grid-based integer programming to jointly optimize room layout and furniture placement. Given a textual prompt, the LLM-driven agent workflow extracts structured design constraints related to room configurations and furniture arrangements. These constraints are encoded into a unified grid-based representation inspired by ``Modulor". Our formulation accounts for key design requirements, including corridor connectivity, room accessibility, spatial exclusivity, and user-specified preferences. To improve computational efficiency, we adopt a coarse-to-fine optimization strategy that begins with a low-resolution grid to solve a simplified problem and guides the solution at the full resolution. Experimental results across diverse scenarios demonstrate that our joint optimization approach significantly outperforms existing two-stage design pipelines in solution quality, and achieves notable computational efficiency through the coarse-to-fine strategy.

Co-Layout: LLM-driven Co-optimization for Interior Layout

Hallucination remains a critical challenge in large language models (LLMs), hindering the development of reliable multimodal LLMs (MLLMs). Existing solutions often rely on human intervention or underutilize the agent’s ability to autonomously mitigate hallucination.
To address these limitations, we draw inspiration from how humans make reliable decisions in the real world. They begin with introspective reasoning to reduce uncertainty and form an initial judgment, then rely on external verification from diverse perspectives to reach a final decision. Motivated by this cognitive paradigm, we propose InEx, a training-free, multi-agent framework designed to autonomously mitigate hallucination. InEx introduces internal introspective reasoning, guided by entropy-based uncertainty estimation, to improve the reliability of the decision agent’s reasoning process. The agent first generates a response, which is then iteratively verified and refined through external cross-modal multi-agent collaboration with the editing agent and self-reflection agents, further enhancing reliability and mitigating hallucination. Extensive experiments show that InEx consistently outperforms existing methods, achieving 4%–27% gains on general and hallucination benchmarks, and demonstrating strong robustness. Our code will be released upon acceptance.

InEx: Hallucination Mitigation via Introspection and Cross-Modal Multi-Agent Collaboration

The scarcity of paired data severely limits the performance and generalization of learning-based underwater image enhancement (UIE) methods. This challenge is particularly prominent in scenes with complex degradations. Semi-supervised learning has emerged as a promising solution by enabling the utilization of large-scale unlabeled data. However, its effectiveness is limited by the use of static, model-agnostic metrics for pseudo-label reliability assessment. To address this, we propose SEA-PACE, a novel semi-supervised framework that integrates model-aware uncertainty modeling and self-paced consistency learning to fully exploit unlabeled data for UIE. Specifically, we design a Model-Aware Reliability Estimator (MARE) that quantifies the uncertainty of the teacher model's predictions through Gaussian Process Regression in latent feature space. The resulting uncertainty is then transformed into reliability weights via a rank-based mapping. Additionally, we apply the Self-Paced Consistency Learning (SPCL) strategy that employs a loss-aware schedule to dynamically prioritize high-confidence pseudo-labels, gradually incorporating more challenging samples during training. Extensive experiments on several public UIE benchmarks demonstrate that SEA-PACE consistently surpasses state-of-the-art methods in both visual quality and generalization capability. The code is publicly available at: \url{https://github.com/haigouboss/SEA-PACE}

SEA-PACE: Semi-Supervised Underwater Image Enhancement via Gaussian Process–Assisted Self-Paced Learning

Modeling continuous-time dynamics from sparse and irregularly-sampled time series remains a fundamental challenge. Neural controlled differential equations provide a principled framework for such tasks, yet their performance is highly sensitive to the choice of control path constructed from discrete observations. Existing methods commonly employ fixed interpolation schemes, which impose simplistic geometric assumptions that often misrepresent the underlying data manifold, particularly under high missingness. We propose FlowPath, a novel approach that learns the geometry of the control path via an invertible neural flow. Rather than merely connecting observations, FlowPath constructs a continuous and data-adaptive manifold, guided by invertibility constraints that enforce information-preserving and well-behaved transformations. This inductive bias distinguishes FlowPath from prior unconstrained learnable path models. Empirical evaluations on 18 benchmark datasets and a real-world case study demonstrate that FlowPath consistently achieves statistically significant improvements in classification accuracy over baselines using fixed interpolants or non-invertible architectures. These results highlight the importance of modeling not only the dynamics along the path but also the geometry of the path itself, offering a robust and generalizable solution for learning from irregular time series.

Content not yet available

Next from AAAI 2026

Beyond Superficial Forgetting: Thorough Unlearning Through Knowledge Density Estimation and Block Re-Insertion

Stay up to date with the latest Underline news!

PRESENTATIONS

CONFERENCES

COMPANY

RESOURCES