Singapore

Questionnaire data serve as a valuable resource across numerous scientific domains, offering insights into human behavior, health, and social trends. Traditional downsampling-based representation learning methods—such as standardization and one-hot encoding—reformat these data into tabular structures that inherently discard semantic richness and obscure inter-sample and inter-feature relationships. Consequently, advanced deep learning models often underperform compared to simpler approaches like gradient-boosted decision trees (GBDT), due to their limited capacity to extract meaningful representations from semantically sparse inputs. To address this limitation, we introduce SemantiQ, a novel upsampling-based representation learning framework that embeds questionnaire responses into a unified semantic space. Leveraging Retrieval-Augmented Generation (RAG) in conjunction with large language models (LLMs), SemantiQ transforms question text, option text, and external knowledge into semantically enriched natural language statements. These statements are then encoded into semantic embeddings, which are further refined through a three-stage training mechanism and test-time training (TTT), enabling the model to capture complex sample- and feature-wise dependencies. Extensive experiments on multiple real-world datasets demonstrate that SemantiQ consistently outperforms state-of-the-art baselines.

AAAI 2026

Rethink Representation Learning for Questionnaire Data

test-time training

semantic enhancement

representation learning

deep learning

poster

We are pleased to announce the Fortieth AAAI Conference on Artificial Intelligence (AAAI-26), which will be held in Singapore EXPO from January 20 to January 27, 2026.

The purpose of the AAAI conference series is to promote research in Artificial Intelligence (AI) and foster scientific exchange between researchers, practitioners, scientists, students, and engineers across the entirety of AI and its affiliated disciplines. AAAI-26 will feature technical paper presentations, special tracks, invited speakers, workshops, tutorials, poster sessions, senior member presentations, competitions, and exhibit programs, and a range of other activities to be announced.<br><br>

To access this event page, you need to log in with the **email address you registered with**. <br>Access credentials will be sent to your email from Underline -  subject line "Welcome to AAAI 2026". Please be sure to check your spam email folder if you do not see an email confirmation right away.

Please log in

To access this event page, you are required to register.
Please complete your registration to continue.

We recommend reading [**the registration information**](https://aaai.org/conference/aaai/aaai-26/registration/) first.

**Online Registration Form**: https://aaai.getregistered.net/conference-2026 

Registration Required

We are pleased to announce the Fortieth AAAI Conference on Artificial Intelligence (AAAI-26), which will be held in Singapore EXPO from January 20 to January 27, 2026.

To address the limitations of transductive learning in evolving real-world scenarios where unknown categories may continuously emerge, Continual Generalized Category Discovery (C-GCD) presents a novel paradigm that extends conventional category discovery frameworks. Unlike traditional static learning environments, C-GCD requires models to incrementally discover novel categories across multiple operational phases while maintaining discrimination capabilities for previously learned classes, posing significant challenges in balancing stability and plasticity. Prior approaches typically employ parameter-level knowledge distillation from historical models to alleviate catastrophic forgetting, which effectively preserves prior knowledge and optimizes computational efficiency. However, our analysis reveals that the persistent availability of samples from previous stages enables more sophisticated knowledge preservation strategies. Specifically, we present a Fix and Explore strategy that employs distinct learning methodologies for different types of potential data, aiming to preserve the features of old categories as much as possible and gradually exploring the potential distribution of new class latent spaces, we can enhance the model's ability to discover novel categories. This paper investigates this effect and introduces a novel heuristic paradigm to solve the C-GCD problem, called Fix and Explore (FaE), which aims to provide sufficient imaginative space for new classes while preserving the classification ability for old tasks. We conducted experiments across multiple datasets and performed detailed comparisons. The results demonstrate that our method achieves state-of-the-art performance at each stage across all datasets.

Learning a Fix and Explore Framework for Continuous Generalized Category Discovery

We introduce Mixture-of-Trees (MoT), a novel framework that integrates sparse expert activation with structured tree-based reasoning for efficient LLM inference. MoT employs a learned gating mechanism to selectively activate only the most relevant expert reasoning trees for each problem, where experts use models of varying capacities based on task complexity. The framework features three key innovations: (1) sparse expert activation through unified gating networks, (2)specialized expert trees that leverage domain-specific expertise while optimizing the quality-efficiency trade-off, and (3) collaborative debate mechanisms for conflicting solutions. Additionally, MoT includes a shared baseline tree with early stopping—activated experts perform lightweight validation and terminate early when confidence is high. Experiments across five benchmarks (GSM8K, MATH, AIME 2024, MMLU, HotpotQA) show that MoT achieves 2-7 percentage point accuracy improvements while reducing LLM calls by 37-40\% compared to existing multi-path methods.

Mixture-of-Trees: Learning to Select and Weigh Reasoning Paths for Efficient LLM Inference

Existing plug-and-play image restoration methods typically employ off-the-shelf Gaussian denoisers as proximal operators within classical optimization frameworks based on variable splitting. Recently, denoisers induced by generative priors have been successfully integrated into regularized optimization methods for image restoration under Gaussian noise. However, their application to non-Gaussian noise—such as impulse noise—remains largely unexplored. In this paper, we propose a plug-and-play image restoration framework based on generative diffusion priors for robust removal of general noise types, including impulse noise. Within the maximum a posteriori (MAP) estimation framework, the data fidelity term is adapted to the specific noise model. Departing from the conventional least-squares loss used for Gaussian noise, we introduce a generalized Gaussian scale mixture-based loss, which approximates a wide range of noise distributions and leads to an $\ell_q$-norm ($0<q\leq2$) fidelity term. This optimization problem is addressed using an iteratively reweighted least squares (IRLS) approach, wherein the proximal step involving the generative prior is efficiently performed via a diffusion-based denoiser. Experimental results on benchmark datasets demonstrate that the proposed method effectively removes non-Gaussian impulse noise and achieves superior restoration performance.

Integrating Reweighted Least Squares with Plug-and-Play Diffusion Priors for Noisy Image Restoration

Large Language Models (LLMs) often suffer from hallucinations and outdated or incomplete knowledge. Retrieval-Augmented Generation (RAG) is proposed to address these issues by integrating external knowledge like that in knowledge graphs (KGs) into LLMs. However, leveraging private KGs in RAG systems poses significant privacy risks due to the black-box nature of LLMs and potential insecure data transmission, especially when using third-party LLM APIs lacking transparency and control.
In this paper, we investigate the \emph{privacy-protected RAG scenario} for the first time, where entities in KGs are anonymous for LLMs, thus preventing them from accessing entity semantics. Due to the loss of semantics of entities, previous RAG systems cannot retrieve question-relevant knowledge from KGs by matching questions with the meaningless identifiers of anonymous entities. To realize an effective RAG system in this scenario, two key challenges must be addressed: (1) \emph{How can anonymous entities be converted into retrievable information}. (2) \emph{How to retrieve question-relevant anonymous entities}.

To address these challenges, we propose a novel \textbf{A}bstraction \textbf{R}easoning \textbf{o}n \textbf{G}raph (\textbf{ARoG}) framework including relation-centric abstraction and structure-oriented abstraction strategies. For challenge (1), the first strategy abstracts entities into high-level concepts by dynamically capturing the semantics of their adjacent relations. Hence, it supplements meaningful semantics which can further support the retrieval process. For challenge (2), the second strategy transforms unstructured natural language questions into structured abstract concept paths. These paths can be more effectively aligned with the abstracted concepts in KGs, thereby improving retrieval performance. In addition to guiding LLMs to effectively retrieve knowledge from KGs, these two abstraction strategies also strictly protect privacy from being exposed to LLMs. Experiments on three datasets demonstrate that ARoG achieves strong performance and privacy-robustness, establishing a new practical direction for privacy-protected RAG systems. The Code is available at https://github.com/NLPGM/ARoG.

Privacy-protected Retrieval-Augmented Generation for Knowledge Graph Question Answering

Image manipulation localization (IML) faces a fundamental trade-off between minimizing annotation cost and achieving fine-grained localization accuracy. 
Existing fully-supervised IML methods depend heavily on dense pixel-level mask annotations, which limits scalability to large datasets or real-world deployment. 
In contrast, the majority of existing weakly-supervised IML approaches are based on image-level labels, which greatly reduce annotation effort but typically lack precise spatial localization.
To address this dilemma, we propose a novel weakly-supervised IML framework that effectively balances annotation cost and localization performance. Specifically, we propose a coarse region annotation strategy, which can generate relatively accurate manipulation masks at lower cost. To improve model efficiency and facilitate deployment, we further design an efficient lightweight student model, which learns to perform fine-grained localization through knowledge distillation from a fixed teacher model based on the Segment Anything Model (SAM). 
Moreover, inspired by the human subconscious memory mechanism, our feature fusion module employs a dual-guidance strategy that actively contextualizes recalled prototypical patterns with real-time observational cues derived from the input. 
Instead of passive feature extraction, this strategy enables a dynamic process of knowledge recollection, where long-term memory is adapted to the specific context of the current image, significantly enhancing localization accuracy and robustness. 
Extensive experiments across both in-distribution and out-of-distribution datasets show that our method outperforms or rivals fully-supervised models, while maintaining strong generalization, low annotation cost, and efficient deployment characteristics. Our code will be released upon acceptance.

From Passive Perception to Active Memory: A Weakly Supervised Image Manipulation Localization Framework Driven by Coarse-Grained Annotations

Quantified formulas with Uninterpreted Functions (UFs) over non-linear real arithmetic pose fundamental challenges for Satisfiability Modulo Theories (SMT) solving. Traditional quantifier instantiation methods struggle because they lack semantic understanding of UF constraints, forcing them to search through unbounded solution spaces with limited guidance. We present AquaForte, a framework that leverages Large Language Models to provide semantic guidance for UF instantiation by generating instantiated candidates for function definitions that satisfy the constraints, thereby significantly reducing the search space and complexity for solvers. Our approach preprocesses formulas through constraint separation, uses structured prompts to extract mathematical reasoning from LLMs, and integrates the results with traditional SMT algorithms through adaptive instantiation. AquaForte maintains soundness through systematic validation: LLM-guided instantiations yielding SAT solve the original problem, while UNSAT results generate exclusion clauses for iterative refinement. Completeness is preserved by fallback to traditional solvers augmented with learned constraints. Experimental evaluation on SMT-COMP benchmarks demonstrates that AquaForte solves numerous instances where state-of-the-art solvers like Z3 and CVC5 timeout, with particular effectiveness on satisfiable formulas. Our work shows that LLMs can provide valuable mathematical intuition for symbolic reasoning, establishing a new paradigm for SMT constraint solving.

LLM-Guided Quantified SMT Solving over Uninterpreted Functions

Camera-based 3D semantic scene completion (SSC) provides dense geometric and semantic perception for autonomous driving and robotic navigation. However, existing methods rely on a coupled encoder to deliver both semantic and geometric priors, which forces the model to make a trade-off between conflicting demands and limits its overall performance. To tackle these challenges, we propose FoundationSSC, a novel framework that performs dual decoupling at both the source and pathway levels. At the source level, we introduce a foundation encoder that provides rich semantic feature priors for the semantic branch and high-fidelity stereo cost volumes for the geometric branch. At the pathway level, these priors are refined through specialised, decoupled pathways, yielding superior semantic context and depth distributions. Our dual-decoupling design produces disentangled and refined inputs, which are then utilised by a hybrid view transformation to generate complementary 3D features. Additionally, we introduce a novel Axis-Aware Fusion (AAF) module that addresses the often-overlooked challenge of fusing these features by anisotropically merging them into a unified representation. Extensive experiments demonstrate the advantages of FoundationSSC, achieving simultaneous improvements in both semantic and geometric metrics, surpassing prior bests by +0.23 mIoU and +2.03 IoU on SemanticKITTI. Additionally, we achieve state-of-the-art performance on SSCBench-KITTI-360, with 21.78 mIoU and 48.61 IoU.

Unleashing Semantic and Geometric Priors for 3D Scene Completion

Large language model-based multi-agent systems (LLM-MAS) effectively accomplish complex and dynamic tasks through inter-agent communication, but this reliance introduces substantial safety vulnerabilities. Existing attack methods targeting LLM-MAS either compromise agent internals or rely on direct and overt persuasion, which limit their effectiveness, adaptability, and stealthiness. In this paper, we propose MAST, a Multi-round Adaptive Stealthy Tampering framework designed to exploit communication vulnerabilities within the system. MAST integrates Monte Carlo Tree Search with Direct Preference Optimization to train an attack policy model that adaptively generates effective multi-round tampering strategies. Furthermore, to preserve stealthiness, we impose dual semantic and embedding similarity constraints during the tampering process. Comprehensive experiments across diverse tasks, communication architectures, and LLMs demonstrate that MAST consistently achieves high attack success rates while significantly enhancing stealthiness compared to baselines. These findings highlight the effectiveness, stealthiness, and adaptability of MAST, underscoring the need for robust communication safeguards in LLM-MAS.

Attack the Messages, Not the Agents: A Multi-round Adaptive Stealthy Tampering Framework for LLM-MAS

Diffusion planning is a promising method for learning high-performance policies from offline data. To avoid the impact of discrepancies between planning and reality on performance, previous works generate new plans at each time step. However, this incurs significant computational overhead and leads to lower decision frequencies, and frequent plan switching may also affect performance. In contrast, humans might create detailed short-term plans and more general, sometimes vague, long-term plans, and adjust them over time. Inspired by this, we propose the Temporal Diffusion Planner (TDP) which improves decision efficiency by distributing the denoising steps across the time dimension. TDP begins by generating an initial plan that becomes progressively more vague over time. At each subsequent time step, rather than generating an entirely new plan, TDP updates the previous one with a small number of denoising steps. This reduces the average number of denoising steps, improving decision efficiency. Additionally, we introduce an automated replanning mechanism to prevent significant deviations between the plan and reality. Experiments on D4RL show that, compared to previous works that generate new plans every time step, TDP significantly improves the decision-making frequency by 11-24.8 times while achieving higher or comparable performance.

Efficient Diffusion Planning with Temporal Diffusion

The sim-to-real gap, where agents trained in a simulator face significant performance degradation during testing, is a fundamental challenge in reinforcement learning. Extensive works adopt the framework of distributionally robust RL, to learn a policy that acts robustly under worst case environment shift. Within this framework, our objective is to devise algorithms that are sample efficient with interactive data collection and large state spaces. By assuming $d$-rectangularity of environment dynamic shift, we identify a fundamental hardness result for learning in online Markov game, and address it by adopting minimum value assumption. Then, a novel least square value iteration type algorithm, DR-CCE-LSI, with exploration bonus devised specifically for multiple agent, is proposed to find an $\\varepsilon-$approximate robust Coarse Correlated Equilibrium(CCE). To obtain sample efficient learning, we find that: when the feature mapping function satisfies certain properties, our algorithm, DR-CCE-LSI, is able to achieve $\\epsilon-$approximate CCE with a regret bound of $\\mathcal{O}\\{dH\\min\\{H,\\frac{1}{\\min\\{\\sigma_i\\}}\\}\\sqrt{K}\\}$, where $K$ is the number of interacting episodes, $H$ is the horizon length, $d$ is the feature dimension, and $\\sigma_i$ represents the uncertainty level of player $i$. Our work introduces the first sample-efficient algorithm for this setting, matches the best result so far in single agent setting, and achieves minimax optimal sample complexity in terms of the feature dimension $d$. Meanwhile, we also conduct simulation study to validate the efficacy of our algorithm in learning a robust equilibrium.

Downloads

Next from AAAI 2026

Learning a Fix and Explore Framework for Continuous Generalized Category Discovery

Stay up to date with the latest Underline news!

PRESENTATIONS

CONFERENCES

COMPANY

RESOURCES

.css-70qvj9{display:-webkit-box;display:-webkit-flex;display:-ms-flexbox;display:flex;-webkit-align-items:center;-webkit-box-align:center;-ms-flex-align:center;align-items:center;}Downloads

Next from AAAI 2026

Learning a Fix and Explore Framework for Continuous Generalized Category Discovery

Stay up to date with the latest Underline news!

PRESENTATIONS

CONFERENCES

COMPANY

RESOURCES

Downloads