Singapore

Task-specific data selection, which aims to identify the most relevant training instances from a large corpus to optimize performance on a target task, is a critical challenge in modern AI. Prevailing methods typically rely on either representation clustering or gradient-based influence estimation. However, these approaches have notable limitations. Representation-based methods rely on static features; they measure semantic proximity but are agnostic to the process of learning. Conversely, influence-based methods, while capturing optimization directions, often focus narrowly on aligning with the validation loss, which may not fully correlate with the desired capabilities. To address these issues, we propose TRACE, a novel algorithm that simultaneously considers data consistency in the optimization direction and representation space, and performs TRajectory-based Activation Change Estimation to select instruction. Specifically, TRACE first performs a targeted weight update using the validation set. It then captures the optimization trajectory by calculating the change in neuron activations for each before and after this update. By selecting data whose activation change are most similar to those of the validation set, TRACE ensures alignment in both the representational and optimization domains. Our experiments demonstrate that TRACE outperforms baseline methods across various tasks, particularly in complex, data-scarce scenarios.

AAAI 2026

TRACE: Trajectory-based Activation Change Estimation for Task-specific Data Selection

and evaluation of nlp models

interpretability

analysis

poster

We are pleased to announce the Fortieth AAAI Conference on Artificial Intelligence (AAAI-26), which will be held in Singapore EXPO from January 20 to January 27, 2026.

The purpose of the AAAI conference series is to promote research in Artificial Intelligence (AI) and foster scientific exchange between researchers, practitioners, scientists, students, and engineers across the entirety of AI and its affiliated disciplines. AAAI-26 will feature technical paper presentations, special tracks, invited speakers, workshops, tutorials, poster sessions, senior member presentations, competitions, and exhibit programs, and a range of other activities to be announced.<br><br>

To access this event page, you need to log in with the **email address you registered with**. <br>Access credentials will be sent to your email from Underline -  subject line "Welcome to AAAI 2026". Please be sure to check your spam email folder if you do not see an email confirmation right away.

Please log in

To access this event page, you are required to register.
Please complete your registration to continue.

We recommend reading [**the registration information**](https://aaai.org/conference/aaai/aaai-26/registration/) first.

**Online Registration Form**: https://aaai.getregistered.net/conference-2026 

Registration Required

We are pleased to announce the Fortieth AAAI Conference on Artificial Intelligence (AAAI-26), which will be held in Singapore EXPO from January 20 to January 27, 2026.

Understanding how localized changes in one variable affect others in multivariate time series is essential for diagnostics and decision-making in complex systems. Existing models often fail to capture realistic inter-feature dynamics when simulating "what-if" scenarios, leading to inaccurate or uncorrelated reconstructions. We propose CFORVAE, a variational autoencoder framework that explicitly addresses this limitation by combining temporal decomposition with frequency-domain feature correlation modeling. Our architecture uses a dual-path encoding of trend and seasonal components, each projected into attention-pooled latent spaces, and applies Fourier Neural Operators (FNO) to capture cross-feature dependencies in the spectral domain. This decomposition-correlation design enables component-specific latent manipulation and ensures that local modifications propagate realistically across correlated variables. Through extensive experiments, we show that CFORVAE outperforms state-of-the-art baselines in preserving temporal and feature-level dependencies, especially under adjustment-based reconstructions, making it a powerful tool for interpretable "what-if" analysis and diagnostics.

Intervention-Aware Time Series Modeling: Capturing and Evaluating Feature Dependencies

Metasurfaces are ultrathin, engineered materials composed of nanostructures that manipulate light in ways unattainable by natural materials. Recent advances have leveraged computational optimization, machine learning, and deep learning to automate their design. However, existing approaches exhibit two fundamental limitations: (1) they often restrict the model to generating only a subset of design parameters, and (2) they rely on heavily downsampled spectral targets, which compromises both the novelty and accuracy of the resulting structures. The core challenge lies in developing a generative model capable of exploring a large, unconstrained design space while precisely capturing the intricate physical relationships between material parameters and their high-resolution spectral responses. In this paper, we introduce ​MetaDiT, a novel framework for high-fidelity metasurface design that addresses these limitations. Our approach leverages a robust spectrum encoder pretrained with contrastive learning, providing strong conditional guidance to a Diffusion Transformer-based backbone. Experiments demonstrate that MetaDiT outperforms existing baselines in spectral accuracy, we further validate our method through extensive ablation studies. Our code and model weights will be open-sourced to facilitate future research.

MetaDiT: Enabling Fine-grained Constraints in High-degree-of Freedom Metasurface Design

Recent advances in the field of sequential recommendation have highlighted the potential of Large Language Models (LLMs) in enhancing item embeddings and improving user understanding. However, existing approaches face three major limitations: 1) insufficient understanding of the reasons behind users' purchase decisions, 2) the high-dimensional embeddings directly produced by LLMs are not well compatible with traditional low-dimensional ID embeddings and 3) reliance on additional fine-tuning and high inference overhead to adapt LLMs to the recommendation task. In this paper, we propose MoMoREC, a simple yet effective user-understanding-based recommendation strategy. This method leverages the intrinsic comprehension capabilities of LLMs combined with residual semantic IDs to better understand users. Specifically, starting from common user purchasing behaviors and incorporating item characteristics, we employ a multi-agent framework to utilize LLMs in analyzing user shopping motivations and extracting high-dimensional dense embeddings. These embeddings are then transformed into low-dimensional IDs using a residual semantic ID approach via clustering and residual dimensionality reduction, which can be fed into the recommendation model. MoMoREC effectively integrates the understanding power of LLMs with the strengths of recommendation systems, preserving rich semantic language embeddings while reducing or eliminating the need for auxiliary trainable modules. As a result, it seamlessly adapts to any sequential recommendation framework. Experiments on three benchmark datasets show that MoMoRec significantly improves traditional recommendation models, demonstrating its effectiveness and flexibility.

MoMoREC: A Multi-agent Motivation Generation Framework for Residual Semantic ID-Aware Recommendation

While Multimodal Large Language Models (MLLMs) demonstrate remarkable capabilities across diverse domains, their application to specialized anomaly detection (AD) remains constrained by domain adaptation challenges. Existing Group Relative Policy Optimization (GRPO) based approaches suffer from two critical limitations: inadequate training data utilization when models produce uniform responses, and insufficient supervision over reasoning processes that encourage immediate binary decisions without deliberative analysis.
We propose a comprehensive framework addressing these limitations through two synergistic innovations. First, we introduce a multi-stage deliberative reasoning process that guides models from region identification to focused examination, generating diverse response patterns essential for GRPO optimization while enabling structured supervision over analytical workflows. Second, we develop a fine-grained reward mechanism incorporating classification accuracy and localization supervision, transforming binary feedback into continuous signals that distinguish genuine analytical insight from spurious correctness.
Comprehensive evaluation across multiple industrial datasets demonstrates substantial performance improvements in adapting general vision-language models to specialized anomaly detection. Our method achieves superior accuracy with efficient adaptation of existing annotations, effectively bridging the gap between general-purpose MLLM capabilities and the fine-grained visual discrimination required for detecting subtle manufacturing defects and structural irregularities.

AD-FM: Multimodal LLMs for Anomaly Detection via Multi-Stage Reasoning and Fine-Grained Reward Optimization

Molecular evolution is the process of simulating the natural evolution of molecules in chemical space to explore potential molecular structures and properties. The relationships between similar molecules are often described through transformations such as adding, deleting, and modifying atoms and chemical bonds, reflecting specific evolutionary paths. Existing molecular representation methods mainly focus on mining data, such as atomic-level structures and chemical bonds directly from the molecules, often overlooking their evolutionary history. Consequently, we aim to explore the possibility of enhancing molecular representations by simulating the evolutionary process. We extract and analyze the changes in the evolutionary pathway and explore combining it with existing molecular representations. Therefore, this paper proposes the molecular evolutionary network (MEvoN) for molecular representations. First, we construct the MEvoN using molecules with a small number of atoms and generate evolutionary paths utilizing similarity calculations. Then, by modeling the atomic-level changes, MEvoN reveals their impact on molecular properties. Experimental results show that the MEvoN-based molecular property prediction method significantly improves the performance of traditional end-to-end algorithms by approximately 33% on both the QM7 and QM9 datasets. The code and implementation details are provided in the appendix.

Can Molecular Evolution Mechanism Enhance Molecular Representation?

In multimodal sentiment analysis, modality missingness and quality degradation are common. Existing methods often rely on batch-level modality generation, generation but neglect sample-level missingness, hence their flexibility is limited severely in real-world scenarios. To address this, Sample-specific Modality Diagnosis and Cross-modal Enhancement for Incomplete Multimodal Representations (SMCIR) is proposed. Specifically, The Dynamic Multi-feature Fusion Detector (DMFD) is presented, which detects missingness and severity at the sample-level using indicators such as information entropy, modality similarity, and mutual information. Unlike batch-based methods, the DMFD provides fine-grained detection and adaptive responses, improving sensitivity to modality disturbances. Meanwhile, the Context-aware Modality Completion Generator (CMCG) is developed to restore missing modalities through context-guided reconstruction using multiscale feature fusion and cross-modal attention. In this way, the proposed CMCG method can avoid redundancy and inconsistency, enhancing the consistency and discriminativity of the fused representation. In CMCG, the text modality serves as a stable guide to improve context consistency. Experiments on the CMU-MOSI and CMU-MOSEI datasets show that SMCIR outperforms existing full-modal and non-recovery-based methods, well validating its efficacy and superiority in multimodal learning.

Sample-specific Modality Diagnosis and Cross-modal Enhancement for Incomplete Multimodal Representations

Multi-relational graph clustering aims to uncover complex node interactions by leveraging multiple relational views, yet existing methods often suffer from two key limitations: they assume equal importance across views and decouple representation learning from clustering, both of which hinder overall performance. To address these issues, we propose OMC-DVM, a novel end-to-end Online Multi-Relational Graph Clustering With Dominant View Mining framework. 
OMC-DVM introduces two core innovations: (1) A unsupervised dominant view mining module that dynamically identifies the dominant view using Maximum Mean Discrepancy (MMD) and adaptively aligns other views to it, mitigating view imbalance. 
(2) An online ,multi-relational clustering process that unifies representation learning and clustering into a single stage.
By performing clustering-level contrastive learning , OMC-DVM directly generates cluster assignments in an end-to-end manner.

Online Multi-Relational Clustering with Dominant View Mining

Goal-conditioned Hierarchical Reinforcement Learning (GCHRL) has demonstrated effectiveness in addressing complicated decision-making tasks by providing ''temporal extraction'', which decomposes tasks into smaller and more manageable ''subgoals''. This enables agents to plan over a longer time scale. However, achieving optimal exploration and exploitation still remains a challenge, especially for long-horizon or sparse-reward scenarios. In this paper, we introduce Active exploraion and hierarchical Self-Imitation (ASI), an effective scheme to enhance exploration and exploitation based on subgoal representation learning. The key point of ASI is to utilize temporal adjacency information in the representation space. We construct and dynamically update an adjacency graph that captures the relationships between subgoals. Based on the adjacency information provided by the graph, we design two mechanisms: (1) active ``frontier-reaching'' exploration that faster expands the explored area by targeting boundary regions, and (2) hierarchical self-imitation learning that leverages historical experience to facilitate both frontier reaching and policy training. Experimental results show that our method accelerates exploration and outperforms existing baselines in challenging long-horizon continuous control tasks.

Enhancing Exploration and Exploitation in Hierarchical Reinforcement Learning with Subgoal Graph Learning

Multi-objective combinatorial optimization problems (MOCOP) frequently arise in practical applications that require the simultaneous optimization of conflicting objectives. Although traditional evolutionary algorithms can be effective, they typically depend on domain knowledge and repeated parameter tuning, limiting flexibility when applied to unseen MOCOP instances. Recently, integration of Large Language Models (LLMs) into evolutionary computation has opened new avenues for automatic heuristic generation, using their advanced language understanding and code synthesis capabilities. Nevertheless, most existing approaches predominantly focus on single-objective tasks, often neglecting key considerations such as runtime efficiency and heuristic diversity in multi-objective settings. To bridge this gap, we introduce Multi-heuristics for MOCOP via Pareto-Grid-guided Evolution of LLMs (MPaGE), a novel enhancement of the Simple Evolutionary Multiobjective Optimization (SEMO) framework that leverages LLMs and Pareto Front Grid (PFG) technique. By partitioning the objective space into grids and retaining top-performing candidates to guide heuristic generation, MPaGE utilizes LLMs to prioritize heuristics with semantically distinct logical structures during variation, thus promoting diversity and mitigating redundancy within the population. Through extensive evaluations, MPaGE demonstrates superior performance over existing LLM-based frameworks, and achieves competitive results to traditional Multi-objective evolutionary algorithms (MOEAs), with significantly faster runtime.

Pareto-Grid-Guided Large Language Models for Fast and High-Quality Heuristics Design in Multi-Objective Combinatorial Optimization

Joint multilingual instruction tuning is a widely adopted approach to improve the multilingual instruction-following ability and downstream performance of large language models (LLMs), but the resulting multilingual capability remains highly sensitive to the composition and selection of the training data.
Existing selection methods, often based on features like text quality, diversity, or task relevance, typically overlook the intrinsic linguistic structure of multilingual data.
In this paper, we propose LangGPS, a lightweight two-stage pre-selection framework guided by language separability—a signal that quantifies how well samples in different languages can be distinguished in the model’s representation space. LangGPS first filters training data based on separability scores and then refines the subset using existing selection methods.
Extensive experiments across six benchmarks and 22 languages demonstrate that applying LangGPS on top of existing selection methods improves their effectiveness and generalizability in multilingual training, especially for understanding tasks and low-resource languages.
Further analysis reveals that highly separable samples facilitate the formation of clearer language boundaries and support faster adaptation, while low-separability samples tend to function as bridges for cross-lingual alignment.
Besides, we also find that language separability can serves as an effective signal for multilingual curriculum learning, where interleaving samples with diverse separability levels yields stable and generalizable gains.
Together, we hope our work offers a new perspective on data utility in multilingual contexts and support the development of more linguistically informed LLMs.

Content not yet available

Next from AAAI 2026

Intervention-Aware Time Series Modeling: Capturing and Evaluating Feature Dependencies

Stay up to date with the latest Underline news!

PRESENTATIONS

CONFERENCES

COMPANY

RESOURCES