Singapore

Fine-tuning large language models (LLMs) improves performance but introduces critical safety vulnerabilities: even minimal harmful data can severely compromise safety measures. We observe that perturbations orthogonal to the alignment direction—defined by weight differences between aligned (safe) and unaligned models—rapidly compromise model safety. In contrast, updates along the alignment direction largely preserve it, revealing the parameter space as a &#39;&#39;narrow safety basin&#39;&#39;. To address this, we propose **SECURE** (**S**afety **E**nforcement **C**onstraint **U**sing **Re**gularized Orthogonality) to maintain safety by explicitly constraining update directions during fine-tuning. By penalizing updates orthogonal to the alignment direction, SECURE effectively constrains the model within the &#39;&#39;narrow safety basin,&quot; thus preserving its inherent safety. Extensive experiments on multiple datasets and models show that SECURE reduces harmful behaviors by up to 7.60\%, improves task performance by 3.44\%, and consistently outperforms existing methods across multiple tasks. Code and datasets are available at: &lt;https://anonymous.4open.science/r/69F7-ED36/&gt;.

AAAI 2026

AsFT: Anchoring Safety During LLM Fine-Tuning Within Narrow Safety Basin

safety and robustness

(large) language models

machine learning

Fine-tuning large language models (LLMs) improves performance but introduces critical safety vulnerabilities: even minimal harmful data can severely compromise safety measures. We observe that perturbations orthogonal to the alignment direction—defined by weight differences between aligned (safe) and unaligned models—rapidly compromise model safety. In contrast, updates along the alignment direction largely preserve it, revealing the parameter space as a ''narrow safety basin''. To address this, we propose **SECURE** (**S**afety **E**nforcement **C**onstraint **U**sing **Re**gularized Orthogonality) to maintain safety by explicitly constraining update directions during fine-tuning. By penalizing updates orthogonal to the alignment direction, SECURE effectively constrains the model within the ''narrow safety basin," thus preserving its inherent safety. Extensive experiments on multiple datasets and models show that SECURE reduces harmful behaviors by up to 7.60\%, improves task performance by 3.44\%, and consistently outperforms existing methods across multiple tasks. Code and datasets are available at: <https://anonymous.4open.science/r/69F7-ED36/>.

technical paper

We are pleased to announce the Fortieth AAAI Conference on Artificial Intelligence (AAAI-26), which will be held in Singapore EXPO from January 20 to January 27, 2026.

The purpose of the AAAI conference series is to promote research in Artificial Intelligence (AI) and foster scientific exchange between researchers, practitioners, scientists, students, and engineers across the entirety of AI and its affiliated disciplines. AAAI-26 will feature technical paper presentations, special tracks, invited speakers, workshops, tutorials, poster sessions, senior member presentations, competitions, and exhibit programs, and a range of other activities to be announced.<br><br>

To access this event page, you need to log in with the **email address you registered with**. <br>Access credentials will be sent to your email from Underline -  subject line "Welcome to AAAI 2026". Please be sure to check your spam email folder if you do not see an email confirmation right away.

Please log in

To access this event page, you are required to register.
Please complete your registration to continue.

We recommend reading [**the registration information**](https://aaai.org/conference/aaai/aaai-26/registration/) first.

**Online Registration Form**: https://aaai.getregistered.net/conference-2026 

Registration Required

We are pleased to announce the Fortieth AAAI Conference on Artificial Intelligence (AAAI-26), which will be held in Singapore EXPO from January 20 to January 27, 2026.

Generative Recommendation (GR) has emerged as a new paradigm in recommender systems. This approach relies on quantized representations to discretize item features, modeling users’ historical interactions as sequences of discrete tokens. Based on these tokenized sequences, GR predicts the next item by employing next-token prediction methods. The challenges of GR lie in constructing high-quality semantic identifiers (IDs) that are hierarchically organized, minimally conflicting, and conducive to effective generative model training. However, current approaches remain limited in their ability to harness multimodal information and to capture the deep and intricate interactions among diverse modalities, both of which are essential for learning high-quality semantic IDs and for effectively training GR models. To address this, we propose **M**ulti-**A**spect **C**ross-modal quantization for generative **Rec**ommendation (MACRec), which introduces multimodal information and incorporates it into both semantic ID learning and generative model training from different aspects. Specifically, we first introduce cross-modal quantization during the ID learning process, which effectively reduces conflict rates and thus improves codebook usability through the complementary integration of multimodal information. In addition, to further enhance the generative ability of our GR model, we incorporate multi-aspect cross-modal alignments, including the implicit and explicit alignments. Finally, we conduct extensive experiments on three well-known recommendation datasets to demonstrate the effectiveness of our proposed method.

Multi-Aspect Cross-modal Quantization for Generative Recommendation

As microservice architectures become increasingly complex and system events become more frequent, Root Cause Analysis (RCA) has emerged as a critical task to ensure system reliability. However, existing deep learning-based methods often struggle with limited flexibility and a lack of interpretability when addressing complex system failures. Recent efforts to integrate large language models (LLMs) have shown promise in enhancing diagnostic transparency and reasoning capability. However, expansive search spaces, intricate workflows, and entangled constraints constrain practical adoption. We propose RCAFlow, a multi-agent framework that integrates structured workflow knowledge with hierarchical planning to address these challenges. RCAFlow transforms semi-structured documents into behavior tree-style workflows to support interpretable plan generation, employs a Git-inspired branching mechanism for modular and hierarchical task execution with path isolation, and leverages state-aware task execution with semantic analysis to improve result understanding and feedback. We evaluate RCAFlow on three benchmark datasets provided by OpenRCA. Experimental results demonstrate that RCAFlow consistently outperforms existing methods across all datasets. Further ablation studies confirm the effectiveness of each core module, highlighting the reliability, extensibility, and interpretability of RCAFlow to support complex RCA tasks within intelligent IT operations.

RCAFlow: A Workflow-Informed Hierarchical Planning Multi-Agent System for Root Cause Analysis

Weakly supervised 3D instance segmentation is essential for 3D scene understanding, especially as the growing scale of data and high annotation costs of fully supervised approaches. Existing methods primarily rely on two forms of weak supervision: one-thing-one-click annotation and bounding box annotation, both of which help alleviate annotation burdens. However, these approaches still face challenges, including time-consuming annotation procedures, high complexity, and reliance on skilled annotators. To overcome these limitations, we propose DBGroup, a two-stage weakly supervised 3D instance segmentation framework that leverages scene-level annotations as a more efficient and scalable alternative. In the first stage, we introduce a Dual-Branch Point Grouping module to generate pseudo labels guided by semantic and mask cues extracted from multi-view images. To further enhance label quality, we design two refinement strategies: Granularity-Aware Instance Merging and Semantic Selection and Propagation. In the second stage, we utilize the refined pseudo labels to perform multi-round self-training on an end-to-end instance segmentation network. Additionally, we propose an Instance Mask Filter strategy to address inconsistencies within the pseudo labels. Extensive experiments on the ScanNetV2 and S3DIS datasets demonstrate that DBGroup achieves superior performance compared to state-of-the-art 3D instance segmentation methods, as well as existing 3D semantic segmentation methods using scene-level supervision.

DBGroup: Dual-Branch Point Grouping for Weakly Supervised 3D Semantic Instance Segmentation

Markov games and robust MDPs are closely related models that involve computing a pair of saddle point policies. As part of the long-standing effort to develop efficient algorithms for these models, the Filar-Tolwinski (FT) algorithm has shown considerable promise. As our first contribution, we demonstrate that FT may fail to converge to a saddle point and may loop indefinitely, even in small games. This observation contradicts the proof of FT's optimality in the original paper. As our second contribution, we then propose Residual Conditioned Policy Iteration (RCPI). RCPI builds on FT, but is guaranteed to converge to a saddle point. Our numerical results show that RCPI outperforms other convergent algorithms by several orders of magnitude.

Convergence of Fast Policy Iteration in Markov Games and Robust MDPs

Current graph neural network (GNN) model-stealing methods rely heavily on queries to the victim model, assuming no hard query limits. However, in reality, the number of allowed queries can be severely limited. In this paper, we demonstrate how an adversary can extract the GNN with very limited interactions with the model. Our approach first enables the adversary to obtain the model backbone without making direct queries to the victim model and then to strategically utilize a fixed query limit to extract the most informative data.

On Stealing Graph Neural Network Models

Responsibility is a central concept in accountable decision making for multiagent systems. As modern AI systems grow in complexity and autonomy, there is a growing demand for them to address issues in AI ethics, prompting researchers to formalize responsibility from diverse perspectives, including strategic responsibility. However, causal responsibility, i.e. responsibility due to actual causal contribution, has received much less attention. In this paper, we study variants of responsibility attribution from both strategic and causal perspectives within a synchronous game-theoretic logic framework that allows concurrent moves by multiple agents. Our formalization is based on Situation Calculus Synchronous Game Structures (SCSGS). We show that by combining these perspectives, one can obtain novel forms of responsibility attribution that are grounded on actual causation. While doing this, we propose an account of actual causation in SCSGSs. We prove that our formalization handles the issues associated with preemption and over-determination well. We also study some key properties of responsibility and demonstrate that causal, strategic, and combined notions of responsibility are extensionally distinct.

Causal, Strategic, and Combined Responsibility Attribution in Situation Calculus Concurrent Game Structures

Recent advances in video understanding have been driven by MLLMs.
But these MLLMs are good at analyzing short videos,
while suffering from difficulties in understanding videos with a longer context.
To address this difficulty,
several agent paradigms have recently been proposed, 
using MLLMs as agents for retrieving extra contextual knowledge in a long video.
However,
most existing agents ignore the key fact that a long video is composed with multiple shots,
i.e.,
to answer the user question from a long video, 
it is critical to deeply understand its relevant shots like human.
Without such insight,
these agents often mistakenly find redundant even noisy temporal context,
restricting their capacity for long video understanding.
To fill this gap,
we propose VideoChat-A1, 
a novel long video agent paradigm.
Different from the previous works,
our VideoChat-A1 can deeply think with long videos,
via a distinct chain-of-shot reasoning paradigm.
More specifically,
it can progressively select the relevant shots of user question,
and 
look into these shots in a coarse-to-fine partition.
By multi-modal reasoning along the shot chain,
VideoChat-A1 can effectively mimic step-by-step human thinking process,
allowing the interactive discovery of preferable temporal context for thoughtful understanding in long videos.
Extensive experiments show that,
VideoChat-A1 achieves the state-of-the-art performance on the mainstream long video QA benchmarks,
e.g., it achieves 77.0 on VideoMME~(w/ subs) and 70.1 on EgoSchema, 
outperforming its strong baselines (e.g., InternVL2.5-8B and InternVideo2.5-8B),
by up to 10.1\% and 6.2\%. Compared to leading closed-source GPT-4o and Gemini 1.5 Pro, VideoChat-A1 offers competitive accuracy, 
but only with 7\% input frames and 12\% inference time on average.

VideoChat-A1: Thinking with Long Videos by Chain-of-Shot Reasoning

Recently, large-scale language-image pre-trained models (e.g., CLIP) have achieved remarkable success in various retrieval tasks. However, transferring the knowledge learned from such models to Video-based Visible-Infrared person Re-IDentification (VVI-ReID) remains unexplored. The primary challenges are narrowing the modality gap and leveraging spatiotemporal information in video sequences. To address the above issues, in this paper, we propose a novel cross-modality feature learning framework named X-ReID for VVI-ReID. Specifically, we first propose a Cross-modality Prototype Collaboration (CPC) to align and integrate features from different modalities, guiding the network to reduce the modality discrepancy. Then, a Multi-granularity Information Interaction (MII) is designed, incorporating short-term interactions from adjacent frames, long-term cross-frame information fusion, and cross-modality feature alignment to enhance temporal modeling and further reduce modality gaps. Finally, by integrating multi-granularity information, a robust sequence-level representation is achieved. Extensive experiments on two large-scale VVI-ReID benchmarks (i.e., HITSZ-VCM and BUPTCampus) demonstrate the superiority of our method over state-of-the-art methods.

X-ReID: Multi-granularity Information Interaction for Video-Based Visible-Infrared Person Re-Identification

Large Reasoning Models (LRMs) extend large language models with explicit, multi-step reasoning traces to enhance transparency and performance on complex tasks.
However, these reasoning traces can be redundant or logically inconsistent, becoming a new and hard-to-detect source of hallucination.
Existing hallucination detection methods focus primarily on answer-level uncertainty and often fail to detect hallucinations or logical inconsistencies arising from the model’s reasoning trace.
This oversight is particularly problematic for LRMs, where the explicit thinking trace is not only an important support to the model's decision-making process but also a key source of potential hallucination. 
To this end, we propose RACE (Reasoning and Answer Consistency Evaluation), a novel framework specifically tailored for hallucination detection in LRMs.
RACE operates by extracting essential reasoning steps and computing four diagnostic signals: inter-sample consistency of reasoning traces, entropy-based answer uncertainty, semantic alignment between reasoning and answers, and internal coherence of reasoning. This joint analysis enables fine-grained hallucination detection even when the final answer appears correct.
Experiments across datasets and different LLMs demonstrate that RACE outperforms existing hallucination detection baselines, offering a robust and generalizable solution for evaluating LRMs.

Joint Evaluation of Answer and Reasoning Consistency for Hallucination Detection in Large Reasoning Models

Imputing missing values in spatial-temporal traffic data is essential for intelligent transportation systems. Among advanced imputation methods, score-based diffusion models have demonstrated competitive performance. These models generate data by reversing a noising process, using observed values as conditional guidance. However, existing diffusion models typically apply a uniform guidance scale across both spatial and temporal dimensions, which is inadequate for nodes with high missing data rates. Sparse observations provide insufficient conditional guidance, causing the generative process to drift toward the learned prior distribution rather than closely following the conditional observations, resulting in suboptimal imputation performance.

To address this, we propose FENCE (Spatial-Temporal Feedback Diffusion Guidance), a novel method that adaptively controls guidance scales during imputation. First, FENCE introduces a dynamic feedback mechanism that adjusts the guidance scale based on the posterior likelihood approximations. The guidance scale is increased when generated values diverge from observations and reduced when alignment improves, preventing overcorrection. Second, because alignment to observations varies across nodes and denoising steps, a global guidance scale for all nodes is suboptimal. FENCE computes guidance scales at the cluster level by grouping nodes based on their attention scores, leveraging spatial-temporal correlations to provide more accurate guidance. Experimental results on real-world traffic datasets show that FENCE significantly enhances imputation accuracy.

Downloads

Next from AAAI 2026

Multi-Aspect Cross-modal Quantization for Generative Recommendation

Stay up to date with the latest Underline news!

PRESENTATIONS

CONFERENCES

COMPANY

RESOURCES