Singapore

Code models have become integral to modern software development, yet they remain vulnerable to backdoor attacks through poisoned training data. Current code backdoor attacks struggle with a critical trade-off. Static triggers using fixed code patterns achieve high transferability across different settings, but are easily detected by defenses. Conversely, dynamic triggers that adapt to code context evade detection effectively but exhibit poor cross-dataset transferability. Moreover, existing dynamic approaches unrealistically assume attackers have access to victims&#39; training data, limiting their practical applicability. To overcome these limitations, we introduce Sharpness-aware Transferable Adversarial Backdoor (STAB), a novel attack that achieves transferability and stealthiness without accessing victim data. Our key idea is that adversarial perturbations discovered in flat regions of the loss landscape transfer more effectively across datasets than those found in sharp minima. STAB leverages this by training a surrogate model with Sharpness-Aware Minimization (SAM) to guide model parameters toward these flat regions. We then employ a Gumbel-Softmax based optimization to transform the discrete search for trigger tokens into a differentiable process, generating context-aware adversarial triggers. Experiments on three datasets and two code models demonstrate the superiority of STAB. Compared to static triggers, STAB significantly improves stealiness, maintaining 73.2% average attack success rate after defense (ASR-D) versus near-zero for static approaches. In cross-dataset scenarios, STAB also outperforms the state-of-the-art dynamic attack, AFRAIDOOR, with a 12.4% higher ASR-D, while preserving model performance on clean inputs.

AAAI 2026

Transferable Backdoor Attacks for Code Models via Sharpness-Aware Adversarial Perturbation

app: security

robustness & trustworthiness

peai: safety

app: software engineering

Code models have become integral to modern software development, yet they remain vulnerable to backdoor attacks through poisoned training data. Current code backdoor attacks struggle with a critical trade-off. Static triggers using fixed code patterns achieve high transferability across different settings, but are easily detected by defenses. Conversely, dynamic triggers that adapt to code context evade detection effectively but exhibit poor cross-dataset transferability. Moreover, existing dynamic approaches unrealistically assume attackers have access to victims' training data, limiting their practical applicability. To overcome these limitations, we introduce Sharpness-aware Transferable Adversarial Backdoor (STAB), a novel attack that achieves transferability and stealthiness without accessing victim data. Our key idea is that adversarial perturbations discovered in flat regions of the loss landscape transfer more effectively across datasets than those found in sharp minima. STAB leverages this by training a surrogate model with Sharpness-Aware Minimization (SAM) to guide model parameters toward these flat regions. We then employ a Gumbel-Softmax based optimization to transform the discrete search for trigger tokens into a differentiable process, generating context-aware adversarial triggers. Experiments on three datasets and two code models demonstrate the superiority of STAB. Compared to static triggers, STAB significantly improves stealiness, maintaining 73.2% average attack success rate after defense (ASR-D) versus near-zero for static approaches. In cross-dataset scenarios, STAB also outperforms the state-of-the-art dynamic attack, AFRAIDOOR, with a 12.4% higher ASR-D, while preserving model performance on clean inputs.

poster

We are pleased to announce the Fortieth AAAI Conference on Artificial Intelligence (AAAI-26), which will be held in Singapore EXPO from January 20 to January 27, 2026.

The purpose of the AAAI conference series is to promote research in Artificial Intelligence (AI) and foster scientific exchange between researchers, practitioners, scientists, students, and engineers across the entirety of AI and its affiliated disciplines. AAAI-26 will feature technical paper presentations, special tracks, invited speakers, workshops, tutorials, poster sessions, senior member presentations, competitions, and exhibit programs, and a range of other activities to be announced.<br><br>

To access this event page, you need to log in with the **email address you registered with**. <br>Access credentials will be sent to your email from Underline -  subject line "Welcome to AAAI 2026". Please be sure to check your spam email folder if you do not see an email confirmation right away.

Please log in

To access this event page, you are required to register.
Please complete your registration to continue.

We recommend reading [**the registration information**](https://aaai.org/conference/aaai/aaai-26/registration/) first.

**Online Registration Form**: https://aaai.getregistered.net/conference-2026 

Registration Required

We are pleased to announce the Fortieth AAAI Conference on Artificial Intelligence (AAAI-26), which will be held in Singapore EXPO from January 20 to January 27, 2026.

Neural textures have emerged as pivotal assets in next-generation neural rendering pipelines. However, hardware limitations and programming interface constraints lead to suboptimal performance in multi-instance real-time rendering scenarios. This bottleneck becomes particularly acute for texture-intensive tasks such as font rendering. To address this, we propose neural outline cache (NOC), a novel neural font representation supporting real-time anti-aliased rendering and procedural editing within modern neural graphics pipelines.
NOC's lightweight network leverages multi-resolution hash encoding to cache spline-derived SDFs, delivering anti-aliased rendering via standard graphics pipelines. For massive-instance scalability, our cache buffer layout and batch-fused inference—tailored for NOC—eliminate neural texture binding bottlenecks. Benchmarks achieve 57.35 dB PSNR, 0.9980 SSIM, and 0.00116 MSE in offline rendering, while sustaining 0.51ms latency with 500 live instances. The integrated procedural node system enables real-time artistic font synthesis, certifying NOC as a production-grade neural asset.

Neural Outline Cache for Real-time Anti-aliasing Font Rendering

Decentralized optimization is critical for solving large-scale machine learning problems over distributed networks, where multiple nodes collaborate through local communication. In practice, the variances of stochastic gradient estimators often differ across nodes, yet their impact on algorithm design and complexity remains unclear. To address this issue, we propose D-NSS, a decentralized algorithm with node-specific sampling, and establish its sample complexity depending on the arithmetic mean of local standard deviations, achieving tighter bounds than existing methods that rely on the worst-case or quadratic mean. We further derive a matching sample complexity lower bound under heterogeneous variance, thereby proving the optimality of this dependence. Moreover, we extend the framework with a variance reduction technique and develop D-NSS-VR, which under the mean-squared smoothness assumption attains an improved sample complexity bound while preserving the arithmetic-mean dependence. Finally, numerical experiments validate the theoretical results and demonstrate the effectiveness of the proposed algorithms.

Decentralized Non-convex Stochastic Optimization with Heterogeneous Variance

Storytelling video generation (SVG) aims to produce coherent and visually rich multi-scene videos that follow a structured narrative. Existing methods primarily employ LLM for high-level planning to decompose a story into scene-level descriptions, which are then independently generated and stitched together. However, these approaches struggle with generating high-quality videos aligned with the complex single-scene description, as visualizing such complex description involves coherent composition of multiple characters and events, complex motion synthesis and character customization with sequential motions. To address these challenges, we propose DreamRunner, a novel story-to-video generation method: First, we structure the input script using a large language model (LLM) to facilitate both coarse-grained scene planning as well as fine-grained object-level layout and motion planning. Next, DreamRunner presents retrieval-augmented test-time adaptation to capture target motion priors for objects in each scene, supporting diverse motion customization based on retrieved videos, thus facilitating the generation of new videos with complex, scripted motions. Lastly, we propose a novel spatial-temporal region-based 3D attention and prior injection module SR3AI for fine-grained object-motion binding and frame-by-frame semantic control. We compare DreamRunner with various SVG baselines, demonstrating state-of-the-art performance in character consistency, text alignment, and smooth transitions. Additionally, DreamRunner exhibits strong fine-grained condition-following ability in compositional text-to-video generation, significantly outperforming baselines on T2V-ComBench. Finally, we validate DreamRunner's robust ability to generate multi-object interactions with qualitative examples.

DreamRunner: Fine-Grained Compositional Story-to-Video Generation with Retrieval-Augmented Motion Adaptation

3D street scene reconstruction is a challenging yet crucial task for autonomous driving. Many reconstruction methods often overlook two key limitations for high-quality driving scene reconstruction, sensitivity to camera parameter noise from high-speed vehicles and heavy reliance on precise dynamic object annotations of datasets. To resolve these issues, we propose DenoiseGS, a simple yet effective approach based on explicit 3D Gaussian splatting. Specifically, we propose a novel learnable Delta attribute per Gaussian primitive that operates on the image plane during rasterization to mitigate the impact of noisy camera parameters through modulating the inputs of the $\alpha$-blending process. To enhance the representation of this Delta attribute, we propose a DeltaEstimator that encodes viewing direction and contextual cues to facilitate view dependence. We also extend additional CUDA operations to enable efficient gradient update for the delta attribute. Furthermore, to overcome the limitation of inaccurate annotations for dynamic objects, we propose a learnable B-spline trajectory optimization with few control points to model the trajectory of a moving object. Comprehensive experiments conducted on nuScenes and Waymo Open Dataset demonstrate that our DenoiseGS outperforms some state-of-the-art methods across all metrics of both reconstruction quality and novel view synthesis.

DenoiseGS: Delta-Based 3D Gaussian Splatting with B-spline Trajectory Optimization for Dynamic Driving Scene Reconstruction

For industrial-scale Text-to-SQL, supplying the entire database schema to Large Language Models (LLMs) is impractical due to context window limits and irrelevant noise. Schema linking, which filters the schema to a relevant subset, is therefore critical. However, existing methods incur prohibitive costs, struggle to balance recall with noise, or scale poorly to large databases. We present \textbf{AutoLink}, an autonomous agent framework that reformulates schema linking as an iterative, agent-driven process. Guided by an LLM, AutoLink dynamically explores and expands the linked schema subset, progressively identifying necessary schema components without inputting the full database schema. Our experiments demonstrate AutoLink's superior performance, achieving state-of-the-art strict schema linking recall of \textbf{97.4\%} on Bird-Dev and \textbf{91.2\%} on Spider-2.0-Lite, with competitive execution accuracy, i.e., \textbf{68.7\%} EX on Bird-Dev (better than CHESS), \textbf{34.9\%} EX on Spider-2.0-Lite (rank 2st on the official leaderboard). Crucially, AutoLink exhibits \textbf{exceptional scalability}, \textbf{maintaining high recall}, \textbf{efficient token consumption} and \textbf{robust execution accuracy} on large schemas (e.g., over 3,000 columns) where existing methods severely degrade. Extensive experimental results validate AutoLink as a robust, highly scalable, and high-recall schema linking solution for industrial Text-to-SQL systems.

AutoLink: Autonomous Schema Exploration and Expansion for Scalable Schema Linking in Text-to-SQL at Scale

Cross-domain Few-shot Segmentation (CD-FSS) aims to segment novel classes from target domains that are not involved in training and have significantly different data distributions from the source domain, using only a few annotated samples, and recent years have witnessed significant progress on this task. However, existing CD-FSS methods primarily focus on style gaps between source and target domains while ignoring segmentation granularity gaps, resulting in insufficient semantic discriminability for novel classes in target domains. Therefore, we propose a Hierarchical Semantic Learning (HSL) framework to tackle this problem. Specifically, we introduce a Dual Style Randomization (DSR) module and a Hierarchical Semantic Mining (HSM) module to learn hierarchical semantic features, thereby enhancing the model's ability to recognize semantics at varying granularities. DSR simulates target domain data with diverse foreground-background style differences and overall style variations through foreground and global style randomization respectively, while HSM leverages multi-scale superpixels to guide the model to mine intra-class consistency and inter-class distinction at different granularities. Additionally, we also propose a Prototype Confidence-modulated Thresholding (PCMT) module to mitigate segmentation ambiguity when foreground and background are excessively similar. Extensive experiments are conducted on four popular target domain datasets, and the results demonstrate that our method achieves state-of-the-art performance.

Bridging Granularity Gaps: Hierarchical Semantic Learning for Cross-domain Few-shot Segmentation

The rapid expansion of the Internet of Things (IoT) has created a growing demand for large-scale sensor deployment. However, the high cost of physical sensors limits the scalability and coverage of sensor networks, making fine-grained sensing difficult. Inductive Spatio-Temporal Kriging (ISK) addresses this challenge by introducing virtual sensors that infer measurements from physical sensors, typically using graph neural networks (GNNs) to model their relationships. Despite its promise, current ISK methods often rely on standard message-passing and generic architectures that fail to effectively capture spatio-temporal features or represent virtual nodes accurately. Additionally, existing graph construction techniques suffer from sparse and noisy connections, further hindering performance. To address these limitations, we propose DarkFarseer, a novel ISK framework with three key innovations. First, the Style-enhanced Temporal-Spatial architecture adopts a temporal-then-spatial processing scheme with a temporal style transfer mechanism to enhance virtual node representations. Second, Regional-semantic Contrastive Learning improves representation learning by aligning virtual nodes with regional component patterns. Third, the Similarity-Based Graph Denoising Strategy mitigates the influence of noisy edges by leveraging temporal similarity and regional structure. Extensive experiments on real-world datasets demonstrate that DarkFarseer significantly outperforms state-of-the-art ISK methods.

DarkFarseer: Robust Spatio-Temporal Kriging Under Graph Sparsity and Noise

An interesting phenomenon arises: Empirical Risk Minimization (ERM) sometimes outperforms methods specifically designed for out-of-distribution tasks. This motivates an investigation into the reasons behind such behavior beyond algorithmic design. In this study, we find that one such reason lies in the distribution shift across training domains. A large degree of distribution shift can lead to better performance even under ERM. Specifically, we derive several theoretical and empirical findings demonstrating that distribution shift plays a crucial role in model learning and benefits learning invariant prediction. First, the proposed upper bounds indicate that the degree of distribution shift directly affects the generalization ability of the learned models. If it is large, the generalization ability of the learned models can increase, approximating invariant prediction models that make stable predictions under arbitrary known or unseen domains; and vice versa. Moreover, we prove that under certain data conditions, ERM solutions can exhibit performance comparable to that of invariant prediction models. Second, the empirical validation results demonstrated that the predictions of the trained models approximate the ground-truth labels, provided that the degree of distribution shift in the training data increases.

Distribution Shift Is Key to Learning Invariant Prediction

Large Language Models (LLMs) excel in reasoning tasks requiring a single correct answer, but they perform poorly in multi-solution tasks that require generating comprehensive and diverse answers. We attribute this limitation to \textbf{reasoning overconfidence}: a tendency to express undue certainty in an incomplete solution set. To examine the effect, we introduce \textit{MuSoBench}, a benchmark of multi-solution problems. Experiments show that the conventional short chain-of-thought (Short-CoT) prompting paradigm exhibits pronounced overconfidence, whereas the emerging long chain-of-thought (Long-CoT) approach mitigates it through iterative exploration and self-reflection. We further characterise observable behaviours and influential factors. To probe the underlying cause, we propose the \textbf{cognitive-rigidity hypothesis}, which posits that overconfidence arises when the reasoning process prematurely converges on a narrow set of thought paths. An attention-entropy analysis offers preliminary support for this view. These findings provide tools for assessing the completeness of LLM reasoning and highlight the need to move evaluation beyond single-answer accuracy toward comprehensive exploration.

Beware of Reasoning Overconfidence: Pitfalls in the Reasoning Process for Multi-solution Tasks

In the domain of moment retrieval, accurately identifying temporal segments within videos based on natural language queries remains challenging. Traditional methods often employ pre-trained models that struggle with fine-grained information and deterministic reasoning, leading to difficulties in aligning with complex or ambiguous moments. To overcome these limitations, we explore Deep Evidential Regression (DER) to construct a vanilla Evidential baseline. However, this approach encounters two major issues: the inability to effectively handle modality imbalance and the structural differences in DER's heuristic uncertainty regularizer, which adversely affect uncertainty estimation. This misalignment results in high uncertainty being incorrectly associated with accurate samples rather than challenging ones. Our observations indicate that existing methods lack the adaptability required for complex video scenarios. In response, we propose Debiased Evidential Learning for Moment Retrieval (DEMR), a novel framework that incorporates a Reflective Flipped Fusion (RFF) block for cross-modal alignment and a query reconstruction task to enhance text sensitivity, thereby reducing bias in uncertainty estimation. Additionally, we introduce a Geom-regularizer to refine uncertainty predictions, enabling adaptive alignment with difficult moments and improving retrieval accuracy. Extensive testing on standard datasets and debiased datasets ActivityNet-CD and Charades-CD demonstrates significant enhancements in effectiveness, robustness, and interpretability, positioning our approach as a promising solution for temporal-semantic robustness in moment retrieval.

Downloads

Next from AAAI 2026

Neural Outline Cache for Real-time Anti-aliasing Font Rendering

Stay up to date with the latest Underline news!

PRESENTATIONS

CONFERENCES

COMPANY

RESOURCES

.css-70qvj9{display:-webkit-box;display:-webkit-flex;display:-ms-flexbox;display:flex;-webkit-align-items:center;-webkit-box-align:center;-ms-flex-align:center;align-items:center;}Downloads

Next from AAAI 2026

Neural Outline Cache for Real-time Anti-aliasing Font Rendering

Stay up to date with the latest Underline news!

PRESENTATIONS

CONFERENCES

COMPANY

RESOURCES

Downloads