Singapore

Large language models (LLMs) have seen remarkable growth in recent years. To leverage convenient LLM cloud services, users are inevitably to upload their prompts. Additionally, for tasks such as translation, reading comprehension, and summarization, associated files or context are inherently needed, whether or not they contain user privacy information. Despite the rapid progress in LLM capabilities, research on preserving user privacy during inference has been relatively scarce. To this end, this paper conducts some exploratory research in this domain. Firstly, we show that (1) the embedding space of tokens is highly sparse, and (2) LLMs primarily function in the orthogonal subspace of embedding space, these two factors making privacy extremely vulnerable. Then, we analyze the structural characteristics of LLMs and design a distributed privacy-preserving inference paradigm which can effectively resist privacy attacks. Finally, we perform a thorough evaluation of the defended models on mainstream tasks and find that low-bit quantization techniques can be effectively combined with our inference paradigm, achieving a balance between privacy, utility, and runtime memory efficiency.

AAAI 2026

Reconstruction Attack-Resistant Inference Paradigm for LLM Cloud Services

nlp: question answering

peai: privacy & security

nlp: (large) language models

poster

We are pleased to announce the Fortieth AAAI Conference on Artificial Intelligence (AAAI-26), which will be held in Singapore EXPO from January 20 to January 27, 2026.

The purpose of the AAAI conference series is to promote research in Artificial Intelligence (AI) and foster scientific exchange between researchers, practitioners, scientists, students, and engineers across the entirety of AI and its affiliated disciplines. AAAI-26 will feature technical paper presentations, special tracks, invited speakers, workshops, tutorials, poster sessions, senior member presentations, competitions, and exhibit programs, and a range of other activities to be announced.<br><br>

To access this event page, you need to log in with the **email address you registered with**. <br>Access credentials will be sent to your email from Underline -  subject line "Welcome to AAAI 2026". Please be sure to check your spam email folder if you do not see an email confirmation right away.

Please log in

To access this event page, you are required to register.
Please complete your registration to continue.

We recommend reading [**the registration information**](https://aaai.org/conference/aaai/aaai-26/registration/) first.

**Online Registration Form**: https://aaai.getregistered.net/conference-2026 

Registration Required

We are pleased to announce the Fortieth AAAI Conference on Artificial Intelligence (AAAI-26), which will be held in Singapore EXPO from January 20 to January 27, 2026.

Adapting large pre-trained language models to downstream tasks often entails fine-tuning millions of parameters or deploying costly dense weight updates, which hinders their use in resource-constrained environments. Low-rank Adaptation (LoRA) reduces trainable parameters by factorizing weight updates, yet the underlying dense weights still impose high storage and computation costs. Magnitude-based pruning can yield sparse models but typically degrades LoRA’s performance when applied naively. In this paper, we introduce SALR (Sparsity-Aware Low-Rank Representation), a novel fine-tuning paradigm that unifies low-rank adaptation with sparse pruning under a rigorous mean-squared-error framework. We prove that statically pruning only the frozen base weights minimizes the pruning error bound, and we recover the discarded residual information via a truncated-SVD low-rank adapter, which provably reduces per-entry MSE by a factor of $(1 - r/\min(d,k))$. To maximize hardware efficiency, we fuse multiple low-rank adapters into a single concatenated GEMM, and we adopt a bitmap-based encoding with a two-stage pipelined decoding+GEMM design to achieve true model compression and speedup. Empirically, SALR attains 50\% sparsity on various LLMs while matching the performance of LoRA on GSM8K and MMLU, reduces model size by $2\times$, and delivers up to a $1.7\times$ inference speedup.

SALR: Sparsity-Aware Low-Rank Representation for Efficient Fine-Tuning of Large Language Models

Confidence alone is often misleading in hyperspectral image classification, as models tend to mistake high predictive scores for correctness while lacking awareness of uncertainty. This leads to confirmation bias, especially under sparse annotations or class imbalance, where models overfit confident errors and fail to generalize. We propose CABIN (Cognitive-Aware Behavior-Informed learNing), a semi-supervised framework that addresses this limitation through a closed-loop learning process of perception, action, and correction. CABIN first develops perceptual awareness by estimating epistemic uncertainty, identifying ambiguous regions where errors are likely to occur. It then acts by adopting an Uncertainty-Guided Dual Sampling Strategy, selecting uncertain samples for exploration while anchoring confident ones as stable pseudo-labels to reduce bias. To correct noisy supervision, CABIN introduces a Fine-Grained Dynamic Assignment Strategy that categorizes pseudo-labeled data into reliable, ambiguous, and noisy subsets, applying tailored losses to enhance generalization. Experimental results show that a wide range of state-of-the-art methods benefit from the integration of CABIN, with improved labeling efficiency and performance.

Perceive, Act and Correct: Confidence Is Not Enough for Hyperspectral Classification

Backdoor attacks embed malicious behaviors into Large Language Models (LLMs), enabling adversaries to trigger harmful outputs or bypass safety controls. However, the persistence of the implanted backdoors under user-driven post-deployment continual fine-tuning has been rarely examined. Most prior works evaluate the effectiveness and generalization of implanted backdoors only at releasing and empirical evidence shows that naively injected backdoor persistence degrades after updates. In this work, we study whether and how implanted backdoors persist through a multi‑stage post-deployment fine‑tuning. We propose P‑Trojan, a trigger‑based attack algorithm that explicitly optimizes for backdoor persistence across repeated updates. By aligning poisoned gradients with those of clean tasks on token embeddings, the implanted backdoor mapping is less likely to be suppressed or forgotten during subsequent updates. Theoretical analysis shows the feasibility of such persistent backdoor attacks after continual fine-tuning. And experiments conducted on the Qwen2.5 and LLaMA3 families of LLMs, as well as diverse task sequences, demonstrate that P‑Trojan achieves over \textbf{99\%} persistence while preserving clean‑task accuracy. Our findings highlight the need for persistence-aware evaluation and stronger defenses in realistic model adaptation pipelines.

Persistent Backdoor Attacks Under Continual Fine-Tuning of LLMs

Zero-shot object navigation tasks agents with locating target objects in unseen environments—a core capability of embodied intelligence. While recent vision-language navigation methods leverage Large Language Models (LLMs) for multimodal reasoning, they suffer from two key limitations: (1) semantic misalignment between language-grounded maps and real-world layouts, and (2) inefficiency due to LLMs’ lack of specialization for navigation-specific tasks. To address these challenges, we propose Chain-of-Search (CoS), a novel parameter-efficient framework that enables human-like decision-making via iterative semantic reasoning. First, CoS replaces traditional global maps with an optimal-benefit multi-map construction that continuously balances expected gain and cost throughout the navigation process. Second, we introduce a Parameter-Efficient Intent Aligner (PEIA), trained via a prompt-guided paradigm to align directional decisions with navigation intent. PEIA injects semantic cues into benefit-aware maps, enabling more rational and goal-consistent exploration. Finally, a Reflection-Guided Destination Verifier (RDV) confirms whether the target is reached via language-driven reasoning and corrects potential errors through self-reflection. CoS achieves state-of-the-art performance on HM3D (+2.8% SR) and MP3D (+1.2% SR) without relying on LLMs, demonstrating the effectiveness of lightweight, reasoning-centered navigation. \textit{All data and code will be publicly released.}

Chain-of-Search: Parameter-Efficient Reasoning for Zero-Shot Object Navigation

In e-commerce logistics, accurate geospatial clustering is essential for optimizing resource allocation, manpower planning, and delivery network design. However, existing density-based clustering approaches, particularly their reliance on heuristic parameter tuning, have been underexplored in datasets with significant density variations, limiting robustness and scalability. This study presents an unsupervised framework that extends DBSCAN by leveraging Gaussian Mixture Models (GMM). First, we propose a method that systematically identifies suitable clustering scales through statistical modeling. Second, the approach iteratively applies DBSCAN to extract clusters from dense to sparse regions, overcoming single-parameter limitations. Finally, we validate the method through large-scale offline experiments using data from over 200 last-mile dispatch centers (LMDC). The results demonstrate the framework’s effectiveness in identifying heterogeneous geographic demand patterns and supporting workforce planning and operational benchmarking. This framework provides a scalable solution to a critical challenge in e-commerce logistics, offering a valuable reference for strategic and operational decision-making.

A Unified Geospatial Clustering Framework to Identify Varying Density Clusters in E-Commerce Logistics

Automated interpretation and reporting of chest X-rays (CXRs) hold significant promise in reducing diagnostic errors and supporting radiologists under heavy clinical workloads. However, existing methods typically rely on global visual features and token-level supervision, limiting their sensitivity to subtle abnormalities and reducing their clinical reliability. 
To address these challenges, we present Reflective X-ray Network (RefleXNet), which systematically integrates multi-scale visual feature fusion and anatomical relational reasoning with a targeted self-reflective learning strategy. 
RefleXNet first constructs multi-scale visual representations and captures anatomical context through graph-based relational modeling. 
Building upon these representations, we introduce a targeted self-reflection strategy that uses clinically guided feedback from generated reports to selectively refine abnormality predictions and their associated region-level visual features. 
Extensive experiments on MIMIC-CXR demonstrate that RefleXNet consistently outperforms state-of-the-art baselines across clinical factual correctness metrics. Notably, our compact 3B-parameter model surpasses several recent models with over twice the parameter count. Additionally, RefleXNet exhibits strong generalization performance in zero-shot evaluations on IU-Xray compared with leading multimodal language models, highlighting its robustness and clinical effectiveness.

RefleXNet: Targeted Self-Reflection for Accurate Chest X-ray Reporting

Despite recent progress in adapting State Space Models such as Mamba to vision tasks, their intrinsic 1D scanning mechanism imposes limitations when applied to inherently 2D-structured data like images. Existing adaptations, including VMamba and 2DMamba, either suffer from inconsistency between scanning order and spatial locality or restrict inter-patch communication to singular paths, hindering effective information propagation. In this paper, we propose 2D-CrossScan, a novel 2D-compatible scan framework that enables spatially consistent, multi-path hidden state propagation by integrating modified state equations over two-dimensional neighborhoods. Furthermore, we mitigate redundant information accumulation due to overlapping paths via cross-directional subtraction. To fully align with the 2D spatial structure, we introduce a multi-directional scanning strategy that starts simultaneously from all four corners of the image, enabling diverse propagation paths and better feature integration. Our approach maintains efficiency, requiring only minimal architectural changes to existing Mamba variants. Experimental results demonstrate substantial improvements in multiple visual tasks, including object detection and semantic segmentation on PANDA and COCO datasets. Compared to baseline SSM-based methods, 2D-CrossScan consistently yields better spatial representations, as confirmed by extensive effective receptive field visualizations and attention analyses. These results highlight the importance of geometry-aware state propagation and validate 2D-CrossScan as a simple yet powerful extension to SSMs for vision.

2D-CrossScan Mamba: Enhancing State Space Models with Spatially Consistent Multi-Path 2D Information Propagation

The US Centers for Disease Control and Prevention (CDC), in 2019, designated Methicillin-resistant Staphylococcus aureus (MRSA) as a serious antimicrobial resistance threat. The risk of acquiring MRSA and suffering life-threatening consequences due to it remains especially high for hospitalized patients due to a unique combination of factors, including: co-morbid conditions, immuno suppression, antibiotic use, and risk of contact with contaminated hospital workers and equipment. In this paper, we present a novel generative probabilistic model, GenHAI, for modeling sequences of MRSA test results outcomes for patients during a single hospitalization. This model can be used to answer many important questions from the perspectives of hospital administrators for mitigating the risk of MRSA infections. Our model is based on the probabilistic programming paradigm, and can be used to approximately answer a variety of predictive, causal, and counterfactual questions. We demonstrate the efficacy of our model by comparing it against discriminative and generative machine learning models using two real-world datasets.

Prediction of Hospital Associated Infections During Continuous Hospital Stays

Previous methods evaluate reward models by testing them on a fixed pairwise ranking test set, but they typically do not provide performance information on each preference dimension. In this work, we address the evaluation challenge of reward models by probing preference representations. To confirm the effectiveness of this evaluation method, we construct a Multi-dimensional Reward Model Benchmark (MRMBench), a collection of six probing tasks for different preference dimensions. We design it to favor and encourage reward models that better capture preferences across different dimensions. Furthermore, we introduce an analysis method, inference-time probing, which identifies the dimensions used during the reward prediction and enhances its interpretability. Through extensive experiments, we find that MRMBench strongly correlates with LLM alignment performance, supporting it as a reliable reference for developing advanced reward models. By analyzing the evaluation results on MRMBench, we reveal that reward models struggle to simultaneously capture preferences across multiple dimensions, highlighting the potential of multi-objective optimization in reward modeling. Furthermore, our results demonstrate that the proposed inference-time probing method provides a reliable metric for assessing the confidence of reward predictions, leading to improved alignment of large language models.

Probing Preference Representations: A Multi-Dimensional Evaluation and Analysis Method for Reward Models

Multi-modal knowledge graph completion (MMKGC) aims to infer missing entities of triples by leveraging heterogeneous information in knowledge graph (KG). However, existing approaches often struggle with inconsistent modality alignment, limited reasoning depth, and insufficient negative sample quality. In this work, we propose HFR-MKGC, a novel framework that integrates hierarchical modal fusion and Multimodal Large Language Model (MLLM) reasoning for robust and expressive MMKGC. Specifically, we introduce a relation-guided hierarchical modal fusion module, which conducts fine-grained intra-visual fusion and relation-guided cross-modal integration to yield rich entity representations. HFR-MKGC employs a fine-tuned MLLM to perform instruction-based triple reasoning, producing candidate entities for completion. Then, it constructs hard negative samples through textual perturbation by MLLM and visual feature augmentation with rotation and noise.HFR-MKGC optimizes the model via adversarial training. Extensive experiments on three MMKGC benchmarks demonstrate that our method outperforms state-of-the-art methods, validating its effectiveness in MMKGC.

Downloads

Next from AAAI 2026

SALR: Sparsity-Aware Low-Rank Representation for Efficient Fine-Tuning of Large Language Models

Stay up to date with the latest Underline news!

PRESENTATIONS

CONFERENCES

COMPANY

RESOURCES

.css-70qvj9{display:-webkit-box;display:-webkit-flex;display:-ms-flexbox;display:flex;-webkit-align-items:center;-webkit-box-align:center;-ms-flex-align:center;align-items:center;}Downloads

Next from AAAI 2026

SALR: Sparsity-Aware Low-Rank Representation for Efficient Fine-Tuning of Large Language Models

Stay up to date with the latest Underline news!

PRESENTATIONS

CONFERENCES

COMPANY

RESOURCES

Downloads