Singapore

While advancements in the reasoning abilities of LLMs have significantly enhanced their performance in solving mathematical problems, coding tasks, and general puzzles, their effectiveness in accurately adhering to instructions remains inconsistent, particularly with more complex directives. Our investigation identifies lazy reasoning during the thinking stage as the primary factor contributing to poor instruction adherence. To mitigate this issue, we propose a comprehensive framework designed to enable rigorous reasoning processes involving preview and self-checking, essential for satisfying strict instruction constraints. 
Specifically, we first generate instructions with complex constraints and apply a filtering process to obtain valid prompts, resulting in three distinct prompt datasets categorized as hard, easy, and pass. Then, we employ rejection sampling on the pass prompts to curate a small yet high-quality dataset, enabling a cold-start initialization of the model and facilitating its adaptation to effective reasoning patterns.
Subsequently, we employ an entropy-preserving supervised fine-tuning (Entropy-SFT) strategy coupled with token-wise entropy-adaptive (TEA-RL) reinforcement learning guided by rule-based dense rewards. This approach encourages the model to transform its reasoning mechanism, ultimately fostering generalizable reasoning abilities that encompass preview and performing self-checking. Extensive experiments conducted on instruction-following benchmarks demonstrate remarkable performance improvements across various model scales. Notably, our Light-IF-32B model surpasses both larger open-source models such as DeepSeek-R1 and closed-source models like Doubao-1.6.

AAAI 2026

Light-IF: Endowing LLMs with Generalizable Reasoning via Preview and Self-Checking for Complex Instruction Following

nlp: conversational ai/dialog systems

nlp: (large) language models

nlp: applications

technical paper

We are pleased to announce the Fortieth AAAI Conference on Artificial Intelligence (AAAI-26), which will be held in Singapore EXPO from January 20 to January 27, 2026.

The purpose of the AAAI conference series is to promote research in Artificial Intelligence (AI) and foster scientific exchange between researchers, practitioners, scientists, students, and engineers across the entirety of AI and its affiliated disciplines. AAAI-26 will feature technical paper presentations, special tracks, invited speakers, workshops, tutorials, poster sessions, senior member presentations, competitions, and exhibit programs, and a range of other activities to be announced.<br><br>

To access this event page, you need to log in with the **email address you registered with**. <br>Access credentials will be sent to your email from Underline -  subject line "Welcome to AAAI 2026". Please be sure to check your spam email folder if you do not see an email confirmation right away.

Please log in

To access this event page, you are required to register.
Please complete your registration to continue.

We recommend reading [**the registration information**](https://aaai.org/conference/aaai/aaai-26/registration/) first.

**Online Registration Form**: https://aaai.getregistered.net/conference-2026 

Registration Required

We are pleased to announce the Fortieth AAAI Conference on Artificial Intelligence (AAAI-26), which will be held in Singapore EXPO from January 20 to January 27, 2026.

We introduce LLA, an effective intellectual property (IP) protection scheme for generative AI models. LLA leverages the synergy between hardware and software to defend against various supply chain threats, including model theft, model corruption, and information leakage. On the software side, it embeds key bits into neurons that can trigger outliers to degrade performance and applies invariance transformations to obscure the key values. On the hardware side, it integrates a lightweight locking module into the AI accelerator while maintaining compatibility with various dataflow patterns and toolchains. An accelerator with a pre-stored secret key acts as a license to access the model services provided by the IP owner. The evaluation results show that LLA can withstand a broad range of oracle-guided key optimization attacks, while incurring a minimal computational overhead of less than 0.1\% for 7,168 key bits.

LLA: Enhancing Security and Privacy for Generative Models with Logic-Locked Accelerators

Locating vertebral landmarks on anteroposterior (AP) X-ray images is challenging due to the tissue overlap. Despite the great progress of heatmap-based methods, they often predict missing/false points, which are intolerable in the downstream applications like scoliosis assessment. In this paper, we instead modernize the classic point-regression scheme, and propose a novel model termed RouterNet to locate the 68 vertebral landmarks completely and accurately. RouterNet starts from an initial root point, and then gradually routes it onto more and more points with finer and finer semantics. RouterNet naturally couples such point routing process with its hierarchical and multi-scale feature learning. That is, lower-scale feature maps are utilized to regress points with coarser semantics, and the regressed points pilot a more focused local feature extraction on the next higher-scale map to route onto their subsequent positions with finer semantics. With this divide-and-conquer, RouterNet alleviates the task difficulty, and can robustly localize by routing from the whole spinal center to 17 vertebral centers, and further to their 68 corner points. Extensive and comprehensive experiments on both public and private datasets demonstrate our superior performance over other state-of-the-arts, by decreasing NMSE by 73.8\% for landmark localization, and SMAPE by 14.8\% for the downstream scoliosis assessment.

RouterNet: Hierarchical Point Routing Network for Robust Vertebral Landmark Localization on AP X-ray Images

Long-horizon events forecasting is a crucial task across various domains, including retail, finance, healthcare, and social networks. Traditional models for event sequences often extend to forecasting on horizon using an autoregressive (recursive) multi-step strategy, which has limited effectiveness due to typical convergence to constant or repetitive outputs. To address this limitation, we introduce DEF, a novel approach for simultaneous forecasting of multiple future events on a horizon with high accuracy and diversity. Our method optimally aligns predictions with ground truth events during training by using a novel matching-based loss function. We establish a new state-of-the-art in long-horizon event prediction, achieving up to a 50% relative improvement over existing temporal point processes and event prediction models. Furthermore, we achieve state-of-the-art performance in next-event prediction tasks while demonstrating high computational efficiency during inference.

Detecting the Future: All-at-Once Event Sequence Forecasting with Horizon Matching

Citation recommendation aims to provide researchers with the most relevant references for their manuscripts, helping them swiftly discover pertinent studies and bolster the reliability of their arguments. However, some individuals manipulate these recommendation systems by injecting false information, such as deliberately inflating the citation count of their own papers, to obtain favorable recommendations and ratings. This form of attack, commonly termed “shilling attack”, is not only highly concealed but also has an unimaginable impact on all scientific research. To address this problem, we theoretically reveal the impact of shilling attacks on citation recommendation and propose three feasible resistance strategies: historical collaborations, significant citations and content constraints. Based on these insights, we introduce RSA-CR, a robust and hybrid citation recommendation algorithm resistant to shilling attacks. The algorithm constructs a two-layer academic graph and uses random and content generation strategies to initialize author and paper embeddings. Confidence-guided inductive aggregations based on collaboration and citation relationships are then performed at the author and paper sides, where author aggregation results directly influences the paper aggregation strength. Finally, recommendations are made by measuring the distances between the fused paper embeddings. The entire learning process resembles a dumbbell, hence termed “dumbbell inductive learning”. Experiments on four academic datasets demonstrate that our method outperforms baselines in both effectiveness and robustness. Code will be released upon acceptance.

RSA-CR: Resisting Shilling Attacks in Citation Recommendation via Dumbbell Inductive Learning

We present RENEW, a novel global path planning framework for Autonomous Surface Vehicle (ASV) operating in dynamic environments with external disturbances (e.g., water currents). These disturbances significantly affect both the risk and energy cost of navigation, particularly in constrained coastal waterways, by dynamically reshaping the navigable area. 
RENEW addresses this challenging scenario through a unified, risk- and energy-aware planning strategy that guarantees safety by explicitly identifying states at risk of entering non-navigable regions and enforcing adaptive safety constraints. Our planner incorporates a best-effort strategy under worst-case scenarios, inspired by contingency planning concepts from maritime domains, to ensure feasible control actions even under adverse conditions. RENEW employs a hierarchical architecture: a high-level planner explores topologically distinct paths via constrained triangulation, while a low-level planner selects an energy-efficient and kinematically feasible trajectory within a safe corridor. We validate our approach through extensive simulations using both custom realistic scenarios and real-world ocean current data. To our knowledge, this is the first global planning framework to jointly address the adaptive identification of non-navigable areas and topological diversity within a risk-aware paradigm, enabling robust navigation in maritime environments.

RENEW: Risk- and Energy-Aware Navigation in Dynamic Waterways

More and more organizations are relying on Machine Learning (ML) models to support internal decision-making processes. To better support such processes, it would be highly beneficial to contextualize the inductively acquired knowledge encoded in these models and enable formal reasoning over it. Despite significant progress in Neural-Symbolic AI, this specific challenge remains largely under-explored. We propose a framework that allows to integrate the knowledge induced by ML classifiers with the knowledge specified by logic-based formalisms. The framework is based on the novel notion of Hybrid Knowledge Base (HKB), consisting of two components: an ontology and a set of ML binary classifiers. As usual, the ontology provides an intensional representation of the modeled domain through logic-based axioms, while the binary classifiers implicitly encode the extensional knowledge. Specifically, a HKB associates to each concept and role mentioned in the ontology a classifier based on a set of features deemed to be relevant for the application domain, thereby virtually populating the concepts and roles with the instances and pairs of instances from the feature space. Besides the definition of the new framework, as a more technical contribution we show how to reason in this framework by studying query answering over HKBs. In particular, we investigate the computational complexity of query answering in a rich language over HKBs in which the ontology is specified in (the Description Logic counterpart of) RDFS, while the binary classifiers are represented by Multi-Layer Perceptrons.

Foundations of Formal Reasoning over Knowledge Bases Combining Symbolic and Sub-Symbolic Knowledge

Over recent decades, the tourism industry has demonstrated progressive expansion, driven by advancements in aviation technologies and shifting consumer interests. In this context, online flight itinerary ranking has become a pivotal business for Online Travel Platforms (OTPs), which aim to rank flight itineraries by synthesizing real-time flight data provided by airlines with users' individual travel preferences.
Currently, most OTPs rely on rule-based methodologies or rudimentary user preference-driven models to address this task. However, these methods are inherently limited by their insufficient consideration of delayed booking behaviors and their neglect of dynamic contextual attributes associated with flight itineraries, thereby undermining their ability to effectively handle the intricacies of flight ranking.
To address these shortcomings, this paper introduces the $\textbf{D} $elayed $\textbf{C} $onversion Modeling based Personalized Flight Itinerary $\textbf{R}$anking $\textbf{Net} $work ( $\textbf{DCRNet} $), designed to improve ranking accuracy by integrating delayed booking patterns and contextual dependencies into the modeling framework. 
Specifically, DCRNet explores the dynamic associations between users' current contextual information and their historical travel records, and models users' delayed booking behaviors via a masked attention mechanism. Moreover, an enhanced multi-task learning framework is employed to effectively integrate traditional behavioral modeling with delay-aware modeling, thereby improving the overall prediction accuracy and enhancing the system's personalized recommendation capabilities.
Extensive offline experiments conducted on real-world datasets from Amadeus and Fliggy demonstrate the superior performance of DCRNet. Furthermore, its successful deployment on Fliggy's online itinerary search system has yielded significant improvements, underscoring its practical effectiveness and scalability.

DCRNet: Delayed Conversion Modeling Based Personalized Flight Itinerary Ranking Network

In-context learning (ICL) with large language models (LLMs) has emerged as a promising paradigm for named entity recognition (NER) in low-resource scenarios. However, existing ICL-based NER methods suffer from three key limitations: (1) reliance on dynamic retrieval of annotated examples, which is problematic when annotated data is scarce; (2) limited generalization to unseen domains due to the LLM's insufficient internal domain knowledge; and (3) failure to incorporate external knowledge or resolve entity ambiguities. To address these challenges, we propose **KDR-Agent**, a novel multi-agent framework for multi-domain low-resource in-context NER that integrates **Knowledge retrieval**, **Disambiguation**, and **Reflective analysis**. KDR-Agent leverages natural-language type definitions and a static set of entity-level contrastive demonstrations to reduce dependency on large annotated corpora. A central planner coordinates specialized agents to (i) retrieve factual knowledge from Wikipedia for domain-specific mentions, (ii) resolve ambiguous entities via contextualized reasoning, and (iii) reflect on and correct model predictions through structured self-assessment. Experiments across ten datasets from five domains demonstrate that KDR-Agent significantly outperforms existing zero-shot and few-shot ICL baselines across multiple LLM backbones.

A Multi-Agent LLM Framework for Multi-Domain Low-Resource In-Context NER via Knowledge Retrieval, Disambiguation and Reflective Analysis

Recent advancements in Large Language Models (LLMs) have led to their widespread adoption in daily applications. Despite their impressive capabilities, they remain vulnerable to adversarial attacks, as even minor meaning-preserving changes such as synonym substitutions can lead to incorrect predictions. As a result, certifying the robustness of LLMs against such adversarial prompts is of vital importance. Existing approaches focused on word deletion or simple denoising strategies to achieve robustness certification. However, these methods face two critical limitations: (1) they yield loose robustness bounds due to the lack of semantic validation for perturbed outputs and (2) they suffer from high computational costs due to repeated sampling. To address these limitations, we propose CluCERT, a novel framework for certifying LLM robustness via clustering-guided denoising smoothing. Specifically, to achieve tighter certified bounds, we introduce a semantic clustering filter that reduces noisy samples and retains meaningful perturbations, supported by theoretical analysis. Furthermore, we enhance computational efficiency through two mechanisms: a refine module that extracts core semantics, and a fast synonym substitution strategy that accelerates the denoising process. Finally, we conduct extensive experiments on various downstream tasks and jailbreak defense scenarios. Experimental results demonstrate that our method outperforms existing certified approaches in both robustness bounds and computational efficiency.

CluCERT: Certifying LLM Robustness via Clustering-Guided Denoising Smoothing

Understanding human actions in videos requires robust
integration of multimodal cues beyond raw pixels. This work
introduces a deep self-supervised action recognition
framework that jointly predicts action concepts and
auxiliary features from RGB video, then hallucinates
missing modalities at test time to improve recognition
without added runtime cost. Two new domain-specific
descriptors, Object Detection Features (ODF) and Saliency
Detection Features (SDF), are proposed to capture spatial
context and motion saliency, integrating them with other
modalities such as optical flow, skeleton, audio, and
improved dense trajectories. The framework incorporates
aleatoric uncertainty modeling to handle noisy or
unreliable features, along with a robust loss for stable
multimodal fusion. Compatible with popular architectures
including I3D, AssembleNet, Video Transformer Network,
VideoMAE V2, and InternVideo2, the approach achieves
state-of-the-art results on Kinetics-400, Kinetics-600, and
Something-Something V2.

Downloads

Next from AAAI 2026

LLA: Enhancing Security and Privacy for Generative Models with Logic-Locked Accelerators

Stay up to date with the latest Underline news!

PRESENTATIONS

CONFERENCES

COMPANY

RESOURCES

.css-70qvj9{display:-webkit-box;display:-webkit-flex;display:-ms-flexbox;display:flex;-webkit-align-items:center;-webkit-box-align:center;-ms-flex-align:center;align-items:center;}Downloads

Next from AAAI 2026

LLA: Enhancing Security and Privacy for Generative Models with Logic-Locked Accelerators

Stay up to date with the latest Underline news!

PRESENTATIONS

CONFERENCES

COMPANY

RESOURCES

Downloads