Singapore

Reward functions, learned or manually specified, are rarely perfect. Instead of accurately expressing human goals, these reward functions are often distorted by human beliefs about how best to achieve those goals. Specifically, these reward functions often express a combination of the human&#39;s terminal goals — those which are ends in themselves — and the human&#39;s instrumental goals — those which are means to an end. We formulate a simple example in which even slight conflation of instrumental and terminal goals results in severe misalignment: optimizing the misspecified reward function $\hat{r}$ results in poor performance when measured by the true reward function $r$. This example distills the essential properties of environments that make reinforcement learning highly sensitive to conflation of instrumental and terminal goals. We discuss how this issue can arise with a common approach to reward learning and how it can manifest in real environments.

AAAI 2026

Misalignment from Treating Means as Ends

instrumental goals

reward learning

alignment

Reward functions, learned or manually specified, are rarely perfect. Instead of accurately expressing human goals, these reward functions are often distorted by human beliefs about how best to achieve those goals. Specifically, these reward functions often express a combination of the human's terminal goals — those which are ends in themselves — and the human's instrumental goals — those which are means to an end. We formulate a simple example in which even slight conflation of instrumental and terminal goals results in severe misalignment: optimizing the misspecified reward function $\hat{r}$ results in poor performance when measured by the true reward function $r$. This example distills the essential properties of environments that make reinforcement learning highly sensitive to conflation of instrumental and terminal goals. We discuss how this issue can arise with a common approach to reward learning and how it can manifest in real environments.

technical paper

We are pleased to announce the Fortieth AAAI Conference on Artificial Intelligence (AAAI-26), which will be held in Singapore EXPO from January 20 to January 27, 2026.

The purpose of the AAAI conference series is to promote research in Artificial Intelligence (AI) and foster scientific exchange between researchers, practitioners, scientists, students, and engineers across the entirety of AI and its affiliated disciplines. AAAI-26 will feature technical paper presentations, special tracks, invited speakers, workshops, tutorials, poster sessions, senior member presentations, competitions, and exhibit programs, and a range of other activities to be announced.<br><br>

To access this event page, you need to log in with the **email address you registered with**. <br>Access credentials will be sent to your email from Underline -  subject line "Welcome to AAAI 2026". Please be sure to check your spam email folder if you do not see an email confirmation right away.

Please log in

To access this event page, you are required to register.
Please complete your registration to continue.

We recommend reading [**the registration information**](https://aaai.org/conference/aaai/aaai-26/registration/) first.

**Online Registration Form**: https://aaai.getregistered.net/conference-2026 

Registration Required

We are pleased to announce the Fortieth AAAI Conference on Artificial Intelligence (AAAI-26), which will be held in Singapore EXPO from January 20 to January 27, 2026.

Remote sensing change detection (RSCD) is crucial for ur-
ban monitoring, environmental protection, and disaster as-
sessment, but small-sample scenarios often lead to overfitting
and inaccurate predictions on unseen data. To address this, we
propose GSAG-CDGAN, an end-to-end framework integrat-
ing Selective Noise Augmentation (SNA) to mitigate overfit-
ting, an Attention-Guided Adversarial Network (AGAN) to
enhance structural consistency, and a Perceptual Loss Mod-
ule (PLM) to preserve semantic consistency. Experiments on
CDData-50 show that GSAG-CDGAN improves F1-Score
from 0.6954 to 0.8851, with notable gains in Recall and IoU,
demonstrating enhanced robustness under small-sample con-
ditions. Further evaluation on the WHU-CD dataset yields an
F1-Score of 0.9502, confirming strong cross-dataset general-
ization and the method’s effectiveness in diverse scenarios.

GSAG-CDGAN: A Generalizable Small-Sample Attention-Guided GAN for Remote Sensing Change Detection (Student Abstract)

Scenario-based testing is an important approach for the
development and validation of autonomous driving systems,
as it enables evaluation across different driving
situations. Safety-critical scenarios are especially
relevant, but they occur rarely in real-world data, which
creates the need for generation methods. In this paper, we
present a scalable AI-based approach based on a variational
autoencoder that unifies the generation of different types
of critical scenarios while introducing controllability
through a structured latent space. The integration of
unified generation and latent space control advances
AI-based scenario generation towards practical use, thereby
supporting the requirements of industrial validation
pipelines.

Guided Latent Spaces for Controllable Multi-Scenario Generation in Autonomous Driving (Student Abstract)

This study introduces PEFT-DML, a parameter-efficient deep
metric learning framework for robust multi-modal 3D object
detection in autonomous driving. Unlike conventional models
that assume fixed sensor availability, PEFT-DML maps
diverse modalities (LiDAR, radar, camera, IMU, GNSS) into a
shared latent space, enabling reliable detection even under
sensor dropout or unseen modality–class combinations. By
integrating Low-Rank Adaptation (LoRA) and adapter layers,
PEFT-DML achieves significant training efficiency while
enhancing robustness to fast motion, weather variability,
and domain shifts. Experiments on benchmarks nuScenes
demonstrate superior accuracy.

PEFT-DML: Parameter-Efficient Fine-Tuning Deep Metric Learning for Robust Multi-Modal 3D Object Detection in Autonomous Driving (Student Abstract)

The architectural blueprint of today’s leading text-to-image models contains a fundamental flaw: an inability to handle logical composition. This survey investigates this breakdown across three core primitives—negation, counting, and spatial relations. Our analysis reveals a dramatic performance collapse: models that are accurate on single primitives fail precipitously when these are combined, exposing severe interference. We trace this failure to three key factors. First, training data show a near-total absence of explicit negations. Second, continuous attention architectures are fundamentally unsuitable for discrete logic. Third, evaluation metrics reward visual plausibility over constraint satisfaction. By analyzing recent benchmarks and methods, we show that current solutions and simple scaling cannot bridge this gap. Achieving genuine compositionality, we conclude, will require fundamental advances in representation and reasoning rather than incremental adjustments to existing architectures.

Right Looks, Wrong Reasons: Compositional Fidelity in Text-to-Image Generation

The recent proliferation of large language models (LLMs) holds the potential to revolutionize healthcare, with strong capabilities in diverse medical tasks. Yet, deploying LLMs in high-stakes healthcare settings requires rigorous verification and validation to understand any potential harm. This paper investigates the reliability and viability of using medical knowledge graphs (KGs) for the automated factuality evaluation of LLM-generated responses. To ground this investigation, we introduce FAITH, a framework designed to systematically probe the strengths and limitations of this KG-based approach. FAITH operates without reference answers by decomposing responses into atomic claims, linking them to a medical KG, and scoring them based on evidence paths. Experiments on diverse medical tasks with human subjective evaluations demonstrate that KG-grounded evaluation achieves considerably higher correlations with clinician judgments and can effectively distinguish LLMs with varying capabilities. It is also robust to textual variances. The inherent explainability of its scoring can further help users understand and mitigate the limitations of current LLMs. We conclude that while limitations exist, leveraging KGs is a prominent direction for automated factuality assessment in healthcare.

Assessing Automated Fact-Checking for Medical LLM Responses with Knowledge Graphs

Split Federated Learning (SFL) is an emerging paradigm for privacy-preserving distributed learning. However, it remains vulnerable to sophisticated data poisoning attacks targeting local features, labels, smashed data, and model weights. Existing defenses, primarily adapted from traditional Federated Learning (FL), are less effective under SFL due to limited access to complete model updates. This paper presents HealSplit, the first unified defense framework tailored for SFL, offering end-to-end detection and recovery against five sophisticated types of poisoning attacks. HealSplit comprises three key components: (1) a topology-aware detection module that constructs graphs over smashed data to identify poisoned samples via topological anomaly scoring (TAS); (2) a generative recovery pipeline that synthesizes semantically consistent substitutes for detected anomalies, validated by a consistency validation student; and (3) an adversarial multi-teacher distillation framework trains the student using semantic supervision from a Vanilla Teacher and anomaly-aware signals from an Anomaly-Influence Debiasing (AD) Teacher, guided by the alignment between topological and gradient-based interaction matrices. Extensive experiments on four benchmark datasets demonstrate that HealSplit consistently outperforms ten state-of-the-art defenses, achieving superior robustness and defense effectiveness across diverse attack scenarios.

HealSplit: Towards Self-Healing Through Adversarial Distillation in Split Federated Learning

The goal of inductive logic programming (ILP) is to search for a hypothesis that generalises training data and background knowledge. The challenge is searching vast hypothesis spaces, which is exacerbated because many logically equivalent hypotheses exist. To address this challenge, we introduce a method to break symmetries in the hypothesis space. We implement our idea in answer set programming. Our experiments on multiple domains, including visual reasoning and game playing, show that our approach can reduce solving times from over an hour to just 17 seconds.

Symmetry Breaking for Inductive Logic Programming

Gradient-based Discrete Samplers (GDSs) are effective for sampling discrete energy landscapes. However, they often stagnate in complex, non-convex settings. To improve exploration, we introduce the Discrete Replica EXchangE Langevin (DREXEL) sampler and its variant with Adjusted Metropolis (DREAM). These samplers use two GDSs at different temperatures and step sizes: one focuses on local exploitation, while the other explores broader energy landscapes. When energy differences are significant, sample swaps occur, governed by a mechanism tailored for discrete sampling to ensure detailed balance. Theoretically, we prove that the proposed samplers satisfy detailed balance and converge to the target distribution under mild conditions. Experiments across 2d synthetic simulations, sampling from Ising models and restricted Boltzmann machines, and training deep energy-based models further confirm their efficiency in exploring non-convex discrete energy landscapes.

Exploring Non-Convex Discrete Energy Landscapes: An Efficient Langevin-Like Sampler with Replica Exchange

High-fidelity helicopter flight simulators are essential for preparing pilots for complex and hazardous environments, yet realistic urban wind dynamics are difficult to reproduce in real time when relying on precomputed computational fluid dynamics (CFD) data. We present the first integration of a Fourier Neural Operator (FNO) into a Level D full flight simulator for real-time, physics-based urban wind field generation. Trained on high-resolution urban flow simulations, the FNO predicts one-minute-averaged 3D wind fields that dynamically adapt to flight state and location, replacing static wind inputs in the simulator pipeline. Turbulence levels are computed from the predictions and injected directly into the simulation loop. Professional pilots evaluated the system in an urban scenario and reported that it reproduced wind effects they would expect, such as turbulence and directional changes when landing behind buildings. They highlighted its value for less experienced pilots to develop wind awareness and for realistic training in critical operations, including offshore platform landings.

Integrating Fourier Neural Operators into High-Fidelity Helicopter Flight Simulation for Real-Time Urban Wind Prediction

The widespread application of uninterpretable machine learning systems for sensitive purposes has spurred research into elucidating the decision-making process of these systems. These efforts have their background in many different disciplines, one of which is the field of AI & law. In particular, recent works have observed that machine learning training data can be interpreted as legal cases. Under this interpretation, the formalism developed to study case law, called the theory of precedential constraint, can be used to analyze the way in which machine learning systems draw on training data—or should draw on them—to make decisions. In the present work, we advance the theory underlying these explanation methods, by relating it to order theory and logic. This allows us to write a software implementation of the theory that can be used to compute with the definitions and give automatic proofs of the properties of the model. We use this implementation to evaluate the model on a series of datasets. Through this analysis, we characterize the types of datasets that are more, or less, suitable to be described by the theory.

Downloads

Next from AAAI 2026

GSAG-CDGAN: A Generalizable Small-Sample Attention-Guided GAN for Remote Sensing Change Detection (Student Abstract)

Stay up to date with the latest Underline news!

PRESENTATIONS

CONFERENCES

COMPANY

RESOURCES

.css-70qvj9{display:-webkit-box;display:-webkit-flex;display:-ms-flexbox;display:flex;-webkit-align-items:center;-webkit-box-align:center;-ms-flex-align:center;align-items:center;}Downloads

Next from AAAI 2026

GSAG-CDGAN: A Generalizable Small-Sample Attention-Guided GAN for Remote Sensing Change Detection (Student Abstract)

Stay up to date with the latest Underline news!

PRESENTATIONS

CONFERENCES

COMPANY

RESOURCES

Downloads