Singapore

The diversity across populations and the variability between individuals have long posed a significant challenge in cognitive science. Although large language models (LLMs) have made notable progress in aligning with human values, faithfully capturing the high degree of diversity and uncertainty in human judgment remains an unresolved challenge.This study investigates whether computational models, or `proxy agents,&quot; can not only emulate human decision patterns but also systematically modulate them. We propose a framework wherein we first fine-tune BERT-based proxy agents to replicate both aggregate and individual-level human judgments on a large-scale moral dilemma dataset. We then hypothesize that stimuli identified as maximally divisive for these individualized agents will similarly elicit high disagreement among human participants. Through a human-in-the-loop experiment, we validate this hypothesis, demonstrating that agent-selected stimuli can predictably induce targeted divergence in human moral choices. Our findings provide empirical evidence that AI agents can bias human perceptual variability by strategically filtering information. We further analyze this induced moral divergence using a Bayesian framework and concept decomposition to identify the distinct conceptual dimensions driving individual differences. This work quantifies the potential for AI-driven cognitive modulation and underscores the urgent need for ethical guidelines to prevent the misuse of such capabilities.

AAAI 2026

When Proxy Agents Disagree, Do Humans Mirror? Manipulating Human Behavior in Moral Dilemmas Through Agents

understanding people

concepts and methods

theories

humanities & computational social science

learning human values and preferences

The diversity across populations and the variability between individuals have long posed a significant challenge in cognitive science. Although large language models (LLMs) have made notable progress in aligning with human values, faithfully capturing the high degree of diversity and uncertainty in human judgment remains an unresolved challenge.This study investigates whether computational models, or `proxy agents," can not only emulate human decision patterns but also systematically modulate them. We propose a framework wherein we first fine-tune BERT-based proxy agents to replicate both aggregate and individual-level human judgments on a large-scale moral dilemma dataset. We then hypothesize that stimuli identified as maximally divisive for these individualized agents will similarly elicit high disagreement among human participants. Through a human-in-the-loop experiment, we validate this hypothesis, demonstrating that agent-selected stimuli can predictably induce targeted divergence in human moral choices. Our findings provide empirical evidence that AI agents can bias human perceptual variability by strategically filtering information. We further analyze this induced moral divergence using a Bayesian framework and concept decomposition to identify the distinct conceptual dimensions driving individual differences. This work quantifies the potential for AI-driven cognitive modulation and underscores the urgent need for ethical guidelines to prevent the misuse of such capabilities.

poster

We are pleased to announce the Fortieth AAAI Conference on Artificial Intelligence (AAAI-26), which will be held in Singapore EXPO from January 20 to January 27, 2026.

The purpose of the AAAI conference series is to promote research in Artificial Intelligence (AI) and foster scientific exchange between researchers, practitioners, scientists, students, and engineers across the entirety of AI and its affiliated disciplines. AAAI-26 will feature technical paper presentations, special tracks, invited speakers, workshops, tutorials, poster sessions, senior member presentations, competitions, and exhibit programs, and a range of other activities to be announced.<br><br>

To access this event page, you need to log in with the **email address you registered with**. <br>Access credentials will be sent to your email from Underline -  subject line "Welcome to AAAI 2026". Please be sure to check your spam email folder if you do not see an email confirmation right away.

Please log in

To access this event page, you are required to register.
Please complete your registration to continue.

We recommend reading [**the registration information**](https://aaai.org/conference/aaai/aaai-26/registration/) first.

**Online Registration Form**: https://aaai.getregistered.net/conference-2026 

Registration Required

We are pleased to announce the Fortieth AAAI Conference on Artificial Intelligence (AAAI-26), which will be held in Singapore EXPO from January 20 to January 27, 2026.

Climate change poses a global threat to public health, food security, and economic stability. Addressing it requires evidence-based policies and a nuanced understanding of how the threat is perceived by the public, particularly within visual social media, where narratives quickly evolve through voices of individuals, politicians, NGOs, and institutions. This study investigates climate-related discourse on YouTube within the Brazilian context, a geopolitically significant nation in global environmental negotiations. Through three case studies, we examine (1) which psychological content traits most effectively drive audience engagement, (2) the extent to which these traits influence content popularity, and (3) whether such insights can inform the design of persuasive synthetic campaigns—such as climate denialism—using recent generative language models. Another contribution of this work is the release of a large publicly available dataset of 226K Brazilian YouTube videos and 2.7M user comments on climate change. The dataset includes fine-grained annotations of persuasive strategies, theory-of-mind categorizations in user responses, and typologies of content creators. This resource can help support future research on digital climate communication and the ethical risk of algorithmically amplified narratives and generative media.

Characterizing AI Manipulation Risks in Brazilian YouTube Climate Discourse

The increasing complexity and workload of clinical radiology leads to inevitable oversights and mistakes in their use as diagnostic tools, causing delayed treatments and sometimes life-threatening harms to patients. While large language models (LLMs) have shown remarkable progress in many tasks, their utilities in detecting and correcting errors in radiology reporting are limited. We present a novel dual-knowledge infusion framework that enhances LLMs' capability for radiology report proofreading through systematic integration of medical expertise.
Specifically, our knowledge infusion combines medical knowledge graph distillation (MKGD) with external knowledge retrieval (EXKR), enabling an effective automated approach in tackling mistakes in radiology reporting. By decomposing the complex proofreading task into three specialized stages of detection, localization, and correction, our method mirrors the systematic review process employed by expert radiologists, ensuring both precision and clinical interpretability. The dual-knowledge framework captures intricate medical relationships through structured graph representations while leveraging curated clinical expertise from reference reports.
To perform a robust, clinically relevant evaluation, we constructed a comprehensive benchmark using real-world radiology reports with error patterns derived from real-world scenarios, including speech recognition confusions, terminology ambiguities, and template-related inconsistencies, all validated by practicing radiologists. Extensive evaluations across multiple LLM architectures demonstrate substantial improvements of our approach: up to 31.56\% increase in error detection accuracy and 37.4\% reduction in processing time. Human evaluation by radiologists confirms superior clinical relevance and factual consistency compared to existing approaches.
Our framework addresses critical needs in clinical practice by enhancing report quality while reducing radiologist burden, particularly benefiting resource-constrained healthcare environments.

Error Correction in Radiology Reports: A Knowledge Distillation-Based Multi-Stage Framework

Opioid overdose is a growing global health crisis that claims more than 120,000 lives annually, of which more than half use opioids alone, without access to bystander intervention. Fatal overdose events are marked by motionlessness, respiratory depression, and hypoxemia, yet current wearable systems often rely on a single biomarker, limiting detection speed and accuracy. We present HypoxSpike, a novel ternary spiking neural network designed for real-time, multi-biomarker overdose detection for low-power neuromorphic hardware, optimized for integration into shoulder-based wearables. HypoxSpike combines motion, respiration, and oxygen saturation signals, while accounting for skin tone and body physiology, thus addressing known racial bias in pulse oximetry. Our research leverages an open-source shoulder-worn dataset from 19 patients experiencing sleep apnea, exploiting the shared physiological mechanisms underlying apnea and opioid overdose. This allows a direct comparison of our model with existing overdose detection approaches. HypoxSpike classifies three stages of hypoxemia with an average accuracy of 94%, outperforming state-of-the-art shoulder-based hypoxemia estimation while reducing false positive alert rates by 23.5%. By minimizing false positives, HypoxSpike supports accurate and power-efficient overdose detection, improving trust and usability for high-risk populations often overlooked by conventional systems.

HypoxSpike: Ternary Spiking Neural Network for Opioid Overdose Detection

Open-vocabulary object detection (OVOD) aims at detecting and recognizing objects beyond a fixed set of classes. Although region-word alignment and knowledge distillation have been explored for training a strong open-vocabulary detector, our analysis reveals three main issues (inaccurate alignment, redundant distillation, and low-quality class embedding) that limit OVOD's performance. In this paper, we explore the well-designed Tensor decomposition and Language descriptions for open-vocabulary object Detection (called TLDet).
Proposals with the highest similarity score often correspond to discriminative but incomplete regions (e.g., object heads), resulting in inaccurate region-word alignment. To mitigate this issue, we propose a low-rank proposal filtering module that quantitatively assesses the completeness of each proposal by performing singular value decomposition and computing the sum of its singular values. This allows the model to reduce discriminative proposals and enhance the precision of alignment between visual regions and textual concepts. Furthermore, to mitigate redundant knowledge transfer, we introduce a core tensor distillation approach that decomposes teacher and student features into core tensors via Tucker decomposition and performs distillation through optimized tensor alignment. This ensures that the student acquires the most essential knowledge from the teacher. Finally, to improve the quality of class embedding, a language description enhancement method is proposed by exploring the knowledge of LLM to enrich the representations of categories during inference. Extensive experiments on popular datasets demonstrate the superior performance of our TLDet, achieving 36.1% mAP on COCO and 30.1% mask mAP on LVIS, and outperforming existing methods on novel categories.

Tensor Decomposition and Language Description for Open-Vocabulary Object Detection

Adversarial attacks pose a significant threat to learning-based 3D point cloud models, critically undermining their reliability in security-sensitive applications. Existing defense methods often suffer from (1) high computational overhead and (2) poor generalization ability across diverse attack types.
To bridge these gaps, we propose a novel yet efficient teacher-student framework, namely Multimodal Robust Prompt Distillation (MRPD) for distilling robust 3D point cloud model.
It learns lightweight prompts by aligning student point cloud model's features with robust embeddings from three distinct teachers: a vision model processing depth projections, a high-performance 3D model, and a text encoder.
To ensure a reliable knowledge transfer, this distillation is guided by a confidence-gated mechanism which dynamically balances the contribution of all input modalities. Notably, since the distillation is all during the training stage, there is no additional computational cost at inference. Extensive experiments demonstrate that MRPD substantially outperforms state-of-the-art defense methods against a wide range of white-box and black-box attacks, while even achieving better performance on clean data.
Our work presents a new, practical paradigm for building robust 3D vision systems by efficiently harnessing multimodal knowledge.

Multimodal Robust Prompt Distillation for 3D Point Cloud Models

With the rise of vertical segmentation in real-world data, federated graph-level clustering has gained significant attention in recent years.
However, the inherent missing attributes in graph datasets held by certain clients lead to suboptimal local parameter updates and misaligned global parameter consensus. This results in knowledge shifts during negotiation to ultimately impair overall clustering performance. 
This issue remains largely underexplored in the current advanced research.
To bridge this gap, we propose a novel deep learning network called **Fed**erated Graph-level Clustering Network with **A**ttribute **I**nference (FedAI), which utilizes high-confidence prior knowledge from each domain and multi-party collaborative optimization to achieve efficient reasoning of unknown features. 
Specifically, on the client, we project high-confidence graph samples into a latent space, extracting and uploading their irreversible path digest information and structure-guided inference signal.
On the server, we first hierarchically identify affinity relationships by the improved graph kernel method. We then infer attributes of clients lacking attributes through a prior attribute-oriented inference signal, facilitating inter-client knowledge transfer for better clustering.
Experimental results on 15 cross-dataset and cross-domain non-IID graph datasets demonstrate that FedAI consistently outperforms existing methods. Our source codes are available at https://github.com/H00001/FedAI.

Federated Graph-level Clustering Network with Attribute Inference

Process simulation is a critical cornerstone of chemical engineering design. Current automated chemical design methodologies focus mainly on various representations of process flow diagrams. However, transforming these diagrams into executable simulation flowsheets remains a time-consuming and labor-intensive endeavor, requiring extensive manual parameter configuration within simulation software. In this work, we propose a novel multi-agent workflow that leverages the semantic understanding capabilities of large language models(LLMs) and enables iterative interactions with chemical process simulation software, achieving end-to-end automated simulation from textual process specifications to computationally validated software configurations for design enhancement. Our approach integrates four specialized agents responsible for task understanding, topology generation, parameter configuration, and evaluation analysis, respectively, coupled with Enhanced Monte Carlo Tree Search to accurately interpret semantics and robustly generate configurations. Evaluated on Simona, a large-scale process description dataset, our method achieves a 31. 3% improvement in the simulation convergence rate compared to state-of-the-art baselines and reduces the design time by 89. 0% compared to the expert manual design. This work demonstrates the potential of AI-assisted chemical process design, which bridges the gap between conceptual design and practical implementation. Our workflow is applicable to diverse process-oriented industries, including pharmaceuticals, petrochemicals, food processing, and manufacturing, offering a generalizable solution for automated process design.

From Text to Simulation: A Multi-Agent LLM Workflow for Automated Chemical Process Design

Large language models (LLMs) typically operate in a question-answering paradigm, where the quality of the input prompt critically affects the response. Automated Prompt Optimization (APO) aims to overcome the cognitive biases of manually crafted prompts and explore a broader prompt design space. However, existing APO methods often suffer from rigid template structures and inefficient exploration in the prompt space. To this end, we propose a Multi-Agent Adaptive Reasoning with Socratic guidance framework (MARS) for APO. MARS consists of five complementary agents and formulates the optimization process as a Partially Observable Markov Decision Process (POMDP), enabling adaptive prompt refinement through explicit state modeling and interactive feedback. Specifically, a Planner agent generates flexible optimization trajectories, a Teacher-Critic-Student triad engages in Socratic-style dialogue to iteratively optimize the prompt based on pseudo-gradient signals in the text space, and a Target agent executes the prompt in downstream tasks to provide performance feedback. MARS integrates reasoning, feedback, and state transition into a unified hidden-state evolution process, improving both the effectiveness and interpretability of optimization. Extensive experiments on multiple datasets demonstrate that MARS outperforms existing APO methods in terms of optimization performance, search efficiency, and interpretability.

MARS: Multi-Agent Adaptive Reasoning with Socratic Guidance for Automated Prompt Optimization

Satellite-based radar retrieval methods are widely employed to fill coverage gaps in ground-based radar systems, especially in remote areas affected by terrain blockage and limited detection range. Existing methods predominantly rely on overly simplistic spatial-domain architectures constructed from a single data source, limiting their ability to accurately capture complex precipitation patterns and sharply defined meteorological boundaries.
To address these limitations, we propose WaveC2R, a novel wavelet-driven coarse-to-refined framework for radar retrieval. WaveC2R integrates complementary multi-source data and leverages frequency-domain decomposition to separately model low-frequency components for capturing precipitation patterns and high-frequency components for delineating sharply defined meteorological boundaries. Specifically, WaveC2R consists of two stages (i) Intensity-Boundary Decoupled Learning, which leverages wavelet decomposition and frequency-specific loss functions to separately optimize low-frequency intensity and high-frequency boundaries; and (ii) Detail-Enhanced Diffusion Refinement, which employs frequency-aware conditional priors and multi-source data to progressively enhance fine-scale precipitation structures while preserving coarse-scale meteorological consistency.
Experimental results on the publicly available SEVIR dataset demonstrate that WaveC2R achieves state-of-the-art performance in satellite-based radar retrieval, particularly excelling at preserving high-intensity precipitation features and sharply defined meteorological boundaries.

WaveC2R: Wavelet-Driven Coarse-to-Refined Hierarchical Learning for Radar Retrieval

A dexterous hand capable of generalizable grasping objects is fundamental for the development of general-purpose embodied AI. However, previous methods focus narrowly on low-level grasp stability metrics, neglecting affordance-aware positioning and human-like poses which are crucial for downstream manipulation. To address these limitations, we propose AfforDex, a novel framework with two-stage training that learns a universal grasping policy with an inherent understanding of both motion priors and object affordances. In the first stage, a trajectory imitator is pre-trained on a large corpus of human hand motions to instill a strong prior for natural movement. In the second stage, a residual module is trained to adapt these general human-like motions to specific object instances. This refinement is critically guided by two components: our Negative Affordance-aware Segmentation (NAA) module, which identifies functionally inappropriate contact regions, and a privileged teacher-student distillation process that ensures the final vision-based policy is highly successful. Extensive experiments demonstrate that AfforDex not only achieves universal dexterous grasping but also remains remarkably human-like in posture and functionally appropriate in contact location. As a result, AfforDex significantly outperforms state-of-the-art baselines across seen objects, unseen instances, and even entirely novel categories.

Downloads

Next from AAAI 2026

Characterizing AI Manipulation Risks in Brazilian YouTube Climate Discourse

Stay up to date with the latest Underline news!

PRESENTATIONS

CONFERENCES

COMPANY

RESOURCES

.css-70qvj9{display:-webkit-box;display:-webkit-flex;display:-ms-flexbox;display:flex;-webkit-align-items:center;-webkit-box-align:center;-ms-flex-align:center;align-items:center;}Downloads

Next from AAAI 2026

Characterizing AI Manipulation Risks in Brazilian YouTube Climate Discourse

Stay up to date with the latest Underline news!

PRESENTATIONS

CONFERENCES

COMPANY

RESOURCES

Downloads