Singapore

Bias in Large Language Models (LLMs) is increasingly
addressed through fairness-oriented techniques. However, in
some cases, these approaches may inadvertently remove
genuine cultural differences between groups, leading to
“over-normalization” or models losing important
socio-cultural distinctions. In this work, we introduce
OverNormEval, a benchmark designed to detect when an LLM
exhibits such over-normalization. We further explore the
use of Direct Preference Optimization (DPO) to mitigate
over-normalization.

AAAI 2026

When Equal Isn’t Fair: Mitigating Over-Normalization in Large Language Models (Student Abstract)

poster

We are pleased to announce the Fortieth AAAI Conference on Artificial Intelligence (AAAI-26), which will be held in Singapore EXPO from January 20 to January 27, 2026.

The purpose of the AAAI conference series is to promote research in Artificial Intelligence (AI) and foster scientific exchange between researchers, practitioners, scientists, students, and engineers across the entirety of AI and its affiliated disciplines. AAAI-26 will feature technical paper presentations, special tracks, invited speakers, workshops, tutorials, poster sessions, senior member presentations, competitions, and exhibit programs, and a range of other activities to be announced.<br><br>

To access this event page, you need to log in with the **email address you registered with**. <br>Access credentials will be sent to your email from Underline -  subject line "Welcome to AAAI 2026". Please be sure to check your spam email folder if you do not see an email confirmation right away.

Please log in

To access this event page, you are required to register.
Please complete your registration to continue.

We recommend reading [**the registration information**](https://aaai.org/conference/aaai/aaai-26/registration/) first.

**Online Registration Form**: https://aaai.getregistered.net/conference-2026 

Registration Required

We are pleased to announce the Fortieth AAAI Conference on Artificial Intelligence (AAAI-26), which will be held in Singapore EXPO from January 20 to January 27, 2026.

Deep learning models are emerging as strong alternatives to numerical weather prediction, yet their
internal
representations remain poorly understood. We analyze the
latent
space of Microsoft’s Aurora model to test whether its embed-
dings align with known physical processes. First, we show
that
land–sea distinctions are strongly captured, with errors
mainly
at coastlines. Second, we examine extreme surface
temperatures
using percentile-based thresholds, finding that embeddings
reveal
a gradient from moderate to severe events, though recall
degrades
at the rarest percentiles. These results suggest that
Aurora’s
encoder encodes physically consistent features but
underestimates
rare extremes. Our study combines deep learning forecasting,
interpretable representation learning, and classical ML
probing,
illustrating how cross-disciplinary AI methods can yield
insight
into foundation models

Latent Representations of Land–Sea Boundaries and Extreme Temperature in Aurora’s Encoder (Student Abstract)

External incentive mechanisms have been studied as a method
to promote cooperation in sequential social dilemmas
involving multiple autonomous agents. Mutual Acknowledgment
Token Exchange (MATE) is one such approach: by enabling
agents to exchange acknowledgment tokens, it induces
cooperation without additional training. However, MATE’s
use of fixed, manually tuned token values limits
adaptability to nonstationary environments and can
constrain performance. To enable a dynamically adapted
token, we introduce Social Influence-based MATE (SI-MATE),
which allows agents to share their individual improvement
signals and to self-punishment in response to inequality.
Experiments in a four-agent environment show that SI-MATE
outperforms MATE across multiple metrics, including
learning speed.

Social Influence-Based Mutual Acknowledgement Token Exchange (Student Abstract)

The current standard for training brain-computer interface (BCI) models is user-specific. There is a high interest in developing generic models that are trained on data from other users to minimize BCI calibration time; however, this is limited by noisy, non-stationary brain signals and high inter-user variabilities. We investigate the trade-off between training data quality and quantity on P300 BCI performance in individuals with amyotrophic lateral sclerosis (ALS) with representative traditional and deep learning models. Results show that data quality and domain alignment are more critical than dataset size: user-specific models trained on significantly less data outperformed generic models; generic models trained on ALS data outperformed models trained on ALS data; dimensionality reduction with block averaging was detrimental to EEGNet; and ISI differences between ALS and non-ALS data had minimal effect. Our findings highlight the importance of individualized model tuning for reliable P300 BCIs.

A Data-Centric Analysis of the Impact of Training Data Quality vs. Quantity on P300 Brain-Computer Interface Performance (Student Abstract)

In environments with sparse or delayed rewards,
reinforcement learning (RL) incurs high sample complexity
due to the large number of interactions needed for
learning. This limitation has motivated the use of large
language models (LLMs) for subgoal discovery and trajectory
guidance. While LLMs can support exploration, frequent
reliance on LLM calls raises concerns about scalability and
reliability. We address these challenges by constructing a
memory graph that encodes subgoals and trajectories from
both LLM guidance and the agent’s own successful rollouts.
From this graph, we derive a utility function that
evaluates how closely the agent’s trajectories align with
prior successful strategies. This utility shapes the
advantage function, providing the critic with additional
guidance without altering the reward. Our method relies
primarily on offline input and only occasional online
queries, avoiding dependence on continuous LLM supervision.
Preliminary experiments in benchmark environments show
improved sample efficiency and faster early learning
compared to baseline RL methods, with final returns
comparable to methods that require frequent LLM interaction.

Memory Based Advantage Shaping for LLM-Guided Reinforcement Learning (Student Abstract)

Although centralized training with centralized execution
(CTCE) excels at multi-agent coordination, its reliance on
global information limits its use in the real world.
Conversely, the practical decentralized execution (CTDE)
paradigm often struggles with complex coordination. This
paper bridges this critical gap by introducing the
Centralized-to-Decentralized (CtoD) learning concept: a
novel framework for transferring the knowledge of a
powerful centralized policy into a robust, practical
decentralized policy. Our method, CtoD-MAT, realizes this
transition through a curriculum that gradually shifts
agents from centralized to decentralized control. A key
innovation is our dynamic scheduling mechanism, featuring a
mediator module, which ensures a robust and effective
knowledge transfer. Using challenging SMAC benchmarks, we
demonstrate that CtoD-MAT successfully produces competitive
decentralized policies, notably solving complex
coordination tasks that are difficult for standard CTDE
methods.

CtoD-MAT: Bridging Centralized and Decentralized Execution in Multi-Agent Reinforcement Learning (Student Abstract)

Automated mosquito species identification is critical for combating vector-borne diseases. We introduce Q-MoFusion, a novel hybrid quantum-classical framework that fuses deep features from pre-trained Audio Spectrogram Transformer (AST) and Whisper models using a Variational Quantum Circuit (VQC). Our approach significantly outperforms individual backbones and prior state-of-the-art benchmarks, demonstrating superior accuracy and robustness, particularly on imbalanced classes. Q-MoFusion demonstrates the potential of hybrid quantum computing to enhance bioacoustic surveillance for addressing critical public health challenges.

Q-MoFusion: A Quantum Classifier for Masquito Species Classification (Student Abstract)

Traditional recommenders often fail to disentangle the
motivations behind user choices. To address this, we
propose MV-LLMRec, a framework that models interactions
through three views: Structural, Intent, and Conformity.
MV-LLMRec leverages LLMs to generate rich semantic
representations for intent and conformity, which are
refined through graph propagation and dynamically fused via
an attention mechanism. We evaluate MV-LLMRec on the
Amazon-Movie and Amazon-Book datasets and show that it
significantly outperforms state-of-the-art baselines,
validating our approach.

MV-LLMRec: Multi-View Representation Learning with Large Language Models for Recommendation (Student Abstract)

Machine Learning (ML) models have significant potential
across research and industry to enable data-driven insights
and decision-making. Their performance relies on input data
quality, but real-world datasets often contain
imperfections, making data preprocessing essential yet
time-consuming. Our research proposes a proof-of-concept
model using Generative Artificial Intelligence (GenAI) to
analyze and transform data for supervised ML
classification. The results from the GenAI models will be
compared with traditionally preprocessed data to evaluate
effectiveness. Preliminary results indicate that
incorporating GenAI models into the preprocessing pipeline
show potential in improving ML's classification performance.

Generative AI-Driven Data Transformation for Enhanced Machine Learning Performance (Student Abstract)

Automated cancer segmentation in Whole Slide Images (WSIs) has been dominated by a paradigm of static pattern recognition, where even advanced methods leveraging Transformers, Multiple Instance Learning, or topology-aware losses remain fundamentally descriptive and correlational. To address this limitation, we reframe WSI segmentation from a descriptive task to one of causal process modeling. We introduce Topo-GraT, a novel framework featuring a Causal Growth Field (CGF) to model tumor invasion dynamics and a Causal Flow Attention (CFA) mechanism that embeds this field as an architectural prior. This causal engine is integrated within an iterative graph refinement loop that uses segmentation uncertainty to dynamically focus computational resources on the most ambiguous tissue regions. Our comprehensive experiments on multiple WSI datasets demonstrate that Topo-GraT establishes a new state-of-the-art, significantly outperforming existing methods and reducing the 95% Hausdorff Distance, a key boundary metric, by over 15%. Crucially, our framework yields the CGF as a rich, interpretable output whose structure correlates with tumor aggressiveness, positioning it as a novel biomarker for downstream prognostic tasks. By shifting the paradigm from static recognition to causal reasoning, Topo-GraT offers a more robust, efficient, and clinically insightful approach, setting a new direction for the causally-aware medical image analysis.

Topo-GraT: Learning to Grow with Causal Graph Transformers (Student Abstract)

The management and annotation of complex, multi-modal scientific data remains a major obstacle for AI-driven research due to poor reusability and scalability of current solutions. We propose SciDataMAS, a novel LLM-powered multi-agent system (MAS), which automate scientific data management through a structured data lake with provenance-based organization and an adaptive metadata taxonomy. The system uses specialized workflows for automated dataset creation, data insertion and retrieval. Experiments show the system's proficiency, with modern LLMs like GPT-5 successfully generating rich metadata schemas and filling them with high accuracy. This work provides a foundational step towards fully automated, reusable, and scalable scientific data organization which may lead to generation and accumulation by scientific community well annotated AI-ready datasets.

Content not yet available

Next from AAAI 2026

Latent Representations of Land–Sea Boundaries and Extreme Temperature in Aurora’s Encoder (Student Abstract)

Stay up to date with the latest Underline news!

PRESENTATIONS

CONFERENCES

COMPANY

RESOURCES