Singapore

The opaque nature of deep learning models presents significant challenges for the ethical deployment of hate speech detection systems. To address this limitation, we introduce Supervised Rational Attention (SRA), a framework that explicitly aligns model attention with human rationales, improving both interpretability and fairness in hate speech classification. SRA integrates a supervised attention mechanism into transformer-based classifiers, optimizing a joint objective that combines standard classification loss with an alignment loss term that minimizes the discrepancy between attention weights and human-annotated rationales.
We evaluated SRA on hate speech benchmarks in English (HateXplain) and Portuguese (HateBRXplain) with rationale annotations. Empirically, SRA achieves 2.4× better explainability compared to current baselines, and produces token-level explanations that are more faithful and human-aligned. In terms of fairness, SRA achieves competitive fairness across all measures, with second-best performance in detecting toxic posts targeting identity groups, while maintaining comparable results on other metrics. These findings demonstrate that incorporating human rationales into attention mechanisms can enhance interpretability and faithfulness without compromising fairness.

AAAI 2026

Aligning Attention with Human Rationales for Self-Explaining Hate Speech Detection

explainable ai

natural language understanding

hate speech detection

poster

We are pleased to announce the Fortieth AAAI Conference on Artificial Intelligence (AAAI-26), which will be held in Singapore EXPO from January 20 to January 27, 2026.

The purpose of the AAAI conference series is to promote research in Artificial Intelligence (AI) and foster scientific exchange between researchers, practitioners, scientists, students, and engineers across the entirety of AI and its affiliated disciplines. AAAI-26 will feature technical paper presentations, special tracks, invited speakers, workshops, tutorials, poster sessions, senior member presentations, competitions, and exhibit programs, and a range of other activities to be announced.<br><br>

To access this event page, you need to log in with the **email address you registered with**. <br>Access credentials will be sent to your email from Underline -  subject line "Welcome to AAAI 2026". Please be sure to check your spam email folder if you do not see an email confirmation right away.

Please log in

To access this event page, you are required to register.
Please complete your registration to continue.

We recommend reading [**the registration information**](https://aaai.org/conference/aaai/aaai-26/registration/) first.

**Online Registration Form**: https://aaai.getregistered.net/conference-2026 

Registration Required

We are pleased to announce the Fortieth AAAI Conference on Artificial Intelligence (AAAI-26), which will be held in Singapore EXPO from January 20 to January 27, 2026.

Large Language Models (LLMs) demonstrate strong reasoning capabilities but still struggle with hallucinations and limited transparency. Recently, KG-enhanced LLMs that integrate knowledge graphs (KGs) have been shown to improve reasoning performance, particularly for complex, knowledge-intensive tasks. However, these methods still face significant challenges, including inaccurate retrieval and reasoning failures, often exacerbated by long input contexts that obscure relevant information. Furthermore, many of these approaches rely on LLMs to directly retrieve evidence from KGs, and to self-assess the sufficiency of this evidence, which often results in premature or incorrect reasoning. To address the retrieval and reasoning failures, we propose ProgRAG, a multi-hop knowledge graph question answering (KGQA) framework that decomposes complex questions into sub-questions, and progressively extends partial reasoning paths by answering each sub-question. At each step, external retrievers gather candidate evidence, which is then refined through uncertainty-aware pruning by the LLM. Finally, the context for LLM reasoning is optimized by organizing and rearranging the partial reasoning paths obtained from the sub-question answers. Experiments on two well-known datasets, WebQSP and CWQ, demonstrate that ProgRAG outperforms existing baselines in multi-hop KGQA, offering improved reliability and reasoning quality.

ProgRAG: Hallucination-Resistant Progressive Retrieval and Reasoning over Knowledge Graphs

Recently, the strong generalization ability of CLIP has facilitated open-vocabulary semantic segmentation, which labels pixels using arbitrary text. However, existing methods that fine-tune CLIP for segmentation on limited seen categories often lead to overfitting and degrade the pretrained vision-language alignment. To stabilize modality alignment during fine-tuning, we propose InfoCLIP, which leverages an information-theoretic perspective to transfer alignment knowledge from pretrained CLIP to the segmentation task. Specifically, this transfer is guided by two novel objectives grounded in mutual information. First, we compress the pixel-text modality alignment from pretrained CLIP to reduce noise arising from its coarse-grained local semantic representations learned under image-text supervision. Second, we maximize the mutual information between the alignment knowledge of pretrained CLIP and the fine-tuned model to transfer compact local semantic relations suited for the segmentation task. Extensive evaluations across various benchmarks validate the effectiveness of InfoCLIP in enhancing CLIP fine-tuning for open-vocabulary semantic segmentation, demonstrating its adaptability and superiority in asymmetric transfer.

InfoCLIP: Bridging Vision-Language Pretraining and Open-Vocabulary Semantic Segmentation via Information-Theoretic Alignment Transfer

Cross-lingual, cross-task transfer is challenged by task-specific data scarcity which becomes more severe as language support grows. 
Both challenges are amplified within vision-language models (VLMs).
We investigate multilingual generalization in encoder-decoder transformer VLMs to enable zero-shot image captioning in a language that was only paired with machine translations during training.
In this setting, the encoder must learn to generate generalizable, latent task-aware vision representations to instruct the decoder via inserted cross-attention layers.
We study scaling laws by training models based on Florence-2 and Gemma-2 that range from 0.4B to 11.2B parameters.
The training is performed on a synthetic dataset using varying compute budgets.
While all languages in the dataset have image-aligned translations, only a subset of them include image captions.
Notably, we show that captioning can emerge in a language after training on only translation data. 
We find that this indirect learning of unseen task-language pairs adheres to scaling laws that are governed by the multilinguality of the model, its model size and seen training samples.
Finally, we demonstrate that our observed scaling laws extend to a variety of downstream tasks, achieving competitive performance through finetuning in multimodal machine translation (Multi30K, CoMMuTE), lexical disambiguation (CoMMuTE), and image captioning (Multi30K, XM3600, COCO Karpathy).

Scaling Laws for Conditional Emergence of Multilingual Image Captioning via Generalization from Translation

Composed Image Retrieval (CIR) is a flexible image retrieval paradigm that enables users to accurately locate the target image through a multimodal query composed of a reference image and modification text. Although this task has demonstrated promising applications in personalized search and recommendation systems, it encounters a severe challenge in practical scenarios known as the Noise Triplet Correspondence (NTC) problem. This issue primarily arises from the high cost and subjectivity involved in annotating triplet data. To address this problem, we identify two central challenges: the precise estimation of composed semantic discrepancy and the insufficient progressive adaptation to modification discrepancy. To tackle these challenges, we propose a cHrono-synergiA roBust progressIve learning framework for composed image reTrieval (HABIT), which consists of two core modules. First, the Mutual Knowledge Estimation Module quantifies sample cleanliness by calculating the Transition Rate of mutual information between the composed feature and the target image, thereby effectively identifying clean samples that align with the intended modification semantics. Second, the Dual-consistency Progressive Learning Module introduces a collaborative mechanism between the historical and current models, simulating human habit formation to retain good habits and calibrate bad habits, ultimately enabling robust learning under the presence of NTC. Extensive experiments conducted on two standard CIR datasets demonstrate that HABIT significantly outperforms state-of-the-art methods under various noise ratios, exhibiting superior robustness and retrieval performance.

HABIT: Chrono-Synergia Robust Progressive Learning Framework for Composed Image Retrieval

Traffic flow prediction is a typical spatial-temporal prediction problem and has a wide range of applications. The core challenge lies in modeling the underlying complex spatial-temporal dependencies. Various methods have been proposed, and recent studies show that the modeling of dynamics is useful to meet the core challenge. While handling spatial dependencies and temporal dependencies using separate base model structures may hinder the modeling of spatial-temporal correlations, modeling of dynamics can bridge this gap. Incorporating spatial-temporal heterogeneity also advances the main goal, since it can extend the parameter space and incorporate more flexibility. Despite these advances, two limitations persist: 1) the modeling of dynamics is often limited to the dynamics of spatial topology (e.g., adjacency matrix changes), which, however, can be extended to a broader scope; 2) the modeling of heterogeneity is often separated for spatial and temporal dimensions, but this gap can also be bridged by the modeling of dynamics. To address the above limitations, we propose a novel framework for traffic prediction, called Meta Dynamic Graph (MetaDG). MetaDG leverages dynamic graph structures of node representations to explicitly model spatial-temporal dynamics. This generates both dynamic adjacency matrix and meta-parameters, extending dynamic modeling beyond topology while unifying the capture of spatial-temporal heterogeneity into a single dimension. Extensive experiments on four real-world datasets validate the effectiveness of MetaDG.

Meta Dynamic Graph for Traffic Flow Prediction

Recent self-supervised pre-training methods for object detection often rely on generic object proposals for localization and semantic feature learning for classification, but they yield limited improvements when applied to Detection Transformers (DETR) due to a lack of architectural alignment. Hence, we propose an elegant and versatile self-supervised framework tailored for DETR-like models called **Dis**tance-aware Multi-view **Co**ntrastive Learning (**DisCo DETR**). **DisCo DETR** enhances localization and semantic features through two core components. (i) **Distance-aware Multi-view Object Query Fusion** explicitly guides object queries to focus on spatially close objects across views, stabilizing training and improving localization accuracy. (ii) **Contrastive Learning for DETR** uses native bipartite matching to identify positive output pairs across views and pull them closer, enhancing semantic features discrimination with no extra matching. DisCo DETR can be seamlessly integrated into DETR-like models and achieves SOTA transfer performance on PASCAL VOC and COCO benchmarks across multiple variants.

DisCo DETR: Distance-aware Multi-view Contrastive Learning for DETR Pre-training

Large Language Models (LLMs) have demonstrated remarkable capabilities in various tasks through the \textbf{example-driven learning paradigm}. However, in high-stakes domains such as emergency response or industrial safety, real incidents are scarce, confidential, or both, while concise \emph{rule books} are plentiful.
We formalize this underexplored setting as \textbf{rule knowledge–driven reasoning} and ask: \emph{Can an LLM reason reliably when rules are abundant but examples are almost nil?}
To answer this question we introduce \textbf{RULER}, a fully automatic benchmark that derives 32K rigorously verified questions from 1K expert-curated emergency response rule knowledge to probe three core abilities—\emph{rule memorization}, \emph{single-rule application}, and \emph{multi-rule complex reasoning}, supported by a hallucination-aware evaluation suite and novel relational metrics.
A comprehensive empirical study of five open-source LLMs and five enhancement strategies shows that, after reliable performance on rule memorization and single-rule application, multi-rule complex reasoning plateaus at 5.4 on a 10-point scale.
We bridge this gap with \textbf{RAMPS}—a \textbf{R}ule-knowledge-\textbf{A}ware \textbf{M}onte-Carlo-tree-search \textbf{P}rocess-reward \textbf{S}upervision framework.
RAMPS injects rule knowledge priors into MCTS, distills 12K step-level traces without human annotation, and trains an advantage-based reward model that scores candidate reasoning paths during the beam search inference.
Experimental results demonstrate a notable improvement in complex reasoning, increasing to 7.7 (+2.3).
Together, RULER and RAMPS provide an automatic benchmark and a strong baseline suite for rule knowledge-driven reasoning in LLMs.

Benchmarking and Enhancing Rule Knowledge-Driven Reasoning of Large Language Models

Though promising in healthcare consultation applications, large language models (LLMs) face critical limitations in retaining and utilizing long-term memory across multi-turn interactions. In particular, existing memory enhancing paradigms are constrained by limited context windows and embedding-based retrieval, often failing to maintain task relevance and still suffering from memory prototype collapse in multi-turn healthcare consultation. To address these challenges, we propose a cognitively-inspired memory framework named MemoryART, which is grounded in Adaptive Resonance Theory (ART)—a cognitive and learning theory of how humans and animals adapt to dynamic environments. MemoryART employs three memory modules—working memory, episodic memory, and semantic memory to support task-aware memory organization and dynamic retrieval. Specifically, episodic memory provides the storage of specific experiences along with contextual clues, which is crucial for managing patient-specific information and perfect for multi-turn healthcare consultation interactions. Building upon this concept, MemoryART leverages multi-channel competitive learning and resonance matching to enable efficient and interpretable episodic memory encoding, alleviating issues of prototype collapse and noisy memory associations. For evaluation, we construct a long-term medical dialogue benchmark called MediLongChat using a LLM-based generation pipeline. The resulting dataset features realistic, multi-disease chat histories, each exceeding 100K tokens across 20–30 dialogues, simulating real-world healthcare interaction patterns. Our experimental results show that MemoryART outperforms mainstream approaches in memory-intensive tasks, achieving SOTA results and significantly reducing token consumption across five popular LLMs, confirming its effectiveness and efficiency in providing scalable, reliable memory for LLMs in healthcare. Code and datasets are available at \url{https://github.com/dairkkriad/MemoryART}

MemoryART: Enhancing LLMs via Multi-Memory Models with Adaptive Resonance Theory for Healthcare Agents

Post-training quantization (PTQ) offers an efficient approach to compressing large language models (LLMs), significantly reducing memory access and computational costs. Existing compensation-based weight calibration methods often rely on a second-order Taylor expansion to model quantization error, under the assumption that the first-order term is negligible in well-trained full-precision models. However, we reveal that the progressive compensation process introduces accumulated first-order deviations between latent weights and their full-precision counterparts, making this assumption fundamentally flawed. To address this, we propose FOEM, a novel PTQ method that explicitly incorporates first-order gradient terms to improve quantization error compensation. 
FOEM approximates gradients by performing a first-order Taylor expansion around the pre-quantization weights. This yields an approximation based on the difference between latent and full-precision weights as well as the Hessian matrix. When substituted into the theoretical solution, the formulation eliminates the need to explicitly compute the Hessian, thereby avoiding the high computational cost and limited generalization of backpropagation-based gradient methods. This design introduces only minimal additional computational overhead.
Extensive experiments across a wide range of models and benchmarks demonstrate that FOEM consistently outperforms the classical GPTQ method. In 3-bit weight-only quantization, FOEM reduces the perplexity of Llama3-8B by 17.3\% and increases the 5-shot MMLU accuracy from 53.8\% achieved by GPTAQ to 56.1\%. Moreover, FOEM can be seamlessly combined with advanced techniques such as SpinQuant, delivering additional gains under the challenging W4A4KV4 setting and further narrowing the performance gap with full-precision baselines, surpassing existing state-of-the-art methods.

First-Order Error Matters: Accurate Compensation for Quantized Large Language Models

Generalized Category Discovery (GCD) aims to classify labeled instances from known categories while discovering novel categories from unlabeled data. Despite recent progress in GCD for computer vision, existing GCD approaches largely rely on static final-step representations (in the visual domain), overlooking the temporally evolving nature of time-series data. In this paper, we introduce TGCD, the first framework specifically designed for GCD in time-series data. TGCD leverages both the dynamics of latent representations and the heterogeneity of predictions across multiple temporal segments to disover unknown (i.e., novel) categories, based on a pre-trained time-series foundation model. We propose a unified learning objective for TGCD that integrates the following three components: (i) a Stochastic Temporal Segment Dropout (STeSD) objective that regularizes the model by selectively penalizing high-entropy segments to encourage confident predictions on uncertain regions of the time-series, and (ii) a Known–Unknown Temporal Discriminability (KUTD) objective that promotes representational separation between known and unknown categories within unlabeled data and (iii) a margin-aware classification objective to improve generalization. Empirical evaluation on six multivariate time-series data sets demonstrates that the TGCD substantially outperforms existing GCD methods, particularly in discovering unknown categories. We further conduct ablation studies to highlight the individual contributions of each component. Additionally, we provide the first comprehensive benchmarking of recent GCD approaches on time-series data, revealing the limitations of naive transfer and underscoring the benefits of temporal modeling.

Downloads

Next from AAAI 2026

ProgRAG: Hallucination-Resistant Progressive Retrieval and Reasoning over Knowledge Graphs

Stay up to date with the latest Underline news!

PRESENTATIONS

CONFERENCES

COMPANY

RESOURCES