Singapore

Legal AI systems powered by retrieval-augmented generation (RAG) face a critical accountability challenge: when an AI assistant cites case law, statutes, or contractual clauses, practitioners need verifiable guarantees that generated text faithfully represents source documents. Existing hallucination detectors rely on semantic similarity metrics that tolerate entity substitutions, a dangerous failure mode when confusing parties, dates, or legal provisions can have material consequences. We introduce HalluGraph, a graph-theoretic framework that quantifies hallucinations through structural alignment between knowledge graphs extracted from context, query, and response. Our approach produces bounded, interpretable metrics decomposed into \textit{Entity Grounding} (EG), measuring whether entities in the response appear in source documents, and \textit{Relation Preservation} (RP), verifying that asserted relationships are supported by context. On structured control documents, HalluGraph achieves near-perfect discrimination ($&gt;$400 words, $&gt;$20 entities), HalluGraph achieves $AUC = 0.979$, while maintaining robust performance ($AUC \approx 0.89$) on challenging generative legal task, consistently outperforming semantic similarity baselines. The framework provides the transparency and traceability required for high-stakes legal applications, enabling full audit trails from generated assertions back to source passages. To facilitate reproducibility, our code, dataset, and an interactive demo are publicly available at: \url{https://vcnoel.github.io/hallugraph-demo/}.

AAAI 2026

HalluGraph: Auditable Hallucination Detection for Legal RAG Systems via Knowledge Graph Alignment

Legal AI systems powered by retrieval-augmented generation (RAG) face a critical accountability challenge: when an AI assistant cites case law, statutes, or contractual clauses, practitioners need verifiable guarantees that generated text faithfully represents source documents. Existing hallucination detectors rely on semantic similarity metrics that tolerate entity substitutions, a dangerous failure mode when confusing parties, dates, or legal provisions can have material consequences. We introduce HalluGraph, a graph-theoretic framework that quantifies hallucinations through structural alignment between knowledge graphs extracted from context, query, and response. Our approach produces bounded, interpretable metrics decomposed into \textit{Entity Grounding} (EG), measuring whether entities in the response appear in source documents, and \textit{Relation Preservation} (RP), verifying that asserted relationships are supported by context. On structured control documents, HalluGraph achieves near-perfect discrimination ($>$400 words, $>$20 entities), HalluGraph achieves $AUC = 0.979$, while maintaining robust performance ($AUC \approx 0.89$) on challenging generative legal task, consistently outperforming semantic similarity baselines. The framework provides the transparency and traceability required for high-stakes legal applications, enabling full audit trails from generated assertions back to source passages. To facilitate reproducibility, our code, dataset, and an interactive demo are publicly available at: \url{https://vcnoel.github.io/hallugraph-demo/}.

technical paper

We are pleased to announce the Fortieth AAAI Conference on Artificial Intelligence (AAAI-26), which will be held in Singapore EXPO from January 20 to January 27, 2026.

The purpose of the AAAI conference series is to promote research in Artificial Intelligence (AI) and foster scientific exchange between researchers, practitioners, scientists, students, and engineers across the entirety of AI and its affiliated disciplines. AAAI-26 will feature technical paper presentations, special tracks, invited speakers, workshops, tutorials, poster sessions, senior member presentations, competitions, and exhibit programs, and a range of other activities to be announced.<br><br>

To access this event page, you need to log in with the **email address you registered with**. <br>Access credentials will be sent to your email from Underline -  subject line "Welcome to AAAI 2026". Please be sure to check your spam email folder if you do not see an email confirmation right away.

Please log in

To access this event page, you are required to register.
Please complete your registration to continue.

We recommend reading [**the registration information**](https://aaai.org/conference/aaai/aaai-26/registration/) first.

**Online Registration Form**: https://aaai.getregistered.net/conference-2026 

Registration Required

We are pleased to announce the Fortieth AAAI Conference on Artificial Intelligence (AAAI-26), which will be held in Singapore EXPO from January 20 to January 27, 2026.

In the Multi-Agent Path Finding (MAPF) problem, the aim is to find collision free paths for multiple agents. MAPF has many practical applications and has spawned massive research interest in the past two decades. Most MAPF research assumed that every agent is assigned a target it must reach. This assumption often does not hold in several key applications such as automated warehouses and parking lots, where some agents are assigned targets to reach, and others, denoted as unassigned agents, can either stay idle or move to clear the way for the assigned agents. In this paper we introduce this important problem, explain its uniqueness and encourage the entire community to work on it.

Multi-Agent Path Finding with Unassigned Agents (MAPFUA)

This report explores the evolution and current state of
neuro-
symbolic artificial intelligence, an approach that
integrates
neural network capabilities with symbolic reasoning. We
trace the historical context from early AI aspirations to
modern implementations and successes, highlighting key
paradigms, and other logical and semantical considerations.
We argue against the “scaling is all you need” hypothesis,
and point to persistent challenges in reliable symbolic
reasoning with deep and large models. We conclude by
suggesting
that despite numerous implementation choices and the ”broad
church” nature of neuro-symbolic AI, these approaches offer
the most promising path towards AI systems that combine
pattern recognition with robust reasoning, particularly for
applications requiring structured knowledge,
explainability, and
trustworthiness.

The Future Is Neuro-Symbolic: Where Has It Been, and Where Is It Going?

Effectively handling long contexts is challenging for Large Language Models (LLMs) due to the rarity of long texts, high computational demands, and substantial forgetting of short-context abilities. Recent approaches have attempted to construct long contexts for instruction tuning, but these methods often require LLMs or human interventions, which are both costly and limited in length and diversity. Also, the drop in short-context performances of present long-context LLMs remains significant. In this paper, we introduce Flora, an effortless (human/LLM-free) long-context construction strategy. Flora can markedly enhance the long-context performance of LLMs by arbitrarily assembling short instructions based on categories and instructing LLMs to generate responses based on long-context meta-instructions. This enables Flora to produce contexts of arbitrary length and scale with rich diversity, while only slightly compromising short-context performance. Experiments on Llama3-8B-Instruct and QwQ-32B show that LLMs enhanced by Flora excel in three long-context benchmarks while maintaining strong performances in short-context tasks.

Flora: Effortless Context Construction to Arbitrary Length and Scale

Vision Language Models (VLMs) have demonstrated strong performance in multimodal understanding, offering promise for the circuit-to-netlist translation task. However, the diverse component symbols and complex connections in circuit images challenge VLMs in understanding physical layouts and reasoning for electrical connection logic. To address these, we propose Circuit-Think, the first multimodal reasoning framework for the automated circuit-to-netlist translation task, which employs the Trajectory-Guided Reinforcement Learning (TGRL) learning paradigm for structured logical reasoning on circuit images. Circuit-Think initializes reasoning capabilities through supervised fine-tuning (SFT) on image-netlist pairs, then optimizes reasoning trajectories and netlist generation decisions using TGRL. Firstly, TGRL introduces a step-by-step reasoning paradigm, which guides the model with stepwise reward functions to simulate the human cognitive trajectory of "identifying ports, recognizing devices, and inferring connections''. Secondly, we customize a multi-level reward that maps reasoning and answers into graph structures and node sets, jointly optimizing logical consistency and netlist accuracy via graph similarity and set matching. Thirdly, TGRL contains a reflective learning mechanism for low-scoring samples, which corrects the reasoning trajectory through reference answers as hints, avoiding local optima caused by sparse reward signals or erroneous reasoning paths. Moreover, we construct a circuit image-netlist reasoning dataset with 3,100 samples, offering step-by-step annotations for converting circuit images to netlists. Extensive experiments demonstrate that Circuit-Think achieves SOTA netlist accuracy and significantly improves the accuracy of downstream tasks. Our circuit image-netlist reasoning dataset is open-source.

Circuit-Think: A Multimodal Reasoning Framework for Automated Circuit-to-Netlist Translation with Trajectory-Guided Reinforcement Learning

The widespread adoption of graph neural networks (GNNs) has brought increased attention to fairness issues related to sensitive attributes, such as gender and race, in practical scenarios. However, this concern remains largely unexplored in the context of graph clustering. Conventional fair graph clustering methods primarily depend on spectral clustering approaches. Meanwhile, we argue that existing graph learning works mainly focus on a single type of fairness, whereas graph clustering should achieve group equality-informed individual fairness. In this paper, we introduce for the first time a fairness-aware framework termed FairGC for deep graph clustering, which integrates the dual objectives of individual and group fairness while maintaining accurate clustering results. Specifically, we construct two views with distinct semantics using Siamese encoders. Then, we apply multi-step random walks on view-specific affinity graphs to capture high-order affinities of node pairs, thereby reformulating the contrastive learning with a focus on individual similarity. Besides, we utilize adversarial learning by making node representations independent of the estimated sensitive attributes to further eliminate group biases of clustering results. Extensive experiments on four benchmarks demonstrate the effectiveness and superiority of our proposed framework FairGC.

FairGC: Fostering Individual and Group Fairness for Deep Graph Clustering

The emergence of Multimodal Large Language Models (MLLMs) has propelled the development of autonomous agents that operate on Graphical User Interfaces (GUIs) using pure visual input. A fundamental challenge is robustly grounding natural language instructions. This requires a precise \textit{ spatial alignment}, which accurately locates the coordinates of each element, and, more critically, a correct \textit{ semantic alignment}, which matches the instructions to the functionally appropriate UI element. Although Reinforcement Learning with Verifiable Rewards (RLVR) has proven to be effective at improving \textit{spatial alignment} for these MLLMs, we find that inefficient exploration bottlenecks \textit{semantic alignment}, which prevent models from learning difficult semantic associations. To address this exploration problem, we present Adaptive Exploration Policy Optimization (AEPO), a new policy optimization framework. AEPO employs a multi-answer generation strategy to enforce broader exploration, which is then guided by a theoretically grounded Adaptive Exploration Reward (AER) function derived from first principles of efficiency $\eta=U/C$. Our AEPO-trained models, InfiGUI-G1-3B and InfiGUI-G1-7B, establish new state-of-the-art results across multiple challenging GUI grounding benchmarks, achieving significant relative improvements of up to 8.3\% against the naive RLVR baseline on benchmarks designed to test generalization and semantic understanding.

InfiGUI-G1: Advancing GUI Grounding with Adaptive Exploration Policy Optimization

Knowledge distillation (KD) is a technique that transfers the knowledge from a teacher model to a student model, where the teacher is usually larger and more powerful. In this tutorial, we will briefly introduce the basic concepts, including intermediate-layer matching and prediction-matching KD. We then dive into the challenges and opportunities of KD with sequential data, which lead to advanced techniques such as reinforcement learning KD and multi-teacher KD. We will also cover practical KD applications such as LLM sequence compression and LLM self-distillation. The goal of this tutorial is to provide participants with a comprehensive understanding of the techniques and applications of KD for language models.

Homepage: https://manga-uofa.github.io/aaai-llm-kd/ 

Knowledge Distillation for Language Models: Challenges and Opportunities with Sequential Data

Graph anomaly detection (GAD), which aims to identify rare observations in graphs, has attracted rapidly increasing attention in recent years due to its significance in a wide range of high-impact application domains such as abusive review detection and malicious behavior detection in online shopping applications, web attack detection, and suspicious activity detection in online/offline financial services. A foundation model on GAD refers to a generalist model trained on specific graph data, enabling it to generalize effectively across different domains and tasks. In recent years, such models have attracted increasing attention due to their ability to provide strong zero-shot and few-shot performance without task-specific retraining. By learning domain-invariant and transferable representations across tasks, a GAD foundation model can be readily adapted to new anomaly detection scenarios, making it applicable to a wide range of use cases such as privacy-preserving anomaly detection, transferable cybersecurity and threat detection, and cross-platform anomaly detection in social network.

In this tutorial, we aim to present a comprehensive review of deep learning methods specifically designed for GAD and foundation models for detecting abnormal activities on graphs. Specifically, we will first elaborate on the key concepts and taxonomies in GAD. Then review popular state-of-the-art deep anomaly detection methods from various perspectives of methodology design on graph data, including GNN backbone design, proxy task design, and anomaly measures. Then we will establish the connection between conventional methods and foundation models on GAD, highlighting how recent advancements build upon or differ from conventional approaches. Following this, we will provide a comprehensive overview of existing foundation models that have been proposed for detecting abnormal activities on graphs from cross-domain and cross-task, respectively. We will discuss their underlying principles, design choices, and effectiveness across various settings. The future directions will be finally presented to help researchers gain a deep understanding of this area and promote more high-quality research and real-world applications in the future.  The webiste of this tutorial is  https://sites.google.com/view/aaai26-tutorial-gad/home?read_current=1

Toward Foundation Models for Detecting Abnormal Activities on Graphs

How to find a natural grouping of a large real data set? Clustering requires a balance between abstraction and representation. To identify clusters, we need to abstract from superfluous details of individual objects, such as background or lighting in images. But we also need a rich representation that emphasizes the key features shared by groups of objects that distinguish them from other groups of objects. Each clustering algorithm implements a different trade-off between abstraction and representation. Classical K-means implements a high level of abstraction – details are simply averaged out – combined with a very simple representation – all clusters are Gaussians in the original data space. We will see how approaches to subspace and deep clustering support high-dimensional and complex data by allowing richer representations. However, with increasing representational expressiveness comes the need to explicitly enforce abstraction in the objective function to ensure that the resulting method performs clustering and not just representation learning. We will see how current deep clustering methods define and enforce abstraction through centroid-based and density-based clustering losses. Balancing the conflicting goals of abstraction and representation is challenging. Ideas from subspace clustering help by learning one latent space for the information that is relevant to clustering and another latent space to capture all other information in the data. The tutorial ends with an outlook on future research in clustering. Future methods will more adaptively balance abstraction and representation to improve performance, energy efficiency and interpretability.
This tutorial is for machine learning researchers and professionals interested in learning more about clustering high-dimensional data. Practitioners will receive an overview of different approaches to clustering high-dimensional data, along with insights into their benefits and limitations. This knowledge will enable them to select an appropriate method for their problem. Researchers will find starting points for contributing to the topic. We will illustrate foundational and current approaches with Python code examples. We will summarize the evaluation methodology and provide pointers to benchmark data. We will also highlight open problems that require further research. This tutorial is a starting point for actively contributing to this active and fascinating research topic. To illustrate, we will use real use cases from collaborative projects in biology, neuroscience, and archeology, in addition to benchmark data. Basic knowledge in machine learning, data mining, linear algebra and Python programming is beneficial but not required.

Website: https://dm.cs.univie.ac.at/research/aaai26/

Clustering High-dimensional Data: Balancing Abstraction and Representation

Computational Pathology Foundation Models (CPathFMs) have emerged as a transformative approach for automating histopathological analysis by leveraging self-supervised learning on large-scale, unlabeled whole-slide images (WSIs). These models, categorized into uni-modal and multi-modal frameworks, facilitate tasks such as segmentation, classification, biomarker discovery, and prognosis prediction. However, the development of CPathFMs faces significant challenges, including limited dataset availability, domain-specific adaptation requirements, and the absence of standardized evaluation benchmarks. This tutorial will provide a comprehensive overview of the current state of CPathFMs, covering key datasets, adaptation strategies such as contrastive learning and multi-modal integration, and a taxonomy of evaluation tasks. We will discuss how these models are trained, fine-tuned, and assessed, addressing the critical gaps in generalization, bias mitigation, and clinical applicability. Additionally, we will explore emerging research directions in fairness, transparency, security, and standardization of evaluation protocols. This tutorial will serve as an essential resource for researchers, clinicians, and AI practitioners looking to advance the field of AI-driven computational pathology.

Website: https://sites.google.com/view/aaai26tutorial-cpath/home

Premium content

Next from AAAI 2026

Multi-Agent Path Finding with Unassigned Agents (MAPFUA)

Stay up to date with the latest Underline news!

PRESENTATIONS

CONFERENCES

COMPANY

RESOURCES