Singapore

AI systems often fail on challenging or out-of-distribution inputs—a critical limitation in domains such as healthcare, finance, and autonomous driving. Learning to Defer (L2D) addresses this by training models not only to predict but also to decide when to defer to external experts. This thesis develops a unified and robust framework for L2D that advances its theoretical foundations, reliability, and applicability. It characterizes Bayes-optimal routing policies, establishes surrogate-consistency guarantees, and introduces a unified adversarial framework for attacking and defending L2D with Bayes-optimal robustness. It further proposes the first top-k deferral methods in both two-stage and one-stage settings. Empirical studies validate these ideas in multi-task learning and extractive question answering with large language models. Ongoing work explores token-level routing in LLMs, online adaptation with dynamic experts, and partial deferral.

AAAI 2026

Towards Robust Human–AI Decision-Making via Learning-to-Defer

learning to defer

selective prediction

human-in-the-loop machine learning

learning theory

technical paper

We are pleased to announce the Fortieth AAAI Conference on Artificial Intelligence (AAAI-26), which will be held in Singapore EXPO from January 20 to January 27, 2026.

The purpose of the AAAI conference series is to promote research in Artificial Intelligence (AI) and foster scientific exchange between researchers, practitioners, scientists, students, and engineers across the entirety of AI and its affiliated disciplines. AAAI-26 will feature technical paper presentations, special tracks, invited speakers, workshops, tutorials, poster sessions, senior member presentations, competitions, and exhibit programs, and a range of other activities to be announced.<br><br>

To access this event page, you need to log in with the **email address you registered with**. <br>Access credentials will be sent to your email from Underline -  subject line "Welcome to AAAI 2026". Please be sure to check your spam email folder if you do not see an email confirmation right away.

Please log in

To access this event page, you are required to register.
Please complete your registration to continue.

We recommend reading [**the registration information**](https://aaai.org/conference/aaai/aaai-26/registration/) first.

**Online Registration Form**: https://aaai.getregistered.net/conference-2026 

Registration Required

We are pleased to announce the Fortieth AAAI Conference on Artificial Intelligence (AAAI-26), which will be held in Singapore EXPO from January 20 to January 27, 2026.

While Reinforcement Learning (RL) has demonstrated remarkable success in solving complex sequential decision-making problems, its application in real-world, safety-critical systems is hindered by its reliance on carefully engineered reward functions. Designing effective rewards is notoriously challenging and can lead to unintended or unsafe behaviors — a phenomenon known as reward hacking. Specification-guided RL has emerged as a principled alternative, leveraging formal methods to directly encode high-level objectives, safety requirements, and behavioral constraints. However, the practical utility of this approach is often limited by coarse or under-specified logical formulas and the computational challenge of enforcing safety at scale. This thesis addresses these limitations by developing a unified framework for the automated refinement, scalable enforcement, and flexible adaptation of formal specifications in RL.

Specification-Guided Reinforcement Learning

Molecular conformations, the stable three-dimensional structures corresponding to local minima on the potential energy surface, govern key molecular properties and consequently underpin a wide range of downstream tasks. However, contemporary learning-based methods often lack scalability, interpretability, and robustness, thereby significantly constraining their practical effectiveness and reliability. In this context, I will introduce my ongoing explorations and the proposed research plan to address these challenges, with the ultimate objective of developing conformation‑centric universal foundation models to accelerate scientific discovery.

Improve Molecular Conformation Modeling with Geometric Deep Learning

Deep learning models offer state-of-the-art performance but
their inherent opacity is a major barrier to adoption in
high-stakes domains. In contrast, Takagi-Sugeno-Kang (TSK)
fuzzy systems provide rule-based transparency but often
lack the predictive power of deep networks. My PhD research
addresses this critical trade-off by developing the
Fuzzy-Modulated Linear Consequents (FMLC) framework, a
novel hybrid architecture that synergizes these two
paradigms. The core of FMLC is a deep neural network that
processes fuzzified input features to generate
context-dependent "modulators". These modulators
dynamically parameterize a TSK-style linear consequent
layer, creating a model that is both highly performant and
inherently interpretable. My latest work, Learnable-FMLC
(L-FMLC), advances this by introducing a regularized,
adaptive fuzzification layer that autonomously learns the
optimal fuzzy partitions from data, and a two-stage rule
distillation framework to ensure interpretability remains
scalable in high-dimensional problems. This research
delivers a validated, theoretically-grounded, and scalable
framework, contributing a significant step towards
transparent and trustworthy AI.

Fusing Deep Learning and Fuzzy Logic: A Framework for Adaptive and Scalable Interpretability

Autonomous driving must cope with motion blur, low light,
and dynamic agents, where RGB frames and event cameras
offer complementary strengths. This thesis investigates how
to fuse them across the perception–reasoning–planning
pipeline. It introduces FlexEvent, a frequency-robust
detector with adaptive fusion and label-efficient training;
Talk2Event, the first benchmark for event–language
grounding with attribute-aware modeling; and the ongoing
EventChat, an event–frame VLM for perception, spatial
relations, and ego reasoning. Future work will extend this
framework with iterative perception and reinforcement
learning for long-horizon decision making. Together, these
efforts aim to deliver robust perception, interpretable
reasoning, and planning support through event–frame fusion.

Towards Robust and Interpretable Event–Frame Fusion for Autonomous Driving

Higher autonomy is an increasingly common goal in the
design of transportation systems for the cities of the
future. Recently, part of this autonomy in both rail and
maritime transport has come from the field of artificial
intelligence and machine learning, particularly for
perception tasks (detection and recognition of rail
signals, other vessels, or other elements in the vehicle
environment) using neural networks. Although AI-based
approaches have gained significant popularity in many
application fields due to their good performance, their
unpredictability and lack of formal guarantees regarding
their desired behavior present a major issue for the
deployment of such safety-critical systems in urban areas.
The goal of my PhD thesis is to design new formal methods
to analyze and ensure the safety of such AI-based
perception modules in autonomous vehicles. More
specifically, my PhD topic aims to formally evaluate the
safety of a recently introduced class of continuous AI
models which is neural ODE.

Formal Verification of Neural ODE for Safety Evaluation in Autonomous Vehicles

Causal discovery is the task of learning a causal model from a source of information. Traditionally, the community has focused on algorithms that infer causal models from observational and/or interventional data, while alternative approaches have been only marginally explored. The proposed work aims to contribute to the theoretical foundations connecting agent-based systems with causal modeling, and to identify conditions under which newly developed causal discovery algorithms can be applied to elicit causal knowledge from agents.

Eliciting Causal Knowledge from Agents

Learning from human feedback enables AI systems and robots
to learn policies that align with human intent. While
existing work has primarily examined learning from
demonstrations, corrections, and preferences in
single-agent settings, these ideas have yet to be fully
extended to multi-agent domains—where cooperation,
decentralization, and non-stationary dynamics demand new
methods. In this thesis summary, I highlight my current
work and outline future directions for multi-robot learning
from human feedback, offering deployment strategies that
align supervisor intent with robot teams in the real world.

Multi-Robot Learning from Human Feedback

Model development in AI is shaped by developer decisions.
While there is significant research on the opportunities
and risks of multiplicity, little attention has been paid
to how developer decisions impact multiplicity. My thesis
focuses on (a) introducing broader frameworks to better
situate and analyze developer decisions in AI, (b)
identifying theoretical connections to characterize the
influence of these decisions on multiplicity, and (c)
operationalizing these insights across various
applications, thus building responsible AI models with
multiplicity.

From Decisions to Multiplicity: Frameworks, Theories, and Applications

Explainable AI (XAI) seeks to answer the question: which
features of the data led a model to make its decision?
Existing approaches are either model-agnostic (e.g., LIME,
SHAP)—flexible but unstable—or logic-based (e.g.,
sufficient reasons,
knowledge compilation)—principled but often overly complex.
This work introduces a probabilistic relaxation of
sufficient reasons, termed probabilistic sufficient
reasons, which balances flexibility with theoretical
guarantees. We analyze
its computational properties, propose tractable subclasses,
and outline future directions for scalable algorithms and
applications.

On the Computational Tractability of Probabilistic Global and Local Sufficient Explanation

Achieving globally desirable outcomes in networked
multi-agent systems—such as high social welfare, stable
allocations, and widespread cooperation—is a fundamental
challenge in AI. This paper outlines a research agenda that
explores two complementary pathways to this goal. The first
is a top-down approach, where a central mechanism designer
proposes rules to guide strategic agents towards
theoretically optimal equilibria. The second is a bottom-up
approach, where desirable farsighted policies, like
cooperation in social dilemmas, emerge from the
decentralized interactions of agents via multi-agent
reinforcement learning. We argue that the integration of
these paths constitutes a promising frontier for creating
robust and adaptive multi-agent systems.

Content not yet available

Next from AAAI 2026

Specification-Guided Reinforcement Learning

Stay up to date with the latest Underline news!

PRESENTATIONS

CONFERENCES

COMPANY

RESOURCES