Singapore

Software reliability remains a fundamental challenge in
modern software engineering. The rapid rise of AI-assisted
coding has dramatically improved development productivity
but introduces a critical problem: AI-generated code cannot
be inherently trusted. While AI coding tools accelerate
development, they may produce code with subtle bugs,
security vulnerabilities, and legal compliance issues,
amplifying an already costly maintenance burden where
developers spend the majority of their time fixing bugs.
This unreliability of AI-generated code poses systemic
risks to software quality, security, and compliance.

Addressing this challenge requires a two-pronged solution
approach. First, we need reliable detection methods to
identify AI-generated code, enabling targeted quality
reviews and ensuring compliance with licensing
requirements. Second, we need effective Automated Program
Repair (APR) techniques to automatically localize faults
and synthesize patches, reducing the growing burden of bug
fixing in rapidly produced code. However, progress in both
areas has been constrained by a critical limitation: the
lack of comprehensive datasets and benchmarks, particularly
for C and C++, which underpin most safety-critical systems.
Moreover, current repair approaches, including recent large
language models (LLMs), lack the semantic reasoning
abilities necessary for complex bug fixing tasks, often
relying on pattern matching rather than genuine program
understanding.

This thesis addresses these interconnected challenges
through four major contributions. First, we conduct the
first comprehensive study of AI-generated code detection,
evaluating thirteen detectors on over two million samples
of code and natural language, and propose fine-tuning–based
approaches that substantially improve detection accuracy.
Second, we construct and release Defects4C, the first
large-scale, executable benchmark for C/C++ bugs, curated
from millions of real-world commits and designed to enable
reproducible evaluation for bug detection and repair.
Third, we propose a dual deep learning–based APR framework,
integrating BiLSTM-based fault localization with a
retrieval-augmented transformer for patch generation, and
conduct the first large-scale evaluation of LLM-based APR
on C/C++, revealing significant performance gaps compared
to Java benchmarks and highlighting the limitations of
current models in semantic understanding and code
reasoning. Finally, we design a semantic-enhancement
framework for LLMs, incorporating dynamic semantic signals
such as code execution traces into training and inference,
and demonstrate improvements in program repair and general
code generation.

These contributions advance the foundations of trustworthy,
semantically grounded automated program repair, providing
new datasets, empirical insights, and methodological
innovations that will guide the future development of
reliable AI-driven software engineering.

AAAI 2026

Trustworthy AI-Assisted Programming: Detection and Repair of Unreliable Code

technical paper

We are pleased to announce the Fortieth AAAI Conference on Artificial Intelligence (AAAI-26), which will be held in Singapore EXPO from January 20 to January 27, 2026.

The purpose of the AAAI conference series is to promote research in Artificial Intelligence (AI) and foster scientific exchange between researchers, practitioners, scientists, students, and engineers across the entirety of AI and its affiliated disciplines. AAAI-26 will feature technical paper presentations, special tracks, invited speakers, workshops, tutorials, poster sessions, senior member presentations, competitions, and exhibit programs, and a range of other activities to be announced.<br><br>

To access this event page, you need to log in with the **email address you registered with**. <br>Access credentials will be sent to your email from Underline -  subject line "Welcome to AAAI 2026". Please be sure to check your spam email folder if you do not see an email confirmation right away.

Please log in

To access this event page, you are required to register.
Please complete your registration to continue.

We recommend reading [**the registration information**](https://aaai.org/conference/aaai/aaai-26/registration/) first.

**Online Registration Form**: https://aaai.getregistered.net/conference-2026 

Registration Required

We are pleased to announce the Fortieth AAAI Conference on Artificial Intelligence (AAAI-26), which will be held in Singapore EXPO from January 20 to January 27, 2026.

The rise of generative AI presents a profound duality. On
one hand, it offers a powerful solution to data scarcity
and privacy challenges in biometrics. On the other, it is
weaponized to create deepfakes that threaten digital
integrity. Existing detectors for these deepfakes are
brittle, failing against real-world transformations and
novel generative models. This dissertation confronts this
duality head-on. First, I establish the viability of
synthetic data for building fair and private biometric
systems. Second, to counter the malicious use of this
technology, this dissertation develops deepfake detectors
designed to be robust, generalizable, and efficient by
construction. My work introduces novel, lightweight feature
sets on different cues (e.g., colour cue-based Relative
Chrominance Difference, Gradient features, Depth cues,
etc.) that are inherently resilient to OSN transformations
and improve generalisation to unseen forgeries. Whereas,
accomplished results confirm state-of-the-art performance,
achieving high accuracy in challenging real-world scenarios
with a significant reduction in model complexity, my
current and future work focuses on achieving superior
generalisation while being OSN manipulation resistant.

Exploring the Janus Face of Synthetic Images: From Privacy-secure Biometrics to Universal and Robust Deepfake Detection

Tabular data is a fundamental form of information in
real-world applications, ranging from finance and
healthcare to scientific research. Unlike traditional views
that treat tables as isolated structured data, tables are
often inherently multimodal—appearing as images, embedded
in documents, or coexisting with text and other modalities.
My research explores multimodal tabular data learning,
aiming to bridge structured tabular knowledge with diverse
input forms and tasks. To this end, our work investigates
leveraging tabular data as expert knowledge to provide
guidance for visual modalities and enable cross-modal
transfer learning. We also study more common scenarios
where tables appear as images, conducting comprehensive
investigations from evaluation to method development for
table-based question answering and reasoning. Beyond these
works, we extend tabular learning to more general
scenarios, developing unified models capable of handling
diverse table tasks within a single framework, and further
expanding from tables to broader document-level parsing and
understanding.

Multimodal Tabular Data Learning

Machine learning is widely used in various areas. However,
the current machine learning framework remains vulnerable
to issues such as adversarial attacks, fairness violations,
and data leakage. These problems are not adequately
captured by fitting models to collected data and focusing
on test performance metrics alone, like accuracy or
F1-score. In practice, machine learning tasks often involve
additional quantities of interest, which turns an
originally unconstrained optimisation problem (only
optimising toward accuracy) into a constrained one. This
thesis formally studies machine learning under different
types of commonly concerning constraints, such as
robustness, fairness, and privacy. I first focus on how the
formal machine learning framework can be extended to
incorporate robustness, which is a critical factor for
safety. After that, I turn to more ethics-related aspects
like fairness and privacy, to explore the possibility of
formally fitting them into machine learning. My approach
differs from empirically pushing up multiple metrics and
instead emphasises fundamental ways to understand and
address the underlying challenges.

Optimisation Problems in Constrained Machine Learning

Empirical performance models (EPMs) predict algorithm
performance without execution, enabling applications such
as algorithm selection, surrogate-based optimisation, and
benchmarking. Their effectiveness, however, is constrained
by both the quality of feature representations and the
predictive models themselves. My thesis advances EPMs along
both directions. To further enhance usability and foster
broader adoption, I also develop a Python library that
unifies state-of-the-art methods under a single API. These
contributions aim to make EPMs more accurate, versatile,
and accessible.

Next Generation of Empirical Performance Prediction

Global biodiversity is declining at unprecedented rates,
yet traditional monitoring at the necessary scales remains
costly and biased toward what can be seen. Sound offers a
complementary lens: many species are detected more reliably
by their vocalizations, microphones are inexpensive and
unobtrusive, and they can cover greater spatial and
temporal scales. These advantages have made passive
acoustic monitoring a fast-growing paradigm, yet robust,
generalizable sound distinction in complex soundscapes
remain a central obstacle. My thesis addresses this by
combining data-driven human-inspired representation
learning with knowledge-guided unsupervised learning from
auditory scene analysis and ecological reasoning,
prioritizing hierarchical organization and structure
discovery prior to labelling. Human-in-the-loop oversight
is incorporated as targeted verification under uncertainty,
drawing on active learning and weak supervision to direct
effort where it has the highest value.

Creating Generalizable Data-Driven Approaches for Biodiversity Monitoring via Acoustics

Multi-agent reinforcement learning enables sophisticated
collaborative behaviors in autonomous systems, yet
fundamental scalability barriers persist: existing methods
struggle to coordinate large agent populations and struggle
with extended decision-making horizons. This research
develops hierarchical approaches to scale up multi-agent
learning systems through two complementary directions:
structural scaling for coordinating increasing numbers of
agents and temporal scaling for extending decision-making
horizons. This paper presents four integrated
contributions: a taxonomical survey establishing
hierarchical architectures as the theoretical foundation
for scalable multi-agent learning systems, a benchmark for
long-horizon multi-objective multi-agent reinforcement
learning, a framework integrating self-organizing neural
networks with multiple reinforcement learning agents for
hierarchical tri-level control, and a framework leveraging
large language models for zero-shot multi-agent planning.
Through comprehensive validations, this work demonstrates
that hierarchical heterogeneous modular architectures
provide unified, interpretable solutions to multi-agent
scalability --- bridging theoretical multi-agent
reinforcement learning research with real-world deployment
requirements.

Scaling Up Cooperative Multi-Agent Reinforcement Learning Through Hierarchical Heterogeneous Modular Architectures

The lack of large-scale clean data for learning has been a
challenge that significantly hinders robots from developing
superior level of autonomous intelligence. This urges the
necessity to utilize diverse data in a more efficient way.
This work approaches the challenge from four perspectives:
efficient learning from expert demonstration, efficient
dynamics modeling from in-the-wild videos, efficient
learning from heuristics guidance, and adjustment for
efficient deployment. We provide an overview of preliminary
results in each area and outline proposed research focusing
on extracting controllable representation from data, aiming
at efficient cross- embodiment learning.

Efficient Robot Learning from Diverse Data

Rainfall forecasting presents a dual challenge: extreme \emph{zero inflation}, where dry days dominate and obscure meaningful precipitation patterns, and pronounced \emph{nonstationarity}, where climate dynamics evolve across time and regimes. We propose the \textbf{Deep Extreme Transformer (DET)}, a principled architecture that integrates statistical distribution modeling with neural sequence learning to address both issues simultaneously. DET augments the Transformer with a Tweedie distribution output head that unifies discrete zeros and continuous intensities, a fixed shared-weight mechanism that emphasizes rare but critical events in both attention and loss computation, and a Gaussian perturbation strategy that enhances learning stability without violating physical constraints. DET further incorporates nonstationary attention to adapt to evolving rainfall regimes. Extensive experiments on multi-decadal South Australian climate data demonstrate that DET consistently outperforms existing deep learning and statistical models across forecasting horizons. Our method provides an effective and generalizable framework for zero-inflated, shift-prone time series, bridging statistical rigor with deep temporal modeling in a unified and scalable design.

Deep Extreme Transformer: Tackling Zero-Inflated Time Series for Precipitation Prediction

Rapid post‑earthquake damage assessment is crucial for rescue and resource planning. Still, existing remote sensing methods depend on costly aerial images, expert labeling, and produce only binary damage maps for early-stage evaluation. Although ground-level images from social networks provide a valuable source to fill this gap, a large pixel-level annotated dataset for this task is still unavailable. We introduce EIDSeg, the first large-scale semantic segmentation dataset specifically for post-earthquake social media imagery. The dataset comprises 3,266 images from nine major earthquakes (2008–2023), annotated across five classes of infrastructure damage: Undamaged Building, Damaged Building, Destroyed Building, Undamaged Road, and Damaged Road. We propose a practical three-phase cross-disciplinary annotation protocol with labeling guidelines that enables consistent segmentation by non-expert annotators, achieving over 70% inter-annotator agreement. We benchmark several state-of-the-art segmentation models, identifying Encoder-only Mask Transformer (EoMT) as the top-performing method with a Mean Intersection over Union (mIoU) of 80.8%. By unlocking social networks' rich, ground-level perspective, our work paves the way for a faster, finer-grained damage assessment in the post-earthquake scenario. All data and code will be released upon acceptance of the paper.

EIDSeg: A Pixel-Level Semantic Segmentation Dataset for Post-Earthquake Damage Assessment from Social Media Images

Self Supervised Learning (SSL) has emerged as a prominent paradigm for label-efficient learning, and has been widely utilized by remote sensing foundation models (RSFMs). Recent RSFMs including SatMAE and DoFA primarily rely on masked autoencoding (MAE), contrastive learning or some combination of them. However, these pretext tasks often overlook the unique temporal characteristics of agricultural landscape, namely nature's cycle of sowing, growth, and harvest. Motivated by this gap, we propose three novel agriculture-specific pretext tasks, namely Time-Difference Prediction (TD), Temporal Frequency Prediction (FP), and Future-Frame Prediction (FF). Comprehensive evaluation on SICKLE dataset shows FF achieves 69.6% IoU on crop mapping and FP reduces yield prediction error to 30.7% MAPE, outperforming all baselines, and TD remains competitive on most tasks. Further, we also scale FF to the national scale of India, achieving 54.2% IoU outperforming all baselines on field boundary delineation on FTW India dataset.

Downloads

Next from AAAI 2026

Exploring the Janus Face of Synthetic Images: From Privacy-secure Biometrics to Universal and Robust Deepfake Detection

Stay up to date with the latest Underline news!

PRESENTATIONS

CONFERENCES

COMPANY

RESOURCES

.css-70qvj9{display:-webkit-box;display:-webkit-flex;display:-ms-flexbox;display:flex;-webkit-align-items:center;-webkit-box-align:center;-ms-flex-align:center;align-items:center;}Downloads

Next from AAAI 2026

Exploring the Janus Face of Synthetic Images: From Privacy-secure Biometrics to Universal and Robust Deepfake Detection

Stay up to date with the latest Underline news!

PRESENTATIONS

CONFERENCES

COMPANY

RESOURCES

Downloads