Singapore

The rapid advancement of generative mod- els (Roy et al.
2023; Pal et al. 2024; Roy et al. 2024) has opened new
avenues for addressing critical challenges in computer
vision (Dhar et al. 2021; Fazlyab et al. 2023), such as
data scarcity, image quality enhancement, and person-
alization. Recent progress has concentrated on improving
the adaptability, efficiency, and quality of these models
to meet the growing demand for parameter-efficient
fine-tuning and adaptation of large vision-language and
generative mod- els (Roy et al. 2025b; Pramanick, Roy, and
Patel 2022). In this work, we begin by tackling the
challenges of resource- constrained learning (Roy et al.
2022). We then leverage powerful vision-language models to
address these issues in a parameter-efficient manner.
Additionally, we aim to en- hance state-of-the-art
generative models—specifically dif- fusion models—by
incorporating natural image priors (Roy et al. 2023). We
also explore joint concept merging through the lens of
low-rank adapter merging, applying it to content- style
personalization. Finally, we address the challenge of
zero-shot personalization of any object without requiring
additional training. We conclude by devising a frequency-
guided method for training-free multi-LoRA composition,
which is more appropriate for deployment on edge devices.

AAAI 2026

Learning More from Less: Resource-Constrained Generative AI for Classification, Generation, and Personalization

generative ai

diffusion

few-shot

technical paper

We are pleased to announce the Fortieth AAAI Conference on Artificial Intelligence (AAAI-26), which will be held in Singapore EXPO from January 20 to January 27, 2026.

The purpose of the AAAI conference series is to promote research in Artificial Intelligence (AI) and foster scientific exchange between researchers, practitioners, scientists, students, and engineers across the entirety of AI and its affiliated disciplines. AAAI-26 will feature technical paper presentations, special tracks, invited speakers, workshops, tutorials, poster sessions, senior member presentations, competitions, and exhibit programs, and a range of other activities to be announced.<br><br>

To access this event page, you need to log in with the **email address you registered with**. <br>Access credentials will be sent to your email from Underline -  subject line "Welcome to AAAI 2026". Please be sure to check your spam email folder if you do not see an email confirmation right away.

Please log in

To access this event page, you are required to register.
Please complete your registration to continue.

We recommend reading [**the registration information**](https://aaai.org/conference/aaai/aaai-26/registration/) first.

**Online Registration Form**: https://aaai.getregistered.net/conference-2026 

Registration Required

We are pleased to announce the Fortieth AAAI Conference on Artificial Intelligence (AAAI-26), which will be held in Singapore EXPO from January 20 to January 27, 2026.

Large Language Models (LLMs) have demonstrated remarkable
capabilities in reasoning, yet their efficacy is
constrained by a fundamental memory limitation: a static
context window that resets with each interaction. This
prevents them from accumulating experience and adapting to
dynamic, long-term tasks. To address the limitations of
long-term memory in agentic LLMs, this work introduces a
neuro-inspired framework with two key contributions. First,
we propose \textbf{ARTEM} (Agentic Retrieval with
Temporal-Episodic Memory), a system that organizes
experiences into structured events and manages
utility-based memory consolidation. Second, we extend this
framework with a distinct governance component,
\textbf{Value-driven ARTEM}, that validates candidate
outputs against core principles before finalization.
Together, these components equip LLM agents with continual
learning, adaptive reasoning, and robust value-aligned
decision-making. Looking forward, we outline future
directions including dynamic memory adaptation, memory
decay mechanisms, and applications in interactive
multi-agent environments.

Value-Driven Memory-Augmented Generation for Agentic LLMs: Towards Structured and Adaptive Knowledge Utilization

Autonomous systems are increasingly deployed in complex,
uncertain environments, where they must make their own
decisions and adapt to unexpected conditions without human
intervention. System decisions have critical implications
for safety, reliability, and task success, yet current
approaches often address only one isolated aspect of this
challenge. For instance, there have been numerous, yet
separate, advances in planning under uncertainty with
optimal control methods, anticipating failures with
conformal prediction thresholding, and integrating large
language models with AI-based planners. This gap raises the
question: how can these capabilities be unified in a
framework that enables autonomy to operate reliably across
uncertain domains without human oversight? My dissertation
will address this challenge by developing methods that link
these threads, to contribute towards advancing trustworthy
autonomy that can operate robustly and transparently in
real environments.

Trustworthy Autonomy Without Human Intervention in Uncertain Domains

The submission contains the following documents:
1. Cover Sheet
2. Thesis Summary
3. Curriculum Vitae
4. Personal Statement

Factuality Evaluation using Reasoning and World Modeling

My research investigates how to evaluate and enhance large
language models’ (LLMs) alignment with human values in
collective decision-making scenarios. I focus on three
inter-
related aspects of this challenge: (i) normative alignment,
(ii)
procedural competence, and (iii) personalization.

Ethical Decision-making with AI: Value Alignment and the Role of Reasoning

Software reliability remains a fundamental challenge in
modern software engineering. The rapid rise of AI-assisted
coding has dramatically improved development productivity
but introduces a critical problem: AI-generated code cannot
be inherently trusted. While AI coding tools accelerate
development, they may produce code with subtle bugs,
security vulnerabilities, and legal compliance issues,
amplifying an already costly maintenance burden where
developers spend the majority of their time fixing bugs.
This unreliability of AI-generated code poses systemic
risks to software quality, security, and compliance.

Addressing this challenge requires a two-pronged solution
approach. First, we need reliable detection methods to
identify AI-generated code, enabling targeted quality
reviews and ensuring compliance with licensing
requirements. Second, we need effective Automated Program
Repair (APR) techniques to automatically localize faults
and synthesize patches, reducing the growing burden of bug
fixing in rapidly produced code. However, progress in both
areas has been constrained by a critical limitation: the
lack of comprehensive datasets and benchmarks, particularly
for C and C++, which underpin most safety-critical systems.
Moreover, current repair approaches, including recent large
language models (LLMs), lack the semantic reasoning
abilities necessary for complex bug fixing tasks, often
relying on pattern matching rather than genuine program
understanding.

This thesis addresses these interconnected challenges
through four major contributions. First, we conduct the
first comprehensive study of AI-generated code detection,
evaluating thirteen detectors on over two million samples
of code and natural language, and propose fine-tuning–based
approaches that substantially improve detection accuracy.
Second, we construct and release Defects4C, the first
large-scale, executable benchmark for C/C++ bugs, curated
from millions of real-world commits and designed to enable
reproducible evaluation for bug detection and repair.
Third, we propose a dual deep learning–based APR framework,
integrating BiLSTM-based fault localization with a
retrieval-augmented transformer for patch generation, and
conduct the first large-scale evaluation of LLM-based APR
on C/C++, revealing significant performance gaps compared
to Java benchmarks and highlighting the limitations of
current models in semantic understanding and code
reasoning. Finally, we design a semantic-enhancement
framework for LLMs, incorporating dynamic semantic signals
such as code execution traces into training and inference,
and demonstrate improvements in program repair and general
code generation.

These contributions advance the foundations of trustworthy,
semantically grounded automated program repair, providing
new datasets, empirical insights, and methodological
innovations that will guide the future development of
reliable AI-driven software engineering.

Trustworthy AI-Assisted Programming: Detection and Repair of Unreliable Code

The rise of generative AI presents a profound duality. On
one hand, it offers a powerful solution to data scarcity
and privacy challenges in biometrics. On the other, it is
weaponized to create deepfakes that threaten digital
integrity. Existing detectors for these deepfakes are
brittle, failing against real-world transformations and
novel generative models. This dissertation confronts this
duality head-on. First, I establish the viability of
synthetic data for building fair and private biometric
systems. Second, to counter the malicious use of this
technology, this dissertation develops deepfake detectors
designed to be robust, generalizable, and efficient by
construction. My work introduces novel, lightweight feature
sets on different cues (e.g., colour cue-based Relative
Chrominance Difference, Gradient features, Depth cues,
etc.) that are inherently resilient to OSN transformations
and improve generalisation to unseen forgeries. Whereas,
accomplished results confirm state-of-the-art performance,
achieving high accuracy in challenging real-world scenarios
with a significant reduction in model complexity, my
current and future work focuses on achieving superior
generalisation while being OSN manipulation resistant.

Exploring the Janus Face of Synthetic Images: From Privacy-secure Biometrics to Universal and Robust Deepfake Detection

Tabular data is a fundamental form of information in
real-world applications, ranging from finance and
healthcare to scientific research. Unlike traditional views
that treat tables as isolated structured data, tables are
often inherently multimodal—appearing as images, embedded
in documents, or coexisting with text and other modalities.
My research explores multimodal tabular data learning,
aiming to bridge structured tabular knowledge with diverse
input forms and tasks. To this end, our work investigates
leveraging tabular data as expert knowledge to provide
guidance for visual modalities and enable cross-modal
transfer learning. We also study more common scenarios
where tables appear as images, conducting comprehensive
investigations from evaluation to method development for
table-based question answering and reasoning. Beyond these
works, we extend tabular learning to more general
scenarios, developing unified models capable of handling
diverse table tasks within a single framework, and further
expanding from tables to broader document-level parsing and
understanding.

Multimodal Tabular Data Learning

Machine learning is widely used in various areas. However,
the current machine learning framework remains vulnerable
to issues such as adversarial attacks, fairness violations,
and data leakage. These problems are not adequately
captured by fitting models to collected data and focusing
on test performance metrics alone, like accuracy or
F1-score. In practice, machine learning tasks often involve
additional quantities of interest, which turns an
originally unconstrained optimisation problem (only
optimising toward accuracy) into a constrained one. This
thesis formally studies machine learning under different
types of commonly concerning constraints, such as
robustness, fairness, and privacy. I first focus on how the
formal machine learning framework can be extended to
incorporate robustness, which is a critical factor for
safety. After that, I turn to more ethics-related aspects
like fairness and privacy, to explore the possibility of
formally fitting them into machine learning. My approach
differs from empirically pushing up multiple metrics and
instead emphasises fundamental ways to understand and
address the underlying challenges.

Optimisation Problems in Constrained Machine Learning

Empirical performance models (EPMs) predict algorithm
performance without execution, enabling applications such
as algorithm selection, surrogate-based optimisation, and
benchmarking. Their effectiveness, however, is constrained
by both the quality of feature representations and the
predictive models themselves. My thesis advances EPMs along
both directions. To further enhance usability and foster
broader adoption, I also develop a Python library that
unifies state-of-the-art methods under a single API. These
contributions aim to make EPMs more accurate, versatile,
and accessible.

Next Generation of Empirical Performance Prediction

Global biodiversity is declining at unprecedented rates,
yet traditional monitoring at the necessary scales remains
costly and biased toward what can be seen. Sound offers a
complementary lens: many species are detected more reliably
by their vocalizations, microphones are inexpensive and
unobtrusive, and they can cover greater spatial and
temporal scales. These advantages have made passive
acoustic monitoring a fast-growing paradigm, yet robust,
generalizable sound distinction in complex soundscapes
remain a central obstacle. My thesis addresses this by
combining data-driven human-inspired representation
learning with knowledge-guided unsupervised learning from
auditory scene analysis and ecological reasoning,
prioritizing hierarchical organization and structure
discovery prior to labelling. Human-in-the-loop oversight
is incorporated as targeted verification under uncertainty,
drawing on active learning and weak supervision to direct
effort where it has the highest value.

Content not yet available

Next from AAAI 2026

Value-Driven Memory-Augmented Generation for Agentic LLMs: Towards Structured and Adaptive Knowledge Utilization

Stay up to date with the latest Underline news!

PRESENTATIONS

CONFERENCES

COMPANY

RESOURCES