Singapore

Humans easily apply learned skills to different situations,
a flexibility that AI systems still struggle to achieve.
Current AI models are often confined to their training
setup, leading to isolated developments and a narrow scope
of application. This largely restricts the creation of
flexible and general-purpose AI systems. Deep Model Reuse
presents a novel solution. Imagine tapping into a vast
library of pre-trained models, each a master in its
specialized domain. Our approach re-purposes these existing
models, extracting and transforming their knowledge for the
development of novel AI systems. In this talk, we explore
the essential techniques of this transformative process,
highlighting the shift towards versatile and efficient AI
that mirrors human cognition&#39;s adaptability.

We introduce three foundational pillars of deep model
reuse: understanding, composing, and refining. First, we
investigate the internal behavior of neural networks—using
language models as explainers and analyzing the
representation space of diffusion models—to uncover how and
what models have learned. Second, we develop methods to
transform and compose models through weight mapping,
knowledge distillation, and model dissection, enabling the
creation of new capabilities by reassembling existing
expertise. Third, we enhance reliability by editing model
behaviors and mitigating biases, ensuring robustness in
complex and dynamic environments.

We demonstrate the power of this paradigm in generative AI,
where model reuse leads to efficient diffusion models free
from spectral bias, improved compositional understanding in
video generation, and the repurposing of 2D/3D models for
3D/4D content creation. By shifting from training from
scratch to intelligently reusing and recombining models, we
move closer to adaptive, scalable, and human-like AI
systems—ushering in a new era of sustainable and general
intelligence.

AAAI 2026

Deep Model Reuse: Paving the Way for Efficient and Generalizable AI Systems

deep model reuse

diffusion models

finetuning

transfer learning

Humans easily apply learned skills to different situations,
a flexibility that AI systems still struggle to achieve.
Current AI models are often confined to their training
setup, leading to isolated developments and a narrow scope
of application. This largely restricts the creation of
flexible and general-purpose AI systems. Deep Model Reuse
presents a novel solution. Imagine tapping into a vast
library of pre-trained models, each a master in its
specialized domain. Our approach re-purposes these existing
models, extracting and transforming their knowledge for the
development of novel AI systems. In this talk, we explore
the essential techniques of this transformative process,
highlighting the shift towards versatile and efficient AI
that mirrors human cognition's adaptability.

We introduce three foundational pillars of deep model
reuse: understanding, composing, and refining. First, we
investigate the internal behavior of neural networks—using
language models as explainers and analyzing the
representation space of diffusion models—to uncover how and
what models have learned. Second, we develop methods to
transform and compose models through weight mapping,
knowledge distillation, and model dissection, enabling the
creation of new capabilities by reassembling existing
expertise. Third, we enhance reliability by editing model
behaviors and mitigating biases, ensuring robustness in
complex and dynamic environments.

We demonstrate the power of this paradigm in generative AI,
where model reuse leads to efficient diffusion models free
from spectral bias, improved compositional understanding in
video generation, and the repurposing of 2D/3D models for
3D/4D content creation. By shifting from training from
scratch to intelligently reusing and recombining models, we
move closer to adaptive, scalable, and human-like AI
systems—ushering in a new era of sustainable and general
intelligence.

technical paper

We are pleased to announce the Fortieth AAAI Conference on Artificial Intelligence (AAAI-26), which will be held in Singapore EXPO from January 20 to January 27, 2026.

The purpose of the AAAI conference series is to promote research in Artificial Intelligence (AI) and foster scientific exchange between researchers, practitioners, scientists, students, and engineers across the entirety of AI and its affiliated disciplines. AAAI-26 will feature technical paper presentations, special tracks, invited speakers, workshops, tutorials, poster sessions, senior member presentations, competitions, and exhibit programs, and a range of other activities to be announced.<br><br>

To access this event page, you need to log in with the **email address you registered with**. <br>Access credentials will be sent to your email from Underline -  subject line "Welcome to AAAI 2026". Please be sure to check your spam email folder if you do not see an email confirmation right away.

Please log in

To access this event page, you are required to register.
Please complete your registration to continue.

We recommend reading [**the registration information**](https://aaai.org/conference/aaai/aaai-26/registration/) first.

**Online Registration Form**: https://aaai.getregistered.net/conference-2026 

Registration Required

We are pleased to announce the Fortieth AAAI Conference on Artificial Intelligence (AAAI-26), which will be held in Singapore EXPO from January 20 to January 27, 2026.

Indian poetry, known for its linguistic complexity and deep cultural resonance, has a rich and varied heritage spanning thousands of years. However, its layered meanings, cultural allusions, and sophisticated grammatical constructions often pose challenges for comprehension, especially for non-native speakers or readers unfamiliar with its context and language. Despite its cultural significance, existing works on poetry have largely overlooked Indian language poems. In this paper, we propose the $\textbf{Translation and Image Generation (TAI)}$ framework, leveraging Large Language Models (LLMs) and Latent Diffusion Models through appropriate prompt tuning. Our framework supports the United Nations Sustainable Development Goals of Quality Education (SDG 4) and Reduced Inequalities (SDG 10), by enhancing the accessibility of culturally rich Indian-language poetry to a global audience. It includes (1) a translation module that uses an Odds Ratio Preference Alignment Algorithm to accurately translate morphologically rich poetry into English; (2) an image generation module that employs a semantic graph to capture tokens, dependencies, and semantic relationships between metaphors and their meanings, to create visually meaningful representations of Indian poems. Our comprehensive experimental evaluation, including both human and quantitative assessments, demonstrates the superiority of $\textit{TAI}$ Diffusion in poem image generation tasks, outperforming strong baselines. To further address the scarcity of resources for Indian-language poetry, we introduce the $\textbf{Morphologically Rich Indian Language Poems \textit{MorphoVerse} Dataset}$, comprising 1,570 poems across 21 low-resource Indian languages. By addressing the gap in poetry translation and visual comprehension, this work aims to broaden accessibility and enrich the reader’s experience.

Crossing Borders: A Multimodal Challenge for Indian Poetry Translation and Image Generation

Multi-view representation learning, which utilizes multiple channels to improve perceptual accuracy, is recognized for its effectiveness in the analysis of multi-view data. However, deploying these methods in real-world scenarios presents two primary challenges. 1) Lack of Variegation: Multi-view representation techniques commonly observe along a singular axis, i.e., the attribute axis; 2) Insufficient Relationship: Most multi-view feature models lack mechanisms for exploring potential relationships between attribute axis and channel axis. To mitigate these obstacles, we design a Dual Impulse Network framework for multi-view learning (DIN) to train a feature representation. In this framework, a strategy observed along the channel axis and attribute axis simultaneously is introduced, and two different representations are generated by two analogous impulse networks, which are capable of extracting information corresponding to different axes. Furthermore, we incorporate an integration network that analyzes the potential relationship between attribute axis and channel axis to generate two attention matrices. The final two feature representations derived from these attention matrices are aggregated to amplify the expression of internal information. Comprehensive experimental results support the efficacy and superiority of the proposed framework, demonstrating improvements in classification performance compared to state-of-the-art methods.

DIN: Dual Impulse Network for Multi-view Representation Learning

The continuous advancements in life science technology have enabled spatial transcriptome technology to achieve an impressive level of resolution at the single-cell level. This technology has emerged as a crucial method for studying the cellular composition and differentiation states of tissues, investigating cell-cell interactions, and unraveling the molecular mechanisms underlying diseases and developmental processes. A key component in this analysis is the accurate segmentation of cells. However, existing segmentation methods often fail to fully leverage the valuable information provided by spatial transcriptomics, leading to inaccurate cell segmentation. In this study, we introduce SSL-CST, a cell segmentation for single-cell spatial transcriptome method based on self-supervised learning. SSL-CST employs a pre-trained model for foundational contour segmentation. Following the denoising process, it utilizes a self-supervised neural network to correct the cell boundaries to obtain accurate cell boundaries. Through this approach, SSL-CST outperforms other state-of-the-art methods in various tests conducted on multiple datasets. The improved segmentation provided by SSL-CST further enhances the analysis of single-cell spatial expression, providing effective tools for biological discovery.

SSL-CST: Cell Segmentation for Single-Cell Spatial Transcriptome Based on Self-Supervised Learning

Optimizing patent claims is a critical yet challenging task, demanding careful balance between maximizing novelty and preserving legal scope. Manual claim drafting is labor-intensive, costly, and inherently inconsistent, while conventional Large Language Models (LLMs) often lack the structured, iterative reasoning essential for precise claim refinement. To address these challenges, we introduce Tree of Claims (ToC), an innovative framework that redefines claim editing as a guided search problem. ToC synergistically integrates Monte Carlo Tree Search (MCTS) with a collaborative multi-agent system, comprising an LLM-based EditorAgent that proposes contextually grounded edits, and an ExaminerAgent that mimics patent examiner critiques through structured, chain-of-thought analyses of novelty and prior art disclosure. Driven by a carefully designed multi-objective reward function, ToC jointly optimizes novelty, scope retention, and semantic coherence. Experimental evaluation on a benchmark of 1145 claims demonstrates that ToC significantly outperforms standard LLMs in zero-shot and few-shot scenarios, achieving an average composite score improvement of 8%, and up to 9% in certain cases. Extensive experiments, including detailed ablation studies, validate ToC’s efficacy in generating superior, legally robust claim revisions. Overall, ToC establishes a transparent, controllable, and interpretable methodology that effectively bridges advanced LLM reasoning capabilities with strategic MCTS planning for structured patent claim optimization.

ToC: Tree-of-Claims Search with Multi-Agent Language Models

Anti-money laundering (AML) detection is of vital importance in financial risk control. Although Graph Neural Networks (GNN) have yielded promising results, existing motif-based approaches primarily focus on node anomaly detection on simple graphs, which hinders the direct identification of anomalous edges in directed temporal transaction networks. Moreover, consecutive transaction relationships, termed dual-edge motifs, have rarely been considered in previous AML studies. To address these gaps, we propose the D-EMAML framework, which consists of: (1) Fast-Motif-Gen, a GPU-accelerated dual-edge motif graph generator with pruning; (2) D-EMGNN, an attention-enhanced heterogeneous GNN module that reduces motif-type information redundancy; (3) MELP, a label aggregation scheme projecting predictions from the motif graph to the original graph. Extensive experiments on real-world and synthetic datasets demonstrate significant improvements over representative baselines and validate the contribution of each component. To our knowledge, this is the first application of dual-edge motif graphs for GNN-based edge anomaly detection in AML.

Beyond Single Transactions: D-EMAML---Dual-Edge Motif Neural Networks for Enhanced Anti-Money Laundering Detection

This paper presents a novel method, called Deformable
Polygonal Flow Matching (DPFM), for the generation of
polygonal arrangements such as jigsaw puzzles and floor
plans. DPFM is a Flow Matching framework that enables
the generation process to deform, rotate, and translate poly-
gons while decoupling these transformation, allowing to tog-
gle them individually. Able to combine the spatial reason-
ing capabilities of arrangement models with the flexibility
of position-based models, it covers a wide range of appli-
cations within a unified formulation, from noiseless puzzle
solving using rigid alignments to unconstrained floor plans
generation. We represent data using a hierarchical graph com-
posed of a topological subgraph encoding connectivity infor-
mation and semantics (such as room types for floor plans),
and a geometrical subgraph encoding the 1D polygonal loop
of each shape. DPFM also leverages Flow Matching’s arbi-
trary prior distributions for geometric constraints by design-
ing priors with domain knowledge. Rather than starting the
generation process from uninformed distributions, the gener-
ation is constrained through the informed priors since the ini-
tialization stage. The qualitative and quantitative evaluations
of our method, ran on the RPLAN and jigsaw puzzle datasets,
demonstrate strong performance. DPFM outperforms task-
specific methods, becoming the new state-of-the-art for 2D
arrangement generation. As our results show, DPFM is able to
solve novel tasks such as puzzle denoising, where pieces are
reconstructed from noisy versions and arranged into a valid
puzzle in parallel.

Deformable Polygonal Flow Matching with Informed Priors and Hierarchical Graph Constraints

Tandem mass spectrometry (MS/MS) is a critical tool for identifying molecular structures. By efficiently separating molecular fragments based on their mass-to-charge (m/z) ratios, it facilitates molecular generation and subsequent scientific discoveries. However, de novo molecular generation from MS/MS spectra remains fundamentally constrained by two paramount challenges: the vast chemical space requires effective structural constraints, and the absence of fine-grained substructural generation weakens the correspondences between spectral features and molecular structures. In this work, we propose MSAnchor, a novel two-stage framework for MS/MS-based molecular structure generation. We mitigate the search space challenge through the introduction of Anchor-Extended Molecular Scaffold (AEMS) representation that explicitly encodes side-chain anchoring points, thereby dramatically reducing combinatorial complexity. Leveraging the explicit attachment sites provided by AEMS, we develop anchor-specific priors that establish effective alignments between spectral features and molecular substructures. This fine-grained substructural correspondence is further enhanced by a modified Conditional Information Bottleneck (CIB) module that extracts the most informative spectral components in a structure-aware manner. These innovations enable MSAnchor to generate molecular structures that closely reflect spectral characteristics while constraining combinatorial complexity. Extensive experiments on the CANOPUS and MassSpecGym datasets demonstrate that MSAnchor achieves state-of-the-art performance in molecular structure prediction from MS/MS spectra, with performance improvements that are particularly more pronounced for molecules with higher complexity.

MSAnchor: De Novo Molecular Generation from Mass Spectrometry Data with Anchor-Extended Molecular Scaffolds

Biological intelligence has driven significant progress in
artificial intelligence (AI), but a critical gap remains:
biological systems inherit innate abilities from genes,
with brains initialized by blueprints refined over 3.5
billion years of evolution, while machines rely heavily on
inefficient, data-driven learning from scratch. This gap
arises from the lack of a genetic mechanism in machines to
transfer and accumulate inheritable knowledge across
generations. To bridge this gap, we propose learngenes,
network fragments that act as inheritable “genes” for
machines. Unlike conventional knowledge transfer methods,
learngenes enable efficient and universal knowledge
transfer by selectively encapsulating task-agnostic
knowledge. To facilitate the transfer and accumulation of
task-agnostic knowledge across generations, we introduce
Genetic Reinforcement Learning (GRL), a framework that
simulates the learning and evolution of organisms in
intelligent agents following Lamarckian principles. Through
GRL, we identify learngenes as network fragments within
agents' policy networks, equipping newborn agents with
innate abilities for rapid adaptation to novel tasks. We
demonstrate the advantages of learngene-based knowledge
transfer over evolution-based search and traditional
pre-trained models, and show how learngenes evolve through
the accumulation of task-agnostic knowledge. Overall, this
work establishes a novel paradigm for knowledge transfer
and model initialization in AI, offering new possibilities
for more adaptive, efficient, and scalable learning systems.

Learngene: Inheritable “Genes” in Intelligent Agents

The reliable deployment of reinforcement learning (RL) for real-world algorithmic trading is critically hindered by the ``simulation-to-reality gap.'' Standard industry backtesting on static historical data ignores market impact—the feedback loop where an agent's trades influence price dynamics—leading to strategies that are fragile and untrustworthy in live markets. To solve this significant problem, we present a novel and emerging application of AI: a framework for building an interactive, responsive market simulator. Our system first uses imitation learning (IL) to automatically train an ensemble of agents, each learning a distinct trading strategy from a different historical market regime (e.g., bull, bear). This creates a data-driven proxy for a diverse population of real-world traders. We then deploy an innovative Action Synthesis Network to synthesize the actions of this ensemble, generating a realistic, synthetic price trajectory that endogenously models the market's reaction to trades. This interactive environment is then used to train a final RL policy. We evaluate our system on NASDAQ-100 (QQQ) data, and the results demonstrate strong potential for deployment. The RL policy trained in our responsive simulator achieves significantly more robust performance, exhibiting superior downside protection during market downturns compared to various traditional baselines. This application provides a scalable and technically sound methodology for building more realistic training environments, presenting a clear path toward the development and eventual deployment of more resilient and effective algorithmic trading strategies.

An Interactive Simulation Framework by Ensemble Imitation Learning Agents for Training Robust Trading Policies

We study a path planning problem where the possible move actions are represented as a finite set of motion primitives aligned with the grid representation of the environment. That is, each primitive corresponds to a short kinodynamically-feasible motion of an agent and is represented as a sequence of the swept cells of a grid. Typically, heuristic search, i.e. A*, is conducted over the lattice induced by these primitives (lattice-based planning) to find a path. However, due to the large branching factor, such search may be inefficient in practice. To this end, we suggest a novel technique rooted in the idea of searching over the grid cells (as in vanilla A*) simultaneously fitting the possible sequences of the motion primitives into these cells. The resultant algorithm, MeshA*, provably preserves the guarantees on completeness and optimality, on the one hand, and is shown to notably outperform conventional lattice-based planning (x1.5-x2 decrease in the runtime), on the other hand.

Downloads

Next from AAAI 2026

Crossing Borders: A Multimodal Challenge for Indian Poetry Translation and Image Generation

Stay up to date with the latest Underline news!

PRESENTATIONS

CONFERENCES

COMPANY

RESOURCES