Singapore

We present a language-based noise modulation module for diffusion models that improves image color generation under textual guidance. Unlike standard approaches that inject noise uniformly, our method leverages semantic cues from text to selectively control the noise injection process, preserving local details and enhancing color accuracy even when descriptions are ambiguous or incomplete. Applied to language guided image colorization, this targeted modulation leads to more faithful and visually consistent results. The proposed module is lightweight, generalizable, and can be integrated into existing diffusion pipelines, offering a simple yet effective step toward more controllable text-to-image generation.

AAAI 2026

NoMoColor: Unified Noise Modulation for Enhanced Diffusion-based Image Colorization (Student Abstract)

poster

We are pleased to announce the Fortieth AAAI Conference on Artificial Intelligence (AAAI-26), which will be held in Singapore EXPO from January 20 to January 27, 2026.

The purpose of the AAAI conference series is to promote research in Artificial Intelligence (AI) and foster scientific exchange between researchers, practitioners, scientists, students, and engineers across the entirety of AI and its affiliated disciplines. AAAI-26 will feature technical paper presentations, special tracks, invited speakers, workshops, tutorials, poster sessions, senior member presentations, competitions, and exhibit programs, and a range of other activities to be announced.<br><br>

To access this event page, you need to log in with the **email address you registered with**. <br>Access credentials will be sent to your email from Underline -  subject line "Welcome to AAAI 2026". Please be sure to check your spam email folder if you do not see an email confirmation right away.

Please log in

To access this event page, you are required to register.
Please complete your registration to continue.

We recommend reading [**the registration information**](https://aaai.org/conference/aaai/aaai-26/registration/) first.

**Online Registration Form**: https://aaai.getregistered.net/conference-2026 

Registration Required

We are pleased to announce the Fortieth AAAI Conference on Artificial Intelligence (AAAI-26), which will be held in Singapore EXPO from January 20 to January 27, 2026.

Large Language Models (LLMs) demonstrate strong
capabilities in code generation but often lack adaptability
in planning and refinement. We propose Self-PR, a framework
that integrates adaptive plan selection and iterative
repair to improve correctness and generalization. Self-PR
constructs a reusable plan database via task clustering and
trains a selector to choose task-specific strategies.
Incorrect outputs are refined through multi-round feedback
until correctness. Trained only on HumanEval, Self-PR
generalizes well to out-of-distribution tasks (MBPP),
improving pass@1 by +4.9\% on HumanEval and +5.5\% on MBPP
compared to Modularization-of-Thought prompting.
Experiments across Llama-3 (8B, 70B) and GPT-4o-mini
confirm robustness and scalability. These findings suggest
that adaptive planning and feedback-driven repair are
essential for reliable LLM-based code generation.

Self-Guided Planning and Repair Framework for Code Generation (Student Abstract)

In this study, we propose two methods to estimate static
graphs from a single dynamic graph and integrate them into
hybrid Graph Neural Networks (GNNs), which combine
long-term static structure with transient dynamic
interactions. Since static graphs are often unavailable and
attributes may be difficult to use at scale or under
privacy constraints, we introduce: (i) a behavioral
similarity estimator based on normalized co-occurrence,
requiring no attributes, and (ii) an attribute-aware
K-means + k-NN estimator that is more efficient than cosine
similarity. Experiments on multiple real-world datasets
show that both methods consistently improve predictive
accuracy and training efficiency, underscoring the
importance of static graph choice in hybrid GNNs.

Behavioral-Similarity and Clustering-Based Methods for Static Graph Estimation in Hybrid GNNs (Student Abstract)

Dynamic head pruning in Vision Transformers (ViTs) improves
efficiency by removing redundant attention heads, but
existing pruning policies are often difficult to interpret
and control. In this work, we propose a novel framework by
integrating Sparse Autoencoders (SAEs) with dynamic
pruning, leveraging their ability to disentangle dense
embeddings into interpretable and controllable sparse
latents. Specifically, we train an SAE on the final-layer
residual embedding of the ViT and amplify the sparse
latents with different strategies to alter pruning
decisions. Among them, per-class steering reveals compact,
class-specific head subsets that preserve accuracy. For
example, bowl improves accuracy (76%→82%) while reducing
head usage (0.72→0.33) via heads h2 and h5. These results
show that sparse latent features enable class-specific
control of dynamic pruning, effectively bridging pruning
efficiency and mechanistic interpretability in ViTs.

Steering Sparse Autoencoder Latents to Control Dynamic Head Pruning in Vision Transformers (Student Abstract)

We evaluate how well large language model embeddings
represent continuous numerical values across different
precisions and ranges. Using linear models and principal
component analysis on models from major providers, we show
that while embeddings can reconstruct numbers with high
fidelity (R2 ≥ 0.95), they introduce substantial noise,
with principal components explaining less than 40% of
embedding variance. Performance degrades with increasing
decimal precision and mixed-sign values, revealing
fundamental limitations in how these models encode
numerical information.

Language Models Do Not Embed Numbers Continuously (Student Abstract)

In social networks, revealing the structure of communities can expose sensitive groups to detection. Traditional approaches, such as DICE, attempt to hide these communities by randomly rewiring links, but this strategy is often inefficient and insecure. We propose an efficient heuristic method called CRIME (Community Rewiring for Influence and Masking Entities) to address this challenge. CRIME removes the most influential internal links, measured by edge-betweenness centrality, and adds external links with the least betweenness centrality. Experiments on real-world networks demonstrate that CRIME hides targeted communities more effectively than DICE, and also achieves faster execution and improves hiding effectiveness by up to 99.8%.

CRIME: Community Rewiring for Influence and Masking Entities in Social Networks (Student Abstract)

Large Language Models (LLMs) have revolutionized the modern society significantly with the numerous advanced interactions between humans and AI agents, whereas the usage of most large language models including ChatGPT are not friendly open-sourced and must require users paying a lot for such AI service continuously. Therefore, deploying open-sourced large language models on local servers can be considered as an efficient method to design and implement creative embodied AI algorithms with lower cost and more stable free usage. Inspired by this ordinary motivation, we originally propose and implement the “Socratic Models-ChatGLM”, which is a well-performed algorithm for multi-modal interactive control of robotic arm based on offline large language models via the facile PyBullet platform, even presents extraordinary potential to address complicated text-image multi-step long-horizon robotic manipulation tasks.

Multi-Modal Interactive Control of Robotic Arm Based on Offline Large Language Models (Student Abstract)

We propose a framework for privacy-preserving argumentative explanations using homomorphic encryption. This method applies the Cheon-Kim-Kim-Song scheme, along with a soft k-means adapted for encrypted computation, to generate explanations without exposing sensitive data. By leveraging GPU acceleration, speedups of approximately 470–670 times were achieved compared with CPU execution. Experimental results show that explanation fidelity is maintained for small- to medium-scale models, whereas significant degradation occurs in larger models. These findings suggest that our study provides an initial step toward enabling secure and trustworthy argumentative explanations under encryption while also highlighting the challenges that remain for generalizability to more complex models.

Privacy-Preserving Argumentative Explanations (Student Abstract)

We propose HyMoERec, a novel sequential recommendation framework that addresses the limitations of uniform Position-wise Feed-Forward Networks in existing models. Current approaches treat all user interactions and items equally, overlooking the heterogeneity in user behavior patterns and diversity in item complexity. HyMoERec initially introduces a hybrid mixture-of-experts architecture that combines shared and specialized expert branches with an adaptive expert fusion mechanism for the sequential recommendation task. This design captures diverse reasoning for varied users and items while ensuring stable training. Experiments on MovieLens-1M and Beauty datasets demonstrate that HyMoERec consistently outperforms state-of-the-art baselines.

HyMoERec: Hybrid Mixture-of-Experts for Sequential Recommendation (Student Abstract)

Question generation is the task of natural language
processing where the goal is to generate fluent,
grammatically correct, error-free questions based on a
given input context and optionally an answer. Multi-hop
question generation is a more complex task compared to
traditional single-hop question generation, as it requires
reasoning over multiple information from multiple input
contexts in generating multi-hop questions. In this paper,
we have addressed the challenge of building a multi-hop
question generation system by combining the knowledge
graphs with large language models. We have designed a
framework KG4QG (Knowledge Graph for Question Generation),
where knowledge graphs are generated from the input
contexts. For the knowledge graph embedding, we have used
Graph Attention Network, and for input text embedding, we
have leveraged Sentence Transformer. Finally, we apply BART
and T5 models as Large Language Models to generate
multi-hop questions from our proposed model. Using HotpotQA
dataset to evaluate the performance of our KG4QG framework,
our proposed methodology has shown an enhancement of
performance over the previous methodologies.

Knowledge Graph for Efficient Multi-hop Question Generation (Student Abstract)

Federated Graph Learning enables multiple clients to
collaboratively train graph models while protecting local
private data. However, most studies have assumed that all
clients contribute data voluntarily and actively. Without
reasonable incentives, clients are often reluctant to
contribute personal data for model training. Furthermore,
the budget for incentives is limited, and if clients with
low-quality graph data are incentivized to participate in
training, it will negatively impact the training
performance of all parties in the system. To address this,
we propose AEFGL, a Reverse Auction and Value
Evaluation-Based Incentive Mechanism for Federated Graph
Learning. First, we design a reverse auction mechanism
combining graph structural attribute motifs with client
production value. Then, we propose a method for evaluating
client production value based on the comparison of the
client's expected reward and actual value. This mechanism
can incentivize clients with high-quality graph data to
participate in training within budget constraints, thereby
improving the model quality. Experimental results validate
the superiority of the AEFGL mechanism and the economic
properties it satisfies.

Downloads

Next from AAAI 2026

Self-Guided Planning and Repair Framework for Code Generation (Student Abstract)

Stay up to date with the latest Underline news!

PRESENTATIONS

CONFERENCES

COMPANY

RESOURCES