Singapore

Music-to-dance generation aims to synthesize human dance motion conditioned on music input. Despite recent progress, significant challenges remain due to the semantic gap between music and dance motion, as music offers only abstract cues, but lacks explicit physical movement descriptions. The challenge is further amplified by the scarcity of paired music and dance data, which restricts the model’s ability to learn diverse dance patterns. These limitations highlight the need for additional semantic guidance beyond the musical signal. In this paper, we propose DanceChat, a novel framework that leverages a Large Language Model (LLM) as a choreographer to generate high-level textual instructions from structured music descriptions. These instructions serve as semantic guidance to bridge the gap between music and motion. DanceChat integrates music, beat, and text features into a unified representation, and employs a diffusion-based motion generator trained with a proposed multi-modal alignment loss. Extensive experiments on AIST++ dataset show that DanceChat outperforms state-of-the-art methods both qualitatively and quantitatively.

AAAI 2026

DanceChat: Large Language Model-Guided Music-to-Dance Generation

workshop paper

We are pleased to announce the Fortieth AAAI Conference on Artificial Intelligence (AAAI-26), which will be held in Singapore EXPO from January 20 to January 27, 2026.

The purpose of the AAAI conference series is to promote research in Artificial Intelligence (AI) and foster scientific exchange between researchers, practitioners, scientists, students, and engineers across the entirety of AI and its affiliated disciplines. AAAI-26 will feature technical paper presentations, special tracks, invited speakers, workshops, tutorials, poster sessions, senior member presentations, competitions, and exhibit programs, and a range of other activities to be announced.<br><br>

To access this event page, you need to log in with the **email address you registered with**. <br>Access credentials will be sent to your email from Underline -  subject line "Welcome to AAAI 2026". Please be sure to check your spam email folder if you do not see an email confirmation right away.

Please log in

To access this event page, you are required to register.
Please complete your registration to continue.

We recommend reading [**the registration information**](https://aaai.org/conference/aaai/aaai-26/registration/) first.

**Online Registration Form**: https://aaai.getregistered.net/conference-2026 

Registration Required

We are pleased to announce the Fortieth AAAI Conference on Artificial Intelligence (AAAI-26), which will be held in Singapore EXPO from January 20 to January 27, 2026.

MOVE-ME is a wearable AI system that functions as a choreographic companion, designed to interrupt and inspire dancers’ improvisational flow in real time. Equipped with an on-body camera and speech synthesis, the system observes the dancer’s environment and responds to visual input and text prompts with spoken suggestions. These responses, ranging from poetic provocations to site-specific directives, create a feedback loop in which human and machine co-compose movement, challenging distinctions between spontaneity and computation, authorship and obedience.
The project explores AI as a relational agent rather than as a tool, an intelligent presence that shapes choreography through co-creation. MOVE-ME was featured in three practice-based research projects between 2024–2025, each testing different choreographic configurations and affective dynamics.

MOVE-ME: Dance Choreography with AI

This study presents a cross-cultural implementation of an artistic Brain–Computer Interface (BCI) and generative artificial intelligence (GenAI) system designed for live performance within the Balinese gamelan tradition. The BCI-GenAI system enables real-time translation of neural synchrony between dyads of performers (musician-musician, dancer-dancer, musician-dancer) into the control of culturally-relevant generative visual projections that interact with performers and audience alike.
Nine Balinese artists (six musicians and three dancers) participated in a two-part performance Janaki Dewi: Sita’s Reverie and Wiwada Manik: Tales of the Brothers. Dyads of artists wore Mobile Brain-Body Imaging (MoBI) technology to capture in real time their brain (electroencephalography, EEG) and ocular (electrooculography, EOG) activities, head motion, and video during rehearsals and a public performance over a period of 3 weeks. All signals were synchronized by hardware. A Brain-Computer Interface (BCI) preprocessed the MoBI Signals and computed the inter-brain synchrony between dyads. Synchrony indices derived from EEG bispectra modulated diffusion parameters in a StreamDiffusion-based GenAI, whereas text prompts consisting of culturally-relevant narrative and emotional descriptors reflected the story’s mythological and affective dimensions. Thus, the BCI-GenAI system linked the real-time inter-brain synchronization to dynamic imagery projected live on stage. The system thus functioned as a creative partner in-the-loop, responsive to both the emotional and rhythmic structure of performance. The multi-institutional, cross-cultural project contributes a methodological framework for BCI-GenAI in artistic settings, emphasizing cross-cultural collaboration, the symbiosis between cultural traditions and emergent technologies, and ethical data governance. It advances a model of responsible human–AI co-creation, where technology supports rather than displaces tradition thereby preserving the continuity of cultural identity through innovation, team science and transdisciplinarity.

An Artistic BCI-GenAI System Enabling Real-Time Co-Creation in Balinese Performance

This work-in-progress introduces SightDog, a hybrid framework that equips a robotic quadruped with multimodal artificial intelligence to support blind and visually impaired users. Designed as a robotic guide dog, SightDog integrates Vision–Language Models (VLMs) with schema-driven function calling to interpret natural language instructions and visual perception for navigation and assistance. A central focus is human–robot interaction: SightDog enables natural language dialog, provides contextual feedback, and adapts its responses to user intent in real time. In doing so, the system demonstrates a form of creative communication, where improvisation and adaptive dialog play a crucial role. In a simulation study, we show that SightDog can deliver environmental information and perform reactive navigation toward user-specified goals while managing obstacles and crosswalks. As part of this creative HRI, the system can also engage in informal dialog (e.g., telling casual jokes, which we illustrate in the experiment provided). While primarily an assistive prototype, the system also raises questions for interactive and creative AI, particularly regarding how AI systems respond to improvised human input. By situating SightDog within both accessibility and interactive AI research, this work contributes to discussions on human–robot collaboration and the future of guidance technologies.

SightDog: Function Calling and Creative Dialogue for AI‑Enhanced Guide Dogs

Recent advances in image generative models have enabled
rich collaborations between humans and AI systems. Among
these, Energy-Based Models (EBMs) learn an energy land-
scape that guides noised samples toward high-probability re-
gions. Unlike diffusion models that use fixed time sched-
ules, EBMs possess equilibrium properties that enable user
feedback during generation without destabilizing the distri-
bution. However, current EBM research primarily optimizes
for high-fidelity images, offering little control over the trade-
off between semantic realism and fine-grained diversity—an
essential feature for interactive creative applications. Artists
and creatives thus lack a modality to balance semantic coher-
ence (e.g., “a red apple”) with creative variation (e.g., apples
of different shapes or colors). To address this, we introduce
a geometry-aware annealing framework for EBMs. We pro-
pose a directionally-aware annealing variable that leverages
local geometric information to directionally adjust the effec-
tive noise level during sampling. Such an annealing feedback
mechanism that allows users to generate semantically real-
istic images before progressively exploring higher-diversity,
more creative variants. Together, these techniques enable a
controllable balance between fidelity and creativity, advanc-
ing the use of EBMs for interactive creative AI.

Geometry-Aware Energy-Based Image Modelling

Algorithmic fairness has grown rapidly, yet key concepts remain unsettled in criminal justice. We review group, individual, and process fairness and map the conditions under which they conflict. We then develop a simple modification to standard group fairness. Rather than exact parity across protected groups, we minimize a weighted error loss while keeping differences in false negative rates within a small tolerance. This improves feasibility, raises accuracy, and highlights the ethical choice of error costs. We situate this proposal within three classes of critique: biased and incomplete data, latent affirmative action, and the explosion of subgroup constraints. Finally, we propose a practical framework for deployment in public systems, built on three pillars: need-based decisions, transparency, and narrowly tailored solutions. Together, these elements link technical design to legitimacy and provide actionable guidance for agencies that use risk assessment and related tools.

Alternative Fairness and Accuracy Optimization in Criminal Justice

Catastrophic forgetting remains a critical challenge in continual learning for large language models (LLMs), where models struggle to retain performance on historical tasks when fine-tuning on new sequential data without access to past datasets. In this paper, we first reveal that the drift of functional directions during the fine-tuning process is a key reason why existing regularization-based methods fail in long-term LLM continual learning. To address this, we propose Dynamic Orthogonal Continual (DOC) fine-tuning, a novel approach that tracks the drift of these functional directions and dynamically updates them during the fine-tuning process. Furthermore, by adjusting the gradients of new task parameters to be orthogonal to the tracked historical function directions, our method mitigates interference between new and old tasks. Extensive experiments on various LLM continual learning benchmarks demonstrate that this approach outperforms prior methods, effectively reducing catastrophic forgetting and providing a robust tool for continuous LLM fine-tuning.

Dynamic Orthogonal Continual Fine-tuning for Mitigating Catastrophic Forgetting of LLMs

Intrinsic self-correction refers to the phenomenon where a language model refines its own outputs purely through prompting, without external feedback or parameter updates. While this approach improves performance across diverse tasks, its internal mechanism remains poorly understood. We analyze intrinsic self-correction from the representation shift induced by prompting. We formalize and introduce the notion of a prompt-induced shift, which is the change in hidden representations caused by a self-correction prompt. Across 5 open-source LLMs, prompt-induced shifts in text detoxification and text toxification align with latent directions constructed from contrastive pairs. In detoxification, the shifts align with the non-toxic direction; in toxification, they align with the toxic direction. These results suggest that intrinsic self-correction functions as representation steering along interpretable latent directions. Our analysis highlights an understanding of model internals can be a direct route to analyzing the mechanisms of prompt-driven LLM behaviors.

Intrinsic Self-Correction in LLMs: Towards Explainable Prompting via Mechanistic Interpretability

We present AI Screener, an end-to-end automated document review system that integrates a 12-billion-parameter pretrained large language model with a Tree-of-Thought reasoning framework to emulate and scale expert-level decision-making. Designed for high-stakes, domain-specific analysis, AI Screener empowers subject matter experts to encode their domain knowledge and reasoning processes in a no-code, efficient manner—enabling rapid customization without technical barriers. The system has been deployed across three different and unrelated mission-critical business functions: (1) accelerating scientific literature reviews to support the development of occupational exposure limits for worker health protection, (2) streamlining patent screening to optimize intellectual property portfolio management, and (3) automating procurement contract analysis to identify value leakage and drive better commercial terms. Across these diverse deployments, subject matter experts encoded their knowledge with AI Screener to transform traditional workflows—significantly reducing manual review time while maintaining expert-grade accuracy and consistency. This work highlights how Tree-of-Thought-augmented LLMs can be pragmatically applied to reshape enterprise document intelligence at scale.

Tree-of-Thought-Augmented LLMs for Automated Document Review Across Industrial Domains

Bias and fairness remain persistent challenges in the responsible deployment of machine learning systems. While most existing metrics are designed for binary classification, fairness evaluation for regression models, widely used in domains such as risk scoring, pricing, and demand forecasting, remains comparatively underexplored. We introduce a quantile-conditioned fairness framework for regression that extends conditional fairness assessment from binary to continuous outcomes. The proposed method partitions target values into quantiles, computes group-to-complement prediction ratios within each segment, and then aggregates these ratios to produce interpretable fairness scores. Through a series of controlled ablation studies on synthetic data, we analyze the effects of bias strength, protected group imbalance, and model performance. We also benchmark our solution against the open-source Dalex fairness toolkit. We further show that the same conditioning principle naturally extends to multiclass classification, treating each class as a conditioning bucket. Real-world case studies on regression and classification datasets demonstrate the practical utility of our approach. Our implementation is lightweight and easily integrable into existing model development workflows, providing a deployable framework for fairness evaluation for all domains.

Quantile-Conditioned Fairness: Extending Binary Fairness Evaluation to Continuous Outcomes

AI inference scaling is often tuned through 1D heuristics (a fixed reasoning passes) or 2D bivariate trade-offs (e.g., performance vs.\ compute), which fail to consider cost and latency constraints. We introduce a 3D optimization framework that jointly calibrates accuracy, cost, and latency within a unified decision space, enabling constraints-aware inference scaling. Using Monte Carlo simulations across three representative scenarios and nine simulated large language models, we evaluate four optimization methods to address the 3D multi-objective optimization (MOO) problem. Framing inference scaling in MOO shapes a feasible space that 1D and 2D optimizations fail to capture, enabling environment-adaptive selection of the inference scaling~$k$. Results show that knee-point optimization achieves the best balance, while accuracy-maximization remains favorable when precision is prioritized. The framework establishes a theoretical foundation for deployment-aware inference scaling across diverse operational contexts.

Premium content

Next from AAAI 2026

MOVE-ME: Dance Choreography with AI

Stay up to date with the latest Underline news!

PRESENTATIONS

CONFERENCES

COMPANY

RESOURCES