AAAI 2026

January 24, 2026

Singapore, Singapore

Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.

Single-channel audio separation, which separates individual audio sources from monophonic mixtures, remains challenging in audio processing. Supervised approaches, typically trained on synthetically generated mixtures, are the prevailing solution. However, these methods depend on high-quality paired training data, which is often scarce or difficult to acquire in real-world scenarios. This data scarcity can hinder model performance on unseen mixing conditions, compromising generalization capabilities. To this end, in this work, we approach the problem from an unsupervised perspective, framing it as a probabilistic inverse problem. Our method requires only training diffusion priors on individual sources. Separation is then achieved by iteratively guiding an initial state towards the solution through reconstruction guidance. Crucially, we introduce an advanced inverse solver designed for separation, which mitigates gradient conflicts arising from the interference between the diffusion prior and the reconstruction guidance during the denoising process. This effectively ensures high-quality and balanced separation performance across individual sources. In addition, we found that using an augmented mixture instead of pure Gaussian noise to initialize the denoising process is effective, and this informative prior significantly improves the final performance. Furthermore, to enhance audio prior modeling, we designed a novel Time-Frequency (TF) attention-based network architecture that demonstrates powerful audio modeling capabilities. These collective improvements significantly enhance separation performance, as demonstrated by our experimental results across speech-sound event, sound event, and speech separation tasks. Audio demonstrations are available in the Supplementary Material.

Downloads

Paper

Next from AAAI 2026

Concepts from Representations: Post-hoc Concept Bottleneck Models via Sparse Decomposition of Visual Representations
poster

Concepts from Representations: Post-hoc Concept Bottleneck Models via Sparse Decomposition of Visual Representations

AAAI 2026

Qi Dou and 2 other authors

24 January 2026

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Presentations
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2025 Underline - All rights reserved