Singapore

We present APEX-Q, a flexible product quantization framework for compressing large language models. Unlike prior multi-codebook quantization methods with fixed partitions, APEX-Q supports arbitrary-dimensional tensor quantization, better capturing weight redundancy. It achieves performance on par with 4-bit and 8-bit baselines, enables post-training quantization without retraining, and reveals key trade-offs across subvector dimensions, codebook sizes, and hardware efficiency. APEX-Q thus provides a unified, hardware-friendly approach to scalable LLM deployment.

AAAI 2026

APEX-Q: Arbitrary-dimension Product-EXtension Quantization for Accelerated LLM Deployment (Student Abstract)

gpu acceleration

llm inference

quantization

model compression

technical paper

We are pleased to announce the Fortieth AAAI Conference on Artificial Intelligence (AAAI-26), which will be held in Singapore EXPO from January 20 to January 27, 2026.

The purpose of the AAAI conference series is to promote research in Artificial Intelligence (AI) and foster scientific exchange between researchers, practitioners, scientists, students, and engineers across the entirety of AI and its affiliated disciplines. AAAI-26 will feature technical paper presentations, special tracks, invited speakers, workshops, tutorials, poster sessions, senior member presentations, competitions, and exhibit programs, and a range of other activities to be announced.<br><br>

To access this event page, you need to log in with the **email address you registered with**. <br>Access credentials will be sent to your email from Underline -  subject line "Welcome to AAAI 2026". Please be sure to check your spam email folder if you do not see an email confirmation right away.

Please log in

To access this event page, you are required to register.
Please complete your registration to continue.

We recommend reading [**the registration information**](https://aaai.org/conference/aaai/aaai-26/registration/) first.

**Online Registration Form**: https://aaai.getregistered.net/conference-2026 

Registration Required

We are pleased to announce the Fortieth AAAI Conference on Artificial Intelligence (AAAI-26), which will be held in Singapore EXPO from January 20 to January 27, 2026.

Recent advances in generative AI for music have achieved remarkable fidelity and stylistic diversity, yet these systems often fail to align with nuanced human preferences due to the specific loss functions they use. This paper advocates for the systematic application of preference alignment techniques to music generation, addressing the fundamental gap between computational optimization and human musical appreciation. Drawing on recent breakthroughs including MusicRL's large-scale preference learning, multi-preference alignment frameworks like diffusion-based preference optimization in DiffRhythm+, and inference-time optimization techniques like Text2midi-InferAlign, we discuss how these techniques can address music's unique challenges: temporal coherence, harmonic consistency, and subjective quality assessment. We identify key research challenges including scalability to long-form compositions, reliability amongst others in preference modelling. Looking forward, we envision preference-aligned music generation enabling transformative applications in interactive composition tools and personalized music services. This work calls for sustained interdisciplinary research combining advances in machine learning, music-theory to create music AI systems that truly serve human creative and experiential needs.

Aligning Generative Music AI with Human Preferences: Methods and Challenges

Defocus blur, common in shallow depth-of-field photography, varies across image regions and is challenging to accurately estimate and restore. Existing deblurring methods often struggle to capture fine structural textures and do not effectively adapt to regional differences in blur. We propose Multi-Level Blur-Aware Stable Diffusion (MBSD), a novel framework that explicitly integrates regional blur recognition into a diffusion-based image restoration process. MBSD assigns blur-level labels to image patches using a Patch Blur Annotator (PBA), guiding a Multi-Scale Blur Estimator (MSBE) to predict soft blur probabilities and generate routing weights. These weights control a Blur-Adaptive Expert Mixer (BAEM), which adaptively combines features based on local blur severity. The features are then passed to a text-to-image diffusion model via a cross-attention mechanism, enabling region-specific restoration. Extensive experiments on public benchmarks demonstrate that MBSD delivers superior perceptual quality while maintaining competitive PSNR and SSIM, consistently outperforming state-of-the-art methods.

Multi-Level Blur-Aware Stable Diffusion for Region-Adaptive Defocus Deblurring

Millimeter-wave radar offers a promising sensing modality for autonomous systems thanks to its robustness in adverse conditions and low cost. However, its utility is significantly limited by the sparsity and low resolution of radar point clouds, which poses challenges for tasks requiring dense and accurate 3D perception. Despite that recent efforts have shown great potential by exploring generative approaches to address this issue, they often rely on dense voxel representations that are inefficient and struggle to preserve structural detail. To fill this gap, we make the key observation that latent diffusion models (LDMs), though successful in other modalities, have not been effectively leveraged for radar-based 3D generation due to a lack of compatible representations and conditioning strategies. We introduce RaLD, a framework that bridges this gap by integrating scene-level frustum-based LiDAR autoencoding, order-invariant latent representations, and direct radar spectrum conditioning. These insights lead to a more compact and expressive generation process. Experiments show that RaLD produces dense and accurate 3D point clouds from raw radar spectrums, offering a promising solution for robust perception in challenging environments.

RaLD: Generating High-Resolution 3D Radar Point Clouds with Latent Diffusion

Multi-modal imbalanced cross-source entity alignment aims to identify equivalent entity pairs across multi-modal knowledge graphs (MMKGs) that encompass diverse data sources with imbalanced modality, which poses significant challenges due to the non-uniform distribution of information across different modalities. Existing methods encounter major limitations in aligning entities across MMKGs, where missing data and modality-specific inconsistencies thus create information gaps. These gaps, stemming from disparities in neighborhood structure and attribute availability, result in reduced alignment performance. To address these challenges, we propose a novel multi-modal fact knowledge generation framework to advance imbalanced cross-source entity alignment. Utilizing large language models (LLMs) for comprehensive knowledge completion, our framework enriches MMKGs by synthesizing missing neighboring entities and relational attributes, enabling precise one-to-one similarity comparisons across all relations and attributes. Specifically, neighbor entity completion generates probable neighboring entities to fill structural gaps, while attribute completion synthesizes missing relational attributes to improve alignment. The facts evaluation module assesses generated triples, ensuring that only high-quality information supports the alignment. Extensive experiments on benchmark datasets demonstrate that our framework significantly outperforms strong competitors, achieving superior entity alignment performance.

Multi-Modal Fact Knowledge Generation for Imbalanced Cross-Source Entity Alignment

Dense retrieval models commonly use flat indexes to achieve high-precision retrieval by computing exact distances between embedding vectors. However, flat indexes are memory-intensive and inefficient, limiting their scalability in large-scale retrieval tasks. In contrast, quantized indexes enable faster retrieval with significantly lower memory usage, but their accuracy tends to decrease. Therefore, we propose a scalable and efficient training method for the dual-encoder models to improves the retrieval accuracy on quantized indexes. Our approach combines the direct gradient update to the cached target embeddings with large scale negative sampling based on similarity, significantly reducing computational overhead and GPU memory usage. Target embeddings are initialized with a pre-trained encoder and stored in a memory buffer, which is directly updated via backpropagation, thus avoiding the repeated re-encoding of the full corpus. To build a rich set of negatives, we retrieve the top-$k$ most similar targets for each query from cached embeddings using the quantized index, including both query-specific and cross-batch top-$k$ results. This design effectively approximates the truncated softmax distribution. The experiments show that our method achieves performs exceptionally well on quantized indexes, providing a practical and scalable solution for real-world retrieval systems.

Improving the Accuracy of Dense Retrieval on the Quantized Indexes via Gradient Optimization of the Target Embeddings

Graph neural networks (GNNs) have shown promise on combinatorial problems such as \textsc{Max-Clique}, yet it remains unclear what algorithmic principles they actually learn. This paper introduces a concept-driven framework for evaluating and interpreting GNNs on such tasks. We begin with a principled benchmark based on synthetic graphs with known difficulty levels—easy, medium, and hard—derived from theoretical thresholds for planted cliques. Using this setup, we show that GNNs reliably learn a simple yet powerful concept: degree-based ranking. This insight motivates a new decoder, Least-Probable Removal (LPR), which significantly outperforms the common top-$k$ strategy, especially on harder and real-world instances. Our analysis pipeline connects latent representations to classical heuristics, improving both interpretability and performance. Finally, we demonstrate cross-domain generalization to sparse PCA, showing that the same GNN architecture and decoding strategy succeed in recovering sparse principal components, revealing a shared underlying principle across domains.

Learning to Rank: How GNNs Solve Max-Clique and Sparse PCA

The integration of medical images with clinical context is essential for generating accurate and clinically interpretable radiology reports. However, current automated methods often rely on resource-heavy Large Language Models (LLMs) or static knowledge graphs and struggle with two fundamental challenges in real-world clinical data: (1) missing modalities, such as incomplete clinical context , and (2) feature entanglement, where mixed modality-specific and shared information leads to suboptimal fusion and clinically unfaithful hallucinated findings. To address these challenges, we propose the DiA-gnostic VLVAE, which achieves robust radiology reporting through Disentangled Alignment. Our framework is designed to be resilient to missing modalities by disentangling shared and modality-specific features using a Mixture-of-Experts (MoE) based Vision-Language Variational Autoencoder (VLVAE). A constrained optimization objective enforces orthogonality and alignment between these latent representations to prevent suboptimal fusion. A compact LLaMA-X decoder then uses these disentangled representations to generate reports efficiently. On the IU X-Ray and MIMIC-CXR datasets, DiA has set new state-of-the-art BLEU@4 scores of 0.266 and 0.134, respectively. Experimental results show that the proposed method significantly outperforms state-of-the-art models.

DiA-gnostic VLVAE: Disentangled Alignment-Constrained Vision Language Variational AutoEncoder for Robust Radiology Reporting with Missing Modalities

Spiking Neural Networks (SNNs), with their brain-inspired spatiotemporal dynamics and spike-driven computation, have emerged as promising energy-efficient alternatives to Artificial Neural Networks (ANNs). However, existing SNNs typically replicate inputs directly or aggregate them into frames at fixed intervals. Such strategies lead to neurons receiving nearly identical stimuli across time steps, severely limiting the model's expressive power—particularly in complex tasks like object detection. In this work, we propose the Temporal Dynamics Enhancer (TDE) to strengthen SNNs' capacity for temporal information modeling. TDE consists of two modules: a Spiking Encoder (SE) that generates diverse input stimuli across time steps, and an Attention Gating Module (AGM) that guides the SE generation based on inter-temporal dependencies. Moreover, to eliminate the high-energy multiplication operations introduced by the AGM, we propose a Spike-Driven Attention (SDA) to reduce attention-related energy consumption. Extensive experiments demonstrate that TDE can be seamlessly integrated into existing SNN-based detectors and consistently outperforms state-of-the-art methods, achieving mAP@50-95 scores of 57.7% on the static PASCAL VOC dataset and 47.6% on the neuromorphic EvDET200K dataset. In terms of energy consumption, the SDA consumes only 0.240× the energy of conventional attention modules.

Temporal Dynamics Enhancer for Directly Trained Spiking Object Detectors

Timely detection of retinal diseases is crucial for prevent- ing vision loss; yet the limited availability of ophthalmolo- gists and disparities in access to diagnostic services continue to hinder widespread screening, particularly in primary care settings. We present REMEDIS, a Software-as-a-Service (SaaS)–based clinical AI framework for the automated diag- nosis of major retinal diseases, including age-related macu- lar degeneration (AMD), diabetic retinopathy (DR), epireti- nal membrane (ERM), and glaucoma, using fundus images. The system analyzes high-resolution fundus photographs in a secure cloud environment via a Swin-Large–based multi- disease classification network, producing disease-specific probability scores. To ensure clinically meaningful decision- making, Youden’s Index is applied to determine optimized sensitivity–specificity thresholds for each condition. An ex- plainability module based on Grad-CAM generates lesion- localization contour visualizations, providing interpretable evidence that assists ophthalmologists in case review and fa- cilitates integration into electronic medical records (EMR). The framework was evaluated in an IRB-approved multi- center prospective clinical trial conducted under real-world conditions, achieving an average AUC exceeding 0.94 across the four target diseases and demonstrating strong concor- dance with expert diagnoses. To our knowledge, this repre- sents one of the first SaaS-based AI diagnostic frameworks for retinal diseases validated through prospective clinical studies, highlighting its potential as an emerging clinical ap- plication of AI.

REMEDIS: A Clinical AI Framework for Retinal Disease Diagnosis with Explainable Fundus Image Analysis

Recommender systems are widely required and deployed to address real-world problems. In this paper, we study a new yet challenging real-world setting for recommender systems, where only user browsing histories are available without any explicit feedback. No item acquisition information, e.g., purchasing or rating, is given. By assuming that user browsing sequences are likely to contain the items to acquire, we draw an analogy to the setting of partial label learning in weakly supervised learning. This enables us to train reliable recommender systems only using browsing histories. We term the proposed method as Partial Acquisition Recommender System (PARS). Empirical results on real-world benchmark datasets show the effectiveness of the proposed method. Surprisingly, we also show that the proposed method even surpasses some baselines using item acquisition information.

Downloads

Next from AAAI 2026

Aligning Generative Music AI with Human Preferences: Methods and Challenges

Stay up to date with the latest Underline news!

PRESENTATIONS

CONFERENCES

COMPANY

RESOURCES

.css-70qvj9{display:-webkit-box;display:-webkit-flex;display:-ms-flexbox;display:flex;-webkit-align-items:center;-webkit-box-align:center;-ms-flex-align:center;align-items:center;}Downloads

Next from AAAI 2026

Aligning Generative Music AI with Human Preferences: Methods and Challenges

Stay up to date with the latest Underline news!

PRESENTATIONS

CONFERENCES

COMPANY

RESOURCES

Downloads