Singapore

The neural-enhanced video streaming (NeVS) has been an emerging technique to integrate neural models into video codecs for higher streaming efficiency. The state-of-the-art methods, e.g., DeNC and Gemino, typically compress videos in RGB space and restore video quality via a neural enhancement model hosted on the external media server. However, these methods are not always accessible in resource-constrained edge environments due to their heavy reliance on the media server&#39;s computation, which undermines end-to-end performance and restricts NeVS&#39;s usage boundary. This limitation raises an interesting question: is it possible to make NeVS lightweight so that all neural codec operations can be handled directly by clients&#39; edge devices? In this paper, we present the answer yes and develop a new plug-and-play module called DeNC++, which significantly improves the compression-restoration-overhead trade-off over existing methods. Our core design philosophy is to wrap all the codec operations within a latent semantic space, in which the original high-dimensional visual signals are efficiently embedded into low-dimensional semantic representations. With this fundamental transformation, DeNC++&#39;s neural encoder introduces the triple semantic-bitwidth-resolution compression to effectively lower the streaming traffic. Meanwhile, we make DeNC++&#39;s neural decoder aware of the perceptual loss caused by its encoder and design tiny generative models to guarantee high restoration quality. We also strictly restrict the runtime computational overhead and accelerate the neural enhancement process, making DeNC++ compatible with commodity edge devices. Real-world evaluations reveal that DeNC++ consistently provides higher restoration quality while achieving 24-55 times higher compression ratio and 5-7 times end-to-end speedup over the latest NeVS solutions.

AAAI 2026

DeNC++: Efficient Diffusion-Enhanced Neural Codec for End-to-end Semantic Streaming at the Edge

neural codec

semantic streaming

diffusion model

The neural-enhanced video streaming (NeVS) has been an emerging technique to integrate neural models into video codecs for higher streaming efficiency. The state-of-the-art methods, e.g., DeNC and Gemino, typically compress videos in RGB space and restore video quality via a neural enhancement model hosted on the external media server. However, these methods are not always accessible in resource-constrained edge environments due to their heavy reliance on the media server's computation, which undermines end-to-end performance and restricts NeVS's usage boundary. This limitation raises an interesting question: is it possible to make NeVS lightweight so that all neural codec operations can be handled directly by clients' edge devices? In this paper, we present the answer yes and develop a new plug-and-play module called DeNC++, which significantly improves the compression-restoration-overhead trade-off over existing methods. Our core design philosophy is to wrap all the codec operations within a latent semantic space, in which the original high-dimensional visual signals are efficiently embedded into low-dimensional semantic representations. With this fundamental transformation, DeNC++'s neural encoder introduces the triple semantic-bitwidth-resolution compression to effectively lower the streaming traffic. Meanwhile, we make DeNC++'s neural decoder aware of the perceptual loss caused by its encoder and design tiny generative models to guarantee high restoration quality. We also strictly restrict the runtime computational overhead and accelerate the neural enhancement process, making DeNC++ compatible with commodity edge devices. Real-world evaluations reveal that DeNC++ consistently provides higher restoration quality while achieving 24-55 times higher compression ratio and 5-7 times end-to-end speedup over the latest NeVS solutions.

technical paper

We are pleased to announce the Fortieth AAAI Conference on Artificial Intelligence (AAAI-26), which will be held in Singapore EXPO from January 20 to January 27, 2026.

The purpose of the AAAI conference series is to promote research in Artificial Intelligence (AI) and foster scientific exchange between researchers, practitioners, scientists, students, and engineers across the entirety of AI and its affiliated disciplines. AAAI-26 will feature technical paper presentations, special tracks, invited speakers, workshops, tutorials, poster sessions, senior member presentations, competitions, and exhibit programs, and a range of other activities to be announced.<br><br>

To access this event page, you need to log in with the **email address you registered with**. <br>Access credentials will be sent to your email from Underline -  subject line "Welcome to AAAI 2026". Please be sure to check your spam email folder if you do not see an email confirmation right away.

Please log in

To access this event page, you are required to register.
Please complete your registration to continue.

We recommend reading [**the registration information**](https://aaai.org/conference/aaai/aaai-26/registration/) first.

**Online Registration Form**: https://aaai.getregistered.net/conference-2026 

Registration Required

We are pleased to announce the Fortieth AAAI Conference on Artificial Intelligence (AAAI-26), which will be held in Singapore EXPO from January 20 to January 27, 2026.

Graph Neural Networks (GNNs) address two key challenges in applying deep learning to graph-structured data: they handle varying size input graphs and ensure invariance under graph isomorphism. While GNNs have demonstrated broad applicability, understanding their expressive power remains an important question. In this paper, we propose GNN architectures that correspond precisely to prominent fragments of first-order logic (FO), including various modal logics as well as more expressive two-variable fragments. To establish these results, we apply methods from finite model theory of first-order and modal logics to the domain of graph representation learning. Our results provide a unifying framework for understanding the logical expressiveness of GNNs within FO.

The Correspondence Between Bounded Graph Neural Networks and Fragments of First-Order Logic

Cross-domain few-shot segmentation (CD-FSS) aims to tackle the dual challenge of recognizing novel classes and adapting to unseen domains with limited annotations. However, encoder features often entangle domain-relevant and category-relevant information, limiting both generalization and rapid adaptation to new domains. To address this issue, we propose a Divide-and-Conquer Decoupled Network (DCDNet). In the training stage, to tackle feature entanglement that impedes cross-domain generalization and rapid adaptation, we propose the Adversarial-Contrastive Feature Decomposition (ACFD) module. It decouples backbone features into category-relevant private and domain-relevant shared representations via contrastive learning and adversarial learning. Then, to mitigate the potential degradation caused by the disentanglement, the Matrix-Guided Dynamic Fusion (MGDF) module adaptively integrates base, shared, and private features under spatial guidance, maintaining structural coherence. In addition, in the fine-tuning stage, to enhanced model generalization, the Cross-Adaptive Modulation (CAM) module is placed before the MGDF, where shared features guide private features via modulation ensuring effective integration of domain-relevant information. Extensive experiments on four challenging datasets show that DCDNet outperforms existing CD-FSS methods, setting a new state-of-the-art for cross-domain generalization and few-shot adaptation. Code: https://github.com/rawwap/DCDNet.

Divide-and-Conquer Decoupled Network for Cross-Domain Few-Shot Segmentation

Cake-cutting algorithms, which aim to fairly allocate a continuous resource based on individual agent preferences, have seen significant progress over the past two decades. Much of the research has concentrated on fairness, with comparatively less attention given to other important aspects.
Chen et al. (2010) introduced an algorithm that, in addition to ensuring fairness, was strategyproof---meaning agents had no incentive to misreport their valuations.
However, even in the absence of strategic incentives to misreport, agents may still hesitate to reveal their true preferences due to privacy concerns (e.g., when allocating advertising time between firms, revealing preferences could inadvertently expose planned marketing strategies or product launch timelines).
In this work, we extend the strategyproof algorithm of Chen et al. by introducing a privacy-preserving dimension. To the best of our knowledge, we present the first private cake-cutting protocol, and, in addition, this protocol is also envy-free and strategyproof. Our approach replaces the algorithm’s centralized computation with a novel adaptation of cryptographic techniques, enabling privacy without compromising fairness or strategyproofness. Thus, our protocol encourages agents to report their true preferences not only because they are not incentivized to lie, but also because they are protected from having their preferences exposed.

Truth, Justice, and Secrecy: Cake Cutting Under Privacy Constraints

Epilepsy is a widespread neurological disorder characterized by highly patient-specific EEG patterns. Existing EEG-based seizure detection methods either train individualized models for each patient or adapt models pre-trained on known patients to new ones. However, when encountering previously unseen patients, these methods typically require retraining or fine-tuning, which limits their practical utility in clinical settings. This limitation can be linked to biases caused by patient-specific variations, which obscure the underlying pathological patterns of seizures. To address this, we propose an evidential multi-view framework that reinforces the learning of core epileptic features by promoting consistency across multiple views and reducing reliance on high-uncertainty, patient-specific segments. Specifically, we introduce Bias-guided Fisher-Evidential Multi-View Learning (BF-EML) to guide the model toward discovering intrinsic seizure patterns. BF-EML employs a two-stage training architecture: In Stage I, we use the Fisher Information Matrix to reorder EEG segments by uncertainty and deliberately train a biased feature generator on low-evidence segments. In Stage II, we design a dual-branch network where the biased and unbiased branches are alternately trained, encouraging the unbiased branch to reduce its reliance on patient-specific biases. Finally, we introduce a shift-calibrated fusion strategy to enhance the consistency of pathogenic feature integration. Extensive experiments on public datasets and a clinical dataset demonstrate that our method achieves superior performance in both single- and multi-patient scenarios. Importantly, it generalizes well to previously unseen patients without the need for retraining. We will release all datasets and source code upon publication.

Universal EEG Epilepsy Detection via Evidential Multi-View De-Biasing

Graph-level anomaly detection (GLAD), which identifies rare or atypical graphs within a graph set, is crucial for applications such as image analysis, industrial defect inspection and fraud detection. However, existing GLAD approaches typically rely on the in-distribution hypothesis while lacking generalization capability for out-of-distribution (OOD) scenarios (e.g., different graph sizes), which largely limits the application in the real world. For the first time, we formulate the OOD generalization problem for GLAD, where testing graph data exhibit significant distributional shifts from training data. To tackle two common types of distributional shifts, domain generalization and subpopulation shift, we propose the $\textbf{F}$ine-$\textbf{G}$rained $\textbf{S}$ubpopulation $\textbf{G}$raph-$\textbf{L}$evel $\textbf{A}$nomaly $\textbf{D}$etection ($\textbf{FGS-GLAD}$). First, we propose a $\textbf{G}$raph $\textbf{I}$nformation $\textbf{B}$ottleneck-based $\textbf{A}$nomaly $\textbf{D}$etection Module ($\textbf{GIB4AD}$) that implements graph reverse distillation and graph information bottleneck on the graph to enhance task-relevant feature extraction for domain generalization. Second, We propose a $\textbf{F}$ine-$\textbf{G}$rained $\textbf{S}$ubpoulation $\textbf{I}$nference Module ($\textbf{FGSI}$) to predict fine-grained subpopulations and focus on critical inter-subpopulation features through a supervised contrastive mechanism. Experiments on seven benchmark datasets and ten baselines demonstrate our model's superiority in handling domain generalization and subpopulation shift, advancing graph-level anomaly detection for real-world applications.

Exploring Domain Generalization and Subpopulation Shift for Generalizable Graph-Level Anomaly Detection

Our statistical analysis reveals a complementary phenomenon between large language model-based question answering (QA) and small model-based QA. To facilitate dual knowledge transfer between these two paradigms, this paper introduces a collaborative enhancement method of large and small models for question answering. The proposed method consists of two iterative steps: i) small4large step, in which the small model first predicts an answer for a given question along with its confidence, and these results are then leveraged as prompts to strengthen the large model's performance; ii) large4small step, where the large model enhances the small model through distillation, judgment and reflection. Through iteration of these two steps, the large and small models could enhance each other progressively. Experimental evaluations across eight datasets spanning five domains demonstrate that the proposed method effectively improves the question answering performance of both large and small models simultaneously.

Collaborative Enhancement of Large and Small Models for Question Answering via Dual Knowledge Transfer

Large Language Models (LLMs) demonstrate impressive performance across natural language tasks but incur substantial computational and storage costs due to their scale. Post-training structured pruning offers an efficient solution. However, when few-shot calibration sets fail to adequately reflect the pretraining data distribution, existing methods exhibit limited generalization to downstream tasks. To address this issue, we propose Function-Aware Neuron Grouping (FANG), a post-training pruning framework that alleviates calibration bias by identifying and preserving neurons critical to specific function. FANG groups neurons with similar function based on the type of semantic context they process and prunes each group independently. During importance estimation within each group, tokens that strongly correlate with the functional role of the neuron group are given higher weighting. Additionally, FANG also preserves neurons that contribute across multiple context types. To achieve a better trade-off between sparsity and performance, it allocates sparsity to each block adaptively based on its functional complexity. Experiments show that FANG improves downstream accuracy while preserving language modeling performance. It achieves the state-of-the-art (SOTA) results when combined with FLAP and OBC, two representative pruning methods. Specifically, FANG outperforms FLAP and OBC by 1.5%–8.5% in average accuracy under 30% and 40% sparsity.

Improving Generalization in LLM Structured Pruning via Function-Aware Neuron Grouping

Discovering subgroups with the maximum average treatment effect is crucial for targeted decision making in domains such as precision medicine, public policy, and education. While most prior work is formulated in the potential‑outcome framework, the corresponding structural causal model (SCM) for this task has been largely overlooked. In practice, two approaches dominate. The first estimates pointwise conditional treatment effects and then fits a tree on those estimates, effectively turning subgroup estimation into the harder problem of accurate pointwise estimation. The second constructs decision trees or rule sets with ad‑hoc 'causal' heuristics, typically without rigorous justification for why a given heuristic may be used or whether such heuristics are necessary at all.

We address these issues by studying the problem directly under the SCM framework. Under the assumption of a partition-based model, we show that optimal subgroup discovery reduces to recovering the data-generating models and hence a standard supervised learning problem (regression or classification). This allows us to adopt \emph{any} partition-based methods to learn the subgroup from data. We instantiate the approach with CART, arguably one of the most widely used tree-based method, to learn the subgroup with maximum treatment effect. Finally, on a large collection of synthetic and semi‑synthetic datasets, we compare our method against a wide range of baselines and find that our approach, which avoids such causal heuristics, more accurately identifies subgroups with maximum treatment effect.

Learning Subgroups with Maximum Treatment Effects Without Causal Heuristics

Efficient inference of large language models (LLMs) is hindered by an ever-growing key-value (KV) cache, making KV cache compression a critical research direction. Traditional methods selectively evict less important KV cache entries, which leads to information loss and hallucinations. Recently, merging-based strategies have been explored to retain more information by merging KV pairs that would be discarded; however, these existing approaches inevitably introduce inconsistencies in attention distributions before and after merging, causing output perturbation and degraded generation quality. To overcome this challenge, we propose KeepKV, a novel adaptive KV cache merging method designed to preserve performance under strict memory constraints, taking a significant step toward achieving lossless KV cache compression. KeepKV introduces the Electoral Votes mechanism that records merging history and adaptively adjusts attention scores. Moreover, it further leverages a novel Zero Inference-Perturbation Merging method, compensating for attention loss resulting from cache merging. Extensive experiments on various benchmarks and LLM architectures demonstrate that KeepKV substantially reduces memory usage while successfully retaining essential context information, achieving over $2\times$ inference throughput improvement and maintaining superior generation quality even with only 10% KV cache budgets.

KeepKV: Achieving Periodic Lossless KV Cache Compression for Efficient LLM Inference

Robots have surpassed humans in terms of strength and
precision, yet humans retain an unparalleled ability for
decision-making in the face of unpredictable disturbances.
This article aims to combine the strengths of both entities
within a singular task: human motion guidance under strict
geometric constraints, particularly adhering to
predetermined paths. To tackle this challenge, a modular
haptic guidance law is proposed that takes the
human-applied wrench as an input. Using an auxiliary
variable called phase, the generated desired motion is
guaranteed to consistently adhere to the constraint path.
It is demonstrated how the guidance policy can be
generalized into physically interpretable terms, adjustable
either prior to initiating the task or dynamically while
the task is in progress. Additionally, an illustrative
guidance adaptation policy is showcased that takes into
account the human’s manipulability. Leveraging passivity
analysis, potential sources of instability are pinpointed,
and subsequently, overall system stability is ensured by
incorporating an augmented virtual energy tank. Lastly, a
comprehensive set of experiments, including a
20-participant user study, explores various aspects of the
approach in practice, encompassing both technical and
usability considerations.

Downloads

Next from AAAI 2026

The Correspondence Between Bounded Graph Neural Networks and Fragments of First-Order Logic

Stay up to date with the latest Underline news!

PRESENTATIONS

CONFERENCES

COMPANY

RESOURCES