Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.
Continual learning for action recognition is a critical capability for next-generation Extended Reality (XR) systems. Yet it faces a severe real-world challenge: strict user privacy that prohibits data rehearsal. While recent prompt-based continual learning methods show promise, we argue their flat, single-granularity design is structurally mismatched to the complexity of human actions. This monolithic architecture fails to model the inherent hierarchical structure of individual actions and overlooks standard action primitives shared across tasks, resulting in suboptimal performance and hindered knowledge transfer. To overcome this limitation, we propose DPCA, a novel spatio-temporal continual learning framework with multi-granularity adaptive prompting. DPCA learns three synergistic components to resolve this mismatch. First, the task-specific prompter employs a multi-granularity query system to capture the unique, compositional semantics of each action. Second, the task-agnostic prompter learns a globally shared vocabulary of action primitives," providing a stable and generalizable knowledge base to mitigate catastrophic forgetting. Furthermore, we introduce a Dissimilarity Attention Rectification at each granularity level, which leverages a reverse attention mechanism to model class-agnostic background information, effectively alleviating overfitting. The synergy between these components enables robust model adaptation without requiring access to past data. Rigorous experiments on the NTU RGB+D benchmark, under a strict rehearsal-free, few-shot protocol, confirm that DPCA establishes a new state-of-the-art, advancing the realization of brilliant and privacy-respecting XR systems.
