Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.
Generalized Category Discovery (GCD) aims to classify unlabeled data by leveraging knowledge from labeled categories. While existing methods have achieved remarkable progress, they often treat images as flat feature sets, neglecting the intrinsic hierarchy: where key objects dominate meaning and backgrounds serve as context. For instance, in images of a dog either standing on grass or lying on a bed, the dog remains the central semantic element, whereas the background varies. Motivated by this, we propose LEArning Intrinsic Hierarchy (LEAH), a lightweight plug-and-play module designed to model hierarchical structure within images. LEAH consists of two components: a pruner that filters task-irrelevant tokens to extract key objects, and a constructor that embeds key objects and full images into hyperbolic space using adaptive entailment cones to capture compositional semantics. LEAH can be easily integrated into existing GCD frameworks with minimal modification. When applied to SimGCD, it achieves up to 13.2\% accuracy improvement on fine-grained benchmarks, demonstrating its effectiveness in discovering subtle inter-class differences through hierarchical modeling.
