Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.
Weakly supervised 3D instance segmentation is essential for 3D scene understanding, especially as the growing scale of data and high annotation costs of fully supervised approaches. Existing methods primarily rely on two forms of weak supervision: one-thing-one-click annotation and bounding box annotation, both of which help alleviate annotation burdens. However, these approaches still face challenges, including time-consuming annotation procedures, high complexity, and reliance on skilled annotators. To overcome these limitations, we propose DBGroup, a two-stage weakly supervised 3D instance segmentation framework that leverages scene-level annotations as a more efficient and scalable alternative. In the first stage, we introduce a Dual-Branch Point Grouping module to generate pseudo labels guided by semantic and mask cues extracted from multi-view images. To further enhance label quality, we design two refinement strategies: Granularity-Aware Instance Merging and Semantic Selection and Propagation. In the second stage, we utilize the refined pseudo labels to perform multi-round self-training on an end-to-end instance segmentation network. Additionally, we propose an Instance Mask Filter strategy to address inconsistencies within the pseudo labels. Extensive experiments on the ScanNetV2 and S3DIS datasets demonstrate that DBGroup achieves superior performance compared to state-of-the-art 3D instance segmentation methods, as well as existing 3D semantic segmentation methods using scene-level supervision.
