Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.
Cascade-based multi-scale Multi-view Stereo (MVS) architectures are currently the mainstream in multi-view stereo reconstruction, achieving a balance between computational efficiency and reconstruction accuracy. However, existing cascade MVS methods suffer from significant limitations in cross-scale information utilization, where depth estimation processes operate independently across scales without fully exploiting the rich relevance between adjacent scales. To address this fundamental limitation, we propose the Enhanced Cascade Multi-View Stereo framework (EC-MVSNet), which introduces a novel cross-scale relevance integration strategy. Our framework incorporates three key components: a Cross-Scale Feature-based Joint Construction (CFC) module that synergistically combines features from adjacent scales to build more reliable cost volumes, a Cross-Scale Probability-guided Enhancement (CPE) module that propagates depth probability distributions across scales to guide cost volume enhancement, and a Monocular Feature-based Refinement (MFR) module that leverages monocular priors to further enhance depth prediction accuracy. Extensive experiments demonstrate that EC-MVSNet achieves state-of-the-art performance on multiple benchmarks, validating the effectiveness of the cross-scale integration in improving MVS reconstruction quality.
