Content not yet available
This lecture has no active video or poster.
Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.
Motion estimation in degraded scenes has long been a significant challenge, primarily attributed to substantial scene variations and insufficient training data. Existing approaches typically address this limitation by incorporating additional training strategies or modifying network architectures within conventional frameworks. However, these solutions not only require cumbersome training procedures or additional modal inputs, but also lack generalization capabilities. To address this problem, we propose a unified optical flow estimation framework specifically designed for degraded scenes. In this work, we employ large-scale pre-trained optical flow foundation models as both teacher and student networks. Our objective is to compensate for feature incompleteness during image degradation through pre-trained large models. Subsequently, we leverage supervised signals for fine-tuning and introduce an intra-inter frame distillation method to enable the student network to adapt to diverse cross-domain scenarios. Our proposed methodology provides deeper insights into learning style-invariant features from these learnable fine-tuning layers. Extensive experiments demonstrate that our approach achieves superior generalization performance and state-of-the-art results in degraded scenes (including low-light, rain, fog and other conditions) while requiring minimal training resources.
