Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.
Identification of fine-grained embryo developmental stages during In Vitro Fertilization (IVF) is crucial for assessing embryo viability. Although recent deep learning methods have achieved promising accuracy, existing approaches based on discriminative models fail to utilize the distributional prior of embryonic development. Moreover, they suffer from incomplete embryonic representation due to their reliance on single-focal information, thereby making them susceptible to feature ambiguity caused by cell occlusions. To address these limitations, we propose EmbryoDiff, a two-stage diffusion-based framework that utilizes sequence features as condition signals for accurate stage recognition. Specifically, in the first stage, a frame-level encoder is trained and fixed to extract robust multi-focal visual features for training the diffusion model. In the second stage, we introduce a Multi-Focal Feature Fusion strategy that integrates information across focal planes to build a morphological representation with 3D contextual awareness, mitigating ambiguity caused by cell occlusions. Based on the fused features, we further extract complementary semantic and boundary condition features and design a Hybrid Semantic-Boundary Condition Block to effectively inject them into the denoising process for accurate stage classification. Extensive experiments on two benchmark datasets demonstrate that our method achieves state-of-the-art performance. Notably, our model attains optimal average test performance with only one denoising step, achieving 82.8% and 81.3% accuracy on the two datasets, respectively.
