Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.
Accurate prediction of breast cancer recurrence after treatment is essential for improving long-term outcomes. However, existing models are limited by three key challenges: (1) they typically rely on single-modal data, missing cross-modal interactions; (2) they analyze static snapshots, failing to capture disease progression over time; and (3) they often perform coarse feature fusion, lacking semantic disentanglement and interpretability. To address these issues, we propose LUMIN (Longitudinal Multi-modal Knowledge Decomposition Network), a novel framework that integrates longitudinal mammograms and electronic health records (EHRs) for recurrence prediction. LUMIN leverages a vision-language contrastive pretraining backbone to align multi-modal representations and introduces two knowledge extraction modules: (1) a Cross-Modal Disentangled Knowledge Extractor (CM-DKE) that separates shared, complementary, and modality-specific information across imaging and text; and (2) a Temporal Evolution Disentangled Knowledge Extractor (TE-DKE) that captures time-invariant, time-varying, and time-specific features to model disease dynamics. Experiments on a large-scale dataset of 3,924 patients and 19,684 exams show that LUMIN significantly outperforms state-of-the-art baselines, demonstrating its effectiveness in capturing both multi-modal semantics and temporal heterogeneity for recurrence prediction.