Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.
Clinical notes contain rich patient information, such as diagnoses or medications, making them valuable for patient representation learning. Recent advances in large language models have further improved the ability to extract meaningful representations from clinical texts. However, clinical notes are often missing—for example, 35\% of patients in real-world datasets lack them. In such cases, representations can be learned from other modalities such as structured data, chest X-rays, or radiology reports. Yet the availability of these modalities is influenced by clinical decision-making and varies across patients, resulting in modality missing-not-at-random (MMNAR) patterns. We propose a causal representation learning framework that leverages observed data and informative missingness in multimodal clinical records. It consists of: (1) a MMNAR-aware modality fusion module using large language models and other encoders to capture both patient health and reasons for missing data in representation learning; (2) a representation balancing module that improves generalization across missingness patterns, inspired by causal machine learning; and (3) a multitask prediction model, fine-tuned for each modality pattern using a rectifier to correct residual bias. On the MIMIC-IV dataset, our approach significantly outperforms recent baselines: AUC/APR increases by 16.83\%/27.21\% for hospital readmission, and by 6.86\%/10.15\% for ICU admission. Subgroup analyses confirm the value of modeling MMNAR for robust and generalizable clinical NLP.