Lecture image placeholder

Premium content

Access to this content requires a subscription. You must be a premium user to view this content.

Monthly subscription - $9.99Pay per view - $4.99Access through your institutionLogin with Underline account
Need help?
Contact us
Lecture placeholder background
VIDEO DOI: https://doi.org/10.48448/xcdn-pm34

poster

AMA Research Challenge 2024

November 07, 2024

Virtual only, United States

CoCluster: A Framework for Comorbidity, Complication, and Covariate Clustering via Machine Learning

Background Clustering is a machine learning technique that identifies distinct groups based on similarities in a given set of variables. This project leverages binary comorbidity, complication, or confounding data to identify distinct patient groups with varying risk profiles. Patients are grouped by similar combinations in a set of variables, this aims to identify key factors associated with an outcome as well as unique, nuanced interactions that are not apparent we term the process of clustering based on comorbidities, complications, or covariates, “CoClustering”. This study aims to guide researchers in applying CoClustering, including selecting appropriate clustering algorithms and understanding their statistical foundations.

Methods For the illustrative example provided in this framework, the 2015-2019 National Inpatient Sample (NIS) was queried for patients diagnosed with Cerebral Infarction. Using the K-Modes algorithm, patients were clustered based on 18 clinically relevant comorbidities and age groups <65, 65-79, 80+). Cluster quality was assessed using the Davies-Bouldin Index (DBI) and Calinski-Harabasz Index (CHI) to determine the optimal number of clusters. Post-clustering analysis included Odds Ratio analysis of mortality amongst clusters, multivariate logistic regression adjusting for sex, race, income quartile and primary payer.

Results In the illustrative example, nine unique clusters were formed. Post-clustering analysis showed statistically significant differences in mortality, the highest mortality group (Group 9) had an OR of 6.05 (95% CI: 4.99-7.33) and an AOR of 6.27 (95% CI: 5.11-7.69) when compared to Group 1.

Conclusion This framework provides clinical researchers with a practical approach to apply clustering for the identification of subgroups in diverse clinical datasets, and a demonstration of the utility of clustering in analyzing highly dimensional comorbidity data. By effectively managing high dimensionality and sparsity in large datasets, machine learning clustering algorithms like K-Modes can reveal clinically relevant patterns and key variables associated with an outcome. The CoClustering technique can be applied broadly, ranging from offering a preliminary analysis of a clinical dataset to identify salient variables to further investigate, to grouping patients with a common diagnosis to better understand individual prognosis, further enabling highly personalized care.

Next from AMA Research Challenge 2024

Comparison of Patient-Physician Trust Among Medical Students and College Students
poster

Comparison of Patient-Physician Trust Among Medical Students and College Students

AMA Research Challenge 2024

Archi Patel

07 November 2024

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Lectures
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2023 Underline - All rights reserved