
Premium content
Access to this content requires a subscription. You must be a premium user to view this content.

poster
Artificial intelligence for diagnosing exudative age-related macular degeneration: findings from a Cochrane Diagnostic Test Accuracy Review
Background: Exudative age-related macular degeneration (eAMD) is a retinal disorder that may result in rapid deterioration of central vision. Tests leveraging artificial intelligence (AI) hold the promise of automatically identifying and categorizing pathological features, enabling the timely diagnosis and treatment of eAMD. This Cochrane systematic review and meta-analysis aim to determine the diagnostic accuracy of AI as a triaging tool for eAMD.
Methods: We searched CENTRAL, MEDLINE, Embase, three clinical trial registries, and DANS for the grey literature up to April 2024. We included studies that compared the test performance of algorithms with that of human readers in detecting eAMD on retinal images. Review authors worked in pairs to independently extract data and assess study quality using the Quality Assessment of Diagnostic Accuracy Studies-2 (QUADAS-2) tool with revised signaling questions. Assuming a common positivity threshold applied by the included studies, we chose random effects bivariate logistic models to estimate summary sensitivity and specificity as the primary performance metrics.
Results: We identified 36 eligible studies that reported 40 sets of algorithm performance data, encompassing over 16,000 participants and 62,000 images. We included 28 studies (78%) reporting 31 algorithms with usable performance data in the meta-analysis. The included algorithms used retinal image types as input, such as optical coherence tomography images (N=15), fundus images (N=6), and multi-modal imaging (N=7). The predominant core method used was deep neural networks. Only three of the included 40 algorithms were externally validated; the summary sensitivity and specificity were 0.94 (95% CI 0.90 to 0.97) and 0.99 (95% CI 0.76 to 1.00), respectively, when compared to human graders. All three studies had high risk of bias mainly due to potential selection bias from either a two-gate design or the inappropriate exclusion of potentially eligible retinal images. Twenty-eight algorithms were reportedly either internally validated or tested on a development set. The pooled sensitivity and specificity were 0.93 (95% CI 0.89 to 0.96) and 0.96 (95% CI 0.94 to 0.98), respectively, when compared to human graders. We did not identify significant sources of heterogeneity among these 28 algorithms.
Conclusion: Low to very low certainty evidence suggests that an algorithm-based test may correctly identify most individuals with eAMD without increasing unnecessary referrals (false positives) in either the primary or the specialty care settings. Limited quality and quantity of externally validated algorithms highlight the need for high-certainty evidence, which will require a standardized definition for the eAMD on different imaging modalities and external validation of the algorithm to assess generalizability.