Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.
Background
Bankart lesions (anteroinferior glenoid labrum tears) can cause significant shoulder pain and dysfunction but are often difficult to detect on standard non-contrast MRI. To improve diagnostic sensitivity, MRI arthrography (MRA) is frequently used, although it is more invasive, costly, and uncomfortable for patients. Deep learning (DL), a subset of machine learning, identifies complex patterns in large datasets without the need for manual input. DL has improved diagnostic accuracy across several areas of medical imaging and is particularly well suited for detecting subtle abnormalities. This study developed a DL model to detect Bankart lesions on both MRI and MRA and evaluated its clinical utility through a clinician user study.
Methods
A dataset of 586 shoulder MRIs/MRAs from 546 patients who underwent shoulder arthroscopy within one year of imaging was retrospectively analyzed. Arthroscopic findings served as the reference standard. A custom neural network ensemble was trained on 410 scans (224 MRI, 186 MRA), tuned on 59 (40 MRI, 19 MRA), and tested on 117 (71 MRI, 46 MRA). Gradient-weighted class activation maps (Grad-CAM) were used to assess the anatomical focus of model predictions. To evaluate clinical impact, a user study was conducted with two shoulder/elbow fellowship-trained orthopaedic surgeons and two orthopaedic residents. Each clinician reviewed all 117 test MRIs/MRAs in two phases: once without model predictions (unaided), and again after a 60-day washout period with DL predictions shown (aided).
Results
For standard MRI, the model achieved 90.1% accuracy, 83.3% sensitivity, and 90.8% specificity—substantially outperforming radiology reports from the same scans (80.3% accuracy, 16.7% sensitivity, 86.2% specificity). On MRA, the model achieved 89.1% accuracy, 94.1% sensitivity, and 86.2% specificity, compared to radiology reports at 84.8% accuracy, 82.4% sensitivity, and 86.2% specificity. Grad-CAM showed consistent model attention on the anterior labrum. In the user study, mean clinician sensitivity improved from 38.0% to 78.3% with model assistance. Specificity slightly decreased (86.7% to 85.4%), while accuracy increased (77.1% to 84.0%). Clinician confidence increased by 0.65 points on a 10-point scale (p < 0.001).
Conclusion
The model performed comparably on standard MRI to radiologists on MRA, with Grad-CAM confirming that predictions were based on appropriate anatomic regions. The user study demonstrated that clinicians achieved greater sensitivity and confidence when aided by the model. These results highlight the potential for DL tools to close the gap between non-contrast and contrast-enhanced imaging—avoiding the added cost, discomfort, and invasiveness of MRA.
