Enhanced Macular Telangiectasia Type 2 Detection: Leveraging Self-Supervised Learning and Ensemble Models
- Shahrzad Gholami ,
- Lea Scheppke ,
- Meghana Kshirsagar ,
- Yue Wu ,
- Rahul Dodhia ,
- Roberto Bonelli ,
- Irene Leung ,
- Ferenc B Sallo ,
- Alyson Muldrew ,
- Catherine Jamison ,
- Tunde Peto ,
- Juan M. Lavista Ferres ,
- Bill Weeks ,
- Martin Friedlander ,
- Aaron Lee ,
- Lowy Medical Research Institute
Ophthalmology Science |
Objective or purpose: To investigate an ensemble-based approach utilizing deep learning models for accurate and interpretable detection of Macular Telangiectasia Type 2 (MacTel) on optical coherence tomography (OCT) imaging.
Design: Retrospective analysis of OCT scans, model development and assessment.
Subjects, Participants, and/or Controls: A total of 5200 OCT images from participants in the MacTel Registry conducted by the Lowy Medical Research Institute and from the University of Washington (780 MacTel patients and 1900 non-MacTel patients).
Methods, Intervention, or Testing: We trained multiple individual MacTel vs. non-MacTel classification models using traditional supervised and self-supervised learning (SSL), and ensembled them using average weighting methods. We investigated diverse methodologies for constructing the ensemble, including varied architectural configurations and learning paradigms of individual models, and manipulating the amount of labeled data accessible for training. Model performance was compared against human expert graders on held-out test set data. Model interpretability was investigated using Grad-CAM visualization and by evaluating interrater agreement.
Main Outcome Measures: For model performance, area under the receiver operating characteristic curve (AUROC), area under the precision-recall curve (AUPRC), accuracy, sensitivity, and specificity were reported. For interpretability, interrater agreements and Grad-CAM visualization results were evaluated.
Results: Despite access to only 419 OCT volumes, including 185 MacTel patients within the 10% labeled training dataset, the ensemble model demonstrated a performance level (AUROC 0.972 [95% confidence interval (CI), 0.971-0.973], AUPRC 0.967 [95% CI, 0.965-0.969], accuracy 91.7%, sensitivity 0.905, and specificity 0.925) comparable to the human experts ensemble (AUROC 0.977 [95% CI, 0.975-0.978], AUPRC 0.987 [95% CI, 0.986-0.987], accuracy 96.8%, sensitivity 0.929, and specificity 1) on a test set of 500 patients. The individual models did not achieve the same performance levels when evaluated separately.
Conclusion: Even with limited data, combining SSL with ensemble approaches improved Mactel classification accuracy and interpretation compared to the individual models. SSL captures meaningful representations from unlabeled data, a key benefit in the setting of limited data such as with rare diseases.