V Hansen, J Jensen, M W Kusk, O Gerke, H B Tromborg, S Lysdahlgaard
Eur J Radiol . 2024 May:174:111399. doi: 10.1016/j.ejrad.2024.111399. Epub 2024 Feb 27.
Objective: To perform a systematic review and meta-analysis of the diagnostic accuracy of deep learning (DL) algorithms in the diagnosis of wrist fractures (WF) on plain wrist radiographs, taking healthcare experts consensus as reference standard.
Methods: Embase, Medline, PubMed, Scopus and Web of Science were searched in the period from 1 Jan 2012 to 9 March 2023. Eligible studies were patients with wrist radiographs for radial and ulnar fractures as the target condition, studies using DL algorithms based on convolutional neural networks (CNN), and healthcare experts consensus as the minimum reference standard. Studies were assessed with a modified QUADAS-2 tool, and we applied a bivariate random-effects model for meta-analysis of diagnostic test accuracy data.
Results: Our study was registered at PROSPERO with ID: CRD42023431398. We included 6 unique studies for meta-analysis, with a total of 33,026 radiographs. CNN performance compared to reference standards for the included articles found a summary sensitivity of 92% (95% CI: 80%-97%) and a summary specificity of 93% (95% CI: 76%-98%). The generalized bivariate I-squared statistic indicated considerable heterogeneity between the studies (81.90%). Four studies had one or more domains at high risk of bias and two studies had concerns regarding applicability.
Conclusion: The diagnostic accuracy of CNNs was comparable to that of healthcare experts in wrist radiographs for investigation of WF. There is a need for studies with a robust reference standard, external data-set validation and investigation of diagnostic performance of healthcare experts aided with CNNs.