Artificial Intelligence in Fracture Detection: A Systematic Review and Meta-Analysis
Rachel Y L Kuo, Conrad Harrison, Terry-Ann Curran, Benjamin Jones, Alexander Freethy, David Cussons, Max Stewart, Gary S Collins, Dominic Furniss
Radiology . 2022 Mar 29;211785. doi: 10.1148/radiol.211785. Online ahead of print.
Background: Patients with fractures are a common emergency presentation and may be misdiagnosed at radiologic imaging. An increasing number of studies apply artificial intelligence (AI) techniques to fracture detection as an adjunct to clinician diagnosis.
Purpose: To perform a systematic review and meta-analysis comparing the diagnostic performance in fracture detection between AI and clinicians in peer-reviewed publications and the gray literature (ie, articles published on preprint repositories).
Materials and Methods: A search of multiple electronic databases between January 2018 and July 2020 (updated June 2021) was performed that included any primary research studies that developed and/or validated AI for the purposes of fracture detection at any imaging modality and excluded studies that evaluated image segmentation algorithms. Meta-analysis with a hierarchical model to calculate pooled sensitivity and specificity was used. Risk of bias was assessed by using a modified Prediction Model Study Risk of Bias Assessment Tool, or PROBAST, checklist.
Results: Included for analysis were 42 studies, with 115 contingency tables extracted from 32 studies (55 061 images). Thirty-seven studies identified fractures on radiographs and five studies identified fractures on CT images. For internal validation test sets, the pooled sensitivity was 92% (95% CI: 88, 93) for AI and 91% (95% CI: 85, 95) for clinicians, and the pooled specificity was 91% (95% CI: 88, 93) for AI and 92% (95% CI: 89, 92) for clinicians. For external validation test sets, the pooled sensitivity was 91% (95% CI: 84, 95) for AI and 94% (95% CI: 90, 96) for clinicians, and the pooled specificity was 91% (95% CI: 81, 95) for AI and 94% (95% CI: 91, 95) for clinicians. There were no statistically significant differences between clinician and AI performance. There were 22 of 42 (52%) studies that were judged to have high risk of bias. Meta-regression identified multiple sources of heterogeneity in the data, including risk of bias and fracture type.
Conclusion: Artificial intelligence (AI) and clinicians had comparable reported diagnostic performance in fracture detection, suggesting that AI technology holds promise as a diagnostic adjunct in future clinical practice.
Read Full Article Here: https://doi.org/10.1148/radiol.211785