Deep learning to automate the labelling of head MRI datasets for computer vision applications
Eur Radiol . 2021 Jul 20. doi: 10.1007/s00330-021-08132-0. Online ahead of print.
David A Wood, Sina Kafiabadi, Aisha Al Busaidi, Emily L Guilhem, Jeremy Lynch, Matthew K Townend, Antanas Montvila, Martin Kiik, Juveria Siddiqui, Naveen Gadapa, Matthew D Benger, Asif Mazumder, Gareth Barker, Sebastian Ourselin, James H Cole, Thomas C Booth
Objectives: The purpose of this study was to build a deep learning model to derive labels from neuroradiology reports and assign these to the corresponding examinations, overcoming a bottleneck to computer vision model development.
Methods: Reference-standard labels were generated by a team of neuroradiologists for model training and evaluation. Three thousand examinations were labelled for the presence or absence of any abnormality by manually scrutinising the corresponding radiology reports ('reference-standard report labels'); a subset of these examinations (n = 250) were assigned 'reference-standard image labels' by interrogating the actual images. Separately, 2000 reports were labelled for the presence or absence of 7 specialised categories of abnormality (acute stroke, mass, atrophy, vascular abnormality, small vessel disease, white matter inflammation, encephalomalacia), with a subset of these examinations (n = 700) also assigned reference-standard image labels. A deep learning model was trained using labelled reports and validated in two ways: comparing predicted labels to (i) reference-standard report labels and (ii) reference-standard image labels. The area under the receiver operating characteristic curve (AUC-ROC) was used to quantify model performance. Accuracy, sensitivity, specificity, and F1 score were also calculated.
Results: Accurate classification (AUC-ROC > 0.95) was achieved for all categories when tested against reference-standard report labels. A drop in performance (ΔAUC-ROC > 0.02) was seen for three categories (atrophy, encephalomalacia, vascular) when tested against reference-standard image labels, highlighting discrepancies in the original reports. Once trained, the model assigned labels to 121,556 examinations in under 30 min.
Conclusions: Our model accurately classifies head MRI examinations, enabling automated dataset labelling for downstream computer vision applications.
Read Full Article Here: https://doi.org/10.1007/s00330-021-08132-0