Chang Ho Ahn, Taewoo Kim, Kyungmin Jo, Seung Shin Park, Min Joo Kim, Ji Won Yoon, Taek Min Kim, Sang Youn Kim, Jung Hee Kim, Jaegul Choo
Radiology . 2025 Mar;314(3):e231650. doi: 10.1148/radiol.231650.
Background The detection and classification of adrenal nodules are crucial for their management. Purpose To develop and test a deep learning model to automatically depict adrenal nodules on abdominal CT images and to simulate triaging performance in combination with human interpretation. Materials and Methods This retrospective study (January 2000-December 2020) used an internal dataset enriched with adrenal nodules for model training and testing and an external dataset reflecting real-world practice for further simulated testing in combination with human interpretation. The deep learning model had a two-stage architecture, a sequential detection and segmentation model, trained separately for the right and left adrenal glands. Model performance was evaluated using the area under the receiver operating characteristic curve (AUC) for nodule detection and intersection over union for nodule segmentation. Results Of a total of 995 patients in the internal dataset, the AUCs for detecting right and left adrenal nodules in internal test set 1 (n = 153) were 0.98 (95% CI: 0.96, 1.00; P < .001) and 0.93 (95% CI: 0.87, 0.98; P < .001), respectively. These values were 0.98 (95% CI: 0.97, 0.99; P < .001) and 0.97 (95% CI: 0.96, 0.97; P < .001) in the external test set (n = 12 080) and 0.90 (95% CI: 0.84, 0.95; P < .001) and 0.89 (95% CI: 0.85, 0.94; P < .001) in internal test set 2 (n = 1214). The median intersection over union was 0.64 (IQR, 0.43-0.71) and 0.53 (IQR, 0.40-0.64) for right and left adrenal nodules, respectively. Combining the model with human interpretation achieved high sensitivity (up to 100%) and specificity (up to 99%), with triaging performance from 0.77 to 0.98. Conclusion The deep learning model demonstrated high performance and has the potential to improve detection of incidental adrenal nodules.