Materials and methods: A total of 380 patients with 385 pathologically confirmed adrenal lesions (101 malignant, 284 benign) were retrospectively included. Adrenal lesions were manually segmented on CT images and analyzed in a deep learning pipeline aimed at differentiating benign from malignant lesions. Four predictive models were developed that incorporated combinations of radiological data (tumor size, and spontaneous attenuation) and non-radiological data (i.e., medical history and laboratory results). Data of 267 patients were used as a training set and those of 113 patients for the test set. The diagnostic capabilities of the four models were estimated using sensitivity, specificity, accuracy, and areas under the receiver operating characteristic curves (AUC) using histopathological findings as the gold standard. The reproducibility of manual segmentation was estimated using the Dice similarity coefficient after blinded resegmentation of 40 adrenal lesions by an independent radiologist.
Results: Segmentation reproducibility achieved a mean Dice similarity coefficient of 0.92 ± 0.03 (range: 0.72-0.97). The most accurate model, which combined clinical, biological, and radiological data, achieved 84.2% accuracy (95% confidence interval: 79.9, 88.6) and an AUC of 0.93 (95% confidence interval: 89.9, 97.9) in the test set for diagnosis of malignant adrenal lesion.
Conclusion: A deep learning model integrating preoperative clinical, biological, and radiological features demonstrates high capabilities in differentiating benign from malignant adrenal lesions on initial CT examination.