Matteo Cavicchioli, Andrea Moglia, Ludovica Pierelli, Giacomo Pugliese, Pietro Cerveri
Comput Med Imaging Graph . 2024 Oct:117:102434. doi: 10.1016/j.compmedimag.2024.102434. Epub 2024 Sep 13.
Accurate segmentation of the pancreas in computed tomography (CT) holds paramount importance in diagnostics, surgical planning, and interventions. Recent studies have proposed supervised deep-learning models for segmentation, but their efficacy relies on the quality and quantity of the training data. Most of such works employed small-scale public datasets, without proving the efficacy of generalization to external datasets. This study explored the optimization of pancreas segmentation accuracy by pinpointing the ideal dataset size, understanding resource implications, examining manual refinement impact, and assessing the influence of anatomical subregions. We present the AIMS-1300 dataset encompassing 1,300 CT scans. Its manual annotation by medical experts required 938 h. A 2.5D UNet was implemented to assess the impact of training sample size on segmentation accuracy by partitioning the original AIMS-1300 dataset into 11 smaller subsets of progressively increasing numerosity. The findings revealed that training sets exceeding 440 CTs did not lead to better segmentation performance. In contrast, nnU-Net and UNet with Attention Gate reached a plateau for 585 CTs. Tests on generalization on the publicly available AMOS-CT dataset confirmed this outcome. As the size of the partition of the AIMS-1300 training set increases, the number of error slices decreases, reaching a minimum with 730 and 440 CTs, for AIMS-1300 and AMOS-CT datasets, respectively. Segmentation metrics on the AIMS-1300 and AMOS-CT datasets improved more on the head than the body and tail of the pancreas as the dataset size increased. By carefully considering the task and the characteristics of the available data, researchers can develop deep learning models without sacrificing performance even with limited data. This could accelerate developing and deploying artificial intelligence tools for pancreas surgery and other surgical data science applications.