M3T: three-dimensional Medical image classifier using Multi-plane and Multi-slice Transformer
Jinseong Jang, Dosik Hwang
In this study, we propose a three-dimensional Medical image classifier using Multi-plane and Multi-slice Transformer (M3T) network to classify Alzheimer’s disease (AD) in 3D MRI images. The proposed network synergically combines 3D CNN, 2D CNN, and Transformer for accurate AD classification. The 3D CNN is used to perform natively 3D representation learning, while 2D CNN is used to utilize the pre-trained weights on large 2D databases and 2D representation learning. It is possible to efficiently extract the locality information for AD-related abnormalities in the local brain using CNN networks with inductive bias. The transformer network is also used to obtain attention relationships among multi-plane (axial, coronal, and sagittal) and multislice images after CNN. It is also possible to learn the abnormalities distributed over the wider region in the brain using the transformer without inductive bias. In this experiment, we used a training dataset from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) which contains a total of 4,786 3D T1-weighted MRI images. For the validation data, we used dataset from three different institutions: The Australian Imaging, Biomarker and Lifestyle Flagship Study of Ageing (AIBL), The Open Access Series of Imaging Studies (OASIS), and some set of ADNI data independent from the training dataset. Our proposed M3T is compared to conventional 3D classification networks based on an area under the curve (AUC) and classification accuracy for AD classification. This study represents that the proposed network M3T achieved the highest performance in multi-institutional validation database, and demonstrates the feasibility of the method to efficiently combine CNN and Transformer for 3D medical images.