Chaoqi Han, Bingcai Chen, Chanjuan Liu, Qian Ning, Victor C.M. Leung, Shouzhen Jiao, Qing Liu
Convolutional Neural Networks and Attention-based Transformer have emerged as the preferred models for medical image processing. Recently, specific network architectures relying solely on multilayer perceptrons (MLPs) have gained popularity and demonstrated excellent results in various computer vision tasks. In particular, CycleMLP has demonstrated good performance in dense prediction tasks owing to its adaptability to image size and linear computational complexity. However, the basic operator of CycleMLP has a fixed sampling location for any feature map and samples very few target organs in medical images characterized by an extreme imbalance between foreground and background. Therefore, effectively extracting the features of target organs becomes challenging. In this paper, we propose a new MLP-like module, SlideMLP, by considering the sparsity of target organs in medical images. This module extracts a set of offsets from the input feature maps and utilizes these offsets to re-select the sampling points. This approach effectively enhances the sampling rate of target organs while retaining the advantages of CycleMLP. Additionally, we constructed a U-shaped network with a pure MLP using this module and assessed its robustness using two datasets with different modalities. Comparative results with state-of-the-art methods demonstrate that the method proposed in this paper can achieve a substantial Dice Similarity Coe cient (DSC) while utilizing fewer parameters.