Abstract
Medical image segmentation is crucial for lesion localization and surgical navigation. Recent advancements in medical image segmentation have been driven by Convolutional Neural Networks (CNNs) and Transformers. However, CNNs have limitations in capturing long-range dependencies due to their weight sharing and localized receptive fields, posing challenges in handling varying organ shapes. While Transformers offer an alternative with global receptive fields, their spatial and computational complexity is particularly high, especially for 3D medical images. To address this issue, we propose a novel series-parallel network that combines convolution and self-attention for 3D medical image segmentation. We utilize a serial 3D CNN as the encoder to extract multi-level feature maps, which are fused via a feature pyramid network. Subsequently, we adopt four parallel Transformer branches to capture global features. To efficiently model long-range information, we introduce patch self-attention, which divides the input into non-overlapping patches and computes attention between corresponding pixels across patches. Experimental evaluations on 3D MRI prostate and left atrial segmentation tasks confirm the superior performance of our network compared to other CNN and Transformer-based networks. Notably, our method achieves higher segmentation accuracy and faster inference speed.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) Medical Image Computing and Computer-Assisted Intervention — MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Adams, R., Bischof, L.: Seeded region growing. IEEE Trans. Pattern Anal. Mach. Intell., 641–647 (1994)
Beucher, S., Meyer, F.: The morphological approach to segmentation: the watershed transformation. In: Mathematical Morphology in Image Processing, pp. 433–481. CRC Press (2018)
Kass, M., Witkin, A., Terzopoulos, D.: Snakes: active contour models. Int. J. Comput. Vis., 321–331 (1988)
Oktay, O., Schlemper, J., et al.: Attention U-Net: learning where to look for the pancreas. arXiv preprint arXiv:1804.03999 (2018)
Zhou, Z., Rahman Siddiquee, M.M., Tajbakhsh, N., Liang, J.: Unet++: a nested U-Net architecture for medical image segmentation. In: Stoyanov, D., et al. (eds.) Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support. LNCS, vol. 11045, pp. 3–11. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00889-5_1
Xiao, X., Lian, S., Luo, Z., et al.: Weighted res-unet for high-quality retina vessel segmentation. In: 2018 9th International Conference on Information Technology in Medicine and Education (ITME), pp. 327–331. IEEE (2018)
Huang, H., Lin, L., et al.: UNet 3+: a full-scale connected U-Net for medical image segmentation. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1055–1059. IEEE (2020)
Prajit, R., Niki, P., et al.: Standalone self-attention in vision models. arXiv preprint arXiv:1906.05909 (2019)
Vaswani, A., Shazeer, N., Parmar, N., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)
Chen, J., Lu, Y., Yu, Q., et al.: TransUNet: transformers make strong encoders for medical image segmentation. arXiv preprint arXiv:2102.04306 (2021)
Hatamizadeh, A., Tang, Y., Nath, V., et al.: UNETR: transformers for 3D medical image segmentation. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 574–584 (2022)
Xie, Y., Zhang, J., Shen, C., Xia, Y.: CoTr: efficiently bridging CNN and transformer for 3D medical image segmentation. In: de Bruijne, M., et al. (eds.) Medical Image Computing and Computer Assisted Intervention – MICCAI 2021. LNCS, vol. 12903, pp. 171–180. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87199-4_16
Liu, Z., Lin, Y., Cao, Y., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10224–10233 (2021)
Cao, H., Wang, Y., Chen, J., et al: Swin-Unet: unet-like pure transformer for medical image segmentation. In: Karlinsky, L., Michaeli, T., Nishino, K. (eds.) European Conference on Computer Vision, pp. 205–218. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-25066-8_9
Lin, T.Y., Dollár, P., Girshick, R., et al.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125 (2017)
Mehta, S., Rastegari, M.: MobileViT: light-weight, general-purpose, and mobile-friendly vision transformer. arXiv preprint arXiv:2110.02178 (2021)
Xie, S., Girshick, R., Dollár P., et al.: Aggregated residual transformations for deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1492–1500 (2017)
Yang, X., Bian, C., Yu, L., Ni, D., Heng, P.-A.: Hybrid loss guided convolutional networks for whole heart parsing. In: Pop, M., et al. (eds.) Statistical Atlases and Computational Models of the Heart. ACDC and MMWHS Challenges. LNCS, vol. 10663, pp. 215–223. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-75541-0_23
Geert, L., Oscar, D., Jelle, B., Nico, K., Henkjan, H.: ProstateX challenge data. Cancer Imaging Arch. (2017). https://doi.org/10.7937/K9TCIA.2017.MURS5CL
Xiong, Z., et al.: A global benchmark of algorithms for segmenting the left atrium from late gadolinium-enhanced cardiac magnetic resonance imaging. Med. Image Anal. 67, 101832 (2021)
MindSpore. https://www.mindspore.cn
Yu, L., Wang, S., Li, X., Fu, C.-W., Heng, P.-A.: Uncertainty-aware self-ensembling model for semi-supervised 3D left atrium segmentation. In: Shen, D., et al. (eds.) Medical Image Computing and Computer Assisted Intervention – MICCAI 2019. LNCS, vol. 11765, pp. 605–613. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32245-8_67
Chang, H.H., Zhuang, A.H., Valentino, D.J., et al.: Performance measure characterization for evaluating neuroimage segmentation algorithms. Neuroimage 47(1), 122–135 (2009)
Litjens, G., Toth, R., et al.: Evaluation of prostate segmentation algorithms for MRI: the PROMISE12 challenge. Med. Image Anal. 18(2), 359–373 (2014)
Wang, W., Chen, C., Ding, M., Yu, H., Zha, S., Li, J.: TransBTS: multimodal brain tumor segmentation using transformer. In: de Bruijne, M., et al. (eds.) Medical Image Computing and Computer Assisted Intervention – MICCAI 2021. LNCS, vol. 12901, pp. 109–119. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87193-2_11
Hatamizadeh, A., Nath, V., Tang, Y., et al.: Swin unetr: swin transformers for semantic segmentation of brain tumors in mri images. In: Crimi, A., Bakas, S. (eds.) International MICCAI Brainlesion Workshop, pp. 272–284. Springer, Cham (2021). https://doi.org/10.1007/978-3-031-08999-2_22
Zhou, H., Guo, J., Zhang, Y., et al.: nnFormer: interleaved transformer for volumetric segmentation. arXiv preprint arXiv:2109.03201 (2021)
Acknowledgment
This work is sponsored by the National Natural Science Foundation of China (Grant No. 61871440), China Postdoctoral Science Foundation (Grant No. 2023M731204), and CAAI-Huawei MindSpore Open Fund.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Yu, B., Zhou, Q., Zhang, X. (2024). SPCTNet: A Series-Parallel CNN and Transformer Network for 3D Medical Image Segmentation. In: Fang, L., Pei, J., Zhai, G., Wang, R. (eds) Artificial Intelligence. CICAI 2023. Lecture Notes in Computer Science(), vol 14473. Springer, Singapore. https://doi.org/10.1007/978-981-99-8850-1_31
Download citation
DOI: https://doi.org/10.1007/978-981-99-8850-1_31
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8849-5
Online ISBN: 978-981-99-8850-1
eBook Packages: Computer ScienceComputer Science (R0)