msFormer: Adaptive Multi-Modality 3D Transformer for Medical Image Segmentation

Tan, Jiaxin; Jiang, Chuangbo; Li, Laquan; Li, Haoyuan; Li, Weisheng; Zheng, Shenhai

doi:10.1007/978-3-031-18910-4_26

Jiaxin Tan¹⁵,
Chuangbo Jiang¹⁶,
Laquan Li^15,16,
Haoyuan Li¹⁵,
Weisheng Li¹⁵ &
…
Shenhai Zheng¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13535))

Included in the following conference series:

Chinese Conference on Pattern Recognition and Computer Vision (PRCV)

1571 Accesses
1 Citations

Abstract

Over the past years, Convolutional Neural Networks (CNNs) have dominated the field of medical image segmentation. But they have difficulty representing long-range dependencies. Recently, the Transformer has been applied to medical image segmentation. Transformer-based architectures that utilize the self-attention (core of the Transformer) mechanism can encode long-range dependencies on images with highly expressive learning capabilities. In this paper, we introduce an adaptive multi-modality 3D medical image segmentation network based on Transformer (called msFormer), which is also a powerful 3D fusion network, and extend the application of Transformer to multi-modality medical image segmentation. This fusion network is modeled in the U-shaped structure to exploit complementary features of different modalities at multiple scales, which increases the cubical representations. We conducted a comprehensive experimental analysis on the Prostate and BraTS2021 datasets. The results show that our method achieves an average DSC of 0.905 and 0.851 on these two datasets, respectively, outperforming existing state-of-the-art methods and providing significant improvements.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Hastreiter, P., Bischoff, B., Fahlbusch, R., Doerfler, A., et al.: Data fusion and 3D visualization for optimized representation of neurovascular relationships in the posterior fossa. Acta Neurochirurgica 164(8), 1–11 (2022)
Article Google Scholar
Pereira, H.R., Barzegar, M., Hamadelseed, O., Esteve, A.V., et al.: 3D surgical planning of pediatric tumors: a review. Int. J. Comput. Assist. Radiol. Surg. 17, 1–12 (2022). https://doi.org/10.1007/s11548-022-02557-8
Article Google Scholar
Moussallem, M., Valette, P.-J., Traverse-Glehen, A., Houzard, C., et al.: New strategy for automatic tumor segmentation by adaptive thresholding on PET/CT images. J. Appl. Clin. Med. Phys. 13(5), 236–251 (2012)
Article Google Scholar
Liu, Z., Song, Y., Maere, C., Liu, Q., et al.: A method for PET-CT lung cancer segmentation based on improved random walk. In: 24th International Conference on Pattern Recognition (ICPR), PP. 1187–1192 (2018)
Google Scholar
Song, Q., Bai, J., Han, D., Bhatia, S., et al.: Optimal co-segmentation of tumor in PET-CT images with context information. IEEE Trans. Med. Imaging 32(9), 1685–1697 (2013)
Article Google Scholar
Zhao, X., Li, L., Lu, W., Tan, S.: Tumor co-segmentation in PET/CT using multi-modality fully convolutional neural network. Phys. Med. Biol. 64(1), 015011 (2018)
Article Google Scholar
Kumar, A., Fulham, M., Feng, D., Kim, J.: Co-learning feature fusion maps from PET-CT images of lung cancer. IEEE Trans. Med. Imaging 39(1), 204–217 (2019)
Article Google Scholar
Xue, Z., Li, P., Zhang, L., Lu, X., et al.: Multi-modal co-learning for liver lesion segmentation on PET-CT images. IEEE Trans. Med. Imaging 40(12), 3531–3542 (2021)
Article Google Scholar
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit J., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar
Shamshad, F., Khan, S., Zamir S.W., et al.: Transformers in medical imaging: a survey. arXiv preprint arXiv:2201.09873 (2022)
Chen, J., Lu, Y., Yu, Q., Luo X., et al.: TransuNet: transformers make strong encoders for medical image segmentation. arXiv preprint arXiv:2102.04306 (2021)
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Chapter Google Scholar
Hatamizadeh, A., Tang, Y., Nath, V., Yang, D., et al.: UNETR: transformers for 3D medical image segmentation. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 574–584 (2022)
Google Scholar
Cao, H., Wang, Y., Chen, J., Jiang, D.,et al.: Swin-unet: unet-like pure transformer for medical image segmentation. arXiv preprint arXiv:2105.05537 (2021)
Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., et al.: An image is worth 16x16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)
Hendrycks, D., Gimpel, K.: Gaussian error linear units (gelus). arXiv preprint arXiv:1606.08415 (2016)
Ba, J.L., Kiros, J.R., Hinton, G.E.: Layer normalization. arXiv preprint arXiv:1607.06450 (2016)
Dong, X., Bao, J., Chen, D., Zhang, W., et al.: Cswin transformer: a general vision transformer backbone with cross-shaped windows. arXiv preprint arXiv:2107.00652 (2021)
Liu, Z., Lin, Y., Cao, Y., Hu, H., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (2021)
Google Scholar
Simpson, A.L., Antonelli, M., Bakas, S., Bilello, M., et al.: A large annotated medical image dataset for the development and evaluation of segmentation algorithms. arXiv preprint arXiv:1902.09063 (2019)
Menze, B.H., Jakab, A., Bauer, S., Kalpathy-Cramer, J., et al.: The multimodal brain tumor image segmentation benchmark (BRATS). IEEE Trans. Med. Imaging 34(10), 1993–2024 (2014)
Article Google Scholar
Bakas, S., Akbari, H., Sotiras, A., Bilello, M., et al.: Segmentation labels and radiomic features for the pre-operative scans of the TCGA-LGG collection. The cancer imaging archive 286, (2017)
Google Scholar
Bakas, S., Akbari, H., Sotiras, A., Bilello, M., et al.: Segmentation labels and radiomic features for the pre-operative scans of the TCGA-GBM collection. Nat. Sci. Data 4, 170117 (2017)
Article Google Scholar
Bakas, S., Akbari, H., Sotiras, A., Bilello, M., et al.: Advancing the cancer genome atlas glioma MRI collections with expert segmentation labels and radiomic features. Sci. Data 4(1), 1–13 (2017)
Article Google Scholar
Baid, U., Ghodasara, S., Mohan, S., Bilello, M., et al.: The RSNA-ASNR-MICCAI BraTS 2021 benchmark on brain tumor segmentation and radiogenomic classification. arXiv preprint arXiv:2107.02314 (2021)
Xu, L., Tetteh, G., Lipkova, J., Zhao, Y., et al.: Automated whole-body bone lesion detection for multiple myeloma on 68GA-Pentixafor PET/CT imaging using deep learning methods. Contrast Media Mol. Imaging 2018, 2391925 (2018)
Article Google Scholar
Zhang, Y., et al.: Modality-aware mutual learning for multi-modal medical image segmentation. In: de Bruijne, M., et al. (eds.) MICCAI 2021. LNCS, vol. 12901, pp. 589–599. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87193-2_56
Chapter Google Scholar
Zhou, T., Ruan, S., Guo, Y., Canu, S.: A multi-modality fusion network based on attention mechanism for brain tumor segmentation. In: 2020 IEEE 17th international symposium on biomedical imaging (ISBI), pp. 377–380 (2020)
Google Scholar
Zhou, H.-Y., Guo, J., Zhang, Y., Yu, L., et al.: nnFormer: interleaved transformer for volumetric segmentation. arXiv preprint arXiv:2109.03201 (2021)

Download references

Acknowledgment

This work was supported in part by the National Natural Science Foundation of China (Grant Nos. 61901074 and 61902046) and the Science and Technology Research Program of Chongqing Municipal Education Commission (Grant Nos. KJQN201900636 and KJQN201900631) and China Postdoctoral Science Foundation (Grant No. 2021M693771) and Chongqing postgraduates innovation project (CYS21310).

Author information

Authors and Affiliations

College of Computer Science and Technology, Chongqing University of Posts and Telecommunications, Chongqing, 400065, China
Jiaxin Tan, Laquan Li, Haoyuan Li, Weisheng Li & Shenhai Zheng
School of Science, Chongqing University of Posts and Telecommunications, Chongqing, 400065, China
Chuangbo Jiang & Laquan Li

Authors

Jiaxin Tan
View author publications
You can also search for this author in PubMed Google Scholar
Chuangbo Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Laquan Li
View author publications
You can also search for this author in PubMed Google Scholar
Haoyuan Li
View author publications
You can also search for this author in PubMed Google Scholar
Weisheng Li
View author publications
You can also search for this author in PubMed Google Scholar
Shenhai Zheng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shenhai Zheng .

Editor information

Editors and Affiliations

Southern University of Science and Technology, Shenzhen, China
Shiqi Yu
Institute of Automation, Chinese Academy of Sciences, Beijing, China
Zhaoxiang Zhang
Hong Kong Baptist University, Hong Kong, China
Pong C. Yuen
Northwestern Polytechnical University, Xi'an, China
Junwei Han
Institute of Automation, Chinese Academy of Sciences, Beijing, China
Tieniu Tan
Hong Kong Baptist University, Hong Kong, China
Yike Guo
Sun Yat-sen University, Guangzhou, China
Jianhuang Lai
Southern University of Science and Technology, Shenzhen, China
Jianguo Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tan, J., Jiang, C., Li, L., Li, H., Li, W., Zheng, S. (2022). msFormer: Adaptive Multi-Modality 3D Transformer for Medical Image Segmentation. In: Yu, S., et al. Pattern Recognition and Computer Vision. PRCV 2022. Lecture Notes in Computer Science, vol 13535. Springer, Cham. https://doi.org/10.1007/978-3-031-18910-4_26

Download citation

DOI: https://doi.org/10.1007/978-3-031-18910-4_26
Published: 27 October 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-18909-8
Online ISBN: 978-3-031-18910-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

msFormer: Adaptive Multi-Modality 3D Transformer for Medical Image Segmentation