NestedFormer: Nested Modality-Aware Transformer for Brain Tumor Segmentation

Xing, Zhaohu; Yu, Lequan; Wan, Liang; Han, Tong; Zhu, Lei

doi:10.1007/978-3-031-16443-9_14

Zhaohu Xing¹²,
Lequan Yu¹³,
Liang Wan¹²,
Tong Han¹⁴ &
…
Lei Zhu^15,16

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13435))

Included in the following conference series:

International Conference on Medical Image Computing and Computer-Assisted Intervention

10k Accesses

Abstract

Multi-modal MR imaging is routinely used in clinical practice to diagnose and investigate brain tumors by providing rich complementary information. Previous multi-modal MRI segmentation methods usually perform modal fusion by concatenating multi-modal MRIs at an early/middle stage of the network, which hardly explores non-linear dependencies between modalities. In this work, we propose a novel Nested Modality-Aware Transformer (NestedFormer) to explicitly explore the intra-modality and inter-modality relationships of multi-modal MRIs for brain tumor segmentation. Built on the transformer-based multi-encoder and single-decoder structure, we perform nested multi-modal fusion for high-level representations of different modalities and apply modality-sensitive gating (MSG) at lower scales for more effective skip connections. Specifically, the multi-modal fusion is conducted in our proposed Nested Modality-aware Feature Aggregation (NMaFA) module, which enhances long-term dependencies within individual modalities via a tri-orientated spatial-attention transformer, and further complements key contextual information among modalities via a cross-modality attention transformer. Extensive experiments on BraTS2020 benchmark and a private meningiomas segmentation (MeniSeg) dataset show that the NestedFormer clearly outperforms the state-of-the-arts. The code is available at https://github.com/920232796/NestedFormer.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Learning intra-inter-modality complementary for brain tumor segmentation

Article 16 July 2023

A2FSeg: Adaptive Multi-modal Fusion Network for Medical Image Segmentation

CMIT-Net: a cross-modal information transfer network for multi-modal brain tumor segmentation

Article 23 January 2025

References

Bakas, S., et al.: Advancing the cancer genome atlas glioma MRI collections with expert segmentation labels and radiomic features. Sci. Data 4(1), 1–13 (2017)
Article Google Scholar
Bakas, S., et al.: Identifying the best machine learning algorithms for brain tumor segmentation, progression assessment, and overall survival prediction in the brats challenge. arXiv preprint arXiv:1811.02629 (2018)
Bray, F., Ferlay, J., Soerjomataram, I., Siegel, R., Torre, L., Jemal, A.: Global cancer statistics 2018: Globocan estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. 68(6), 394–424 (2018)
Article Google Scholar
Cao, H., et al.: Swin-unet: Unet-like pure transformer for medical image segmentation. arXiv preprint arXiv:2105.05537 (2021)
Chen, W., Liu, B., Peng, S., Sun, J., Qiao, X.: S3D-UNet: separable 3D U-Net for brain tumor segmentation. In: Crimi, A., Bakas, S., Kuijf, H., Keyvan, F., Reyes, M., van Walsum, T. (eds.) BrainLes 2018. LNCS, vol. 11384, pp. 358–368. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11726-9_32
Çiçek, Ö., Abdulkadir, A., Lienkamp, S.S., Brox, T., Ronneberger, O.: 3d u-net: learning dense volumetric segmentation from sparse annotation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 424–432. Springer (2016)
Google Scholar
Dolz, J., Gopinath, K., Yuan, J., Lombaert, H., Desrosiers, C., Ayed, I.B.: Hyperdense-net: a hyper-densely connected CNN for multi-modal image segmentation. IEEE Trans. Med. Imaging 38(5), 1116–1126 (2019)
Article Google Scholar
Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of theTthirteenth International Conference on Artificial Intelligence and Statistics, pp. 249–256. JMLR Workshop and Conference Proceedings (2010)
Google Scholar
Hatamizadeh, A., et al.: Unetr: transformers for 3d medical image segmentation. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 574–584 (2022)
Google Scholar
Ho, J., Kalchbrenner, N., Weissenborn, D., Salimans, T.: Axial attention in multidimensional transformers. arXiv preprint arXiv:1912.12180 (2019)
Isensee, F., Jaeger, P.F., Kohl, S.A., Petersen, J., Maier-Hein, K.H.: nnu-net: a self-configuring method for deep learning-based biomedical image segmentation. Nat. Methods 18(2), 203–211 (2021)
Article Google Scholar
Khanh, T.L.B., et al.: Enhancing u-net with spatial-channel attention gate for abnormal tissue segmentation in medical imaging. Appl. Sci. 10(17), 5729 (2020)
Google Scholar
Kong, X., Sun, G., Wu, Q., Liu, J., Lin, F.: Hybrid pyramid U-Net model for brain tumor segmentation. In: Shi, Z., Mercier-Laurent, E., Li, J. (eds.) IIP 2018. IAICT, vol. 538, pp. 346–355. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00828-4_35
Larrazabal, A.J., Martínez, C., Dolz, J., Ferrante, E.: Orthogonal ensemble networks for biomedical image segmentation. In: de Bruijne, M., et al. (eds.) MICCAI 2021. LNCS, vol. 12903, pp. 594–603. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87199-4_56
Li, X., Lu, Y., Xiong, J., Wang, D., She, D., Kuai, X., Geng, D., Yin, B.: Presurgical differentiation between malignant haemangiopericytoma and angiomatous meningioma by a radiomics approach based on texture analysis. J. Neuroradiol. 46(5), 281–287 (2019)
Article Google Scholar
Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. arXiv preprint arXiv:2103.14030 (2021)
Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101 (2017)
Menze, B.H., et al.: The multimodal brain tumor image segmentation benchmark (brats). IEEE Trans. Med. Imaging 34(10), 1993–2024 (2014)
Google Scholar
Mou, L., et al.: CS-Net: channel and spatial attention network for curvilinear structure segmentation. In: Shen, D., et al. (eds.) MICCAI 2019. LNCS, vol. 11764, pp. 721–730. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32239-7_80
Myronenko, A.: 3D MRI brain tumor segmentation using autoencoder regularization. In: International MICCAI Brainlesion Workshop, pp. 311–320. Springer (2018)
Google Scholar
Ostrom, Q.T., Patil, N., Cioffi, G., Waite, K., Kruchko, C., Barnholtz-Sloan, J.S.: Cbtrus statistical report: primary brain and central nervous system tumors diagnosed in the united states in 2013–2017. Neuro Oncol. 22(iv), 1–96 (2020)
Google Scholar
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Ryoo, M.S., Piergiovanni, A., Arnab, A., Dehghani, M., Angelova, A.: Tokenlearner: what can 8 learned tokens do for images and videos? arXiv preprint arXiv:2106.11297 (2021)
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)
Google Scholar
Wang, W., Chen, C., Ding, M., Yu, H., Zha, S., Li, J.: TransBTS: multimodal brain tumor segmentation using transformer. In: de Bruijne, M., et al. (eds.) MICCAI 2021. LNCS, vol. 12901, pp. 109–119. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87193-2_11
Yu, W., et al.: Metaformer is actually what you need for vision. arXiv preprint arXiv:2111.11418 (2021)
Zhang, Y., et al.: Modality-Aware Mutual Learning for Multi-modal Medical Image Segmentation. In: de Bruijne, M., et al. (eds.) MICCAI 2021. LNCS, vol. 12901, pp. 589–599. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87193-2_56
Zhang, Y., Pei, Y., Zha, H.: Learning dual transformer network for diffeomorphic registration. In: de Bruijne, M., et al. (eds.) MICCAI 2021. LNCS, vol. 12904, pp. 129–138. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87202-1_13
Zhou, T., Canu, S., Vera, P., Ruan, S.: 3D medical multi-modal segmentation network guided by multi-source correlation constraint. In: 25th International Conference on Pattern Recognition, pp. 10243–10250. IEEE (2020)
Google Scholar

Download references

Acknowledgments

This work was supported by the grant from Tianjin Natural Science Foundation (Grant No. 20JCYBJC00960) and HKU Seed Fund for Basic Research (Project No. 202111159073).

Author information

Authors and Affiliations

Medical College of Tianjin University, Tianjin, China
Zhaohu Xing & Liang Wan
The University of Hong Kong, Hong Kong, China
Lequan Yu
Brain Medical Center of Tianjin University, Huanhu Hospital, Tianjin, China
Tong Han
The Hong Kong University of Science and Technology (Guangzhou), Guangzhou, China
Lei Zhu
The Hong Kong University of Science and Technology, Hong Kong, China
Lei Zhu

Authors

Zhaohu Xing
View author publications
You can also search for this author in PubMed Google Scholar
Lequan Yu
View author publications
You can also search for this author in PubMed Google Scholar
Liang Wan
View author publications
You can also search for this author in PubMed Google Scholar
Tong Han
View author publications
You can also search for this author in PubMed Google Scholar
Lei Zhu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Liang Wan .

Editor information

Editors and Affiliations

Rochester Institute of Technology, Rochester, NY, USA
Linwei Wang
Chinese University of Hong Kong, Hong Kong, Hong Kong
Qi Dou
University of Virginia, Charlottesville, VA, USA
P. Thomas Fletcher
National Center for Tumor Diseases (NCT/UCC), Dresden, Germany
Stefanie Speidel
Case Western Reserve University, Cleveland, OH, USA
Shuo Li

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 61 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xing, Z., Yu, L., Wan, L., Han, T., Zhu, L. (2022). NestedFormer: Nested Modality-Aware Transformer for Brain Tumor Segmentation. In: Wang, L., Dou, Q., Fletcher, P.T., Speidel, S., Li, S. (eds) Medical Image Computing and Computer Assisted Intervention – MICCAI 2022. MICCAI 2022. Lecture Notes in Computer Science, vol 13435. Springer, Cham. https://doi.org/10.1007/978-3-031-16443-9_14

Download citation

DOI: https://doi.org/10.1007/978-3-031-16443-9_14
Published: 16 September 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-16442-2
Online ISBN: 978-3-031-16443-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The Medical Image Computing and Computer Assisted Intervention Society (opens in a new tab)

NestedFormer: Nested Modality-Aware Transformer for Brain Tumor Segmentation