Skip to main content

DBTrans: A Dual-Branch Vision Transformer for Multi-Modal Brain Tumor Segmentation

  • Conference paper
  • First Online:
Medical Image Computing and Computer Assisted Intervention – MICCAI 2023 (MICCAI 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14223))

  • 4565 Accesses

Abstract

3D Spatially Aligned Multi-modal MRI Brain Tumor Segmentation (SAMM-BTS) is a crucial task for clinical diagnosis. While Transformer-based models have shown outstanding success in this field due to their ability to model global features using the self-attention mechanism, they still face two challenges. First, due to the high computational complexity and deficiencies in modeling local features, the traditional self-attention mechanism is ill-suited for SAMM-BTS tasks that require modeling both global and local volumetric features within an acceptable computation overhead. Second, existing models only stack spatially aligned multi-modal data on the channel dimension, without any processing for such multi-channel data in the model’s internal design. To address these challenges, we propose a Transformer-based model for the SAMM-BTS task, namely DBTrans, with dual-branch architectures for both the encoder and decoder. Specifically, the encoder implements two parallel feature extraction branches, including a local branch based on Shifted Window Self-attention and a global branch based on Shuffle Window Cross-attention to capture both local and global information with linear computational complexity. Besides, we add an extra global branch based on Shifted Window Cross-attention to the decoder, introducing the key and value matrices from the corresponding encoder block, allowing the segmented target to access a more complete context during up-sampling. Furthermore, the above dual-branch designs in the encoder and decoder are both integrated with improved channel attention mechanisms to fully explore the contribution of features at different channels. Experimental results demonstrate the superiority of our DBTrans model in both qualitative and quantitative measures. Codes will be released at https://github.com/Aru321/DBTrans.

X. Zeng and P. Zeng—Contribute equally to this work.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Gordillo, N., Montseny, E., et al.: State of the art survey on MRI brain tumor segmentation. Magn. Reson. Imaging 31(8), 1426–1438 (2013)

    Article  Google Scholar 

  2. Luo, Y., Zhou, L., Zhan, B., et al.: Adaptive rectification based adversarial network with spectrum constraint for high-quality PET image synthesis. Med. Image Anal. 77, 102335 (2022)

    Article  Google Scholar 

  3. Wang, K., Zhan, B., Zu, C., Wu, X., et al.: Semi-supervised medical image segmentation via a tripled-uncertainty guided mean teacher model with contrastive learning. Med. Image Anal. 79, 102447 (2022)

    Article  Google Scholar 

  4. Ma, Q., Zu, C., Wu, X., Zhou, J., Wang, Y.: Coarse-To-fine segmentation of organs at risk in nasopharyngeal carcinoma radiotherapy. In: de Bruijne, M., et al. (eds.) MICCAI 2021. LNCS, vol. 12901, pp. 358–368. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87193-2_34

    Chapter  Google Scholar 

  5. Tang, P., Yang, P., Nie, D., et al.: Unified medical image segmentation by learning from uncertainty in an end-to-end manner. Knowl.-Based Syst. 241, 108215 (2022)

    Article  Google Scholar 

  6. Zhang, J., Zhang, Z., Wang, L., et al.: Kernel-based feature aggregation framework in point cloud networks. Pattern Recogn. 1(1), 1–15 (2023)

    Article  Google Scholar 

  7. Shi, Y., Zu, C., Yang, P., et al.: Uncertainty-weighted and relation-driven consistency training for semi-supervised head-and-neck tumor segmentation. Knowl.-Based Syst. 272, 110598 (2023)

    Article  Google Scholar 

  8. Wang, K., Wang, Y., Zhan, B., et al.: An efficient semi-supervised framework with multi-task and curriculum learning for medical image. Int. J. Neural Syst. 32(09), 2250043 (2022)

    Article  Google Scholar 

  9. Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28

    Chapter  Google Scholar 

  10. Zhou, T., Zhou, Y., He, K., et al.: Cross-level feature aggregation network for polyp segmentation. Pattern Recogn. 140, 109555 (2023)

    Article  Google Scholar 

  11. Du, G., Cao, X., Liang, J., et al.: Medical image segmentation based on u-net: a review. J. Imaging Sci. Technol. 64(2), 020508-1–020508-12 (2020)

    Google Scholar 

  12. Brügger, R., Baumgartner, C.F., Konukoglu, E.: A partially reversible U-Net for memory-efficient volumetric image segmentation. In: Shen, D., et al. (eds.) MICCAI 2019. LNCS, vol. 11766, pp. 429–437. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32248-9_48

    Chapter  Google Scholar 

  13. Pei, L., Liu, Y.: Multimodal brain tumor segmentation using a 3D ResUNet in BraTS 2021. In: 7th International Workshop, BrainLes 2021, Held in Conjunction with MICCAI 2021, Virtual Event, Revised Selected Papers, Part I, pp. 315–323. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-08999-2_26

  14. Zeng, P., Wang, Y., et al.: 3D CVT-GAN: a 3D convolutional vision transformer-GAN for PET reconstruction.In: Wang, L., et al. (eds.) MICCAI 2022, Proceedings, pp. 516-526. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-16446-0_49

  15. Vaswani, A., Shazeer, N., Parmar, N., et al.: Attention is all you need. Adv. Neural Inf. Process. Syst. 30 (2017)

    Google Scholar 

  16. Parmar, N., Vaswani, A,, Uszkoreit, J., et al.: Image transformer. In: International Conference on Machine Learning, pp. 4055–4064. PMLR (2018)

    Google Scholar 

  17. Luo, Y., et al.: 3D Transformer-GAN for high-quality PET reconstruction. In: de Bruijne, M., et al. (eds.) MICCAI 2021. LNCS, vol. 12906, pp. 276–285. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87231-1_27

    Chapter  Google Scholar 

  18. Wen, L., Xiao, J., Tan, S., et al.: A transformer-embedded multi-task model for dose distribution prediction. Int. J. Neural Syst. 33, 2350043–2350043 (2023)

    Google Scholar 

  19. Dosovitskiy, A., Beyer, L., Kolesnikov, A., et al.: An image is worth 16x16 words: transformers for image recognition at scale. In: Proceedings of International Conference on Learning Representations (2021)

    Google Scholar 

  20. Zheng, S., Lu, J., Zhao, H., et al.: Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6881–6890 (2021)

    Google Scholar 

  21. Xie, E., Wang, W., Yu, Z., et al.: SegFormer: simple and efficient design for semantic segmentation with transformers. Adv. Neural. Inf. Process. Syst. 34, 12077–12090 (2021)

    Google Scholar 

  22. Hatamizadeh, A., Tang, Y., Nath, V., et al.: Unetr: transformers for 3d medical image segmentation. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 574–584 (2022)

    Google Scholar 

  23. Zhou, H.Y., Guo, J., Zhang, Y., et al.: nnformer: Interleaved transformer for volumetric segmentation. arXiv preprint arXiv:2109.03201 (2021)

  24. Lin, A., Chen, B., Xu, J., et al.: Ds-transunet: dual swin transformer u-net for medical image segmentation. IEEE Trans. Instrum. Meas. 71, 1–15 (2022)

    Google Scholar 

  25. Wang, W., Chen, C., Ding, M., Yu, H., Zha, S., Li, J.: Transbts: multimodal brain tumor segmentation using transformer. In: de Bruijne, M., et al. (eds.) MICCAI 2021. LNCS, vol. 12901, pp. 109–119. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87193-2_11

    Chapter  Google Scholar 

  26. Peiris, H., Hayat, M., Chen, Z., et al.: A robust volumetric transformer for accurate 3d tumor segmentation. In: Wang, L., et al. (eds.) MICCAI 2022, Proceedings, Part V, pp. 162–172. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-16443-9_16

    Chapter  Google Scholar 

  27. Cao, H., Wang, Y., Chen, J., et al.: Swin-unet: unet-like pure transformer for medical image segmentation. In: Computer Vision–ECCV 2022 Workshops. Proceedings, Part III, pp. 205–218. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-25066-8_9

  28. Liu, Z., Ning, J., Cao, Y., et al.: Video swin transformer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3202–3211 (2022)

    Google Scholar 

  29. Gholami, A., et al.: A novel domain adaptation framework for medical image segmentation. In: Crimi, A., Bakas, S., Kuijf, H., Keyvan, F., Reyes, M., van Walsum, T. (eds.) BrainLes 2018. LNCS, vol. 11384, pp. 289–298. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11726-9_26

    Chapter  Google Scholar 

  30. Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)

    Google Scholar 

  31. Çiçek, Ö., Abdulkadir, A., Lienkamp, S.S., Brox, T., Ronneberger, O.: 3D U-Net: learning dense volumetric segmentation from sparse annotation. In: Ourselin, S., Joskowicz, L., Sabuncu, M.R., Unal, G., Wells, W. (eds.) MICCAI 2016. LNCS, vol. 9901, pp. 424–432. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46723-8_49

    Chapter  Google Scholar 

  32. Jadon S.: A survey of loss functions for semantic segmentation. In: Proceedings of IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB), pp. 1–7. IEEE (2020)

    Google Scholar 

  33. Baid, U., et al.: The RSNA-ASNR-MICCAI BraTS 2021 benchmark on brain tumor segmentation and radiogenomic classification. arXiv:2107.02314 (2021)

  34. Bakas, S., Akbari, H., Sotiras, A., Bilello, M., Rozycki, M., Kirby, J.S., et al.: Advancing the cancer genome atlas glioma mri collections with expert segmentation labels and radiomic features. Nat. Sci. Data 4, 170117 (2017)

    Article  Google Scholar 

  35. Zhang, X., Zhou, X., Lin, M, et al.: Shufflenet: an extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6848–6856 (2018)

    Google Scholar 

  36. Xing, Z., Yu, L., Wan, L., et al.: NestedFormer: nested modality-aware transformer for brain tumor segmentation. In: Wang, L., et al. (eds.) MICCAI 2022, Proceedings, Part V, pp. 140–150. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-16443-9_14

    Chapter  Google Scholar 

  37. Wang, Y., Zhou, L., Yu, B., Wang, L., et al.: 3D auto-context-based locality adaptive multi-modality GANs for PET synthesis. IEEE Trans. Med. Imaging 38(6), 1328–1339 (2019)

    Article  Google Scholar 

  38. Menze, B.H., Jakab, A., Bauer, S., et al.: The multimodal brain tumor image segmentation benchmark (BRATS). IEEE Trans. Med. Imaging 34(10), 1993–2024 (2014)

    Article  Google Scholar 

Download references

Acknowledgement

This work is supported by the National Natural Science Foundation of China (NSFC 62371325, 62071314), Sichuan Science and Technology Program 2023YFG0263, 2023YFG0025, 2023NSFSC0497, and Opening Foundation of Agile and Intelligent Computing Key Laboratory of Sichuan Province.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yan Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zeng, X., Zeng, P., Tang, C., Wang, P., Yan, B., Wang, Y. (2023). DBTrans: A Dual-Branch Vision Transformer for Multi-Modal Brain Tumor Segmentation. In: Greenspan, H., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2023. MICCAI 2023. Lecture Notes in Computer Science, vol 14223. Springer, Cham. https://doi.org/10.1007/978-3-031-43901-8_48

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-43901-8_48

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-43900-1

  • Online ISBN: 978-3-031-43901-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics