Skip to main content

A Hybrid Network Based on nnU-Net and Swin Transformer for Kidney Tumor Segmentation

  • Conference paper
  • First Online:
Kidney and Kidney Tumor Segmentation (KiTS 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14540))

Included in the following conference series:

  • 368 Accesses

Abstract

Kidney cancer is one of the most common cancers. Precise delineation and localization of the lesion area play a crucial role in the diagnosis and treatment of kidney cancer. Deep learning-based automatic medical image segmentation can help to confirm the diagnosis. The traditional 3D nnU-net based on convolutional layers is widely used in medical image segmentation. However, the fixed receptive field of convolutional neural networks introduces an induction bias limiting their ability to capture long-range spatial information in input images. The Swin Transformer addresses this limitation by leveraging the global contextual modeling ability obtained through self-attention computation. However, it requires a large amount of training data and lacks in local feature encoding. To overcome these limitations, our paper proposes a hybrid network structure called STransUnet, which combines the nnU-net with Swin Transformer. STransUnet retains the local feature encoding capability of nnU-net while introducing the Swin Transformer to capture a broader range of global contextual information, resulting in a more powerful modeling ability for image segmentation tasks. In the KiTS23 challenge, our average Dice and average Surface Dice of segmentation on the test are 0.801 and 0.680 ranked the 6th and 8th respectively and our Tumor Dice is 0.687.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Chen, J., et al.: TransUNet: transformers make strong encoders for medical image segmentation. arXiv preprint arXiv:2102.04306 (2021)

  2. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding (2018)

    Google Scholar 

  3. Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Houlsby, N.: An image is worth 16\(\times \)16 words: transformers for image recognition at scale (2020)

    Google Scholar 

  4. Fedus, W., Zoph, B., Shazeer, N.: Switch transformers: scaling to trillion parameter models with simple and efficient sparsity (2021)

    Google Scholar 

  5. Bray, F., Ferlay, J., Soerjomataram, I., Siegel, R.L.: Global cancer statistics 2018: globocan estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA: Cancer J. Clin. (2018)

    Google Scholar 

  6. Hatamizadeh, A., Nath, V., Tang, Y., Yang, D., Roth, H.R., Xu, D.: Swin UNETR: swin transformers for semantic segmentation of brain tumors in MRI images. In: Crimi, A., Bakas, S. (eds.) BrainLes 2021. LNCS, vol. 12962, pp. 272–284. Springer, Cham (2021). https://doi.org/10.1007/978-3-031-08999-2_22

    Chapter  Google Scholar 

  7. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (2016)

    Google Scholar 

  8. Isensee, F., Jaeger, P.F., Kohl, S.A.A., Petersen, J., Maier-Hein, K.H.: nnU-net: a self-configuring method for deep learning-based biomedical image segmentation. Nat. Methods (2021)

    Google Scholar 

  9. Krizhevsky, A., Sutskever, I., Hinton, G.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, vol. 25, no. 2 (2012)

    Google Scholar 

  10. Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows (2021)

    Google Scholar 

  11. Matsoukas, C., Haslum, J.F., Sderberg, M., Smith, K.: Is it time to replace CNNs with transformers for medical images? (2021)

    Google Scholar 

  12. Sha, Y., Zhang, Y., Ji, X., Hu, L.: Transformer-UNet: raw image processing with unet. arXiv preprint arXiv:2109.08417 (2021)

  13. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. Comput. Sci. (2014)

    Google Scholar 

  14. Sudre, C.H., Li, W., Vercauteren, T., Ourselin, S., Jorge Cardoso, M.: Generalised dice overlap as a deep learning loss function for highly unbalanced segmentations. In: Cardoso, M.J., et al. (eds.) DLMIA/ML-CDS -2017. LNCS, vol. 10553, pp. 240–248. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-67558-9_28

    Chapter  Google Scholar 

  15. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Rabinovich, A.: Going deeper with convolutions. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)

    Google Scholar 

  16. Vaswani, A., et al.: Attention is all you need. arXiv (2017)

    Google Scholar 

  17. Yi-de, M., Qing, L., Zhi-Bai, Q.: Automated image segmentation using improved PCNN model based on cross-entropy. In: Proceedings of 2004 International Symposium on Intelligent Multimedia, Video and Speech Processing, pp. 743–746. IEEE (2004)

    Google Scholar 

  18. Zhou, S.K., et al.: A review of deep learning in medical imaging: image traits, technology trends, case studies with progress highlights, and future promises (2020)

    Google Scholar 

Download references

Acknowledgment

This work is supported by the National Key Research and Development Program of China (Grant no: 2021YFC2009200).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Daidi Zhong .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Qian, L., Luo, L., Zhong, Y., Zhong, D. (2024). A Hybrid Network Based on nnU-Net and Swin Transformer for Kidney Tumor Segmentation. In: Heller, N., et al. Kidney and Kidney Tumor Segmentation. KiTS 2023. Lecture Notes in Computer Science, vol 14540. Springer, Cham. https://doi.org/10.1007/978-3-031-54806-2_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-54806-2_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-54805-5

  • Online ISBN: 978-3-031-54806-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics