SwinE-UNet3+: swin transformer encoder network for medical image segmentation

Zou, Ping; Wu, Jian-Sheng

doi:10.1007/s13748-023-00300-1

SwinE-UNet3+: swin transformer encoder network for medical image segmentation

Short Communication
Published: 27 February 2023

Volume 12, pages 99–105, (2023)
Cite this article

Progress in Artificial Intelligence Aims and scope Submit manuscript

726 Accesses
1 Citation
Explore all metrics

Abstract

A SwinE-UNet3+ model is proposed to improve the problem that convolutional neural networks cannot capture long-range feature dependencies due to the limitation of receptive field and is insensitive to contour details in tumor segmentation tasks. Each encoder layer of SwinE-UNet3+ uses two consecutive Swin Transformer blocks to extract features, especially long-range features in images. Patch Merging is used for down-sampling between encoder layers. The decoder uses Conv2DTranspose to perform progressive up-sampling and uses convolution operation to aggregate the decoder information after up-sampling and the encoder information through skip connection. The proposed model evaluates the TipDM Cup rectal cancer dataset and the melanoma dermoscopic image ISIC-2017 dataset. Experimental results show that SwinE-UNet3+ model outperforms UNet, UNet++ and UNet3+ models in Dice coefficient, IOU value and Precision evaluation metric.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(4), 640–651 (2015). https://doi.org/10.1109/CVPR.2015.7298965
Article Google Scholar
Milletari, F., Navab, N., Ahmadi, S. A.: V-net: fully convolutional neural networks for volumetric medical image segmentation. In: 2016 Fourth International Conference on 3D Vision (3DV), pp. 565–571 (2016). https://doi.org/10.1109/3DV.2016.79
Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 234–241 (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Ibtehaz, N., Rahman, M.S.: MultiResUNet: rethinking the U-Net architecture for multimodal biomedical image segmentation. Neural Netw. 121, 74–87 (2020). https://doi.org/10.1016/j.neunet.2019.08.025
Article Google Scholar
Li, X., Hao, C., Qi, X., Qi, D., Fu, C.W., Pheng-Ann, H.: H-DenseUNet: hybrid densely connected UNet for liver and tumor segmentation from CT volumes. IEEE Trans. Med. Imaging 37(12), 2663–2674 (2018)
Article Google Scholar
Jha, D., Smedsrud, P.H., Riegler, M.A., Johansen, D., Simulamet.: Resunet++: an advanced architecture for medical image segmentation. In: 2019 IEEE International Symposium on Multimedia (ISM), pp. 225–2255 (2019). https://doi.org/10.1109/ISM46123.2019.00049
Zhou, Z., Siddiquee, M.M.R., Tajbakhsh, N., Liang, J.: Unet++: a nested u-net architecture for medical image segmentation. In: Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support, pp. 3–11 (2018)
Oktay, O., Schlemper, J., Folgoc, L.L., Lee, M., Heinrich, M., Misawa, K., Mori, K., Mcdonagh, S., Hammerla, N.Y., Kainz, B.: Attention U-Net: learning where to look for the pancreas. MIDL. In: Proc, pp. 1–10 (2018). https://doi.org/10.48550/arXiv.1804.03999
Gu, Z., Cheng, J., Fu, H., Zhou, K., Hao, H., Zhao, Y., Zhang, T., Gao, S., Liu, J.: Ce-net: context encoder network for 2d medical image segmentation. IEEE Trans. Med. Imaging 38(10), 2281–2292 (2019). https://doi.org/10.1109/TMI.2019.2903562
Article Google Scholar
Vaswani, A., Shazeer, N., Parmar, N.: Attention is all you need. Adv. Neural Inf. Process. Syst. 5998–6008 (2017)
Yuan, L., Chen, Y., Wang, T., Yu, W., Shi, Y., Tay, FE., Feng, J., Yan, S.: Tokens-to-token vit: training vision transformers from scratch on imagenet. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 558–567 (2021). https://doi.org/10.48550/arXiv.2101.11986
Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformer. In: European Conference on Computer Vision, pp. 213–229 (2020). https://doi.org/10.1007/978-3-030-58452-8_13
Valanarasu, J.M.J., Oza, P., Hacihaliloglu, I., Patel, V.M.: Medical transformer: gated axial-attention for medical image segmentation. In: International conference on medical image computing and computer-assisted intervention, pp. 36–46 (2021). https://doi.org/10.48550/arXiv.2102.10662
Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., Guo, B.: Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 10012–10022 (2021). https://doi.org/10.48550/arXiv.2103.14030
Cao, H., Wang, Y., Chen, J., Jiang, D., Zhang, X., Tian, Q., Wang, M.: Swin-unet: Unet-like pure transformer for medical image segmentation. Arxiv Prep. (2021). https://doi.org/10.48550/arXiv.2105.05537
Article Google Scholar
Huang, H., Lin, L., Tong, R., Hu, H., Wu, J.: Unet 3+: a full-scale connected unet for medical image segmentation. In: ICASSP 2020–2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1055–1059 (2020). https://doi.org/10.1109/ICASSP40776.2020.9053405
Zeiler, M.D., Taylor, G.W., Fergus, R.: Adaptive deconvolutional networks for mid and high level feature learning. In: 2011 International Conference on Computer Vision, pp. 2018–2025 (2011). https://doi.org/10.1109/iccv.2011.6126474
Jamieson, A.R., Drukker, K., Giger, M.L., Van Ginneken, B., Novak, C.L.: Breast image feature learning with adaptive deconvolutional networks. Proc. SPIE Int. Soc. Opt. Eng. 8315, 64–76 (2012). https://doi.org/10.1117/12.910710
Article Google Scholar
Zhao, H., Jia, J., Koltun, V.: Exploring self-attention for image recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10076–10085 (2020). https://doi.org/10.1109/CVPR42600.2020.01009
Petit, O., Thome, N., Rambour, C., Soler, L.: U-net transformer: self and cross attention for medical image segmentation. In: International Workshop on Machine Learning in Medical Imaging, pp. 267–276 (2021). https://doi.org/10.48550/arXiv.2103.06104
Tang, Z., Jiang, W., Zhang, Z., Zhao, M., Zhang, L.: DenseNet with Up-sampling block for recognizing texts in images. Neural Comput. Appl. 32(11), 1–9 (2020). https://doi.org/10.1007/s00521-019-04285-8
Article Google Scholar
Bipat, S., Glas, A.S., Slors, F.J.M., Zwinderman, A.H., Bossuyt, P.M.M., Stoker, J.: Rectal cancer: local staging and assessment of lymph node involvement with endoluminal US, CT, and MR imaging—a meta-analysis. Radiology 232(3), 773–783 (2004). https://doi.org/10.1148/radiol.2323031368
Article Google Scholar
Codella, N.C., Gutman, D., Celebi, M.E., Helba, B., Marchetti, M.A., Dusza, S.W., Kalloo, A., Liopyris, K., Mishra, N., Kittler, H.: Skin lesion analysis toward melanoma detection: a challenge at the 2017 international symposium on biomedical imaging (isbi), hosted by the international skin imaging collaboration (isic). In: 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018). IEEE, pp. 168–172 (2018). https://doi.org/10.48550/arXiv.1710.05006
Dice, L.R.: Measures of the amount of ecologic association between species. Ecology 26(3), 297–302 (1944). https://doi.org/10.2307/1932409
Article Google Scholar
Kubota, T., Jerebko, A.K., Dewan, M., Salganicoff, M., Krishnan, A.: Segmentation of pulmonary nodules of various densities with morphological approaches and convexity models. Med. Image Anal. 15(1), 133–154 (2011). https://doi.org/10.1016/j.media.2010.08.005
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science and Software Engineering, University of Science and Technology Liaoning, Anshan, 114044, China
Ping Zou & Jian-Sheng Wu

Authors

Ping Zou
View author publications
You can also search for this author in PubMed Google Scholar
Jian-Sheng Wu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jian-Sheng Wu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zou, P., Wu, JS. SwinE-UNet3+: swin transformer encoder network for medical image segmentation. Prog Artif Intell 12, 99–105 (2023). https://doi.org/10.1007/s13748-023-00300-1

Download citation

Received: 15 May 2022
Accepted: 11 February 2023
Published: 27 February 2023
Issue Date: March 2023
DOI: https://doi.org/10.1007/s13748-023-00300-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SwinE-UNet3+: swin transformer encoder network for medical image segmentation

Abstract

Access this article

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation