skip to main content
10.1145/3604078.3604106acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicdipConference Proceedingsconference-collections
research-article

Semi-supervised 3D Medical Image Segmentation Using Transformer and CNN

Authors Info & Claims
Published:26 October 2023Publication History

ABSTRACT

Due to the lack of labeled information in medical images, semi-supervised learning has been highly valued in the field of image segmentation. How to effectively use unlabeled images to guide image segmentation is regarded as a key issue to achieve accurate segmentation. In this paper, an uncertainty-aware based consistent segmentation method is proposed to fully utilize the power of Transformer and Convolution Neural Network (CNN) in semi- supervised image segmentation. Our proposed framework consists of a feature learning module which is enhanced by Transformer and CNN, and a feature guidance module based on consistency perception. In the feature learning module, Transformer is used as an encoder to extract multi-scale features of labeled images, and CNN is used as a decoder to restore image dimensions. The feature-guided module learns the features of unlabeled images after data perturbation, and develops a feature-guided model by averaging network weights. In this paper, medical image segmentation based on consensus perception is implemented in a semi-supervised manner, and the proposed method achieves good performance on public benchmark datasets.

References

  1. Ronneberger, O., Fischer, P., and Brox, T., “U-net: Convolutional networks for biomedical image seg- mentation,” in [International Conference on Medical image computing and computer-assisted intervention], 234–241, Springer (2015).Google ScholarGoogle Scholar
  2. Zhou, Z., Rahman Siddiquee, M. M., Tajbakhsh, N., and Liang, J., “Unet++: A nested u-net architecture for medical image segmentation,” in [Deep learning in medical image analysis and multimodal learning for clinical decision support], 3–11, Springer (2018).Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Otsuki, K., Iwamoto, Y., Chen, Y.-W., Furukawa, A., and Kanasaki, S., “Cine-mr image segmentation for assessment of small bowel motility function using 3d u-net,” Journal of Image and Graphics 7(4), 134–139 (2019).Google ScholarGoogle ScholarCross RefCross Ref
  4. Joshua, A. O., Nelwamondo, F. V., and Mabuza-Hocquet, G., “Blood vessel segmentation from fundus images using modified u-net convolutional neural network,” Journal of Image and Graphics 8(1), 21–25 (2020).Google ScholarGoogle ScholarCross RefCross Ref
  5. Choi, J. W., “Knowledge distillation from cross teaching teachers for efficient semi-supervised abdominal organ segmentation in ct,” arXiv preprint arXiv:2211.05942 (2022).Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., , “An image is worth 16x16 words: Transformers for image recognition at scale,” arXiv preprint arXiv:2010.11929 (2020).Google ScholarGoogle Scholar
  7. Wang, Z., Zheng, J.-Q., and Voiculescu, I., “An uncertainty-aware transformer for mri cardiac semantic segmentation via mean teachers,” in [Annual Conference on Medical Image Understanding and Analysis], 494–507, Springer (2022).Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., and Fei-Fei, L., “Imagenet: A large-scale hierarchical image database,” in [2009 IEEE conference on computer vision and pattern recognition], 248–255, Ieee (2009).Google ScholarGoogle Scholar
  9. Khan, W., “Image segmentation techniques: A survey,” Journal of image and graphics 1(4), 166–170 (2013).Google ScholarGoogle Scholar
  10. Wang, Z., Li, T., Zheng, J.-Q., and Huang, B., “When cnn meet with vit: Towards semi-supervised learning for multi-class medical image semantic segmentation,” arXiv preprint arXiv:2208.06449 (2022).Google ScholarGoogle Scholar
  11. Luo, X., Hu, M., Song, T., Wang, G., and Zhang, S., “Semi-supervised medical image segmentation via cross teaching between cnn and transformer,” arXiv preprint arXiv:2112.04894 (2021).Google ScholarGoogle Scholar
  12. Tarvainen, A. and Valpola, H., “Mean teachers are better role models: Weight-averaged consistency tar- gets improve semi-supervised deep learning results,” Advances in neural information processing systems 30 (2017).Google ScholarGoogle Scholar
  13. Yu, L., Wang, S., Li, X., Fu, C.-W., and Heng, P.-A., “Uncertainty-aware self-ensembling model for semi- supervised 3d left atrium segmentation,” in [International Conference on Medical Image Computing and Computer-Assisted Intervention], 605–613, Springer (2019).Google ScholarGoogle Scholar
  14. Long, J., Shelhamer, E., and Darrell, T., “Fully convolutional networks for semantic segmentation,” in [Proceedings of the IEEE conference on computer vision and pattern recognition], 3431–3440 (2015).Google ScholarGoogle ScholarCross RefCross Ref
  15. Oktay, O., Schlemper, J., Folgoc, L. L., Lee, M., Heinrich, M., Misawa, K., Mori, K., McDonagh, S., Hammerla, N. Y., Kainz, B., , “Attention u-net: Learning where to look for the pancreas,” arXiv preprint arXiv:1804.03999 (2018).Google ScholarGoogle Scholar
  16. Guo, M.-H., Lu, C.-Z., Hou, Q., Liu, Z., Cheng, M.-M., and Hu, S.-M., “Segnext: Rethinking convolutional attention design for semantic segmentation,” arXiv preprint arXiv:2209.08575 (2022).Google ScholarGoogle Scholar
  17. Xia, Z., Pan, X., Song, S., Li, L. E., and Huang, G., “Vision transformer with deformable attention,” in [Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition], 4794–4803 (2022).Google ScholarGoogle Scholar
  18. Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., and Guo, B., “Swin transformer: Hierarchical vision transformer using shifted windows,” in [Proceedings of the IEEE/CVF International Conference on Computer Vision], 10012–10022 (2021).Google ScholarGoogle ScholarCross RefCross Ref
  19. Aghdam, E. K., Azad, R., Zarvani, M., and Merhof, D., “Attention swin u-net: Cross-contextual attention mechanism for skin lesion segmentation,” arXiv preprint arXiv:2210.16898 (2022).Google ScholarGoogle Scholar
  20. Tang, F., Wang, L., Ning, C., Xian, M., and Ding, J., “Cmu-net: A strong convmixer-based medical ultrasound image segmentation network,” arXiv preprint arXiv:2210.13012 (2022).Google ScholarGoogle Scholar
  21. Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., and Torralba, A., “Learning deep features for discriminative localization,” in [Proceedings of the IEEE conference on computer vision and pattern recognition], 2921–2929 (2016).Google ScholarGoogle ScholarCross RefCross Ref
  22. Song, C., Huang, Y., Ouyang, W., and Wang, L., “Box-driven class-wise region masking and filling rate guided loss for weakly supervised semantic segmentation,” in [Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition], 3136–3145 (2019).Google ScholarGoogle ScholarCross RefCross Ref
  23. Chen, P., Zhu, C., Shui, Z., Cai, J., Zheng, S., Zhang, S., and Yang, L., “Unsupervised dense nuclei detection and segmentation with prior self-activation map for histology images,” arXiv preprint arXiv:2210.07862 (2022).Google ScholarGoogle Scholar
  24. Lu, C., Zheng, S., and Gupta, G., “Unsupervised domain adaptation for cardiac segmentation: Towards structure mutual information maximization,” in [Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition], 2588–2597 (2022).Google ScholarGoogle ScholarCross RefCross Ref
  25. Dong-DongChen, W. and WeiGao, Z.-H., “Tri-net for semi-supervised deep learning,” in [Proceedings of twenty-seventh international joint conference on artificial intelligence], 2014–2020 (2018).Google ScholarGoogle Scholar
  26. Liu, Y., Tian, Y., Wang, C., Chen, Y., Liu, F., Belagiannis, V., and Carneiro, G., “Translation consistent semi-supervised segmentation for 3d medical images,” arXiv preprint arXiv:2203.14523 (2022).Google ScholarGoogle Scholar
  27. Wu, Y., Wu, Z., Wu, Q., Ge, Z., and Cai, J., “Exploring smoothness and class-separation for semi-supervised medical image segmentation,” arXiv preprint arXiv:2203.01324 (2022).Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Zhao, X., Fang, C., Fan, D.-J., Lin, X., Gao, F., and Li, G., “Cross-level contrastive learning and consistency constraint for semi-supervised medical image segmentation,” in [2022 IEEE 19th International Symposium on Biomedical Imaging (ISBI)], 1–5, IEEE (2022).Google ScholarGoogle Scholar
  29. Luo, X., Liao, W., Chen, J., Song, T., Chen, Y., Zhang, S., Chen, N., Wang, G., and Zhang, S., “Efficient semi-supervised gross target volume of nasopharyngeal carcinoma segmentation via uncertainty rectified pyramid consistency,” in [International Conference on Medical Image Computing and Computer-Assisted Intervention], 318–329, Springer (2021).Google ScholarGoogle Scholar
  30. Hatamizadeh, A., Tang, Y., Nath, V., Yang, D., Myronenko, A., Landman, B., Roth, H. R., and Xu, D., “Unetr: Transformers for 3d medical image segmentation,” in [Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision], 574–584 (2022).Google ScholarGoogle ScholarCross RefCross Ref
  31. Bukhari, S. T. and Mohy-ud Din, H., “Ed u-net for brain tumor segmentation: Submission to the rsna- asnr-miccai brats 2021 challenge,” in [International MICCAI Brainlesion Workshop], 276–288, Springer (2022).Google ScholarGoogle Scholar
  32. Wang, L., Wang, J., Zhu, L., Fu, H., Li, P., Cheng, G., Feng, Z., Li, S., and Heng, P.-A., “Dual multi-scale mean teacher network for semi-supervised infection segmentation in chest ct volume for covid-19,” arXiv preprint arXiv:2211.05548 (2022).Google ScholarGoogle Scholar
  33. Laine, S. and Aila, T., “Temporal ensembling for semi-supervised learning,” arXiv preprint arXiv:1610.02242 (2016).Google ScholarGoogle Scholar
  34. Luo, X., Wang, G., Liao, W., Chen, J., Song, T., Chen, Y., Zhang, Shichuan, D. N. M., and Zhang, S., “Semi-supervised medical image segmentation via uncertainty rectified pyramid consistency,” Medical Image Analysis 80, 102517 (2022).Google ScholarGoogle ScholarCross RefCross Ref
  35. Luo, X., Hu, M., Song, T., Wang, G., and Zhang, S., “Semi-supervised medical image segmentation via cross teaching between cnn and transformer,” in [Medical Imaging with Deep Learning], (2022).Google ScholarGoogle Scholar
  36. Luo, X., Liao, W., Chen, J., Song, T., Chen, Y., Zhang, S., Chen, N., Wang, G., and Zhang, S., “Efficient semi-supervised gross target volume of nasopharyngeal carcinoma segmentation via uncertainty rectified pyramid consistency,” in [Medical Image Computing and Computer Assisted Intervention – MICCAI 2021], 318–329 (2021).Google ScholarGoogle Scholar
  37. Luo, X., Chen, J., Song, T., and Wang, G., “Semi-supervised medical image segmentation through dual-task consistency,” AAAI Conference on Artificial Intelligence, 8801–8809 (2021).Google ScholarGoogle ScholarCross RefCross Ref
  38. Luo, X., “SSL4MIS.” https://github.com/HiLab-git/SSL4MIS (2020).Google ScholarGoogle Scholar
  39. Yushkevich, P. A., Piven, J., Cody Hazlett, H., Gimpel Smith, R., Ho, S., Gee, J. C., and Gerig, G., “User-guided 3D active contour segmentation of anatomical structures: Significantly improved efficiency and reliability,” Neuroimage 31(3), 1116–1128 (2006).Google ScholarGoogle ScholarCross RefCross Ref
  40. Zhang, Y., Yang, L., Chen, J., Fredericksen, M., Hughes, D. P., and Chen, D. Z., “Deep adversarial networks for biomedical image segmentation utilizing unannotated images,” in [International conference on medical image computing and computer-assisted intervention], 408–416, Springer (2017).Google ScholarGoogle Scholar
  41. Chen, X., Yuan, Y., Zeng, G., and Wang, J., “Semi-supervised semantic segmentation with cross pseudo supervision,” in [Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition], 2613–2622 (2021).Google ScholarGoogle ScholarCross RefCross Ref
  42. Vu, T.-H., Jain, H., Bucher, M., Cord, M., and Pérez, P., “Advent: Adversarial entropy minimization for domain adaptation in semantic segmentation,” in [Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition], 2517–2526 (2019).Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Semi-supervised 3D Medical Image Segmentation Using Transformer and CNN

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      ICDIP '23: Proceedings of the 15th International Conference on Digital Image Processing
      May 2023
      711 pages
      ISBN:9798400708237
      DOI:10.1145/3604078

      Copyright © 2023 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 26 October 2023

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited
    • Article Metrics

      • Downloads (Last 12 months)36
      • Downloads (Last 6 weeks)8

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format