ABSTRACT
Due to the lack of labeled information in medical images, semi-supervised learning has been highly valued in the field of image segmentation. How to effectively use unlabeled images to guide image segmentation is regarded as a key issue to achieve accurate segmentation. In this paper, an uncertainty-aware based consistent segmentation method is proposed to fully utilize the power of Transformer and Convolution Neural Network (CNN) in semi- supervised image segmentation. Our proposed framework consists of a feature learning module which is enhanced by Transformer and CNN, and a feature guidance module based on consistency perception. In the feature learning module, Transformer is used as an encoder to extract multi-scale features of labeled images, and CNN is used as a decoder to restore image dimensions. The feature-guided module learns the features of unlabeled images after data perturbation, and develops a feature-guided model by averaging network weights. In this paper, medical image segmentation based on consensus perception is implemented in a semi-supervised manner, and the proposed method achieves good performance on public benchmark datasets.
- Ronneberger, O., Fischer, P., and Brox, T., “U-net: Convolutional networks for biomedical image seg- mentation,” in [International Conference on Medical image computing and computer-assisted intervention], 234–241, Springer (2015).Google Scholar
- Zhou, Z., Rahman Siddiquee, M. M., Tajbakhsh, N., and Liang, J., “Unet++: A nested u-net architecture for medical image segmentation,” in [Deep learning in medical image analysis and multimodal learning for clinical decision support], 3–11, Springer (2018).Google ScholarDigital Library
- Otsuki, K., Iwamoto, Y., Chen, Y.-W., Furukawa, A., and Kanasaki, S., “Cine-mr image segmentation for assessment of small bowel motility function using 3d u-net,” Journal of Image and Graphics 7(4), 134–139 (2019).Google ScholarCross Ref
- Joshua, A. O., Nelwamondo, F. V., and Mabuza-Hocquet, G., “Blood vessel segmentation from fundus images using modified u-net convolutional neural network,” Journal of Image and Graphics 8(1), 21–25 (2020).Google ScholarCross Ref
- Choi, J. W., “Knowledge distillation from cross teaching teachers for efficient semi-supervised abdominal organ segmentation in ct,” arXiv preprint arXiv:2211.05942 (2022).Google ScholarDigital Library
- Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., , “An image is worth 16x16 words: Transformers for image recognition at scale,” arXiv preprint arXiv:2010.11929 (2020).Google Scholar
- Wang, Z., Zheng, J.-Q., and Voiculescu, I., “An uncertainty-aware transformer for mri cardiac semantic segmentation via mean teachers,” in [Annual Conference on Medical Image Understanding and Analysis], 494–507, Springer (2022).Google ScholarDigital Library
- Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., and Fei-Fei, L., “Imagenet: A large-scale hierarchical image database,” in [2009 IEEE conference on computer vision and pattern recognition], 248–255, Ieee (2009).Google Scholar
- Khan, W., “Image segmentation techniques: A survey,” Journal of image and graphics 1(4), 166–170 (2013).Google Scholar
- Wang, Z., Li, T., Zheng, J.-Q., and Huang, B., “When cnn meet with vit: Towards semi-supervised learning for multi-class medical image semantic segmentation,” arXiv preprint arXiv:2208.06449 (2022).Google Scholar
- Luo, X., Hu, M., Song, T., Wang, G., and Zhang, S., “Semi-supervised medical image segmentation via cross teaching between cnn and transformer,” arXiv preprint arXiv:2112.04894 (2021).Google Scholar
- Tarvainen, A. and Valpola, H., “Mean teachers are better role models: Weight-averaged consistency tar- gets improve semi-supervised deep learning results,” Advances in neural information processing systems 30 (2017).Google Scholar
- Yu, L., Wang, S., Li, X., Fu, C.-W., and Heng, P.-A., “Uncertainty-aware self-ensembling model for semi- supervised 3d left atrium segmentation,” in [International Conference on Medical Image Computing and Computer-Assisted Intervention], 605–613, Springer (2019).Google Scholar
- Long, J., Shelhamer, E., and Darrell, T., “Fully convolutional networks for semantic segmentation,” in [Proceedings of the IEEE conference on computer vision and pattern recognition], 3431–3440 (2015).Google ScholarCross Ref
- Oktay, O., Schlemper, J., Folgoc, L. L., Lee, M., Heinrich, M., Misawa, K., Mori, K., McDonagh, S., Hammerla, N. Y., Kainz, B., , “Attention u-net: Learning where to look for the pancreas,” arXiv preprint arXiv:1804.03999 (2018).Google Scholar
- Guo, M.-H., Lu, C.-Z., Hou, Q., Liu, Z., Cheng, M.-M., and Hu, S.-M., “Segnext: Rethinking convolutional attention design for semantic segmentation,” arXiv preprint arXiv:2209.08575 (2022).Google Scholar
- Xia, Z., Pan, X., Song, S., Li, L. E., and Huang, G., “Vision transformer with deformable attention,” in [Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition], 4794–4803 (2022).Google Scholar
- Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., and Guo, B., “Swin transformer: Hierarchical vision transformer using shifted windows,” in [Proceedings of the IEEE/CVF International Conference on Computer Vision], 10012–10022 (2021).Google ScholarCross Ref
- Aghdam, E. K., Azad, R., Zarvani, M., and Merhof, D., “Attention swin u-net: Cross-contextual attention mechanism for skin lesion segmentation,” arXiv preprint arXiv:2210.16898 (2022).Google Scholar
- Tang, F., Wang, L., Ning, C., Xian, M., and Ding, J., “Cmu-net: A strong convmixer-based medical ultrasound image segmentation network,” arXiv preprint arXiv:2210.13012 (2022).Google Scholar
- Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., and Torralba, A., “Learning deep features for discriminative localization,” in [Proceedings of the IEEE conference on computer vision and pattern recognition], 2921–2929 (2016).Google ScholarCross Ref
- Song, C., Huang, Y., Ouyang, W., and Wang, L., “Box-driven class-wise region masking and filling rate guided loss for weakly supervised semantic segmentation,” in [Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition], 3136–3145 (2019).Google ScholarCross Ref
- Chen, P., Zhu, C., Shui, Z., Cai, J., Zheng, S., Zhang, S., and Yang, L., “Unsupervised dense nuclei detection and segmentation with prior self-activation map for histology images,” arXiv preprint arXiv:2210.07862 (2022).Google Scholar
- Lu, C., Zheng, S., and Gupta, G., “Unsupervised domain adaptation for cardiac segmentation: Towards structure mutual information maximization,” in [Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition], 2588–2597 (2022).Google ScholarCross Ref
- Dong-DongChen, W. and WeiGao, Z.-H., “Tri-net for semi-supervised deep learning,” in [Proceedings of twenty-seventh international joint conference on artificial intelligence], 2014–2020 (2018).Google Scholar
- Liu, Y., Tian, Y., Wang, C., Chen, Y., Liu, F., Belagiannis, V., and Carneiro, G., “Translation consistent semi-supervised segmentation for 3d medical images,” arXiv preprint arXiv:2203.14523 (2022).Google Scholar
- Wu, Y., Wu, Z., Wu, Q., Ge, Z., and Cai, J., “Exploring smoothness and class-separation for semi-supervised medical image segmentation,” arXiv preprint arXiv:2203.01324 (2022).Google ScholarDigital Library
- Zhao, X., Fang, C., Fan, D.-J., Lin, X., Gao, F., and Li, G., “Cross-level contrastive learning and consistency constraint for semi-supervised medical image segmentation,” in [2022 IEEE 19th International Symposium on Biomedical Imaging (ISBI)], 1–5, IEEE (2022).Google Scholar
- Luo, X., Liao, W., Chen, J., Song, T., Chen, Y., Zhang, S., Chen, N., Wang, G., and Zhang, S., “Efficient semi-supervised gross target volume of nasopharyngeal carcinoma segmentation via uncertainty rectified pyramid consistency,” in [International Conference on Medical Image Computing and Computer-Assisted Intervention], 318–329, Springer (2021).Google Scholar
- Hatamizadeh, A., Tang, Y., Nath, V., Yang, D., Myronenko, A., Landman, B., Roth, H. R., and Xu, D., “Unetr: Transformers for 3d medical image segmentation,” in [Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision], 574–584 (2022).Google ScholarCross Ref
- Bukhari, S. T. and Mohy-ud Din, H., “Ed u-net for brain tumor segmentation: Submission to the rsna- asnr-miccai brats 2021 challenge,” in [International MICCAI Brainlesion Workshop], 276–288, Springer (2022).Google Scholar
- Wang, L., Wang, J., Zhu, L., Fu, H., Li, P., Cheng, G., Feng, Z., Li, S., and Heng, P.-A., “Dual multi-scale mean teacher network for semi-supervised infection segmentation in chest ct volume for covid-19,” arXiv preprint arXiv:2211.05548 (2022).Google Scholar
- Laine, S. and Aila, T., “Temporal ensembling for semi-supervised learning,” arXiv preprint arXiv:1610.02242 (2016).Google Scholar
- Luo, X., Wang, G., Liao, W., Chen, J., Song, T., Chen, Y., Zhang, Shichuan, D. N. M., and Zhang, S., “Semi-supervised medical image segmentation via uncertainty rectified pyramid consistency,” Medical Image Analysis 80, 102517 (2022).Google ScholarCross Ref
- Luo, X., Hu, M., Song, T., Wang, G., and Zhang, S., “Semi-supervised medical image segmentation via cross teaching between cnn and transformer,” in [Medical Imaging with Deep Learning], (2022).Google Scholar
- Luo, X., Liao, W., Chen, J., Song, T., Chen, Y., Zhang, S., Chen, N., Wang, G., and Zhang, S., “Efficient semi-supervised gross target volume of nasopharyngeal carcinoma segmentation via uncertainty rectified pyramid consistency,” in [Medical Image Computing and Computer Assisted Intervention – MICCAI 2021], 318–329 (2021).Google Scholar
- Luo, X., Chen, J., Song, T., and Wang, G., “Semi-supervised medical image segmentation through dual-task consistency,” AAAI Conference on Artificial Intelligence, 8801–8809 (2021).Google ScholarCross Ref
- Luo, X., “SSL4MIS.” https://github.com/HiLab-git/SSL4MIS (2020).Google Scholar
- Yushkevich, P. A., Piven, J., Cody Hazlett, H., Gimpel Smith, R., Ho, S., Gee, J. C., and Gerig, G., “User-guided 3D active contour segmentation of anatomical structures: Significantly improved efficiency and reliability,” Neuroimage 31(3), 1116–1128 (2006).Google ScholarCross Ref
- Zhang, Y., Yang, L., Chen, J., Fredericksen, M., Hughes, D. P., and Chen, D. Z., “Deep adversarial networks for biomedical image segmentation utilizing unannotated images,” in [International conference on medical image computing and computer-assisted intervention], 408–416, Springer (2017).Google Scholar
- Chen, X., Yuan, Y., Zeng, G., and Wang, J., “Semi-supervised semantic segmentation with cross pseudo supervision,” in [Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition], 2613–2622 (2021).Google ScholarCross Ref
- Vu, T.-H., Jain, H., Bucher, M., Cord, M., and Pérez, P., “Advent: Adversarial entropy minimization for domain adaptation in semantic segmentation,” in [Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition], 2517–2526 (2019).Google ScholarCross Ref
Index Terms
- Semi-supervised 3D Medical Image Segmentation Using Transformer and CNN
Recommendations
Efficient Combination of CNN and Transformer for Dual-Teacher Uncertainty-guided Semi-supervised Medical Image Segmentation
Highlights- We proposed a semi-supervised iterative architecture that fuses two different learning paradigms, CNN and Transformer, for medical image segmentation tasks.
AbstractBackground and objective: Deep learning-based methods for fast target segmentation of magnetic resonance imaging (MRI) have become increasingly popular in recent years. Generally, the success of deep learning methods in ...
Semi-supervised medical image segmentation via feature similarity and reliable-region enhancement
AbstractSemantic segmentation is a crucial task in the field of computer vision, and medical image segmentation, as its downstream task, has made significant breakthroughs in recent years. However, the issue of requiring a large number of annotations in ...
Highlights- Introducing an innovative semi-supervised semantic segmentation model for medical images
- Enhanced utilization of the Feature Similarity Module in extracting information from unlabeled data
- The Reliable-region Enhancement Module ...
An interactive medical image segmentation framework using iterative refinement
Segmentation is often performed on medical images for identifying diseases in clinical evaluation. Hence it has become one of the major research areas. Conventional image segmentation techniques are unable to provide satisfactory segmentation results ...
Comments