VoxSeP: semi-positive voxels assist self-supervised 3D medical segmentation

Yang, Zijie; Xie, Lingxi; Zhou, Wei; Huo, Xinyue; Wei, Longhui; Lu, Jian; Tian, Qi; Tang, Sheng

doi:10.1007/s00530-022-00977-9

VoxSeP: semi-positive voxels assist self-supervised 3D medical segmentation

Regular Paper
Published: 23 July 2022

Volume 29, pages 33–48, (2023)
Cite this article

Multimedia Systems Aims and scope Submit manuscript

Zijie Yang^1,2,
Lingxi Xie³,
Wei Zhou³,
Xinyue Huo⁴,
Longhui Wei³,
Jian Lu⁵,
Qi Tian³ &
…
Sheng Tang^1,2,6

825 Accesses
6 Citations
Explore all metrics

Abstract

Medical image segmentation enjoys the advantage of understanding 3D contexts, but 3D networks are prone to over-fitting due to the limited amount of annotated data. This paper investigates self-supervised pre-training, i.e., making use of unlabeled medical data to initialize 3D segmentation networks. We build our system upon contrastive learning, where the dependence on positive and negative samples obstructs it from satisfying performance on medical image datasets with fewer samples. To alleviate this issue, we present a novel proxy task that takes advantage of the property of human body similarity in medical scans, and defines the sub-volumes from the same position of different cases as Semi-Positive samples. Pre-trained on a mixed dataset containing 1254 CT volumes, the proposed approach, VoxSeP, transfers well to 4 downstream datasets with 2 different backbone networks. On both fully supervised and semi-supervised fine-tuning, VoxSeP achieves favorable averaged improvements ($2\%$ and $4\%$), which surpass several state-of-the-art counterparts.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Momentum Contrastive Voxel-Wise Representation Learning for Semi-supervised Volumetric Medical Image Segmentation

VISA-FSS: A Volume-Informed Self Supervised Approach for Few-Shot 3D Segmentation

Bootstrapping Semi-supervised Medical Image Segmentation with Anatomical-Aware Contrastive Distillation

Notes

We use VoxSeP to represent the whole self-supervised pre-training method in Tables 1 to 4, while in Table 5, it refers to the VoxSeP pretext task ${\mathcal {T}}_\mathrm {VoxSeP}$ upon semi-positive contrastive pairs.

References

Azizi, S., Mustafa, B., Ryan, F., Beaver, Z., Freyberg, J., Deaton, J., Loh, A., Karthikesalingam, A., Kornblith, S., Chen, T., et al.: Big self-supervised models advance medical image classification. arXiv preprint arXiv:2101.05224 (2021)
Baid, U., Talbar, S., Rane, S., Gupta, S., Thakur, M.H., Moiyadi, A., Thakur, S., Mahajan, A.: Deep learning radiomics algorithm for gliomas (drag) model: a novel approach using 3d unet based deep convolutional neural network for predicting survival in gliomas. In: International MICCAI Brainlesion Workshop, pp. 369–379. Springer (2018)
Chaitanya, K., Erdil, E., Karani, N., Konukoglu, E.: Contrastive learning of global and local features for medical image segmentation with limited annotations. Adv. Neural Inform. Process. Syst. 33, 12546–12558 (2020)
Google Scholar
Chen, L., Bentley, P., Mori, K., Misawa, K., Fujiwara, M., Rueckert, D.: Self-supervised learning for medical image analysis using image context restoration. Med. Image Anal. 58, 101539 (2019)
Article Google Scholar
Chen, M., Radford, A., Child, R., Wu, J., Jun, H., Luan, D., Sutskever, I.: Generative pretraining from pixels. In: International Conference on Machine Learning, pp. 1691–1703. PMLR (2020)
Chen, S., Ma, K., Zheng, Y.: Med3d: Transfer learning for 3d medical image analysis. arXiv preprint arXiv:1904.00625 (2019)
Chen, T., Kornblith, S., Norouzi, M., Hinton, G.: A simple framework for contrastive learning of visual representations. In: Proc. of Intl. Conf. on Machine Learning, pp. 1597–1607 (2020)
Chen, T., Kornblith, S., Swersky, K., Norouzi, M., Hinton, G.: Big self-supervised models are strong semi-supervised learners. arXiv preprint arXiv:2006.10029 (2020)
Dwibedi, D., Aytar, Y., Tompson, J., Sermanet, P., Zisserman, A.: With a little help from my friends: Nearest-neighbor contrastive learning of visual representations. arXiv preprint arXiv:2104.14548 (2021)
Fan, D.P., Zhou, T., Ji, G.P., Zhou, Y., Chen, G., Fu, H., Shen, J., Shao, L.: Inf-net: automatic covid-19 lung infection segmentation from ct images. IEEE Trans. Med. Imaging 39(8), 2626–2637 (2020). https://doi.org/10.1109/TMI.2020.2996645
Article Google Scholar
Frid-Adar, M., Klang, E., Amitai, M., Goldberger, J., Greenspan, H.: Synthetic data augmentation using gan for improved liver lesion classification. In: 2018 IEEE 15th international symposium on biomedical imaging (ISBI 2018), pp. 289–293. IEEE (2018)
Gaur, L., Bhatia, U., Jhanjhi, N., Muhammad, G., Masud, M.: Medical image-based detection of covid-19 using deep convolution neural networks. Multimedia Syste pp 1–10 (2021)
Gibson, E., Giganti, F., Hu, Y., Bonmati, E., Bandula, S., Gurusamy, K., Davidson, B., Pereira, S.P., Clarkson, M.J., Barratt, D.C.: Automatic multi-organ segmentation on abdominal ct with dense v-networks. IEEE Trans Med Imaging 37(8), 1822–1834 (2018). https://doi.org/10.1109/TMI.2018.2806309
Article Google Scholar
Grill, J.B., Strub, F., Altché, F., Tallec, C., Richemond, P., Buchatskaya, E., Doersch, C., Avila Pires, B., Guo, Z., Gheshlaghi Azar, M., et al.: Bootstrap your own latent-a new approach to self-supervised learning. Adv. Neural Inform. Process. Syst. 33, 21271–21284 (2020)
Google Scholar
Guo, S., Rigall, E., Qi, L., Dong, X., Li, H., Dong, J.: Graph-based cnns with self-supervised module for 3d hand pose estimation from monocular rgb. IEEE Trans. Circuits Syst. Video Technol. 31(4), 1514–1525 (2021). https://doi.org/10.1109/TCSVT.2020.3004453
Article Google Scholar
Haghighi, F., Taher, M.R.H., Zhou, Z., Gotway, M.B., Liang, J.: Transferable visual words: Exploiting the semantics of anatomical patterns for self-supervised learning. IEEE Transactions on Medical Imaging pp. 1–1 (2021). https://doi.org/10.1109/TMI.2021.3060634
Hatamizadeh, A., Nath, V., Tang, Y., Yang, D., Roth, H., Xu, D.: Swin unetr: Swin transformers for semantic segmentation of brain tumors in mri images. arXiv preprint arXiv:2201.01266 (2022)
He, K., Fan, H., Wu, Y., Xie, S., Girshick, R.: Momentum contrast for unsupervised visual representation learning. In: Proc. of IEEE Intl. Conf. on Computer Vision and Pattern Recognition, pp. 9729–9738 (2020)
He, K., Fan, H., Wu, Y., Xie, S., Girshick, R.: Momentum contrast for unsupervised visual representation learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9729–9738 (2020)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778 (2016)
Heller, N., Isensee, F., Maier-Hein, K.H., Hou, X., Xie, C., Li, F., Nan, Y., Mu, G., Lin, Z., Han, M., et al.: The state of the art in kidney and kidney tumor segmentation in contrast-enhanced ct imaging: Results of the kits19 challenge 67, 101821 (2021)
Huo, X., Xie, L., Wei, L., Zhang, X., Li, H., Yang, Z., Zhou, W., Li, H., Tian, Q.: Heterogeneous contrastive learning: Encoding spatial information for compact visual representations. arXiv preprint arXiv:2011.09941 (2020)
Isensee, F., Petersen, J., Klein, A., Zimmerer, D., Jaeger, P.F., Kohl, S., Wasserthal, J., Koehler, G., Norajitra, T., Wirkert, S., et al.: nnu-net: Self-adapting framework for u-net-based medical image segmentation. arXiv preprint arXiv:1809.10486 (2018)
Juarez, A.G.U., Selvan, R., Saghir, Z., de Bruijne, M.: A joint 3d unet-graph neural network-based method for airway segmentation from chest cts. In: International workshop on machine learning in medical imaging, pp. 583–591. Springer (2019)
Kausar, A., Razzak, I., Shapiai, M.I., Beheshti, A.: 3d shallow deep neural network for fast and precise segmentation of left atrium. Multimedia Systems pp. 1–11 (2021)
Kayal, S., Chen, S., de Bruijne, M.: Region-of-interest guided supervoxel inpainting for self-supervision. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 500–509. Springer (2020)
Landman, B., Xu, Z., Igelsias, J., Styner, M., Langerak, T., Klein, A.: Miccai multi-atlas labeling beyond the cranial vault–workshop and challenge. In: Proc. MICCAI Multi-Atlas Labeling Beyond Cranial Vault-Workshop Challenge, vol. 5, p.12 (2015)
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)
Article Google Scholar
Li, H., Zhang, X., Sun, R., Xiong, H., Tian, Q.: Center-wise local image mixture for contrastive representation learning. arXiv preprint arXiv:2011.02697 (2020)
Li, J., Zhao, G., Tao, Y., Zhai, P., Chen, H., He, H., Cai, T.: Multi-task contrastive learning for automatic ct and x-ray diagnosis of covid-19. Pattern Recognit. 114, 107848 (2021)
Article Google Scholar
Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., Guo, B.: Swin transformer: Hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10012–10022 (2021)
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3431–3440 (2015)
Mahapatra, D., Poellinger, A., Shao, L., Reyes, M.: Interpretability-driven sample selection using self supervised learning for disease classification and segmentation. IEEE Transactions on Medical Imaging pp. 1–1 (2021). https://doi.org/10.1109/TMI.2021.3061724
Milletari, F., Navab, N., Ahmadi, S.A.: V-net: Fully convolutional neural networks for volumetric medical image segmentation. In: International Conference on 3D vision (3DV), pp. 565–571. IEEE (2016)
Noroozi, M., Favaro, P.: Unsupervised learning of visual representations by solving jigsaw puzzles. In: European conference on computer vision, pp. 69–84. Springer (2016)
Oord, A.v.d., Li, Y., Vinyals, O.: Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748 (2018)
Ouyang, C., Biffi, C., Chen, C., Kart, T., Qiu, H., Rueckert, D.: Self-supervision with superpixels: Training few-shot medical image segmentation without annotation. In: European Conference on Computer Vision, pp. 762–780. Springer (2020)
Qian, R., Meng, T., Gong, B., Yang, M.H., Wang, H., Belongie, S., Cui, Y.: Spatiotemporal contrastive video representation learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6964–6974 (2021)
Qiao, S., Shen, W., Zhang, Z., Wang, B., Yuille, A.: Deep co-training for semi-supervised image recognition. In: Proceedings of the european conference on computer vision (eccv), pp. 135–152 (2018)
Qureshi, K.N., Alhudhaif, A., Ali, M., Qureshi, M.A., Jeon, G.: Self-assessment and deep learning-based coronavirus detection and medical diagnosis systems for healthcare. Multimedia Systems pp. 1–10 (2021)
Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: Towards real-time object detection with region proposal networks. Adv. Neural Inform. Process. Syst. 28, 91–99 (2015)
Google Scholar
Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomedical image segmentation. In: Proc. of Intl. Conf. on Medical Image Computing and Computer Assisted Intervention 9351, 234–241 (2015)
Google Scholar
Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomedical image segmentation. In: International Conference on Medical image computing and computer-assisted intervention, pp. 234–241. Springer (2015)
Roth, H.R., Lu, L., Farag, A., Shin, H.C., Liu, J., Turkbey, E.B., Summers, R.M.: Deeporgan: Multi-level deep convolutional networks for automated pancreas segmentation. In: International conference on medical image computing and computer-assisted intervention, pp. 556–564. Springer (2015)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Simpson, A.L., Antonelli, M., Bakas, S., Bilello, M., Farahani, K., Van Ginneken, B., Kopp-Schneider, A., Landman, B.A., Litjens, G., Menze, B., et al.: A large annotated medical image dataset for the development and evaluation of segmentation algorithms. arXiv preprint arXiv:1902.09063 (2019)
Taleb, A., Loetzsch, W., Danz, N., Severin, J., Gaertner, T., Bergner, B., Lippert, C.: 3d self-supervised methods for medical imaging. arXiv preprint arXiv:2006.03829 (2020)
Tang, Y., Yang, D., Li, W., Roth, H., Landman, B., Xu, D., Nath, V., Hatamizadeh, A.: Self-supervised pre-training of swin transformers for 3d medical image analysis. arXiv preprint arXiv:2111.14791 (2021)
Tao, X., Li, Y., Zhou, W., Ma, K., Zheng, Y.: Revisiting rubik’s cube: self-supervised learning with volume-wise transformation for 3d medical image segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 238–248. Springer (2020)
Wang, X., Girshick, R., Gupta, A., He, K.: Non-local neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7794–7803 (2018)
Wang, X., Zhang, R., Shen, C., Kong, T., Li, L.: Dense contrastive learning for self-supervised visual pre-training. In: Proc. of IEEE Intl. Conf. on Computer Vision and Pattern Recognition (2021)
Xia, Y., Liu, F., Yang, D., Cai, J., Yu, L., Zhu, Z., Xu, D., Yuille, A., Roth, H.: 3d semi-supervised learning with uncertainty-aware multi-view co-training. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 3646–3655 (2020)
Xie, L., Yu, Q., Zhou, Y., Wang, Y., Fishman, E.K., Yuille, A.L.: Recurrent saliency transformation network for tiny target segmentation in abdominal ct scans. IEEE Trans. Med. Imag. 39(2), 514–525 (2020). https://doi.org/10.1109/TMI.2019.2930679
Article Google Scholar
Xie, Y., Zhang, J., Liao, Z., Xia, Y., Shen, C.: Pgl: Prior-guided local self-supervised learning for 3d medical image segmentation. arXiv preprint arXiv:2011.12640 (2020)
Xie, Z., Lin, Y., Zhang, Z., Cao, Y., Lin, S., Hu, H.: Propagate yourself: Exploring pixel-level consistency for unsupervised visual representation learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16684–16693 (2021)
Xu, P., Song, Z., Yin, Q., Song, Y.Z., Wang, L.: Deep self-supervised representation learning for free-hand sketch. IEEE Trans. Circ. Syst. Video Technol. 31(4), 1503–1513 (2021). https://doi.org/10.1109/TCSVT.2020.3003048
Article Google Scholar
Yu, Q., Yang, D., Roth, H., Bai, Y., Zhang, Y., Yuille, A.L., Xu, D.: C2fnas: Coarse-to-fine neural architecture search for 3d medical image segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4126–4135 (2020)
Zhao, X., Vemulapalli, R., Mansfield, P., Gong, B., Green, B., Shapira, L., Wu, Y.: Contrastive learning for label-efficient semantic segmentation. arXiv preprint arXiv:2012.06985 (2020)
Zhou, Z., Sodha, V., Siddiquee, M.M.R., Feng, R., Tajbakhsh, N., Gotway, M.B., Liang, J.: Models genesis: Generic autodidactic models for 3d medical image analysis. In: International conference on medical image computing and computer-assisted intervention, pp. 384–393. Springer (2019)
Zhu, J., Li, Y., Hu, Y., Ma, K., Zhou, S.K., Zheng, Y.: Rubik’s cube+: a self-supervised feature learning framework for 3d medical image analysis. Med. Image Anal. 64, 101746 (2020)
Article Google Scholar
Zhu, Z., Xia, Y., Shen, W., Fishman, E., Yuille, A.: A 3d coarse-to-fine framework for volumetric medical image segmentation. In: 2018 International conference on 3D vision (3DV), pp. 682–690. IEEE (2018)

Download references

Acknowledgements

The research work is supported by the National Natural Science Foundation of China (61871004) and the Project of Chinese Academy of Sciences (E141020).

Author information

Authors and Affiliations

The Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, 100190, China
Zijie Yang & Sheng Tang
The University of Chinese Academy of Sciences, Beijing, 100049, China
Zijie Yang & Sheng Tang
Huawei Cloud, Beijing, China
Lingxi Xie, Wei Zhou, Longhui Wei & Qi Tian
University of Science and Technology of China, Hefei, China
Xinyue Huo
Department of Urology, Peking University Third Hospital, Beijing, China
Jian Lu
Research Institute of Intelligent Computing, Zhejiang Lab, Hangzhou, China
Sheng Tang

Authors

Zijie Yang
View author publications
You can also search for this author inPubMed Google Scholar
Lingxi Xie
View author publications
You can also search for this author inPubMed Google Scholar
Wei Zhou
View author publications
You can also search for this author inPubMed Google Scholar
Xinyue Huo
View author publications
You can also search for this author inPubMed Google Scholar
Longhui Wei
View author publications
You can also search for this author inPubMed Google Scholar
Jian Lu
View author publications
You can also search for this author inPubMed Google Scholar
Qi Tian
View author publications
You can also search for this author inPubMed Google Scholar
Sheng Tang
View author publications
You can also search for this author inPubMed Google Scholar

Contributions

ZY, LX, WZ, and XH wrote the main manuscript text, prepared figures and tables. ZY, WZ conducted the fully supervised fine-tuning experiments, ZY, Xinyue Huo conducted semi-supervised fine-tuning experiments, ZY conducted ablation study and visualizations. LW helped ZY with the re-implementation of comparison methods. LX revised the manuscript. JL, QT, and ST provide guidance and instruction on the idea and methods. All authors reviewed the manuscript

Corresponding author

Correspondence to Sheng Tang.

Ethics declarations

Conflict of interest

The authors declare no competing interests

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yang, Z., Xie, L., Zhou, W. et al. VoxSeP: semi-positive voxels assist self-supervised 3D medical segmentation. Multimedia Systems 29, 33–48 (2023). https://doi.org/10.1007/s00530-022-00977-9

Download citation

Received: 23 March 2022
Accepted: 28 June 2022
Published: 23 July 2022
Issue Date: February 2023
DOI: https://doi.org/10.1007/s00530-022-00977-9

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

VoxSeP: semi-positive voxels assist self-supervised 3D medical segmentation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Momentum Contrastive Voxel-Wise Representation Learning for Semi-supervised Volumetric Medical Image Segmentation

VISA-FSS: A Volume-Informed Self Supervised Approach for Few-Shot 3D Segmentation

Bootstrapping Semi-supervised Medical Image Segmentation with Anatomical-Aware Contrastive Distillation

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now