Skip to main content
Log in

Fuzzy-based cross-image pixel contrastive learning for compact medical image segmentation

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Existing medical image segmentation ignore the exploration of inter-class similarity and intra-class variability in pixel semantics, and aim to develop deeper and more complex networks for strength enhancement, leading to insufficient pixel relationship modeling high computational cost. To overcome the aforementioned limitation, we propose a novel fuzzy-based cross-image pixel contrastive learning regime to exploit discriminative relationships between pixel representations across images globally. CPC ensures that the lesion pixel is pulled closer to other lesion pixels while pushed far away from the background pixels in the representation space, thus driving the network to discriminate pixel semantics more robustly. Instead of computing or storing all samples, we devise a fuzzy filtering strategy that selects Top-K samples based on fuzzy membership. Furthermore, considering the speed requirement of medical image segmentation, we propose a compact but efficient network for rapid and precise segmentation, which can model both local and long-range dependencies by microscopically fusing Transformer and convolution. Benefitted from our efficient design of the hybrid module, the proposed network enjoys the properties of being compact, lightweight, and powerful. We term our efficient hybrid network with cross-image pixel contrastive learning as CPCNet. Extensive qualitative and quantitative experiments on various image segmentation tasks demonstrate that our CPCNet surpasses the state-of-the-art approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Algorithm 1
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Saez A, Serrano C, Acha B (2014) Model-based classification methods of global patterns in dermoscopic images. IEEE Trans Med Imaging 33(5):1137–1147

    Article  PubMed  Google Scholar 

  2. Ronneberger O, Fischer P, Brox T (2015) U-net: Convolutional networks for biomedical image segmentation. In: International Conference on Medical Image Computing and Computer-assisted Intervention. Springer, p 234-241

  3. Chen J, Lu Y, Yu Q, Luo X, Adeli E, Wang Y, Lu L, Yuille AL, Zhou Y (2021) Transunet: Transformers make strong encoders for medical image segmentation. arXiv:2102.04306

  4. Zhang Y, Liu H, Hu Q(2021) Transfuse: Fusing transformers and cnns for medical image segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer p 14–24

  5. Liu Y, Zhou J, Liu L, Zhan Z, Hu Y, Fu YQ, Duan H (2022) Fcp-net: A feature-compression-pyramid network guided by game-theoretic interactions for medical image segmentation. IEEE Trans Med Imaging

  6. Aminian M, Khotanlou H (2022) Capsnet-based brain tumor segmentation in multimodal mri images using inhomogeneous voxels in del vector domain. Multimed Tools Appl 81(13):17793–17815

    Article  Google Scholar 

  7. Liu J, Wei X, Li L (2020) Mr image segmentation based on level set method. Multimed Tools Appl 79:11487–11502

    Article  Google Scholar 

  8. Lv T, Yang G, Zhang Y, Yang J, Chen Y, Shu H, Luo L (2019) Vessel segmentation using centerline constrained level set method. Multimed Tools Appl 78:17051–17075

    Article  Google Scholar 

  9. Arora T, Dhir R (2019) A variable region scalable fitting energy approach for human metaspread chromosome image segmentation. Multimed Tools Appl 78:9383–9404

    Article  Google Scholar 

  10. Oktay O, Schlemper J, Folgoc LL, Lee M, Heinrich M, Misawa K, Mori K, McDonagh S, Hammerla NY, Kainz B, et al (2018) Attention u-net: Learning where to look for the pancreas. arXiv:1804.03999

  11. Jha D, Smedsrud PH, Riegler MA, Johansen D, De Lange T, Halvorsen P, Johansen HD (2019) Resunet++: An advanced architecture for medical image segmentation. In: 2019 IEEE International Symposium on Multimedia (ISM). IEEE, p 225–2255

  12. Wang X, Girshick R, Gupta A, He K(2018) Non-local neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, p 7794–7803

  13. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. Adv Neural Inf Process Syst 30

  14. Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, et al (2020) An image is worth 16x16 words: Transformers for image recognition at scale. arXiv:2010.11929

  15. Zheng S, Lu J, Zhao H, Zhu X, Luo Z, Wang Y, Fu Y, Feng J, Xiang T, Torr PH, et al.(2021) Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, p 6881–6890

  16. Gao Y, Zhou M, Metaxas DN(2021) Utnet: a hybrid transformer architecture for medical image segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer, p 61–71

  17. Wang W, Chen C, Ding M, Yu H, Zha S, Li J (2021) Transbts: Multimodal brain tumor segmentation using transformer. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer, p 109–1119

  18. Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, p 3431–3440

  19. Zamir SW, Arora A, Khan S, Hayat M, Khan FS, Yang M-H, Shao L (2021) Multi-stage progressive image restoration. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, p 14821–14831

  20. Gao Y, Zhou M, Liu D, Metaxas D (2022) A multi-scale transformer for medical image segmentation: Architectures, model efficiency, and benchmarks. arXiv:2203.00131

  21. Zhang K, Li Y, Liang J, Cao J, Zhang Y, Tang H, Timofte R, Van Gool L (2022) Practical blind denoising via swin-conv-unet and data synthesis. arXiv:2203.13278

  22. Xie Y, Zhang J, Xia Y, Shen C (2020) A mutual bootstrapping model for automated skin lesion segmentation and classification. IEEE Trans Med Imaging 39(7):2482–2493

    Article  PubMed  ADS  Google Scholar 

  23. Asadi-Aghbolaghi M, Azad R, Fathy M, Escalera S (2020) Multi-level context gating of embedded collective knowledge for medical image segmentation. arXiv:2003.05056

  24. Srinivasu PN, Rao TS, Balas VE (2020) A systematic approach for identification of tumor regions in the human brain through haris algorithm.In: Deep Learning Techniques for Biomedical and Health Informatics. Elsevier, p 97–118

  25. Srinivasu PN, Balas VE (2021) Self-learning network-based segmentation for real-time brain mr images through haris. PeerJ Computer Science 7:654

    Article  Google Scholar 

  26. Van den Oord A, Li Y, Vinyals O (2018) Representation learning with contrastive predictive coding. 1807

  27. Hjelm RD, Fedorov A, Lavoie-Marchildon S, Grewal K, Bachman P, Trischler A, Bengio Y (2018) Learning deep representations by mutual information estimation and maximization. arXiv:1808.06670

  28. Wu Z, Xiong Y, Yu SX, Lin D (2018) Unsupervised feature learning via non-parametric instance discrimination. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 3733–3742

  29. Chen T, Kornblith S, Norouzi M, Hinton G (2020) A simple framework for contrastive learning of visual representations. In: International Conference on Machine Learning. PMLR, p 1597–1607

  30. Larsson G, Maire M, Shakhnarovich G (2016) Learning representations for automatic colorization. In: European Conference on Computer Vision. Springer, p 577–593

  31. Komodakis N, Gidaris S (2018) Unsupervised representation learning by predicting image rotations. In: International Conference on Learning Representations (ICLR)

  32. Doersch C, Gupta A, Efros AA (2015) Unsupervised visual representation learning by context prediction. In: Proceedings of the IEEE International Conference on Computer Vision. p 1422–1430

  33. Noroozi M, Favaro P (2016) Unsupervised learning of visual representations by solving jigsaw puzzles. In: European Conference on Computer Vision. Springer, p 69–84

  34. Caron M, Misra I, Mairal J, Goyal P, Bojanowski P, Joulin A (2020) Unsupervised learning of visual features by contrasting cluster assignments. Adv Neural Inf Process Syst 33:9912–9924

    Google Scholar 

  35. He K, Fan H, Wu Y, Xie S, Girshick R (2020) Momentum contrast for unsupervised visual representation learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, p 9729–9738

  36. Khosla P, Teterwak P, Wang C, Sarna A, Tian Y, Isola P, Maschinot A, Liu C, Krishnan D (2020) Supervised contrastive learning. Adv Neural Inf Process Syst 33:18661–18673

    Google Scholar 

  37. Robinson J, Chuang C-Y, Sra S, Jegelka S (2020) Contrastive learning with hard negative samples. arXiv:2010.04592

  38. Kalantidis Y, Sariyildiz MB, Pion N, Weinzaepfel P, Larlus D (2020) Hard negative mixing for contrastive learning. Adv Neural Inf Process Syst 33:21798–21809

    Google Scholar 

  39. Chen X, Fan H, Girshick R, He K (2020) Improved baselines with momentum contrastive learning. arXiv:2003.04297

  40. Xie Z, Lin Y, Zhang Z, Cao Y, Lin S, Hu H (2021) Propagate yourself: Exploring pixel-level consistency for unsupervised visual representation learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. p 16684–16693

  41. Chaitanya K, Erdil E, Karani N, Konukoglu E (2020) Contrastive learning of global and local features for medical image segmentation with limited annotations. Adv Neural Inf Process Syst 33:12546–12558

    Google Scholar 

  42. Wang X, Zhang R, Shen C, Kong T, Li L (2021) Dense contrastive learning for self-supervised visual pre-training. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. p 3024–3033

  43. Hendrycks D, Gimpel K (2016) Gaussian error linear units (gelus). arXiv:1606.08415

  44. Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. p 7132–7141

  45. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. p 770–778

  46. Dosovitskiy A, Springenberg JT, Riedmiller M, Brox T (2014) Discriminative unsupervised feature learning with convolutional neural networks. Adv Neural Inf Process Syst 27

  47. Bachman P, Hjelm RD, Buchwalter W (2019) Learning representations by maximizing mutual information across views. Adv Neural Inf Process Syst 32

  48. Wang W, Zhou T, Yu F, Dai J, Konukoglu E, Van Gool L (2021) Exploring cross-image pixel contrast for semantic segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, p 7303–7313

  49. Bucher M, Herbin S, Jurie F (2016) Hard negative mining for metric learning based zero-shot classification. In: European Conference on Computer Vision. Springer, p 524–531

  50. Jha D, Smedsrud PH, Riegler MA, Halvorsen P, Lange Td, Johansen D, Johansen HD (2020) Kvasir-seg: A segmented polyp dataset. In: International Conference on Multimedia Modeling. Springer, p 451–462

  51. Silva J, Histace A, Romain O, Dray X, Granado B (2014) Toward embedded detection of polyps in wce images for early diagnosis of colorectal cancer. Int J Comput Assist Radiol Surg 9(2):283–293

    Article  PubMed  Google Scholar 

  52. Vázquez D, Bernal J, Sánchez FJ, Fernández-Esparrach G, López AM, Romero A, Drozdzal M, Courville A (2017) A benchmark for endoluminal scene segmentation of colonoscopy images. J Healthc Eng 2017

  53. Tajbakhsh N, Gurudu SR, Liang J (2015) Automated polyp detection in colonoscopy videos using shape and context information. IEEE Trans Med Imaging 35(2):630–44

    Article  PubMed  Google Scholar 

  54. Bernal J, Sánchez FJ, Fernández-Esparrach G, Gil D, Rodríguez C, Vilariño F (2015) Wm-dova maps for accurate polyp highlighting in colonoscopy: Validation vs. saliency maps from physicians. Comput Med Imaging Graph 43:99–111

    Article  PubMed  Google Scholar 

  55. Fan D-P, Ji G-P, Zhou T, Chen G, Fu H, Shen J, Shao L (2020) Pranet: Parallel reverse attention network for polyp segmentation. In: International Conference on Medical Image Computing and Computerassisted Intervention. Springer, p 263–273

  56. Huang C-H, Wu H-Y, Lin Y-L (2021) Hardnet-mseg: a simple encoderdecoder polyp segmentation neural network that achieves over 0.9 mean dice and 86 fps. arXiv:2101.07172

  57. Codella N, Gutman D, Celebi ME, Helba B, Marchetti MA, Dusza SW, Kalloo A, Liopyris K, Mishra N, Kittler H(2017) Skin lesion analysis toward melanoma detection: A challenge at the 2017 international symposium on biomedical imaging (isbi), hosted by the international skin imaging collaboration (isic)

  58. Codella N, Rotemberg V, Tschandl P, Celebi ME, Dusza S, Gutman D, Helba B, Kalloo A, Liopyris K, Marchetti M, et al.(2019) Skin lesion analysis toward melanoma detection 2018: A challenge hosted by the international skin imaging collaboration (isic). arXiv:1902.03368

  59. Mendonça T, Ferreira PM, Marques JS, Marcal AR, Rozeira J (2013) Ph 2-a dermoscopic image database for research and benchmarking. In: 2013 35th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC). IEEE, p 5437–5440

  60. Al-Dhabyani W, Gomaa M, Khaled H, Fahmy A (2020) Dataset of breast ultrasound images. Data Brief 28:104863

    Article  PubMed  Google Scholar 

  61. Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L(2009) Imagenet: A large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition. Ieee, p 248–255

  62. Zhou Z, Siddiquee MMR, Tajbakhsh N, Liang J (2019) Unet++: Redesigning skip connections to exploit multiscale features in image segmentation. IEEE Trans Med Imaging 39(6):1856–1867

    Article  PubMed  PubMed Central  Google Scholar 

  63. Fang Y, Chen C, Yuan Y, Tong K-Y (2019) Selective feature aggregation network with area-boundary constraints for polyp segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, p 302–310

  64. Chen L-C, Zhu Y, Papandreou G, Schroff F, Adam H (2018) Encoderdecoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European Conference on Computer Vision (ECCV), p 801–818

  65. Yang Z, Farsiu S(2023) Directional connectivity-based segmentation of medical images. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, p 11525–11535

  66. Valanarasu JMJ, Oza P, Hacihaliloglu I, Patel VM (2021) Medical transformer: Gated axial-attention for medical image segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, p 36–46

  67. Lei B, Xia Z, Jiang F, Jiang X, Ge Z, Xu Y, Qin J, Chen S, Wang T, Wang S (2020) Skin lesion segmentation via generative adversarial networks with dual discriminators. Med Image Anal 64:101716

    Article  PubMed  Google Scholar 

  68. Fu J, Liu J, Tian H, Li Y, Bao Y, Fang Z, Lu H(2019) Dual attention network for scene segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, p 3146–3154

  69. Gu Z, Cheng J, Fu H, Zhou K, Hao H, Zhao Y, Zhang T, Gao S, Liu J (2019) Ce-net: Context encoder network for 2d medical image segmentation. IEEE transactions on medical imaging 38(10):2281–2292

    Article  PubMed  Google Scholar 

Download references

Funding

This work was supported by National Key Research and development Program of China (2021YFA1000102), and in part by the grants from the National Natural Science Foundation of China (Nos. 62376285, 62272375, 61673396), Natural Science Foundation of Shandong Province, China (No. ZR2022MF260).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mingwen Shao.

Ethics declarations

Competing interests

The authors have no competing interests to declare that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wan, Y., Shao, M., Cheng, Y. et al. Fuzzy-based cross-image pixel contrastive learning for compact medical image segmentation. Multimed Tools Appl 83, 30377–30397 (2024). https://doi.org/10.1007/s11042-023-16611-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-16611-3

Keywords

Navigation