Abstract
Existing medical image segmentation ignore the exploration of inter-class similarity and intra-class variability in pixel semantics, and aim to develop deeper and more complex networks for strength enhancement, leading to insufficient pixel relationship modeling high computational cost. To overcome the aforementioned limitation, we propose a novel fuzzy-based cross-image pixel contrastive learning regime to exploit discriminative relationships between pixel representations across images globally. CPC ensures that the lesion pixel is pulled closer to other lesion pixels while pushed far away from the background pixels in the representation space, thus driving the network to discriminate pixel semantics more robustly. Instead of computing or storing all samples, we devise a fuzzy filtering strategy that selects Top-K samples based on fuzzy membership. Furthermore, considering the speed requirement of medical image segmentation, we propose a compact but efficient network for rapid and precise segmentation, which can model both local and long-range dependencies by microscopically fusing Transformer and convolution. Benefitted from our efficient design of the hybrid module, the proposed network enjoys the properties of being compact, lightweight, and powerful. We term our efficient hybrid network with cross-image pixel contrastive learning as CPCNet. Extensive qualitative and quantitative experiments on various image segmentation tasks demonstrate that our CPCNet surpasses the state-of-the-art approaches.
Similar content being viewed by others
References
Saez A, Serrano C, Acha B (2014) Model-based classification methods of global patterns in dermoscopic images. IEEE Trans Med Imaging 33(5):1137–1147
Ronneberger O, Fischer P, Brox T (2015) U-net: Convolutional networks for biomedical image segmentation. In: International Conference on Medical Image Computing and Computer-assisted Intervention. Springer, p 234-241
Chen J, Lu Y, Yu Q, Luo X, Adeli E, Wang Y, Lu L, Yuille AL, Zhou Y (2021) Transunet: Transformers make strong encoders for medical image segmentation. arXiv:2102.04306
Zhang Y, Liu H, Hu Q(2021) Transfuse: Fusing transformers and cnns for medical image segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer p 14–24
Liu Y, Zhou J, Liu L, Zhan Z, Hu Y, Fu YQ, Duan H (2022) Fcp-net: A feature-compression-pyramid network guided by game-theoretic interactions for medical image segmentation. IEEE Trans Med Imaging
Aminian M, Khotanlou H (2022) Capsnet-based brain tumor segmentation in multimodal mri images using inhomogeneous voxels in del vector domain. Multimed Tools Appl 81(13):17793–17815
Liu J, Wei X, Li L (2020) Mr image segmentation based on level set method. Multimed Tools Appl 79:11487–11502
Lv T, Yang G, Zhang Y, Yang J, Chen Y, Shu H, Luo L (2019) Vessel segmentation using centerline constrained level set method. Multimed Tools Appl 78:17051–17075
Arora T, Dhir R (2019) A variable region scalable fitting energy approach for human metaspread chromosome image segmentation. Multimed Tools Appl 78:9383–9404
Oktay O, Schlemper J, Folgoc LL, Lee M, Heinrich M, Misawa K, Mori K, McDonagh S, Hammerla NY, Kainz B, et al (2018) Attention u-net: Learning where to look for the pancreas. arXiv:1804.03999
Jha D, Smedsrud PH, Riegler MA, Johansen D, De Lange T, Halvorsen P, Johansen HD (2019) Resunet++: An advanced architecture for medical image segmentation. In: 2019 IEEE International Symposium on Multimedia (ISM). IEEE, p 225–2255
Wang X, Girshick R, Gupta A, He K(2018) Non-local neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, p 7794–7803
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. Adv Neural Inf Process Syst 30
Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, et al (2020) An image is worth 16x16 words: Transformers for image recognition at scale. arXiv:2010.11929
Zheng S, Lu J, Zhao H, Zhu X, Luo Z, Wang Y, Fu Y, Feng J, Xiang T, Torr PH, et al.(2021) Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, p 6881–6890
Gao Y, Zhou M, Metaxas DN(2021) Utnet: a hybrid transformer architecture for medical image segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer, p 61–71
Wang W, Chen C, Ding M, Yu H, Zha S, Li J (2021) Transbts: Multimodal brain tumor segmentation using transformer. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer, p 109–1119
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, p 3431–3440
Zamir SW, Arora A, Khan S, Hayat M, Khan FS, Yang M-H, Shao L (2021) Multi-stage progressive image restoration. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, p 14821–14831
Gao Y, Zhou M, Liu D, Metaxas D (2022) A multi-scale transformer for medical image segmentation: Architectures, model efficiency, and benchmarks. arXiv:2203.00131
Zhang K, Li Y, Liang J, Cao J, Zhang Y, Tang H, Timofte R, Van Gool L (2022) Practical blind denoising via swin-conv-unet and data synthesis. arXiv:2203.13278
Xie Y, Zhang J, Xia Y, Shen C (2020) A mutual bootstrapping model for automated skin lesion segmentation and classification. IEEE Trans Med Imaging 39(7):2482–2493
Asadi-Aghbolaghi M, Azad R, Fathy M, Escalera S (2020) Multi-level context gating of embedded collective knowledge for medical image segmentation. arXiv:2003.05056
Srinivasu PN, Rao TS, Balas VE (2020) A systematic approach for identification of tumor regions in the human brain through haris algorithm.In: Deep Learning Techniques for Biomedical and Health Informatics. Elsevier, p 97–118
Srinivasu PN, Balas VE (2021) Self-learning network-based segmentation for real-time brain mr images through haris. PeerJ Computer Science 7:654
Van den Oord A, Li Y, Vinyals O (2018) Representation learning with contrastive predictive coding. 1807
Hjelm RD, Fedorov A, Lavoie-Marchildon S, Grewal K, Bachman P, Trischler A, Bengio Y (2018) Learning deep representations by mutual information estimation and maximization. arXiv:1808.06670
Wu Z, Xiong Y, Yu SX, Lin D (2018) Unsupervised feature learning via non-parametric instance discrimination. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 3733–3742
Chen T, Kornblith S, Norouzi M, Hinton G (2020) A simple framework for contrastive learning of visual representations. In: International Conference on Machine Learning. PMLR, p 1597–1607
Larsson G, Maire M, Shakhnarovich G (2016) Learning representations for automatic colorization. In: European Conference on Computer Vision. Springer, p 577–593
Komodakis N, Gidaris S (2018) Unsupervised representation learning by predicting image rotations. In: International Conference on Learning Representations (ICLR)
Doersch C, Gupta A, Efros AA (2015) Unsupervised visual representation learning by context prediction. In: Proceedings of the IEEE International Conference on Computer Vision. p 1422–1430
Noroozi M, Favaro P (2016) Unsupervised learning of visual representations by solving jigsaw puzzles. In: European Conference on Computer Vision. Springer, p 69–84
Caron M, Misra I, Mairal J, Goyal P, Bojanowski P, Joulin A (2020) Unsupervised learning of visual features by contrasting cluster assignments. Adv Neural Inf Process Syst 33:9912–9924
He K, Fan H, Wu Y, Xie S, Girshick R (2020) Momentum contrast for unsupervised visual representation learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, p 9729–9738
Khosla P, Teterwak P, Wang C, Sarna A, Tian Y, Isola P, Maschinot A, Liu C, Krishnan D (2020) Supervised contrastive learning. Adv Neural Inf Process Syst 33:18661–18673
Robinson J, Chuang C-Y, Sra S, Jegelka S (2020) Contrastive learning with hard negative samples. arXiv:2010.04592
Kalantidis Y, Sariyildiz MB, Pion N, Weinzaepfel P, Larlus D (2020) Hard negative mixing for contrastive learning. Adv Neural Inf Process Syst 33:21798–21809
Chen X, Fan H, Girshick R, He K (2020) Improved baselines with momentum contrastive learning. arXiv:2003.04297
Xie Z, Lin Y, Zhang Z, Cao Y, Lin S, Hu H (2021) Propagate yourself: Exploring pixel-level consistency for unsupervised visual representation learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. p 16684–16693
Chaitanya K, Erdil E, Karani N, Konukoglu E (2020) Contrastive learning of global and local features for medical image segmentation with limited annotations. Adv Neural Inf Process Syst 33:12546–12558
Wang X, Zhang R, Shen C, Kong T, Li L (2021) Dense contrastive learning for self-supervised visual pre-training. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. p 3024–3033
Hendrycks D, Gimpel K (2016) Gaussian error linear units (gelus). arXiv:1606.08415
Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. p 7132–7141
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. p 770–778
Dosovitskiy A, Springenberg JT, Riedmiller M, Brox T (2014) Discriminative unsupervised feature learning with convolutional neural networks. Adv Neural Inf Process Syst 27
Bachman P, Hjelm RD, Buchwalter W (2019) Learning representations by maximizing mutual information across views. Adv Neural Inf Process Syst 32
Wang W, Zhou T, Yu F, Dai J, Konukoglu E, Van Gool L (2021) Exploring cross-image pixel contrast for semantic segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, p 7303–7313
Bucher M, Herbin S, Jurie F (2016) Hard negative mining for metric learning based zero-shot classification. In: European Conference on Computer Vision. Springer, p 524–531
Jha D, Smedsrud PH, Riegler MA, Halvorsen P, Lange Td, Johansen D, Johansen HD (2020) Kvasir-seg: A segmented polyp dataset. In: International Conference on Multimedia Modeling. Springer, p 451–462
Silva J, Histace A, Romain O, Dray X, Granado B (2014) Toward embedded detection of polyps in wce images for early diagnosis of colorectal cancer. Int J Comput Assist Radiol Surg 9(2):283–293
Vázquez D, Bernal J, Sánchez FJ, Fernández-Esparrach G, López AM, Romero A, Drozdzal M, Courville A (2017) A benchmark for endoluminal scene segmentation of colonoscopy images. J Healthc Eng 2017
Tajbakhsh N, Gurudu SR, Liang J (2015) Automated polyp detection in colonoscopy videos using shape and context information. IEEE Trans Med Imaging 35(2):630–44
Bernal J, Sánchez FJ, Fernández-Esparrach G, Gil D, Rodríguez C, Vilariño F (2015) Wm-dova maps for accurate polyp highlighting in colonoscopy: Validation vs. saliency maps from physicians. Comput Med Imaging Graph 43:99–111
Fan D-P, Ji G-P, Zhou T, Chen G, Fu H, Shen J, Shao L (2020) Pranet: Parallel reverse attention network for polyp segmentation. In: International Conference on Medical Image Computing and Computerassisted Intervention. Springer, p 263–273
Huang C-H, Wu H-Y, Lin Y-L (2021) Hardnet-mseg: a simple encoderdecoder polyp segmentation neural network that achieves over 0.9 mean dice and 86 fps. arXiv:2101.07172
Codella N, Gutman D, Celebi ME, Helba B, Marchetti MA, Dusza SW, Kalloo A, Liopyris K, Mishra N, Kittler H(2017) Skin lesion analysis toward melanoma detection: A challenge at the 2017 international symposium on biomedical imaging (isbi), hosted by the international skin imaging collaboration (isic)
Codella N, Rotemberg V, Tschandl P, Celebi ME, Dusza S, Gutman D, Helba B, Kalloo A, Liopyris K, Marchetti M, et al.(2019) Skin lesion analysis toward melanoma detection 2018: A challenge hosted by the international skin imaging collaboration (isic). arXiv:1902.03368
Mendonça T, Ferreira PM, Marques JS, Marcal AR, Rozeira J (2013) Ph 2-a dermoscopic image database for research and benchmarking. In: 2013 35th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC). IEEE, p 5437–5440
Al-Dhabyani W, Gomaa M, Khaled H, Fahmy A (2020) Dataset of breast ultrasound images. Data Brief 28:104863
Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L(2009) Imagenet: A large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition. Ieee, p 248–255
Zhou Z, Siddiquee MMR, Tajbakhsh N, Liang J (2019) Unet++: Redesigning skip connections to exploit multiscale features in image segmentation. IEEE Trans Med Imaging 39(6):1856–1867
Fang Y, Chen C, Yuan Y, Tong K-Y (2019) Selective feature aggregation network with area-boundary constraints for polyp segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, p 302–310
Chen L-C, Zhu Y, Papandreou G, Schroff F, Adam H (2018) Encoderdecoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European Conference on Computer Vision (ECCV), p 801–818
Yang Z, Farsiu S(2023) Directional connectivity-based segmentation of medical images. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, p 11525–11535
Valanarasu JMJ, Oza P, Hacihaliloglu I, Patel VM (2021) Medical transformer: Gated axial-attention for medical image segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, p 36–46
Lei B, Xia Z, Jiang F, Jiang X, Ge Z, Xu Y, Qin J, Chen S, Wang T, Wang S (2020) Skin lesion segmentation via generative adversarial networks with dual discriminators. Med Image Anal 64:101716
Fu J, Liu J, Tian H, Li Y, Bao Y, Fang Z, Lu H(2019) Dual attention network for scene segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, p 3146–3154
Gu Z, Cheng J, Fu H, Zhou K, Hao H, Zhao Y, Zhang T, Gao S, Liu J (2019) Ce-net: Context encoder network for 2d medical image segmentation. IEEE transactions on medical imaging 38(10):2281–2292
Funding
This work was supported by National Key Research and development Program of China (2021YFA1000102), and in part by the grants from the National Natural Science Foundation of China (Nos. 62376285, 62272375, 61673396), Natural Science Foundation of Shandong Province, China (No. ZR2022MF260).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
The authors have no competing interests to declare that are relevant to the content of this article.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wan, Y., Shao, M., Cheng, Y. et al. Fuzzy-based cross-image pixel contrastive learning for compact medical image segmentation. Multimed Tools Appl 83, 30377–30397 (2024). https://doi.org/10.1007/s11042-023-16611-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-023-16611-3