Skip to main content
Log in

Segmentation-based context-aware enhancement network for medical images

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

Automatic medical image segmentation plays a pivotal role in clinical diagnosis. In the past decades, medical image segmentation has made remarkable improvements with the aid of convolutional neural networks (CNNs). However, extracting context information and disease features for dense segmentation remains a challenging task because of the low contrast between lesions and the background of the medical images. To address this issue, we propose a novel enhanced feature fusion scheme in this work. First, we develop a global feature enhancement modTule, which captures the long-range global dependencies of the spatial domains and enhances global features learning. Second, we propose a channel fusion attention module to extract multi-scale context information and alleviate the incoherence of semantic information among different scale features. Then, we combine these two schemes to produce richer context information and to enhance the feature contrast. In addition, we remove the decoder with the progressive deconvolution operations from classical U-shaped networks, and only utilize the features of the last three layers to generate predictions. We conduct extensive experiments on three public datasets: the poly segmentation dataset, ISIC-2018 dataset, and the Synapse Multi-Organ Segmentation dataset. The experimental results demonstrate superior performance and robustness of our method in comparison with state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Data availability

In our work, the datasets used are publicly available, namely the ISIC-2018 dataset (https://challenge2018.isic-archive.com/), the Synapse multi organ segmentation dataset (https://www.synapse.org/#!Synapse:syn3193805/wiki/217789) and the Polyp segmentation dataset include five datasets, i.e. ETIS [58], Kvasir-seg [59], EndoScene [60], CVCColonDB [61], and CVC-ClinicDB [62], as described in Sect. 4.1.

References

  1. Richhariya B, Tanveer M, Rashid AH, Initiative ADN et al (2020) Diagnosis of Alzheimer’s disease using Universum support vector machine based recursive feature elimination (usvm-rfe). Biomed Sig Proc Control 59:101903

    Article  Google Scholar 

  2. Tanveer M, Rashid AH, Ganaie M, Reza M, Razzak I, Hua K-L (2021) Classification of Alzheimer’s disease using ensemble of deep neural networks trained through transfer learning. IEEE J Biomed Health Inform 26(4):1453–1463

    Article  Google Scholar 

  3. Beheshti I, Ganaie M, Paliwal V, Rastogi A, Razzak I, Tanveer M (2021) Predicting brain age using machine learning algorithms: A comprehensive evaluation. IEEE J Biomed Health Inf 26(4):1432–1440

    Article  Google Scholar 

  4. Ning Z, Zhong S, Feng Q, Chen W, Zhang Y (2021) Smu-net: saliency-guided morphology-aware u-net for breast lesion segmentation in ultrasound image. IEEE Transact Med Imaging 41(2):476–490

    Article  Google Scholar 

  5. Wang G, Liu X, Li C, Xu Z, Ruan J, Zhu H, Meng T, Li K, Huang N, Zhang S (2020) A noise-robust framework for automatic segmentation of covid-19 pneumonia lesions from ct images. IEEE Transact Med Imag 39(8):2653–2663

    Article  Google Scholar 

  6. Chu X, Yang W, Ouyang W, Ma C, Yuille AL, Wang X (2017) Multi-context attention for human pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1831–1840

  7. Huang Z, Zhong Z, Sun L, Huo Q (2019) Mask r-cnn with pyramid attention network for scene text detection. In: 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 764–772. IEEE

  8. Gupta A, Agrawal D, Chauhan H, Dolz J, Pedersoli M (2018) An attention model for group-level emotion recognition. In: Proceedings of the 20th ACM International Conference on Multimodal Interaction, pp. 611–615

  9. Liu J, Zhou W, Cui Y, Yu L, Luo T (2022) Gcnet: Grid-like context-aware network for rgb-thermal semantic segmentation. Neurocomputing 506:60–67

    Article  Google Scholar 

  10. Ronneberger O, Fischer P, Brox T (2015) U-net: Convolutional networks for biomedical image segmentation. In: International Conference on Medical Image Computing and Computer-assisted Intervention, pp. 234–241. Springer

  11. Çiçek Ö, Abdulkadir A, Lienkamp SS, Brox T, Ronneberger O (2016) 3d u-net: learning dense volumetric segmentation from sparse annotation. In: International Conference on Medical Image Computing and Computer-assisted Intervention, pp. 424–432 . Springer

  12. Zhou Z, Rahman Siddiquee MM, Tajbakhsh N, Liang J (2018) Unet++: A nested u-net architecture for medical image segmentation. Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support. Springer, pp 3–11

    Chapter  Google Scholar 

  13. Isensee F, Jaeger PF, Kohl SA, Petersen J, Maier-Hein KH (2021) nnu-net: a self-configuring method for deep learning-based biomedical image segmentation. Nat Methods 18(2):203–211

    Article  CAS  PubMed  Google Scholar 

  14. Cheng Z, Li Y, Chen H, Zhang Z, Pan P, Cheng L (2022) Dsgmffn: Deepest semantically guided multi-scale feature fusion network for automated lesion segmentation in abus images. Computer Methods and Programs in Biomedicine, 106891

  15. Cao F, Gao C, Ye H (2022) A novel method for image segmentation: two-stage decoding network with boundary attention. Int J Mach Learn Cybernet 13(5):1461–1473

    Article  Google Scholar 

  16. Song K, Zhao Z, Wang J, Qiang Y, Zhao J, Zia MB (2022) Segmentation-based multi-scale attention model for kras mutation prediction in rectal cancer. Int J Mach Learn Cybernet 13(5):1283–1299

    Article  Google Scholar 

  17. Huang H, Lin L, Tong R, Hu H, Zhang Q, Iwamoto Y, Han X, Chen Y-W, Wu J (2020) Unet 3+: A full-scale connected unet for medical image segmentation. In: ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1055–1059. IEEE

  18. Xiao X, Lian S, Luo Z, Li S (2018) Weighted res-unet for high-quality retina vessel segmentation. In: 2018 9th International Conference on Information Technology in Medicine and Education (ITME), pp. 327–331. IEEE

  19. Li X, Chen H, Qi X, Dou Q, Fu C-W, Heng P-A (2018) H-denseunet: hybrid densely connected unet for liver and tumor segmentation from ct volumes. IEEE Transact Med Imag 37(12):2663–2674

    Article  Google Scholar 

  20. Li S, Liu J, Song Z (2022) Brain tumor segmentation based on region of interest-aided localization and segmentation u-net. International Journal of Machine Learning and Cybernetics, 1–11

  21. Wang X, Girshick R, Gupta A, He K (2018) Non-local neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7794–7803

  22. Huang Z, Wang X, Huang L, Huang C, Wei Y, Liu W (2019) Ccnet: Criss-cross attention for semantic segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 603–612

  23. Hou Q, Zhou D, Feng J (2021) Coordinate attention for efficient mobile network design. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13713–13722

  24. Sinha A, Dolz J (2020) Multi-scale self-guided attention for medical image segmentation. IEEE J Biomed Health Inform 25(1):121–130

    Article  Google Scholar 

  25. Fan D-P, Ji G-P, Zhou T, Chen G, Fu H, Shen J, Shao L (2020) Pranet: Parallel reverse attention network for polyp segmentation. International Conference on Medical Image Computing and Computer-assisted Intervention. Springer, pp 263–273

    Google Scholar 

  26. Yao C, Tang J, Hu M, Wu Y, Guo W, Li Q, Zhang X-P (2021) Claw u-net: a unet variant network with deep feature concatenation for scleral blood vessel segmentation. In: Artificial Intelligence: First CAAI International Conference, CICAI 2021, Hangzhou, China, June 5–6, 2021, Proceedings, Part II 1, pp. 67–78. Springer

  27. Gu Z, Cheng J, Fu H, Zhou K, Hao H, Zhao Y, Zhang T, Gao S, Liu J (2019) Ce-net: Context encoder network for 2d medical image segmentation. IEEE Transact Med Imag 38(10):2281–2292

    Article  Google Scholar 

  28. Feng S, Zhao H, Shi F, Cheng X, Wang M, Ma Y, Xiang D, Zhu W, Chen X (2020) Cpfnet: Context pyramid fusion network for medical image segmentation. IEEE Transact Med Imag 39(10):3008–3018

    Article  Google Scholar 

  29. Zhao H, Shi J, Qi X, Wang X, Jia J (2017) Pyramid scene parsing network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2881–2890

  30. Ni J, Wu J, Tong J, Chen Z, Zhao J (2020) Gc-net: Global context network for medical image segmentation. Comput Methods Programs Biomed 190:105121

    Article  PubMed  Google Scholar 

  31. Chen J, Lu Y, Yu Q, Luo X, Adeli E, Wang Y, Lu L, Yuille AL, Zhou Y (2021) Transunet: Transformers make strong encoders for medical image segmentation. arXiv preprint arXiv:2102.04306

  32. Wang W, Chen C, Ding M, Yu H, Zha S, Li J (2021) Transbts: Multimodal brain tumor segmentation using transformer. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 109–119. Springer

  33. Gao Y, Zhou M, Metaxas DN (2021) Utnet: a hybrid transformer architecture for medical image segmentation. In: Medical Image Computing and Computer Assisted Intervention–MICCAI 2021: 24th International Conference, Strasbourg, France, September 27–October 1, 2021, Proceedings, Part III 24, pp. 61–71. Springer

  34. Wang H, Cao P, Wang J, Zaiane OR (2022) Uctransnet: rethinking the skip connections in u-net from a channel-wise perspective with transformer. Proc. AAAI Conf Artif Intell 36:2441–2449

    Google Scholar 

  35. Wang J, Wei L, Wang L, Zhou Q, Zhu L, Qin J (2021) Boundary-aware transformers for skin lesion segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 206–216. Springer

  36. Ji Y, Zhang R, Wang H, Li Z, Wu L, Zhang S, Luo P (2021) Multi-compound transformer for accurate biomedical image segmentation. In: Medical Image Computing and Computer Assisted Intervention–MICCAI 2021: 24th International Conference, Strasbourg, France, September 27–October 1, 2021, Proceedings, Part I 24, pp. 326–336. Springer

  37. Cao H, Wang Y, Chen J, Jiang D, Zhang X, Tian Q, Wang M (2021) Swin-unet: Unet-like pure transformer for medical image segmentation. arXiv preprint arXiv:2105.05537

  38. Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, Lin S, Guo B (2021) Swin transformer: Hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10012–10022

  39. Lin A, Chen B, Xu J, Zhang Z, Lu G (2021) Ds-transunet: Dual swin transformer u-net for medical image segmentation. arXiv preprint arXiv:2106.06716

  40. Huang X, Deng Z, Li D, Yuan X (2021) Missformer: An effective medical image segmentation transformer. arXiv preprint arXiv:2109.07162

  41. Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440

  42. Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Transact Patte Anal Mach Intell 40(4):834–848

    Article  Google Scholar 

  43. Milletari F, Navab N, Ahmadi S-A (2016) V-net: Fully convolutional neural networks for volumetric medical image segmentation. In: 2016 Fourth International Conference on 3D Vision (3DV), pp. 565–571 . IEEE

  44. Valanarasu JMJ, Sindagi VA, Hacihaliloglu I, Patel VM (2020) Kiu-net: Towards accurate segmentation of biomedical images using over-complete representations. In: International Conference on Medical Image Computing and Computer-assisted Intervention, pp. 363–373 . Springer

  45. Jha D, Riegler MA, Johansen D, Halvorsen P, Johansen HD (2020) Doubleu-net: A deep convolutional neural network for medical image segmentation. In: 2020 IEEE 33rd International Symposium on Computer-based Medical Systems (CBMS), pp. 558–564. IEEE

  46. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez A.N, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Advances in neural information processing systems 30

  47. Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, et al. (2020) An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929

  48. Zheng S, Lu J, Zhao H, Zhu X, Luo Z, Wang Y, Fu Y, Feng J, Xiang T, Torr PH, et al. (2021) Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6881–6890

  49. Carion N, Massa F, Synnaeve G, Usunier N, Kirillov A, Zagoruyko S (2020) End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229. Springer

  50. Touvron H, Cord M, Douze M, Massa F, Sablayrolles A, Jégou H (2021) Training data-efficient image transformers & distillation through attention. In: International Conference on Machine Learning, pp. 10347–10357. PMLR

  51. Wu H, Chen S, Chen G, Wang W, Lei B, Wen Z (2022) Fat-net: feature adaptive transformers for automated skin lesion segmentation. Med Image Anal 76:102327

    Article  PubMed  Google Scholar 

  52. Xue Y, Xu T, Zhang H, Long LR, Huang X (2018) Segan: adversarial network with multi-scale l1 loss for medical image segmentation. Neuroinformatics 16(3):383–392

    Article  PubMed  Google Scholar 

  53. Wang R, Chen S, Ji C, Fan J, Li Y (2022) Boundary-aware context neural network for medical image segmentation. Med Image Anal 78:102395

    Article  PubMed  Google Scholar 

  54. Huang C-H, Wu H-Y, Lin Y-L (2021) Hardnet-mseg: a simple encoder-decoder polyp segmentation neural network that achieves over 0.9 mean dice and 86 fps. arXiv preprint arXiv:2101.07172

  55. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778

  56. Woo S, Park J, Lee J-Y, Kweon IS (2018) Cbam: Convolutional block attention module. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 3–19

  57. Chen C-FR, Fan Q, Panda R (2021) Crossvit: Cross-attention multi-scale vision transformer for image classification. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 357–366

  58. Vázquez D, Bernal J, Sánchez FJ, Fernández-Esparrach G, López AM, Romero A, Drozdzal M, Courville A (2017) A benchmark for endoluminal scene segmentation of colonoscopy images. Journal of healthcare engineering 2017:1–9. https://doi.org/10.1155/2017/4037190. https://www.hindawi.com/journals/jhe/2017/4037190/

  59. Jha D, Smedsrud PH, Riegler MA, Halvorsen P, Lange Td, Johansen D, Johansen HD (2020) Kvasir-seg: A segmented polyp dataset. In: International Conference on Multimedia Modeling, pp. 451–462. Springer

  60. Tajbakhsh N, Gurudu SR, Liang J (2015) Automated polyp detection in colonoscopy videos using shape and context information. IEEE Transact Med Imag 35(2):630–644

    Article  Google Scholar 

  61. Bernal J, Sánchez FJ, Fernández-Esparrach G, Gil D, Rodríguez C, Vilariño F (2015) Wm-dova maps for accurate polyp highlighting in colonoscopy: Validation vs. saliency maps from physicians. Comput Med Imag Graph 43:99–111

    Article  Google Scholar 

  62. Silva J, Histace A, Romain O, Dray X, Granado B (2014) Toward embedded detection of polyps in wce images for early diagnosis of colorectal cancer. Int J Comput Assist Radiol Surg 9(2):283–293

    Article  PubMed  Google Scholar 

  63. Margolin R, Zelnik-Manor L, Tal A (2014) How to evaluate foreground maps? In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255

  64. Fan D-P, Cheng M-M, Liu Y, Li T, Borji A (2017) Structure-measure: A new way to evaluate foreground maps. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4548–4557

  65. Fan D-P, Gong C, Cao Y, Ren B, Cheng M-M, Borji A (2018) Enhanced-alignment measure for binary foreground map evaluation. arXiv preprint arXiv:1805.10421

  66. Fu S, Lu Y, Wang Y, Zhou Y, Shen W, Fishman E, Yuille A (2020) Domain adaptive relational reasoning for 3d multi-organ segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 656–666. Springer

  67. Oktay O, Schlemper J, Folgoc LL, Lee M, Heinrich M, Misawa K, Mori K, McDonagh S, Hammerla NY, Kainz B, et al. (2018) Attention u-net: Learning where to look for the pancreas. arXiv preprint arXiv:1804.03999

  68. Wang H, Xie S, Lin L, Iwamoto Y, Han X-H, Chen Y-W, Tong R (2022) Mixed transformer u-net for medical image segmentation. In: ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2390–2394. IEEE

  69. Jha D, Smedsrud PH, Riegler MA, Johansen D, De Lange T, Halvorsen P, Johansen HD (2019) Resunet++: An advanced architecture for medical image segmentation. In: 2019 IEEE International Symposium on Multimedia (ISM), pp. 225–2255. IEEE

  70. Badrinarayanan V, Kendall A, Cipolla R (2017) Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Transact Patt Anal Mach Intell 39(12):2481–2495

    Article  Google Scholar 

  71. Jha D, Ali S, Tomar NK, Johansen HD, Johansen D, Rittscher J, Riegler MA, Halvorsen P (2021) Real-time polyp detection, localization and segmentation in colonoscopy using deep learning. IEEE Access 9:40496–40510

    Article  PubMed  Google Scholar 

  72. Zhang Z, Liu Q, Wang Y (2018) Road extraction by deep residual u-net. IEEE Geosci Remote Sens Lett 15(5):749–753

    Article  ADS  CAS  Google Scholar 

  73. Fang Y, Chen C, Yuan Y, Tong K-y (2019) Selective feature aggregation network with area-boundary constraints for polyp segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 302–310. Springer

Download references

Acknowledgements

This work was supported in part by the Provincial Natural Science Foundation of Anhui under Grant 1908085MF217, the Natural Science Research Project of Anhui Provincial Education Department under Grant KJ2019A0022918005 and the National Natural Science Foundation of China under Grant 62276146.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hua Bao.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix


Ablation studies on the ISIC2018 dataset: To further validate the effectiveness of our proposed module, we conduct ablation experiments on the ISIC2018 dataset again. As is shown in Table 9, each component in the proposed module removed can result the degradation of performance.

Table 9 Ablation studies on the ISIC 2018 dataset

Ablation on input resolution ont the the ISIC2018 dataset:To further evaluate the effectiveness of the proposed model on different resolutions, we carry out ablation studies with two different resolution sizes of \({512 \times 512}\) and \({256 \times 256}\) on the ISIC2018 dataset again. Dermoscopic images are relatively clear RGB images, so high resolution images and low resolution images have little impact on the experimental results. The detailed experimental data are showed on the Table 10.

Table 10 Ablation study about the impact of input resolution on the the ISIC2018 dataset

Ablation of pre-training model: To explore the effectiveness of the pre-trained model of the proposed method, we conduct two comparative experiments, that is, the encoder ResNet-50 with and without pre-trained weight respectively. Figure 11 shows the curve of Dice and loss during training. The results show that the network with the pre-trained weight is easier to optimize and converge faster, which may benefit from the power of the pre-trained model to capture useful information quickly and efficiently.

Fig. 11
figure 11

Comparison between the network with pre-trained weight and without pre-trained weight


Notation table The variables and notation used in the paper are showed in the Table 11

Table 11 The notation explanation of the formula in this paper

Research on the influence of long-range dependencies on low contrast image segmentation Images with low contrast between target and background need to make full use of context information to model the relationship between background and target, thus the network can identify the faint difference of pixels between target and the background, achieving more accurate segmentation. In the proposed method, the GFE module is designed to model the long-range dependencies. To verify the influence of long-range dependencies on the low-contrast images, we display the heatmaps of the model with and without the GFE module. It can be seen from the Fig. 12, the model with GFE module perform significantly better than those without GFE module in the low-contrast images.

Fig. 12
figure 12

The heat map visualization comparison between the model with GFE module and without GFE module. The pictures from left to right are the initial image, the corresponding ground-truth, the heatmap of model without GFE and the heatmap of the model with the GFE module


Detailed neural architecture To enhance the reproducibility, we present the detailed neural architecture of model configuration. Figure 13 shows the configuration of pipeline architecture and the Fig. 14 shows the configuration of the GFE module.

Fig. 13
figure 13

The overall neural architecture of the proposed network with detailed parameter setting, c means the number of the classes

Fig. 14
figure 14

The architecture of the GFE with the detailed parameter setting

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bao, H., Li, Q. & Zhu, Y. Segmentation-based context-aware enhancement network for medical images. Int. J. Mach. Learn. & Cyber. 15, 963–983 (2024). https://doi.org/10.1007/s13042-023-01950-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-023-01950-2

Keywords

Navigation