Skip to main content

Advertisement

Log in

ResU-KAN: a medical image segmentation model integrating residual convolutional attention and atrous spatial pyramid pooling

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

With the rapid growth of medical imaging data, precise segmentation and analysis of medical images face unprecedented challenges. Addressing small sample sizes, significant variations, and structurally complex medical imaging data to improve the accuracy of early diagnosis has become a key issue in the medical field. This study proposes a Residual U-KAN model (ResU-KAN) to tackle this challenge and improve medical image segmentation accuracy. First, to address the model’s shortcomings in capturing long-distance dependencies and issues like potential gradient vanishing (or explosion) and overfitting, we introduce a Residual Convolution Attention (RCA) module. Second, to expand the model’s receptive field while performing multi-scale feature extraction, we introduce an Atrous Spatial Pyramid Pooling module (ASPP). Finally, experiments were conducted on three publicly available medical imaging datasets, and comparative analysis with existing state-of-the-art methods demonstrated the effectiveness of the proposed approach. Project page: https://github.com/Alfreda12/ResU-KAN

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data Availability

https://github.com/Alfreda12/ResU-KAN/tree/main/inputs

Code Availability

https://github.com/Alfreda12/ResU-KAN

References

  1. Bray F, Laversanne M, Weiderpass E, Soerjomataram I (2021) The ever-increasing importance of cancer as a leading cause of premature death worldwide. Cancer. Wiley Online Library. https://acsjournals.onlinelibrary.wiley.com/doi/full/10.1002/cncr.33587 13-Aug-2024

  2. Xu Y, Hou S, Wang X et al (2023) A medical image segmentation method based on improved UNet 3+ network. Diagnostics 13(3):576

    MATH  Google Scholar 

  3. Liu X, Chen Z, Yuan Y (2024) MOST: multi-formation soft masking for semi-supervised medical image segmentation. In: Linguraru MG, Dou Q, Feragen A, et al (eds) Medical image computing and computer assisted intervention - MICCAI 2024. Cham: Springer Nature Switzerland, pp 469-480

  4. Ali S, Ghatwary N, Jha D et al (2024) Assessing generalisability of deep learning-based polyp detection and segmentation methods through a computer vision challenge. Sci Rep 14(1):2032

    MATH  Google Scholar 

  5. Li C, Liu X, Wang C, et al (2025) GTP-4o: modality-prompted heterogeneous graph learning for omni-modal biomedical representation. In: Leonardis A, Ricci E, Roth S, et al (eds) Computer vision - ECCV 2024. Cham: Springer Nature Switzerland, pp 168-187

  6. Takahashi R, Kajikawa Y (2017) Computer-aided diagnosis: a survey with bibliometric analysis. ScienceDirect. https://www.sciencedirect.com/science/article/abs/pii/S1386505617300357 13-Aug-2024

  7. Ahmad OF, Soares AS, Mazomenos E, Brandao P, Vega R, Seward E, Stoyanov D, Chand M, Lovat LB (2019) Artificial intelligence and computer-aided diagnosis in colonoscopy: current evidence and future directions. The Lancet Gastroenterology & Hepatology. https://www.thelancet.com/journals/langas/article/PIIS2468-1253(18)30282-6/abstract 13-Aug-2024

  8. Liu H, Liu Y, Li C, et al (2024) LGS: a light-weight 4d gaussian splatting for efficient surgical scene reconstruction. In: Linguraru MG, Dou Q, Feragen A, et al (eds) Medical image computing and computer assisted intervention, MICCAI 2024. Cham: Springer Nature Switzerland, pp 660-670

  9. Lecun Y, Bottou L, Bengio Y et al (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324

    MATH  Google Scholar 

  10. Chen L, Bentley P, Mori K et al (2018) DRINet for medical image segmentation. IEEE Trans Med Imaging 37(11):2453–2462

    MATH  Google Scholar 

  11. Zhang Z, Wu C, Coleman S et al (2020) DENSE-INception U-net for medical image segmentation. Comput Methods Programs Biomed 192:105395

    Google Scholar 

  12. Girum KB, Créhange G, Lalande A (2021) Learning with context feedback loop for robust medical image segmentation. IEEE Trans Med Imaging 40(6):1542–1554

    MATH  Google Scholar 

  13. Long J, Shelhamer E, Darrell T (2015) Fully Convolutional Networks for Semantic Segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431-3440

  14. Yap MH, Pons G, Martí J et al (2018) Automated breast ultrasound lesions detection using convolutional neural networks. IEEE J Biomed Health Inform 22(4):1218–1226

    Google Scholar 

  15. Ronneberger O, Fischer P, Brox T (2015) U-Net: convolutional networks for biomedical image segmentation. In: Navab N, Hornegger J, Wells WM, et al (eds) Medical image computing and computer-assisted intervention, MICCAI 2015. Cham: Springer International Publishing, pp 234-241

  16. Vaswani A, Shazeer N, Parmar N, et al (2017) Attention is All you Need. In: Advances in neural information processing systems, vol. 30. Curran Associates, Inc

  17. Chen J, Lu Y, Yu Q, et al (2021) TransUNet: transformers make strong encoders for medical image segmentation

  18. Valanarasu JMJ, Oza P, Hacihaliloglu I, et al (2021) Medical transformer: gated axial-attention for medical image segmentation. In: de Bruijne M, Cattin PC, Cotin S, et al (eds) Medical image computing and computer assisted intervention, MICCAI 2021. Cham: Springer International Publishing, pp 36-46

  19. Hatamizadeh A, Tang Y, Nath V, et al (2022) UNETR: transformers for 3D medical image segmentation. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 574-584

  20. Liu Z, Wang Y, Vaidya S, et al (2024) KAN: kolmogorov-arnold networks

  21. Li C, Liu X, Li W, et al (2024) U-KAN makes strong backbone for medical image segmentation and generation

  22. Rosenblatt F (1958) The perceptron: a probabilistic model for information storage and organization in the brain. Psychol Rev 65(6):386–408

    MATH  Google Scholar 

  23. Xu K, Ba J, Kiros R, et al (2015) Show, attend and tell: neural image caption generation with visual attention. In: Proceedings of the 32nd international conference on machine learning. PMLR, pp 2048-2057

  24. Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 7132-7141

  25. Park J, Woo S, Lee JY, et al (2018) BAM: bottleneck attention module

  26. Woo S, Park J, Lee JY, et al (2018) CBAM: convolutional block attention module. In: Proceedings of the european conference on computer vision (ECCV). pp 3-19

  27. Ouyang D, He S, Zhang G, et al (2023) Efficient multi-scale attention module with cross-spatial learning. In: ICASSP 2023, 2023 IEEE international conference on acoustics, speech and signal processing (ICASSP). pp 1-5

  28. He K, Zhang X, Ren S et al (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37(9):1904–1916

    MATH  Google Scholar 

  29. Weber M, Wang H, Qiao S, et al (2021) DeepLab2: a tensorflow library for deep labeling

  30. Al-Dhabyani W, Gomaa M, Khaled H et al (2020) Dataset of breast ultrasound images. Data Brief 28:104863

    MATH  Google Scholar 

  31. Sirinukunwattana K, Pluim JPW, Chen H et al (2017) Gland segmentation in colon histology images: the glas challenge contest. Med Image Anal 35:489–502

    Google Scholar 

  32. Bernal J, Sánchez FJ, Fernández-Esparrach G et al (2015) WM-DOVA maps for accurate polyp highlighting in colonoscopy: validation vs. saliency maps from physicians. Comput Med Imaging Graph 43:99–111

    MATH  Google Scholar 

  33. Li W, Yang C, Liu J, et al (2021) Joint Polyp Detection and Segmentation with Heterogeneous Endoscopic Data. In: Proceedings of the 3rd international workshop and challenge on computer vision in endoscopy (EndoCV 2021): co-located with the 18th IEEE International Symposium on Biomedical Imaging (ISBI 2021). CEUR-WS Team, pp 69-79

  34. Yang Q, Li W, Li B, et al (2023) MRM: masked relation modeling for medical image pre-training with genetics. In: Proceedings of the IEEE/CVF international conference on computer vision. pp 21452-21462

  35. Li C, Liu H, Liu Y, et al (2024) Endora: video generation models as endoscopy simulators. In: Linguraru MG, Dou Q, Feragen A, et al (eds) Medical image computing and computer assisted intervention, MICCAI 2024. Cham: Springer Nature Switzerland, pp 230-240

  36. Kingma DP, Ba J (2017) Adam: a method for stochastic optimization

  37. Zhou Z, Rahman Siddiquee MM, Tajbakhsh N, et al (2018) UNet++: a nested U-Net architecture for medical image segmentation. In: Stoyanov D, Taylor Z, Carneiro G, et al (eds) Deep learning in medical image analysis and multimodal learning for clinical decision support. Cham: Springer International Publishing, pp 3-11

  38. Oktay O, Schlemper J, Folgoc LL, et al (2018) Attention U-Net: learning where to look for the pancreas

  39. Ma J, Li F, Wang B (2024) U-Mamba: enhancing long-range dependency for biomedical image segmentation

  40. Valanarasu JMJ, Patel VM (2022) UNeXt: MLP-based rapid medical image segmentation network. In: Wang L, Dou Q, Fletcher PT, et al (eds) Medical image computing and computer assisted intervention, MICCAI 2022. Cham: Springer Nature Switzerland, pp 23-33

  41. Liu Y, Zhu H, Liu M et al (2024) Rolling-Unet: revitalizing mlp’s ability to efficiently extract long-distance dependencies for medical image segmentation. Proc AAAI Conf Artif Intell 38(4):3819–3827

    MATH  Google Scholar 

Download references

Funding

This work was supported in part by Shijiazhuang Introducing High-level Talents’ Startup Funding Project (248790067A), the Startup Foundation for PhD of Hebei GEO University (No. BQ201322), Natural Science Foundation of Hebei Province (H2024403001), and Science Research Project Funding from Hebei Provincial Department of Education (BJK2024099).

Author information

Authors and Affiliations

Authors

Contributions

Haibin Wang: Conceptualization, Funding Acquisition, Resources, Supervision, Writing - Review & Editing; Zhenfeng Zhao: Conceptualization, Investigation, Methodology, Data Curation, Visualization, Formal Analysis, Writing - Original Draft; Qi Liu: Resources, Supervision.

Corresponding author

Correspondence to Shenwen Wang.

Ethics declarations

Conflict of Interest

The authors have no competing interests to declare that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, H., Zhao, Z., Liu, Q. et al. ResU-KAN: a medical image segmentation model integrating residual convolutional attention and atrous spatial pyramid pooling. Appl Intell 55, 568 (2025). https://doi.org/10.1007/s10489-025-06467-5

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10489-025-06467-5

Keywords