Skip to main content

Advertisement

Intelligent mask image reconstruction for cardiac image segmentation through local–global fusion

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Accurate segmentation of cardiac structures in magnetic resonance imaging (MRI) is essential for reliable diagnosis and management of cardiovascular disease. Although numerous robust models have been proposed, no single segmentation model consistently outperforms others across all cases, and models that excel on one dataset may not achieve similar accuracy on others or when the same dataset is expanded. This study introduces FCTransNet, an ensemble-based computer-aided diagnosis system that leverages the complementary strengths of Vision Transformer (ViT) models (specifically TransUNet, SwinUNet, and SegFormer) to address these challenges. To achieve this, we propose a novel pixel-level fusion technique, the Intelligent Weighted Summation Technique (IWST), which reconstructs the final segmentation mask by integrating the outputs of the ViT models and accounting for their diversity. First, a dedicated U-Net module isolates the region of interest (ROI) from cine MRI images, which is then processed by each ViT to generate preliminary segmentation masks. The IWST subsequently fuses these masks to produce a refined final segmentation. By using a local window around each pixel, IWST captures specific neighborhood details while incorporating global context to enhance segmentation accuracy. Experimental validation on the ACDC dataset shows that FCTransNet significantly outperforms individual ViTs and other deep learning-based methods, achieving a Dice Score (DSC) of 0.985 and a mean Intersection over Union (IoU) of 0.914 in the end-diastolic phase. In addition, FCTransNet maintains high accuracy in the end-systolic phase with a DSC of 0.989 and an IoU of 0.908. These results underscore FCTransNet’s ability to improve cardiac MRI segmentation accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Algorithm 1
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data availability

These data were obtained from https://www.creatis.insa-lyon.fr/Challenge/acdc/databases.html

References

  1. Mall PK et al (2023) A comprehensive review of deep neural networks for medical image processing: recent developments and future opportunities. Healthc Anal 4

  2. Calisto MB, Lai-Yuen KS (2020) AdaEn-Net: an ensemble of adaptive 2D–3D fully convolutional networks for medical image segmentation. Neural Netw 126:76–94. https://doi.org/10.1016/j.neunet.2020.03.007

    Article  Google Scholar 

  3. Chakravarty A, Sivaswamy J (2019) RACE-Net: a recurrent neural network for biomedical image segmentation. IEEE J Biomed Health Inform 23(3):1151–1162. https://doi.org/10.1109/JBHI.2018.2852635

    Article  MATH  Google Scholar 

  4. Boukhamla A et al (2023) GANs investigation for multimodal medical data interpretation: basic architectures and overview. In: 2023 International Conference on Control, Automation and Diagnosis (ICCAD), pp 01–06. https://doi.org/10.1109/ICCAD57653.2023.10152386.

  5. Conze P-H, Andrade-Miranda G, Singh VK, Jaouen V, Visvikis D (2023) Current and emerging trends in medical image segmentation with deep learning. IEEE Trans Radiat and Plasma Med Sci 7(6):545–569. https://doi.org/10.1109/TRPMS.2023.3265863

    Article  MATH  Google Scholar 

  6. Xiao H, Li L, Liu Q, Zhu X, Zhang Q (2023) Transformers in medical image segmentation: a review. Biomed Signal Process Control 84. https://doi.org/10.1016/j.bspc.2023.104791

    Article  MATH  Google Scholar 

  7. Wang Z, Zheng J-Q, Voiculescu I (2022) An uncertainty-aware transformer for MRI cardiac semantic segmentation via mean teachers. In: Yang G, Aviles-Rivero A, Roberts M, and Schönlieb CB (eds) Medical Image Understanding and Analysis (MIUA 2022). lecture Notes in Computer Science. Springer International Publishing, Cham, pp 494–507. https://doi.org/10.1007/978-3-031-12053-4_37.

  8. Fan C, Su Q, Xiao Z, Su H, Hou A, Luan B (2023) ViT-FRD: a vision transformer model for cardiac mri image segmentation based on feature recombination distillation. IEEE Access 11:129763–129772. https://doi.org/10.1109/ACCESS.2023.3302522

    Article  Google Scholar 

  9. Azad R et al (2024) Advances in medical image analysis with vision transformers: a comprehensive review. Med Image Anal 91. https://doi.org/10.1016/j.media.2023.103000

  10. Chen J, Chen J, Lu Y, Yu Q, Luo X, Adeli E, Wang Y, Lu L, Yuille AL, Zhou Y (2021) TransUNet: transformers make strong encoders for medical image segmentation. Available at http://arxiv.org/abs/2102.04306. Accessed 3 Mar 2023

  11. Hatamizadeh A, Tang Y, Nath V, Yang D, Myronenko A, Landman B, Roth HR, Xu D (2022) UNETR: transformers for 3d medical image segmentation. In: 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pp 1748–1758. https://doi.org/10.1109/WACV51458.2022.00181

  12. Cao H, Wang Y, Chen J, Jiang D, Zhang X, Tian Q, Wang M (2023) Swin-Unet: unet-like pure transformer for medical image segmentation. In: Karlinsky L, Michaeli T, Nishino K (eds) Computer Vision – ECCV 2022 Workshops. Lecture notes in computer science. Springer Nature Switzerland, Cham, pp 205–218. https://doi.org/10.1007/978-3-031-25066-8_9

  13. Hatamizadeh A, Nath V, Tang Y, Yang D, Roth HR, Xu D (2022) Swin UNETR: swin transformers for semantic segmentation of brain tumors in MRI images. In: Crimi A, Bakas S (eds) Brainlesion: glioma, multiple sclerosis, stroke and traumatic brain injuries. Springer International Publishing, Cham, pp 272–284. https://doi.org/10.1007/978-3-031-08999-2_22.

  14. Ammar LB, Gasmi K, Ltaifa IB (2024) ViT-TB: ensemble learning based ViT model for tuberculosis recognition. Cybern Syst 55(3):634–653. https://doi.org/10.1080/01969722.2022.2162736

    Article  MATH  Google Scholar 

  15. Qiu J, Mitra J, Ghose S, Dumas C, Yang J, Sarachan B, Judson MA (2024) A multichannel CT and radiomics-guided CNN-ViT (RadCT-CNNViT) ensemble network for diagnosis of pulmonary sarcoidosis. Diagnostics 14(10):1049. https://doi.org/10.3390/diagnostics14101049.

  16. Xu G, Wu X, Zhang X, and He X (2021) LeViT-UNet: make faster encoders with transformer for medical image segmentation. arXiv: https://doi.org/10.48550/arXiv.2107.08623 . Accessed 3 Mar 2023.

  17. Graham B et al (2021) LeViT: a vision transformer in ConvNet’s clothing for faster inference. arXivhttps://doi.org/10.48550/arXiv.2104.01136.

  18. Yang X, Tian X (2022) TransNUNet: using attention mechanism for whole heart segmentation. In: 2022 IEEE 2nd International Conference on Power, Electronics and Computer Applications (ICPECA), pp 553–556. https://doi.org/10.1109/ICPECA53709.2022.9719101

  19.  Gao Y, Zhou M, and Metaxas DN (2021) UTNet: a hybrid transformer architecture for medical image segmentation, in medical image computing and computer assisted intervention – MICCAI. In: de Bruijne M, PC, Cattin PC, Cotin S,Padoy N, Speidel S, Zheng Y, and Essert C (eds) in lecture notes in computer science. Cham: Springer International Publishing, pp. 61–71. https://doi.org/10.1007/978-3-030-87199-46.

  20. Gao Y, Zhou M, Liu D, Yan Z, Zhang S, and Metaxas DN (2023) A data-scalable transformer for medical image segmentation: architecture, model efficiency, and benchmark. arXiv: https://doi.org/10.48550/arXiv.2203.00131. Accessed 01 Aug 2023.

  21. Deng K et al (2021) TransBridge: a lightweight transformer for left ventricle segmentation in echocardiography, in simplifying medical ultrasound. In: Noble JA, Aylward S, Grimwood A, Min Z, Lee S-L, and Hu Y (eds) in lecture notes in computer science. Cham: Springer International Publishing, pp. 63–72. https://doi.org/10.1007/978-3-030-87583-1_7.

  22. Wu Y et al (2022) D-Former: a U-shaped dilated transformer for 3D medical image segmentation. arXiv: https://doi.org/10.48550/arXiv.2201.00462.

  23. Aghapanah H et al (2024) CardSegNet: an adaptive hybrid CNN-vision transformer model for heart region segmentation in cardiac MRI. Comput Med Imaging Graph 115. https://doi.org/10.1016/j.compmedimag.2024.102382

  24. Huang X, Deng Z, Li D, and Yuan X (2021) MISSFormer: an effective medical image segmentation transformer. arXiv:https://doi.org/10.48550/arXiv.2109.07162.

  25. Zhou H-Y, Guo J, Zhang Y, Yu L, Wang L, and Yu Y (2022) nnFormer: interleaved transformer for volumetric segmentation. arXiv:https://doi.org/10.48550/arXiv.2109.03201.

  26. Liu D et al (2022) TransFusion: multi-view divergent fusion for medical image segmentation with transformers,” in medical image computing and computer assisted intervention – MICCAI. In: Wang L, Dou Q, Fletcher PT, Speidel S, and Li S (eds) in lecture notes in computer science. Cham: Springer Nature Switzerland, pp. 485–495. https://doi.org/10.1007/978-3-031-16443-9_47.

  27. Ji Y et al (2021) Multi-compound Transformer for Accurate Biomedical Image Segmentation, in medical image computing and computer assisted interventionMICCAI. In: de Bruijne M, Cattin PC, Cotin S, Padoy N, Speidel S, Zheng Y, and Essert C (eds) in lecture notes in computer science. Cham: Springer International Publishing, pp. 326–336. https://doi.org/10.1007/978-3-030-87193-2_31.

  28. Li B, Yang T, Zhao X (2023) NVTrans-UNet: Neighborhood vision transformer based U-Net for multi-modal cardiac MR image segmentation. J Appl Clin Med Phys 24(3). https://doi.org/10.1002/acm2.13908

    Article  Google Scholar 

  29. Yang R, Liu K, Liang Y (2024) A fusion-attention swin transformer for cardiac MRI image segmentation. IET Image Proc 18(1):105–115. https://doi.org/10.1049/ipr2.12936

    Article  MATH  Google Scholar 

  30. Luo X, Hu M, Song S, Wang G, and Zhang S (2021) Semi-supervised medical image segmentation via cross teaching between CNN and transformer. arXiv: https://doi.org/10.48550/arXiv.2112.04894. Accessed 02 Aug. 2023

  31. Mazher M et al (2024) Self-supervised spatial–temporal transformer fusion based federated framework for 4D cardiovascular image segmentation. Inf Fusion 106. https://doi.org/10.1016/j.inffus.2024.102256

  32. Zhou T, Cheng Q, Lu H, Li Q, Zhang X, Qiu S (2023) Deep learning methods for medical image fusion: a review. Comput Biol Med 160. https://doi.org/10.1016/j.compbiomed.2023.106959

  33. Hermessi H, Mourali O, Zagrouba E (2021) Multimodal medical image fusion review: theoretical background and recent advances. Signal Process 183. https://doi.org/10.1016/j.sigpro.2021.108036

  34.  Sahu A, Bhateja V, Krishn A, and Himanshi (2014) Medical image fusion with laplacian pyramids, in 2014 international conference on medical imaging, m-health and emerging communication systems(MedCom)pp. 448–453. https://doi.org/10.1109/MedCom.2014.7006050

  35. Bhavana V, Krishnappa HK (2015) Multi-modality medical image fusion using discrete wavelet transform. Procedia Comput Sci 70:625–631.https://doi.org/10.1016/j.procs.2015.10.057

    Article  MATH  Google Scholar 

  36. Tang L, Li L, Qian J, Zhang J, Pan J-S (2016) NSCT-based multimodal medical image fusion with sparse representation and pulse coupled neural network. J Inf Hiding Multim Signal Process 7(6):1306–1316

  37. Das S, Kundu MK (2013) A neuro-fuzzy approach for medical image fusion. IEEE Trans Biomed Eng 60(12):3347–3353.https://doi.org/10.1109/TBME.2013.2282461

    Article  MATH  Google Scholar 

  38. Shahdoosti HR, Mehrabi A (2018) Multimodal image fusion using sparse representation classification in tetrolet domain. Digital Signal Process 79:9–22. https://doi.org/10.1016/j.dsp.2018.04.002

    Article  MathSciNet  MATH  Google Scholar 

  39. Prakash O, Park CM, Khare A, Jeon M, Gwak J (2019) Multiscale fusion of multimodal medical images using lifting scheme based biorthogonal wavelet transform. Optik 182:995–1014. https://doi.org/10.1016/j.ijleo.2018.12.028

    Article  MATH  Google Scholar 

  40. Singh S et al (2023) A review of image fusion: methods, applications and performance metrics. Digital Signal Process 137. https://doi.org/10.1016/j.dsp.2023.104020

  41. Li S, Kwok JT, Wang Y (2002) Using the discrete wavelet frame transform to merge landsat TM and SPOT panchromatic images. Inf Fusion 3(1):17–23. https://doi.org/10.1016/S1566-2535(01)00037-9

    Article  MATH  Google Scholar 

  42. Ronneberger O, Fischer P, and Brox T (2015) U-Net: convolutional networks for biomedical image segmentation, in medical image computing and computer-assisted intervention MICCAI. In: Navab N, Hornegger J, Wells WM, and Frangi AF (eds) in lecture notes in computer science. Cham: Springer International Publishing, pp. 234–241. https://doi.org/10.1007/978-3-319-24574-4_28

  43. Gao L, Zhang L, Liu C, Wu S (2020) Handling imbalanced medical image data: a deep-learning-based one-class classification approach. Artif Intell Med 108. https://doi.org/10.1016/j.artmed.2020.101935

  44. Simonyan K and Zisserman A (2015) Very Deep convolutional networks for large-scale image recognition. https://doi.org/10.48550/arXiv.1409.1556, Accessed 18 March 2022. [Online].

  45. Xie E, Wang W, Yu Z, Anandkumar A, Alvarez JM, and Luo P (2021) “SegFormer: Simple and efficient design for semantic segmentation with transformers,” in advances in neural information processing systems, Curran Associates, Inc. pp. 12077–12090. Accessed: 6 Aug 2023. [Online]. Available: https://proceedings.neurips.cc/paper/2021/hash/64f1f27bf1b4ec22924fd0acb550c235-Abstract.html

  46. Chaoyang Z, Shibao S, Wenmao H, Pengcheng Z (2024) FDR-TransUNet: a novel encoder-decoder architecture with vision transformer for improved medical image segmentation. Comput Biol Med 169

  47. Chong Y, Xie N, Liu X, Pan S (2023) P-TransUNet: an improved parallel network for medical image segmentation. BMC Bioinformatics 24(1):285. https://doi.org/10.1186/s12859-023-05409-7

    Article  MATH  Google Scholar 

  48.  Liu Z et al (2023) “Swin transformer: hierarchical vision transformer using shifted windows,” presented at the proceedings of the IEEE/CVF international conference on computer vision, pp. 10012–10022. Accessed 9 Aug. 2023. [Online]. Available: https://openaccess.thecvf.com/content/ICCV2021/html/Liu_Swin_Transformer_Hierarchical_Vision_Transformer_Using_Shifted_Windows_ICCV_2021_paper

  49. Milletari F, Navab N, and Ahmadi S-A (2016) “V-Net: fully convolutional neural networks for volumetric medical image segmentation,” in 2016 fourth international conference on 3D Vision (3DV), pp. 565–571. https://doi.org/10.1109/3DV.2016.79

  50. Bernard O et al (2018) Deep learning techniques for automatic MRI Cardiac multi-structures segmentation and diagnosis: is the problem solved? IEEE Trans Med Imaging 37(11):2514–2525.https://doi.org/10.1109/TMI.2018.2837502

    Article  MATH  Google Scholar 

  51. Deng J, Dong W, Socher R, Li L-J, Li K, and Fei-Fei L (2009) ImageNet: a large-scale hierarchical image database, In 2009 IEEE conference on computer vision and pattern recognition, pp. 248–255. https://doi.org/10.1109/CVPR.2009.5206848

  52. Xu S and Quan H (2021) ECT-NAS: searching efficient CNN-Transformers architecture for medical image segmentation, In 2021 IEEE international conference on bioinformatics and biomedicine (BIBM), pp. 1601–1604. https://doi.org/10.1109/BIBM52615.2021.9669734

  53. Chen Y, Lu X, Xie Q (2023) ATFormer: Advanced transformer for medical image segmentation. Biomed Signal Process Control 85. https://doi.org/10.1016/j.bspc.2023.105079

  54. Li J et al (2023) MCRformer: morphological constraint reticular transformer for 3D medical image segmentation. Expert Syst Appl 232. https://doi.org/10.1016/j.eswa.2023.120877

  55. Isensee F, Jaeger PF, Full PM, Wolf I,Engelhardt S, and Maier-Hein MH, “automatic cardiac disease assessment on cine-MRI via time-series segmentation and domain specific features,” in statistical atlases and computational models of the heart.ACDC and MMWHS challenges. In: Pop M, Sermesant M, Jodoin P-M, Lalande A, Zhuang X, Yang G, Young A, and Bernard O (eds) in lecture notes in computer science. Cham: Springer International Publishing, pp. 120–129. https://doi.org/10.1007/978-3-319-75541-0_13.

  56. Baumgartner CF, Koch LM, Pollefeys M, and Konukoglu E (2018) “An exploration of 2D and 3D deep learning techniques for cardiac MR Image segmentation,” in statistical atlases and computational models of the heart. ACDC and MMWHS challenges. In: Pop M, Sermesant M, Jodoin P-M, Lalande A, Zhuang X, Yang G, Young A, and Bernard O (eds) in lecture notes in computer science. Cham: Springer International Publishing, pp. 111–119. https://doi.org/10.1007/978-3-319-75541-0_12

  57. Zotti C, Luo Z, Lalande A, Jodoin P-M (2019) Convolutional neural network with shape prior applied to cardiac MRI segmentation. IEEE J Biomed Health Inform 23(3):1119–1128. https://doi.org/10.1109/JBHI.2018.2865450

    Article  Google Scholar 

  58. Painchaud N, Skandarani Y, Judge T, Bernard O, A. Lalande A, and Jodoin P-M (2019) Cardiac MRI segmentation with strong anatomical guarantees, in medical image computing and computer assisted intervention – MICCAI. In: Shen D, Liu T, Peters TM, Staib LH, Essert C, Zhou S, Yap PT, and A. Khan A (eds) in lecture notes in computer science. Cham: Springer International Publishing, pp. 632–640. https://doi.org/10.1007/978-3-030-32245-8_70

  59. Khened M, Kollerathu VA, Krishnamurthi G (2019) Fully convolutional multi-scale residual DenseNets for cardiac segmentation and automated cardiac diagnosis using ensemble of classifiers. Med Image Anal 51:21–45. https://doi.org/10.1016/j.media.2018.10.004

  60. Simantiris G, Tziritas G (2020) Cardiac MRI segmentation with a dilated CNN incorporating domain-specific constraints. IEEE J Sel Topics Signal Process 14(6):1235–1243. https://doi.org/10.1109/JSTSP.2020.3013351

    Article  MATH  Google Scholar 

  61. da Silva IFS, Silva AC, de Paiva AC, Gattass M (2022) A cascade approach for automatic segmentation of cardiac structures in short-axis cine-MR images using deep neural networks. Expert Syst Appl 197.https://doi.org/10.1016/j.eswa.2022.116704

  62. Dong S et al (2022) DeU-Net 2.0: Enhanced deformable U-Net for 3D cardiac cine MRI segmentation. Med Image Anal 78. https://doi.org/10.1016/j.media.2022.102389

  63. Wang K-N et al (2022) AWSnet: an auto-weighted supervision attention network for myocardial scar and edema segmentation in multi-sequence cardiac magnetic resonance images. Med Image Anal 77. https://doi.org/10.1016/j.media.2022.102362

  64. Kim D, Kim J (2023) Vision transformer compression and architecture exploration with efficient embedding space search. In: Wang L, Gall J, Chin T-J, Sato I, Chellappa R (eds) computer vision – ACCV 2022. Lecture Notes in Computer Science. Springer Nature Switzerland, Cham, pp 524–540. https://doi.org/10.1007/978-3-031-26313-2_32

  65. Alqahtani A, Xie X, and Jones MW (2021) Literature review of deep network compression. Informatics 8(4). https://doi.org/10.3390/informatics8040077

Download references

Acknowledgements

We are grateful to the Direction Generale de la Recherche Scientifique et du Developpement Technologique (DGRSDT) which kindly supported this research, as well as to the Laboratoire de Gestion Electronique du Document (LabGED) where this study was conducted.

Funding

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Assia Boukhamla.

Ethics declarations

Conflicts interest

The authors declare no conflicts of interest regarding the publication of this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Boukhamla, A., Azizi, N. & Belhaouari, S.B. Intelligent mask image reconstruction for cardiac image segmentation through local–global fusion. Appl Intell 55, 257 (2025). https://doi.org/10.1007/s10489-024-06085-7

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10489-024-06085-7

Keywords