Skip to main content

Advertisement

Log in

Enhanced Stroke Risk Prediction: A Fusion of Machine Learning Models for Improved Healthcare Strategies

  • Original Research
  • Published:
SN Computer Science Aims and scope Submit manuscript

    We’re sorry, something doesn't seem to be working properly.

    Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

Stroke is a serious medical condition that can result in death as it causes a sudden loss of blood supply to large portions of brain. Given the rising prevalence of strokes, it is critical to understand the many factors that contribute to these occurrences. A strong prediction framework must be developed to identify a person's risk for stroke. The effectiveness of several machine learning (ML) techniques, such as Decision Trees (DT), Extra Trees (ET), Random Forest (RF), and Voting Classifiers (VC), in predicting the risk of stroke is being investigated. Furthermore, this research clarifies that whereas certain factors—like age, gender, and smoking status—have a big impact, others—like place of residence—have little effect and may be controlled using careful feature selection methods. Principal component analysis (PCA) is an approach for reducing dimensionality that is particularly effective when combined with class-balancing methods such as Synthetic Minority Oversampling Technique (SMOTE), which is required for dealing with unbalanced datasets, such as those with only 5% of cases indicating stroke risk and 95% representing non-stroke cases. The SMOTE oversampling approach, which involves replicating nearby samples, is used to correct this skew. We examine each algorithm's Receiver Operating Characteristic (ROC) scores; we find that ET, RF, and VC have areas under the curve that are larger than 0.95. After a thorough analysis that considers many performance criteria such as recall, accuracy, F1 score, and precision, the Voting Ensemble approach is found to be a better option than the current stroke detection methods. Interestingly, hypertension is identified as a key risk factor, with most hypertensive persons being at risk for stroke. There is a strong correlation between cardiovascular disease and stroke, with most stroke cases occurring in people who already have a heart issue. It is noteworthy that whilst 5% of people with heart illness get strokes, 95% of those without cardiac conditions never have a stroke.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18

Similar content being viewed by others

Data availability

Data is available on Kaggle (https://www.kaggle.com/datasets/fedesoriano/stroke-prediction-dataset/data).

References

  1. Noor MBT, Zenia NZ, Kaiser MS, AlMamun S, Mahmud M. Application of deep learning in detecting neurological disorders from magnetic resonance images: a survey on the detection of Alzheimer’s disease, Parkinson’s disease and schizophrenia. Brain Inform. 2020. https://doi.org/10.1186/s40708-020-00112-2.

    Article  Google Scholar 

  2. Mahmud M, et al. A brain-inspired trust management model to assure security in a cloud based IoT framework for neuroscience applications. Cognit Comput. 2018;10(5):864–73. https://doi.org/10.1007/s12559-018-9543-3.

    Article  Google Scholar 

  3. Bhatia S, Alam S, Shuaib M, Hameed Alhameed M, Jeribi F, Alsuwailem RI. Retinal vessel extraction via assisted multi-channel feature map and U-Net. Front Public Health. 2022. https://doi.org/10.3389/fpubh.2022.858327.

    Article  Google Scholar 

  4. Ischemic stroke. https://www.mayoclinic.org/diseases-conditions/stroke/multimedia/img-20116029

  5. Liew S-L, et al. A large, open source dataset of stroke anatomical brain images and manual lesion segmentations. Sci Data. 2018;5(1):180011. https://doi.org/10.1038/sdata.2018.11.

    Article  MathSciNet  Google Scholar 

  6. Sun Y, et al. Risk factors for constipation in patients with acute and subacute ischemic stroke: a retrospective cohort study. J Clin Neurosci. 2022;106:91–5. https://doi.org/10.1016/j.jocn.2022.10.014.

    Article  Google Scholar 

  7. Dev S, Wang H, Nwosu CS, Jain N, Veeravalli B, John D. A predictive analytics approach for stroke prediction using machine learning and neural networks. Healthc Anal. 2022;2:100032. https://doi.org/10.1016/j.health.2022.100032.

    Article  Google Scholar 

  8. Musuka TD, Wilton SB, Traboulsi M, Hill MD. Diagnosis and management of acute ischemic stroke: speed is critical. Can Med Assoc J. 2015;187(12):887–93. https://doi.org/10.1503/cmaj.140355.

    Article  Google Scholar 

  9. Yang J, et al. The independent and combined association of napping and night sleep duration with stroke in Chinese rural adults. Sleep Breath. 2023;27(1):265–74. https://doi.org/10.1007/s11325-022-02619-w.

    Article  Google Scholar 

  10. Yu J, Park S, Lee H, Pyo C-S, Lee YS. An elderly health monitoring system using machine learning and in-depth analysis techniques on the NIH stroke scale. Mathematics. 2020;8(7):1115. https://doi.org/10.3390/math8071115.

    Article  Google Scholar 

  11. Kansadub T, Thammaboosadee S, Kiattisin S, Jalayondeja C. Stroke risk prediction model based on demographic data. In: 2015 8th biomedical engineering international conference (BMEiCON), IEEE, 2015, pp. 1–3. https://doi.org/10.1109/BMEiCON.2015.7399556.

  12. AliAnsari Z, MadhavaTripathi M, Ahmed R. Quantifying breast cancer: radiomics, machine learning, and dimensionality reduction for enhanced image-based diagnosis. Int J Comput Digit Syst. 2024;16(1):1535–52. https://doi.org/10.12785/ijcds/1601114.

    Article  Google Scholar 

  13. Tripathi AK, Ahmed R, Tiwari AK. Review of deep learning techniques for neurological disorders detection, 2023. https://doi.org/10.21203/rs.3.rs-2269745.

  14. Kumar S, et al. Exploitation of machine learning algorithms for detecting financial crimes based on customers’ behavior. Sustainability. 2022;14(21):13875. https://doi.org/10.3390/su142113875.

    Article  Google Scholar 

  15. Ahmed R, Ahmad T, Almutairi FM, Qahtani AM, Alsufyani A, Almutiry O. Fuzzy semantic classification of multi-domain E-learning concept. Mobile Netw Appl. 2021;26(5):2206–15. https://doi.org/10.1007/s11036-021-01776-8.

    Article  Google Scholar 

  16. Ahmed R, Singh P, Ahmad T. Novel semantic relatedness computation for multi-domain unstructured data. EAI Endorsed Trans Energy Web. 2018. https://doi.org/10.4108/eai.13-7-2018.165503.

    Article  Google Scholar 

  17. Ahmad T, Ahmad R, Masud S, Nilofer F. Framework to extract context vectors from unstructured data using big data analytics. In: 2016 Ninth International Conference on Contemporary Computing (IC3), IEEE, 2016, pp. 1–6. https://doi.org/10.1109/IC3.2016.7880229.

  18. Singh PK, Ahmed R, Rajput IS, Choudhury P. A comparative study on prediction approaches of item-based collaborative filtering in neighborhood-based recommendations. Wirel Pers Commun. 2021;121(1):857–77. https://doi.org/10.1007/s11277-021-08662-2.

    Article  Google Scholar 

  19. Singh PK, Othman E, Ahmed R, Mahmood A, Dhahri H, Choudhury P. Optimized recommendations by user profiling using apriori algorithm. Appl Soft Comput. 2021;106:107272. https://doi.org/10.1016/j.asoc.2021.107272.

    Article  Google Scholar 

  20. Topol EJ. High-performance medicine: the convergence of human and artificial intelligence. Nat Med. 2019;25(1):44–56. https://doi.org/10.1038/s41591-018-0300-7.

    Article  Google Scholar 

  21. Chen T, Zhou X, Wang G. Using an innovative method for breast cancer diagnosis based on extreme gradient boost optimized by simplified memory bounded A*. Biomed Signal Process Control. 2024;87:105450. https://doi.org/10.1016/j.bspc.2023.105450.

    Article  Google Scholar 

  22. Amin SU, Agarwal K, Beg R. Genetic neural network based data mining in prediction of heart disease using risk factors. In: 2013 IEEE conference on information and communication technologies, IEEE, 2013, pp. 1227–1231. https://doi.org/10.1109/CICT.2013.6558288.

  23. Li K, Xu H, Liu X. Analysis and visualization of accidents severity based on LightGBM-TPE. Chaos Solitons Fractals. 2022;157:111987. https://doi.org/10.1016/j.chaos.2022.111987.

    Article  Google Scholar 

  24. Teo YH, et al. Predicting clinical outcomes in acute ischemic stroke patients undergoing endovascular thrombectomy with machine learning. Clin Neuroradiol. 2021;31(4):1121–30. https://doi.org/10.1007/s00062-020-00990-3.

    Article  Google Scholar 

  25. Choi Y-A, et al. Machine-learning-based elderly stroke monitoring system using electroencephalography vital signals. Appl Sci. 2021;11(4):1761. https://doi.org/10.3390/app11041761.

    Article  Google Scholar 

  26. Heo J, Yoon JG, Park H, Kim YD, Nam HS, Heo JH. Machine learning-based model for prediction of outcomes in acute stroke. Stroke. 2019;50(5):1263–5. https://doi.org/10.1161/STROKEAHA.118.024293.

    Article  Google Scholar 

  27. Xie Y, et al. Use of gradient boosting machine learning to predict patient outcome in acute ischemic stroke on the basis of imaging, demographic, and clinical information. Am J Roentgenol. 2019;212(1):44–51. https://doi.org/10.2214/AJR.18.20260.

    Article  Google Scholar 

  28. Lip GYH, Genaidy A, Tran G, Marroquin P, Estes C, Sloop S. Improving stroke risk prediction in the general population: a comparative assessment of common clinical rules, a new multimorbid index, and machine-learning-based algorithms. Thromb Haemost. 2022;122(01):142–50. https://doi.org/10.1055/a-1467-2993.

    Article  Google Scholar 

  29. Liu T, Fan W, Wu C. A hybrid machine learning approach to cerebral stroke prediction based on imbalanced medical dataset. Artif Intell Med. 2019;101:101723. https://doi.org/10.1016/j.artmed.2019.101723.

    Article  Google Scholar 

  30. Messica S, et al. Enhancing stroke risk and prognostic timeframe assessment with deep learning and a broad range of retinal biomarkers. Artif Intell Med. 2024;154:102927. https://doi.org/10.1016/j.artmed.2024.102927.

    Article  Google Scholar 

  31. Khosla A, Cao Y, Lin CC-Y, Chiu H-K, Hu J, Lee H. An integrated machine learning approach to stroke prediction. In: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD ’10. ACM Press, New York, 2010, p. 183. https://doi.org/10.1145/1835804.1835830.

  32. Rahmani MKI, et al. Blockchain-based trust management framework for cloud computing-based internet of medical things (IoMT): a systematic review. Comput Intell Neurosci. 2022;2022:1–14. https://doi.org/10.1155/2022/9766844.

    Article  Google Scholar 

  33. Zhang L, et al. Ischemic stroke lesion segmentation using multi-plane information fusion. IEEE Access. 2020;8:45715–25. https://doi.org/10.1109/ACCESS.2020.2977415.

    Article  Google Scholar 

  34. Yu Y, et al. Use of deep learning to predict final ischemic stroke lesions from initial magnetic resonance imaging. JAMA Netw Open. 2020;3(3):e200772. https://doi.org/10.1001/jamanetworkopen.2020.0772.

    Article  Google Scholar 

  35. Barman A, Inam ME, Lee S, Savitz S, Sheth S, Giancardo L. Determining ischemic stroke from CT-angiography imaging using symmetry-sensitive convolutional networks. In: 2019 IEEE 16th international symposium on biomedical imaging (ISBI 2019). IEEE, 2019, pp. 1873–1877. https://doi.org/10.1109/ISBI.2019.8759475.

  36. Dolz J, Ben Ayed I, Desrosiers C. Dense multi-path U-net for ischemic stroke lesion segmentation in multiple image modalities, 2019, pp. 271–282. https://doi.org/10.1007/978-3-030-11723-8_27.

  37. Pérez Malla CU, Valdés Hernández MC, Rachmadi MF, Komura T. Evaluation of enhanced learning techniques for segmenting ischaemic stroke lesions in brain magnetic resonance perfusion images using a convolutional neural network scheme. Front Neuroinform. 2019. https://doi.org/10.3389/fninf.2019.00033.

    Article  Google Scholar 

  38. Öman O, Mäkelä T, Salli E, Savolainen S, Kangasniemi M. 3D convolutional neural networks applied to CT angiography in the detection of acute ischemic stroke. Eur Radiol Exp. 2019;3(1):8. https://doi.org/10.1186/s41747-019-0085-6.

    Article  Google Scholar 

  39. Bertels J, Robben D, Vandermeulen D, Suetens P Contra-lateral information CNN for core lesion segmentation based on native CTP in acute stroke, 2019, pp. 263–270. https://doi.org/10.1007/978-3-030-11723-8_26.

  40. To MNN, Kim HJ, Roh HG, Cho Y-S, Kwak JT. Deep regression neural networks for collateral imaging from dynamic susceptibility contrast-enhanced magnetic resonance perfusion in acute ischemic stroke. Int J Comput Assist Radiol Surg. 2020;15(1):151–62. https://doi.org/10.1007/s11548-019-02060-7.

    Article  Google Scholar 

  41. Soltanpour M, Greiner R, Boulanger P, Buck B. Ischemic stroke lesion prediction in CT perfusion scans using multiple parallel U-nets following by a pixel-level classifier. In: 2019 IEEE 19th international conference on bioinformatics and bioengineering (BIBE), IEEE, 2019, pp. 957–963. https://doi.org/10.1109/BIBE.2019.00179.

  42. Abulnaga SM, Rubin J. Ischemic stroke lesion segmentation in CT perfusion scans using pyramid pooling and focal loss, 2019, pp. 352–363. https://doi.org/10.1007/978-3-030-11723-8_36.

  43. Islam M, Vaidyanathan NR, Jose VJM, Ren H. Ischemic stroke lesion segmentation using adversarial learning, 2019, pp. 292–300. https://doi.org/10.1007/978-3-030-11723-8_29.

  44. Liu P. Stroke lesion segmentation with 2D novel CNN pipeline and novel loss function, 2019, pp. 253–262. https://doi.org/10.1007/978-3-030-11723-8_25.

  45. Heikal A, El-Ghamry A, Elmougy S, Rashad MZ. Fine tuning deep learning models for breast tumor classification. Sci Rep. 2024;14(1):10753. https://doi.org/10.1038/s41598-024-60245-w.

    Article  Google Scholar 

  46. Zhang R, et al. Automatic segmentation of acute ischemic stroke from DWI using 3-D fully convolutional DenseNets. IEEE Trans Med Imaging. 2018;37(9):2149–60. https://doi.org/10.1109/TMI.2018.2821244.

    Article  Google Scholar 

  47. Mondal S, Ghosh S, Nag A. Brain stroke prediction model based on boosting and stacking ensemble approach. Int J Inf Technol. 2024;16(1):437–46. https://doi.org/10.1007/s41870-023-01418-0.

    Article  Google Scholar 

  48. Shinde S, Kurhekar MP, Diwan T, Pikle NK, Gulhane M. Design of a novel enhanced machine learning model for early prediction of cerebral stroke early prediction of brain stroke. Int J Comput Digit Syst. 2024;15(1):1807–21. https://doi.org/10.12785/ijcds/1501127.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Contributions

Conceptualization: Rafeeq Ahmed; Methodology: Rafeeq Ahmed, Zubair Ashraf; Formal analysis & Data curation: Rafeeq Ahmed, Anmol Varshney; writing – original draft preparation: Rafeeq Ahmed; writing – review & editing: Rafeeq Ahmed, Anmol Varshney, and Zubair Ashraf; supervision: Rafeeq Ahmed.

Corresponding author

Correspondence to Rafeeq Ahmed.

Ethics declarations

Conflict of Interest

The authors declare no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ahmed, R., Varshney, A., Ashraf, Z. et al. Enhanced Stroke Risk Prediction: A Fusion of Machine Learning Models for Improved Healthcare Strategies. SN COMPUT. SCI. 5, 1078 (2024). https://doi.org/10.1007/s42979-024-03389-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s42979-024-03389-w

Keywords

Navigation