Abstract
Stroke is a serious medical condition that can result in death as it causes a sudden loss of blood supply to large portions of brain. Given the rising prevalence of strokes, it is critical to understand the many factors that contribute to these occurrences. A strong prediction framework must be developed to identify a person's risk for stroke. The effectiveness of several machine learning (ML) techniques, such as Decision Trees (DT), Extra Trees (ET), Random Forest (RF), and Voting Classifiers (VC), in predicting the risk of stroke is being investigated. Furthermore, this research clarifies that whereas certain factors—like age, gender, and smoking status—have a big impact, others—like place of residence—have little effect and may be controlled using careful feature selection methods. Principal component analysis (PCA) is an approach for reducing dimensionality that is particularly effective when combined with class-balancing methods such as Synthetic Minority Oversampling Technique (SMOTE), which is required for dealing with unbalanced datasets, such as those with only 5% of cases indicating stroke risk and 95% representing non-stroke cases. The SMOTE oversampling approach, which involves replicating nearby samples, is used to correct this skew. We examine each algorithm's Receiver Operating Characteristic (ROC) scores; we find that ET, RF, and VC have areas under the curve that are larger than 0.95. After a thorough analysis that considers many performance criteria such as recall, accuracy, F1 score, and precision, the Voting Ensemble approach is found to be a better option than the current stroke detection methods. Interestingly, hypertension is identified as a key risk factor, with most hypertensive persons being at risk for stroke. There is a strong correlation between cardiovascular disease and stroke, with most stroke cases occurring in people who already have a heart issue. It is noteworthy that whilst 5% of people with heart illness get strokes, 95% of those without cardiac conditions never have a stroke.


















Similar content being viewed by others
Data availability
Data is available on Kaggle (https://www.kaggle.com/datasets/fedesoriano/stroke-prediction-dataset/data).
References
Noor MBT, Zenia NZ, Kaiser MS, AlMamun S, Mahmud M. Application of deep learning in detecting neurological disorders from magnetic resonance images: a survey on the detection of Alzheimer’s disease, Parkinson’s disease and schizophrenia. Brain Inform. 2020. https://doi.org/10.1186/s40708-020-00112-2.
Mahmud M, et al. A brain-inspired trust management model to assure security in a cloud based IoT framework for neuroscience applications. Cognit Comput. 2018;10(5):864–73. https://doi.org/10.1007/s12559-018-9543-3.
Bhatia S, Alam S, Shuaib M, Hameed Alhameed M, Jeribi F, Alsuwailem RI. Retinal vessel extraction via assisted multi-channel feature map and U-Net. Front Public Health. 2022. https://doi.org/10.3389/fpubh.2022.858327.
Ischemic stroke. https://www.mayoclinic.org/diseases-conditions/stroke/multimedia/img-20116029
Liew S-L, et al. A large, open source dataset of stroke anatomical brain images and manual lesion segmentations. Sci Data. 2018;5(1):180011. https://doi.org/10.1038/sdata.2018.11.
Sun Y, et al. Risk factors for constipation in patients with acute and subacute ischemic stroke: a retrospective cohort study. J Clin Neurosci. 2022;106:91–5. https://doi.org/10.1016/j.jocn.2022.10.014.
Dev S, Wang H, Nwosu CS, Jain N, Veeravalli B, John D. A predictive analytics approach for stroke prediction using machine learning and neural networks. Healthc Anal. 2022;2:100032. https://doi.org/10.1016/j.health.2022.100032.
Musuka TD, Wilton SB, Traboulsi M, Hill MD. Diagnosis and management of acute ischemic stroke: speed is critical. Can Med Assoc J. 2015;187(12):887–93. https://doi.org/10.1503/cmaj.140355.
Yang J, et al. The independent and combined association of napping and night sleep duration with stroke in Chinese rural adults. Sleep Breath. 2023;27(1):265–74. https://doi.org/10.1007/s11325-022-02619-w.
Yu J, Park S, Lee H, Pyo C-S, Lee YS. An elderly health monitoring system using machine learning and in-depth analysis techniques on the NIH stroke scale. Mathematics. 2020;8(7):1115. https://doi.org/10.3390/math8071115.
Kansadub T, Thammaboosadee S, Kiattisin S, Jalayondeja C. Stroke risk prediction model based on demographic data. In: 2015 8th biomedical engineering international conference (BMEiCON), IEEE, 2015, pp. 1–3. https://doi.org/10.1109/BMEiCON.2015.7399556.
AliAnsari Z, MadhavaTripathi M, Ahmed R. Quantifying breast cancer: radiomics, machine learning, and dimensionality reduction for enhanced image-based diagnosis. Int J Comput Digit Syst. 2024;16(1):1535–52. https://doi.org/10.12785/ijcds/1601114.
Tripathi AK, Ahmed R, Tiwari AK. Review of deep learning techniques for neurological disorders detection, 2023. https://doi.org/10.21203/rs.3.rs-2269745.
Kumar S, et al. Exploitation of machine learning algorithms for detecting financial crimes based on customers’ behavior. Sustainability. 2022;14(21):13875. https://doi.org/10.3390/su142113875.
Ahmed R, Ahmad T, Almutairi FM, Qahtani AM, Alsufyani A, Almutiry O. Fuzzy semantic classification of multi-domain E-learning concept. Mobile Netw Appl. 2021;26(5):2206–15. https://doi.org/10.1007/s11036-021-01776-8.
Ahmed R, Singh P, Ahmad T. Novel semantic relatedness computation for multi-domain unstructured data. EAI Endorsed Trans Energy Web. 2018. https://doi.org/10.4108/eai.13-7-2018.165503.
Ahmad T, Ahmad R, Masud S, Nilofer F. Framework to extract context vectors from unstructured data using big data analytics. In: 2016 Ninth International Conference on Contemporary Computing (IC3), IEEE, 2016, pp. 1–6. https://doi.org/10.1109/IC3.2016.7880229.
Singh PK, Ahmed R, Rajput IS, Choudhury P. A comparative study on prediction approaches of item-based collaborative filtering in neighborhood-based recommendations. Wirel Pers Commun. 2021;121(1):857–77. https://doi.org/10.1007/s11277-021-08662-2.
Singh PK, Othman E, Ahmed R, Mahmood A, Dhahri H, Choudhury P. Optimized recommendations by user profiling using apriori algorithm. Appl Soft Comput. 2021;106:107272. https://doi.org/10.1016/j.asoc.2021.107272.
Topol EJ. High-performance medicine: the convergence of human and artificial intelligence. Nat Med. 2019;25(1):44–56. https://doi.org/10.1038/s41591-018-0300-7.
Chen T, Zhou X, Wang G. Using an innovative method for breast cancer diagnosis based on extreme gradient boost optimized by simplified memory bounded A*. Biomed Signal Process Control. 2024;87:105450. https://doi.org/10.1016/j.bspc.2023.105450.
Amin SU, Agarwal K, Beg R. Genetic neural network based data mining in prediction of heart disease using risk factors. In: 2013 IEEE conference on information and communication technologies, IEEE, 2013, pp. 1227–1231. https://doi.org/10.1109/CICT.2013.6558288.
Li K, Xu H, Liu X. Analysis and visualization of accidents severity based on LightGBM-TPE. Chaos Solitons Fractals. 2022;157:111987. https://doi.org/10.1016/j.chaos.2022.111987.
Teo YH, et al. Predicting clinical outcomes in acute ischemic stroke patients undergoing endovascular thrombectomy with machine learning. Clin Neuroradiol. 2021;31(4):1121–30. https://doi.org/10.1007/s00062-020-00990-3.
Choi Y-A, et al. Machine-learning-based elderly stroke monitoring system using electroencephalography vital signals. Appl Sci. 2021;11(4):1761. https://doi.org/10.3390/app11041761.
Heo J, Yoon JG, Park H, Kim YD, Nam HS, Heo JH. Machine learning-based model for prediction of outcomes in acute stroke. Stroke. 2019;50(5):1263–5. https://doi.org/10.1161/STROKEAHA.118.024293.
Xie Y, et al. Use of gradient boosting machine learning to predict patient outcome in acute ischemic stroke on the basis of imaging, demographic, and clinical information. Am J Roentgenol. 2019;212(1):44–51. https://doi.org/10.2214/AJR.18.20260.
Lip GYH, Genaidy A, Tran G, Marroquin P, Estes C, Sloop S. Improving stroke risk prediction in the general population: a comparative assessment of common clinical rules, a new multimorbid index, and machine-learning-based algorithms. Thromb Haemost. 2022;122(01):142–50. https://doi.org/10.1055/a-1467-2993.
Liu T, Fan W, Wu C. A hybrid machine learning approach to cerebral stroke prediction based on imbalanced medical dataset. Artif Intell Med. 2019;101:101723. https://doi.org/10.1016/j.artmed.2019.101723.
Messica S, et al. Enhancing stroke risk and prognostic timeframe assessment with deep learning and a broad range of retinal biomarkers. Artif Intell Med. 2024;154:102927. https://doi.org/10.1016/j.artmed.2024.102927.
Khosla A, Cao Y, Lin CC-Y, Chiu H-K, Hu J, Lee H. An integrated machine learning approach to stroke prediction. In: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD ’10. ACM Press, New York, 2010, p. 183. https://doi.org/10.1145/1835804.1835830.
Rahmani MKI, et al. Blockchain-based trust management framework for cloud computing-based internet of medical things (IoMT): a systematic review. Comput Intell Neurosci. 2022;2022:1–14. https://doi.org/10.1155/2022/9766844.
Zhang L, et al. Ischemic stroke lesion segmentation using multi-plane information fusion. IEEE Access. 2020;8:45715–25. https://doi.org/10.1109/ACCESS.2020.2977415.
Yu Y, et al. Use of deep learning to predict final ischemic stroke lesions from initial magnetic resonance imaging. JAMA Netw Open. 2020;3(3):e200772. https://doi.org/10.1001/jamanetworkopen.2020.0772.
Barman A, Inam ME, Lee S, Savitz S, Sheth S, Giancardo L. Determining ischemic stroke from CT-angiography imaging using symmetry-sensitive convolutional networks. In: 2019 IEEE 16th international symposium on biomedical imaging (ISBI 2019). IEEE, 2019, pp. 1873–1877. https://doi.org/10.1109/ISBI.2019.8759475.
Dolz J, Ben Ayed I, Desrosiers C. Dense multi-path U-net for ischemic stroke lesion segmentation in multiple image modalities, 2019, pp. 271–282. https://doi.org/10.1007/978-3-030-11723-8_27.
Pérez Malla CU, Valdés Hernández MC, Rachmadi MF, Komura T. Evaluation of enhanced learning techniques for segmenting ischaemic stroke lesions in brain magnetic resonance perfusion images using a convolutional neural network scheme. Front Neuroinform. 2019. https://doi.org/10.3389/fninf.2019.00033.
Öman O, Mäkelä T, Salli E, Savolainen S, Kangasniemi M. 3D convolutional neural networks applied to CT angiography in the detection of acute ischemic stroke. Eur Radiol Exp. 2019;3(1):8. https://doi.org/10.1186/s41747-019-0085-6.
Bertels J, Robben D, Vandermeulen D, Suetens P Contra-lateral information CNN for core lesion segmentation based on native CTP in acute stroke, 2019, pp. 263–270. https://doi.org/10.1007/978-3-030-11723-8_26.
To MNN, Kim HJ, Roh HG, Cho Y-S, Kwak JT. Deep regression neural networks for collateral imaging from dynamic susceptibility contrast-enhanced magnetic resonance perfusion in acute ischemic stroke. Int J Comput Assist Radiol Surg. 2020;15(1):151–62. https://doi.org/10.1007/s11548-019-02060-7.
Soltanpour M, Greiner R, Boulanger P, Buck B. Ischemic stroke lesion prediction in CT perfusion scans using multiple parallel U-nets following by a pixel-level classifier. In: 2019 IEEE 19th international conference on bioinformatics and bioengineering (BIBE), IEEE, 2019, pp. 957–963. https://doi.org/10.1109/BIBE.2019.00179.
Abulnaga SM, Rubin J. Ischemic stroke lesion segmentation in CT perfusion scans using pyramid pooling and focal loss, 2019, pp. 352–363. https://doi.org/10.1007/978-3-030-11723-8_36.
Islam M, Vaidyanathan NR, Jose VJM, Ren H. Ischemic stroke lesion segmentation using adversarial learning, 2019, pp. 292–300. https://doi.org/10.1007/978-3-030-11723-8_29.
Liu P. Stroke lesion segmentation with 2D novel CNN pipeline and novel loss function, 2019, pp. 253–262. https://doi.org/10.1007/978-3-030-11723-8_25.
Heikal A, El-Ghamry A, Elmougy S, Rashad MZ. Fine tuning deep learning models for breast tumor classification. Sci Rep. 2024;14(1):10753. https://doi.org/10.1038/s41598-024-60245-w.
Zhang R, et al. Automatic segmentation of acute ischemic stroke from DWI using 3-D fully convolutional DenseNets. IEEE Trans Med Imaging. 2018;37(9):2149–60. https://doi.org/10.1109/TMI.2018.2821244.
Mondal S, Ghosh S, Nag A. Brain stroke prediction model based on boosting and stacking ensemble approach. Int J Inf Technol. 2024;16(1):437–46. https://doi.org/10.1007/s41870-023-01418-0.
Shinde S, Kurhekar MP, Diwan T, Pikle NK, Gulhane M. Design of a novel enhanced machine learning model for early prediction of cerebral stroke early prediction of brain stroke. Int J Comput Digit Syst. 2024;15(1):1807–21. https://doi.org/10.12785/ijcds/1501127.
Author information
Authors and Affiliations
Contributions
Conceptualization: Rafeeq Ahmed; Methodology: Rafeeq Ahmed, Zubair Ashraf; Formal analysis & Data curation: Rafeeq Ahmed, Anmol Varshney; writing – original draft preparation: Rafeeq Ahmed; writing – review & editing: Rafeeq Ahmed, Anmol Varshney, and Zubair Ashraf; supervision: Rafeeq Ahmed.
Corresponding author
Ethics declarations
Conflict of Interest
The authors declare no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Ahmed, R., Varshney, A., Ashraf, Z. et al. Enhanced Stroke Risk Prediction: A Fusion of Machine Learning Models for Improved Healthcare Strategies. SN COMPUT. SCI. 5, 1078 (2024). https://doi.org/10.1007/s42979-024-03389-w
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s42979-024-03389-w