Skip to main content

Advertisement

Log in

Intelligible Models for HealthCare: Predicting the Probability of 6-Month Unfavorable Outcome in Patients with Ischemic Stroke

  • Original Article
  • Published:
Neuroinformatics Aims and scope Submit manuscript

Abstract

Early prediction of unfavorable outcome after ischemic stroke is significant for clinical management. Machine learning as a novel computational modeling technique could help clinicians to address the challenge. We aim to investigate the applicability of machine learning models for individualized prediction in ischemic stroke patients and demonstrate the utility of various model-agnostic explanation techniques for machine learning predictions. A total of 499 consecutive patients with Unfavorable [modified Rankin Scale (mRS) score 3–6, n = 140] and favorable (mRS score 0–2, n = 359) outcome after 6-month from ischemic stroke were enrolled in this study. Four machine learning models, including Random Forest [RF], eXtreme Gradient Boosting [XGBoost], Adaptive Boosting [Adaboost] and Support Vector Machine [SVM] were performed with the area-under-the-curve (AUC): (90.20 ± 0.22)%, (86.91 ± 1.05)%, (86.49 ± 2.35)%, (81.89 ± 2.40)%, respectively. Three global interpretability techniques (Feature Importance shows the contribution of selected features, Partial Dependence Plot aims to visualize the average effect of a feature on the predicted probability of unfavorable outcome, Feature Interaction detects the change in the prediction that occurs by varying the features after considering the individual feature effects) and one local interpretability technique (Shapley Value indicates the probability of unfavorable outcome of different instances) have been applied to present the interpretability techniques via visualization. Thereby, the current study is important for better understanding intelligible healthcare analytics via explanations for the prediction of local and global levels, and potentially reduction of the mortality of patients with ischemic stroke by assisting clinicians in the decision-making process.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  • Basu Roy, S., Teredesai, A., Zolfaghar, K., Liu, R., Hazel, D., Newman, S., et al. (2015). Dynamic hierarchical classification for patient risk-of-readmission. In: Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 1691–1700).

  • Belard, A., Buchman, T., Forsberg, J., Potter, B. K., Dente, C. J., Kirk, A., & Elster, E. (2017). Precision diagnosis: A view of the clinical decision support systems (CDSS) landscape through the lens of critical care. Journal of Clinical Monitoring and Computing, 31(2), 261–271.

    Article  Google Scholar 

  • Blum, A. L., & Langley, P. (1997). Selection of relevant features and examples in machine learning. Artificial Intelligence, 97(1), 245–271. https://doi.org/10.1016/S0004-3702(97)00063-5.

    Article  Google Scholar 

  • Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32.

    Article  Google Scholar 

  • Cappellari, M., Turcato, G., Forlivesi, S., Micheletti, N., Tomelleri, G., Bonetti, B., Merlino, G., Eleopra, R., Russo, M., L’Erario, R., Adami, A., Gentile, C., Gaudenzi, A., Bruno, S., & Bovi, P. (2018). Introduction of direct oral anticoagulant within 7 days of stroke onset: A nomogram to predict the probability of 3-month modified Rankin scale score > 2. Journal of Thrombosis and Thrombolysis, 46(3), 292–298.

    Article  CAS  Google Scholar 

  • Caruana, R., Lou, Y., Gehrke, J., Koch, P., Sturm, M., & Elhadad, N. (2015). Intelligible models for HealthCare: Predicting pneumonia risk and hospital 30-day readmission. In: Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 1721–1730): Association for Computing Machinery.

  • Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 16(1), 321–357.

    Article  Google Scholar 

  • Committee, C. S. (1996). A randomised, blinded, trial of clopidogrel versus aspirin in patients at risk of ischaemic events (CAPRIE). The Lancet, 348(9038), 1329–1339.

  • Cooray, C., Mazya, M., Bottai, M., Dorado, L., Skoda, O., Toni, D., et al. (2016). External validation of the ASTRAL and DRAGON scores for prediction of functional outcome in stroke. Stroke, 47(6), 1493–1499.

    Article  Google Scholar 

  • Cuadrado-Godia, E., Dwivedi, P., Sharma, S., Ois Santiago, A., Roquer Gonzalez, J., Balcells, M., Laird, J., Turk, M., Suri, H. S., Nicolaides, A., Saba, L., Khanna, N. N., & Suri, J. S. (2018). Cerebral small vessel disease: A review focusing on pathophysiology, biomarkers, and machine learning strategies. Journal of Stroke, 20(3), 302–320.

    Article  Google Scholar 

  • Emberson, J., Lees, K. R., Lyden, P., Blackwell, L., Albers, G., Bluhmki, E., Brott, T., Cohen, G., Davis, S., Donnan, G., Grotta, J., Howard, G., Kaste, M., Koga, M., von Kummer, R., Lansberg, M., Lindley, R. I., Murray, G., Olivot, J. M., Parsons, M., Tilley, B., Toni, D., Toyoda, K., Wahlgren, N., Wardlaw, J., Whiteley, W., del Zoppo, G., Baigent, C., Sandercock, P., Hacke, W., & Stroke Thrombolysis Trialists' Collaborative Group. (2014). Effect of treatment delay, age, and stroke severity on the effects of intravenous thrombolysis with alteplase for acute ischaemic stroke: A meta-analysis of individual patient data from randomised trials. Lancet, 384(9958), 1929–1935.

    Article  CAS  Google Scholar 

  • Feigin, V. L., Krishnamurthi, R. V., Parmar, P., Norrving, B., Mensah, G. A., Bennett, D. A., Barker-Collo, S., Moran, A. E., Sacco, R. L., Truelsen, T., Davis, S., Pandian, J. D., Naghavi, M., Forouzanfar, M. H., Nguyen, G., Johnson, C. O., Vos, T., Meretoja, A., Murray, C. J., Roth, G. A., & GBD 2013 Writing Group, GBD 2013 Stroke Panel Experts Group. (2015). Update on the global burden of ischemic and hemorrhagic stroke in 1990-2013: The GBD 2013 study. Neuroepidemiology, 45(3), 161–176.

    Article  Google Scholar 

  • Fisher, A., Rudin, C., & Dominici, F. (2018). Model class reliance: Variable importance measures for any machine learning model class, from the “Rashomon” perspective. arXiv preprint arXiv:1801.01489, 68

  • Flint, A. C., Faigeles, B. S., Cullen, S. P., Kamel, H., Rao, V. A., Gupta, R., Smith, W. S., Bath, P. M., Donnan, G. A., Lees, K. R., Alexandrov, A., Bath, P. M., Bluhmki, E., Bornstein, N., Claesson, L., Davis, S. M., Donnan, G., Diener, H. C., Fisher, M., Gregson, B., Grotta, J., Hacke, W., Hennerici, M. G., Hommel, M., Kaste, M., Lyden, P., Marler, J., Muir, K., Sacco, R., Shuaib, A., Teal, P., Wahlgren, N. G., Warach, S., Weimar, C., & VISTA Steering Committee (VISTA-Acute) members. (2013). THRIVE score predicts ischemic stroke outcomes and thrombolytic hemorrhage risk in VISTA. Stroke, 44(12), 3365–3369.

    Article  CAS  Google Scholar 

  • Friedman, J. (2000). Greedy function approximation: A gradient boosting machine. The Annals of Statistics, 29(5), 1189–1232.

    Google Scholar 

  • Friedman, J. H., & Popescu, B. E. (2008). Predictive learning via rule ensembles. Statistics, 2(3), 916–954.

    Google Scholar 

  • Futoma, J., Morris, J., & Lucas, J. (2015). A comparison of models for predicting early hospital readmissions. Journal of Biomedical Informatics, 56, 229–238.

    Article  Google Scholar 

  • Gschwendtner, A., & Dichgans, M. (2013). Genetics of ischemic stroke. Nervenarzt, 84(2), 166–172.

    Article  CAS  Google Scholar 

  • He, D., Mathews, S. C., Kalloo, A. N., & Hutfless, S. (2014). Mining high-dimensional administrative claims data to predict early hospital readmissions. Journal of the American Medical Informatics Association, 21(2), 272–279.

    Article  Google Scholar 

  • Hearst, M. A., Dumais, S. T., Osman, E., Platt, J., & Scholkopf, B. (1998). Support vector machines. IEEE Intelligent Systems & Their Applications, 13(4), 18–28.

    Article  Google Scholar 

  • Heo, J., Yoon, J. G., Park, H., Kim, Y. D., Nam, H. S., & Heo, J. H. (2019). Machine learning-based model for prediction of outcomes in acute stroke. Stroke, 50(5), 1263–1265.

    Article  Google Scholar 

  • Hernán, M. A., Hernández-Díaz, S., & Robins, J. M. (2004). A structural approach to selection bias. Epidemiology, 15(5), 615–625.

    Article  Google Scholar 

  • Howe, C. J., Cole, S. R., Chmiel, J. S., & Muñoz, A. (2011). Limitation of inverse probability-of-censoring weights in estimating survival in the presence of strong selection bias. American Journal of Epidemiology, 173(5), 569–577.

    Article  Google Scholar 

  • Karpathy, A., Johnson, J., & Fei-Fei, L. (2015). Visualizing and understanding recurrent networks. arXiv preprint arXiv:1506.02078

  • Liaw, A., & Wiener, M. (2002). Classification and regression by randomForest. R News, 23, 18–22.

  • Lim, B. Y., Dey, A. K., & Avrahami, D. (2009). Why and why not explanations improve the intelligibility of context-aware intelligent systems. In: Proceedings of the SIGCHI conference on human factors in computing systems (pp. 2119–2128): Association for Computing Machinery.

  • Luedi, R., Hsieh, K., Slezak, A., El-Koussy, M., Fischer, U., Heldner, M. R., et al. (2014). Age dependency of safety and outcome of endovascular therapy for acute stroke. Journal of Neurology, 261(8), 1622–1627.

    Article  CAS  Google Scholar 

  • Margineantu, D. D., & Dietterich, T. G. (1997). Pruning adaptive boosting. In: ICML, 1997 (Vol. 97, pp. 211–218): Citeseer.

  • Menze, B. H., Kelm, B. M., Masuch, R., Himmelreich, U., Bachert, P., Petrich, W., & Hamprecht, F. A. (2009). A comparison of random forest and its Gini importance with standard chemometric methods for the feature selection and classification of spectral data. BMC Bioinformatics, 10, 213. https://doi.org/10.1186/1471-2105-10-213.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Morrell, K., Hyers, M., Stuchiner, T., Lucas, L., Schwartz, K., Mako, J., Spinelli, K. J., & Yanase, L. (2017). Telehealth stroke dysphagia evaluation is safe and effective. Cerebrovascular Diseases, 44(3–4), 225–231.

    Article  Google Scholar 

  • Ntaios, G., Faouzi, M., Ferrari, J., Lang, W., Vemmos, K., & Michel, P. (2012). An integer-based score to predict functional outcome in acute ischemic stroke: The ASTRAL score. Neurology, 78(24), 1916–1922.

    Article  CAS  Google Scholar 

  • Pederson, J. L., Majumdar, S. R., Forhan, M., Johnson, J. A., & McAlister, F. A. (2016). Current depressive symptoms but not history of depression predict hospital readmission or death after discharge from medical wards: A multisite prospective cohort study. General Hospital Psychiatry, 39, 80–85.

    Article  Google Scholar 

  • Powers, D. (2007). Evaluation: From precision, recall and fmeasure to roc, informedness, markedness and correlation. Journal of Machine Learning Technologies, 2, 37–63.

    Google Scholar 

  • Pruvost-Robieux, E., Calvet, D., Ben Hassen, W., Turc, G., Marchi, A., Mélé, N., Seners, P., Oppenheim, C., Baron, J. C., Mas, J. L., & Gavaret, M. (2018). Design and methodology of a pilot randomized controlled trial of transcranial direct current stimulation in acute middle cerebral artery stroke (STICA). Frontiers in Neurology, 9, 816.

    Article  Google Scholar 

  • Roth, A. E. (1988). The Shapley value: essays in honor of Lloyd S. Shapley: Cambridge University Press.

  • Roy, S. B., Teredesai, A., Zolfaghar, K., Liu, R., Hazel, D., Newman, S., et al. (2015). Dynamic hierarchical classification for patient risk-of-readmission. In: Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 1691-1700): Association for Computing Machinery.

  • Rutledge, R. B., Chekroud, A. M., & Huys, Q. J. (2019). Machine learning and big data in psychiatry: Toward clinical applications. Current Opinion in Neurobiology, 55, 152–159. https://doi.org/10.1016/j.conb.2019.02.006.

    Article  PubMed  CAS  Google Scholar 

  • Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556

  • Singal, A. G., Rahimi, R. S., Clark, C., Ma, Y., Cuthbert, J. A., Rockey, D. C., & Amarasingham, R. (2013). An automated model using electronic medical record data identifies patients with cirrhosis at high risk for readmission. Clinical Gastroenterology and Hepatology, 11(10), 1335–1341.e1331. https://doi.org/10.1016/j.cgh.2013.03.022.

    Article  PubMed  Google Scholar 

  • Štrumbelj, E., & Kononenko, I. (2014). Explaining prediction models and individual predictions with feature contributions. Knowledge and Information Systems, 41(3), 647–665.

    Article  Google Scholar 

  • Štrumbelj, E., Bosnić, Z., Kononenko, I., Zakotnik, B., & Kuhar, C. G. (2010). Explanation and reliability of prediction models: The case of breast cancer recurrence. Knowledge and Information Systems, 24(2), 305–324.

    Article  Google Scholar 

  • Sun, C., Li, X., Song, B., Chen, X., Nyame, L., Liu, Y., Tang, D., Ibrahim, M., Zhao, Z., Liu, C., Yan, M., Pan, X., Yang, J., Zhou, J., & Zou, J. (2019). A NADE nomogram to predict the probability of 6-month unfavorable outcome in Chinese patients with ischemic stroke. BMC Neurology, 19(1), 274.

    Article  Google Scholar 

  • Turcato, G., Cervellin, G., Cappellari, M., Bonora, A., Zannoni, M., Bovi, P., Ricci, G., & Lippi, G. (2017). Early function decline after ischemic stroke can be predicted by a nomogram based on age, use of thrombolysis, RDW and NIHSS score at admission. Journal of Thrombosis and Thrombolysis, 43(3), 394–400.

    Article  Google Scholar 

  • van Os, H. J. A., Ramos, L. A., Hilbert, A., van Leeuwen, M., van Walderveen, M. A. A., Kruyt, N. D., et al. (2018). Predicting outcome of endovascular treatment for acute ischemic stroke: Potential value of machine learning algorithms. Frontiers in Neurology, 9, 784.

    Article  Google Scholar 

  • Virani, S. S., Alonso, A., Benjamin, E. J., Bittencourt, M. S., Callaway, C. W., Carson, A. P., Chamberlain, A. M., Chang, A. R., Cheng, S., Delling, F. N., Djousse, L., Elkind, M. S. V., Ferguson, J. F., Fornage, M., Khan, S. S., Kissela, B. M., Knutson, K. L., Kwan, T. W., Lackland, D. T., Lewis, T. T., Lichtman, J. H., Longenecker, C. T., Loop, M. S., Lutsey, P. L., Martin, S. S., Matsushita, K., Moran, A. E., Mussolino, M. E., Perak, A. M., Rosamond, W. D., Roth, G. A., Sampson, U. K. A., Satou, G. M., Schroeder, E. B., Shah, S. H., Shay, C. M., Spartano, N. L., Stokes, A., Tirschwell, D. L., VanWagner, L., Tsao, C. W., & American Heart Association Council on Epidemiology and Prevention Statistics Committee and Stroke Statistics Subcommittee. (2020). Heart disease and stroke Statistics-2020 update: A report from the American Heart Association. Circulation, 141(9), e139–e596.

    Article  Google Scholar 

  • Wu, Q., Zou, C., Wu, C., Zhang, S., & Huang, Z. (2016). Risk factors of outcomes in elderly patients with acute ischemic stroke in China. Aging Clinical and Experimental Research, 28(4), 705–711.

    Article  Google Scholar 

  • Xing, Z., Pei, J., Huang, J., Peng, X., Chen, P., & Hu, X. (2018). Relationship of obesity to adverse events among patients with mean 10-year history of type 2 diabetes mellitus: Results of the ACCORD study. Journal of the American Heart Association, 7(22), e010512. https://doi.org/10.1161/JAHA.118.010512.

    Article  PubMed  PubMed Central  Google Scholar 

  • Xu, Y., Yang, X., Huang, H., Peng, C., Ge, Y., Wu, H., Wang, J., Xiong, G., & Yi, Y. (2019). Extreme gradient boosting model has a better performance in predicting the risk of 90-day readmissions in patients with Ischaemic stroke. Journal of Stroke and Cerebrovascular Diseases, 28(12), 104441.

    Article  Google Scholar 

  • Yosinski, J., Clune, J., Nguyen, A., Fuchs, T., & Lipson, H. (2015). Understanding neural networks through deep visualization. arXiv preprint arXiv:1506.06579

Download references

Acknowledgments

This work was supported by the Double-Class University project [grant numbers CPU2018GY19]; the National Natural Science Foundation of China [grant number 81473274, 81673511]; and Jiangsu Key Research and Development Plan grant [grant number BE2017613].

Information Sharing Statement

We used our separate dataset in the present study to benchmark four machining learning methods. Raw data associated with any figures can be provided upon request from Dr. Jun Liao (Email: liaojun@cpu.edu.cn). There are no restrictions on data availability. The Python codes of the proposed methods are available from https://github.com/cpufxb/Neuroinformatics_ML.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jun Liao.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

ESM 1

(PDF 179 kb)

ESM 2

(PDF 139 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Feng, X., Hua, Y., Zou, J. et al. Intelligible Models for HealthCare: Predicting the Probability of 6-Month Unfavorable Outcome in Patients with Ischemic Stroke. Neuroinform 20, 575–585 (2022). https://doi.org/10.1007/s12021-021-09535-6

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12021-021-09535-6

Keywords

Navigation