Skip to main content

Advertisement

Log in

Interpretability, personalization and reliability of a machine learning based clinical decision support system

  • Published:
Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Abstract

Artificial intelligence (AI) has achieved notable performances in many fields and its research impact in healthcare has been unquestionable. Nevertheless, the deployment of such computational models in clinical practice is still limited. Some of the major issues recognized as barriers to a successful real-world machine learning applications include lack of: transparency; reliability and personalization. Actually, these aspects are decisive not only for patient safety, but also to assure the confidence of professionals. Explainable AI aims at to achieve solutions for artificial intelligence transparency and reliability concerns, with the capacity to better understand and trust a model, providing the ability to justify its outcomes, thus effectively assisting clinicians in rationalizing the model prediction. This work proposes an innovative machine learning based approach, implementing a hybrid scheme, able to combine in a systematic way knowledge-driven and data-driven techniques. In a first step a global set of interpretable rules is generated, founded on clinical evidence. Then, in a second phase, a machine learning model is trained to select, from the global set of rules, the subset that is more appropriate for a given patient, according to his particular characteristics. This approach addresses simultaneously three of the central requirements of explainable AI—interpretability, personalization, and reliability—without impairing the accuracy of the model’s prediction. The scheme was validated with a real dataset provided by two Portuguese Hospitals, the Santa Cruz Hospital, Lisbon, and the Santo André Hospital, Leiria, comprising a total of N = 1111 patients that suffered an acute coronary syndrome event, where the 30 days mortality was assessed. When compared with standard black-box structures (e.g. feedforward neural network) the proposed scheme achieves similar performances, while ensures simultaneously clinical interpretability and personalization of the model, as well as provides a level of reliability to the estimated mortality risk.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  • Adadi A, Berrada M (2018) Peeking inside the black-box: a survey on explainable artificial intelligence (XAI). IEEE Access 6:52138–52160

    Article  Google Scholar 

  • Adebayo J, Gilmer J, Muelly M, Goodfellow I, Hardt M, Kim B (2018) Sanity checks for saliency maps. Adv Neural Inf Process Syst 568:9505–9515

    Google Scholar 

  • Ahmad M, Eckert C, Teredesai A, McKelvey G (2018) Interpretable machine learning in healthcare. IEEE Intell Inf Bull 19(1):596

    Google Scholar 

  • Arrieta B (2019) Explainable artificial intelligence (XAI): concepts, taxonomies, opportunities and challenges toward responsible AI. Inf Fusion. https://doi.org/10.1016/j.inffus.2019.12.012,2019

    Article  Google Scholar 

  • Barocas S, Hardt M, Narayanan A (2001) Fairness and machine learning limitations and opportunities. In: Smatinc scholar proceedings, ID: 113402716

  • Bella A, Ferri C, Orallo J, Quintana M (2019) Calibration of Machine learning models. In: Handbook of research on machine learning applications. IGI Global

  • Bradley A (1997) The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recogn 30:1145–1159

    Article  Google Scholar 

  • Breiman L, Friedman J, Olshen R, Stone C (1984) Classification and regression trees. Wadsworth Inc

  • Burkart N, Huber M (2021) A Survey on the explainability of supervised machine learning. J Artif Intell Res 70:245–317

    Article  MathSciNet  Google Scholar 

  • Burrell J (2016) How the machine thinks: Understanding opacity in machine learning algorithms. Big Data Soc 3(1):568

    Article  Google Scholar 

  • Carrington A, Manuel D, Fieguth P, Ramsay T, Osmani V, Wernly B, Bennett C, Hawken S, McInnes M, Magwood O, Sheikh Y, Holzinger A (2021) Deep ROC analysis and AUC as balanced average accuracy to improve model selection, understanding and interpretation. In: IEEE transactions on pattern analysis and machine intelligence. arXiv:2103.11357

  • Caruana R, Lou Y, Gehrke J, Koch P, Sturm M, Elhadad N (2015) Intelligible models for healthcare: predicting pneumonia risk and hospital 30-day readmission. In: Proceedings 21th SIGKDD International conference on knowledge discovery and data mining, pp 1721–1730

  • Carvalho D, Pereira E, Cardoso J (2019) Machine learning interpretability: a survey on methods and metrics. Electronics 8:832. https://doi.org/10.3390/electronics8080832

    Article  Google Scholar 

  • Casalicchio G, Molnar C, Bischl B (2018) Visualizing the feature importance for black box models. In: Joint European conference on machine learning and knowledge discovery in databases. Springer, pp 655–670 (2018)

  • Cohen W (1995) Fast effective rule induction. Semantic Scholar. https://doi.org/10.1016/b978-1-55860-377-6.50023

  • Dandl S, Molnar C, Binder M, Bischl B (2020) Multi-objective counterfactual explanations. arXiv preprint arXiv:2004.11165

  • Doshi-Velez F, Towards K (2017) A rigorous science of interpretable machine learning. arXiv:1702.08608

  • Dubey R, Zhou J, Wang Y, Thompson PM, Ye J (2014) Analysis of sampling techniques for imbalanced data: An n= 648 adni study. Neuroimage 87:220–241

    Article  Google Scholar 

  • European Commission (2020) White Paper On Artificial Intelligence—a European approach to excellence and trust. https://ec.europa.eu/info/sites/info/files/commission-white-paper-artificial-intelligence-feb2020en.pdf

  • Freitas A (2014) Comprehensible classification models: a position paper. ACM SIGKDD Explor Newsl 15(1):1–10

    Article  Google Scholar 

  • Frosst N, Hinton G (2017) Distilling a neural network into a soft decision tree. arXiv preprint arXiv:1711.09784

  • Gavish B, Ben-Dov IZ, Bursztyn M (2008) Linear relationship between systolic and diastolic blood pressure monitored over 24 h: assessment and correlates. J Hypertens 26:199

    Article  Google Scholar 

  • Gonçalves P, Ferreira J, Aguiar C, Seabra-Gomes R (2005) TIMI, PURSUIT, and GRACE risk scores: sustained prognostic value and interaction with revascularization in NSTE-ACS. Eur Heart J 26:865

    Article  Google Scholar 

  • Granger C (2003) Predictors of hospital mortality in the global registry of acute coronary events. Arch Intern Med 163:2345

    Article  Google Scholar 

  • Guidotti R, Monreale A, Ruggieri S, Turini F, Giannotti F, Pedreschi D (2018) A survey of methods for explaining black box models. ACM Comput Surv

  • Hajian S, Bonchi F, Castillo C (2016) Algorithmic bias: From discrimination discovery to fairness-aware data mining. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 2125–2126

  • Hajian-Tilaki K (2013) Receiver operating characteristic (ROC) curve analysis for medical diagnostic test evaluation. Caspian J Intern Med Spring 4(2):627–635

    Google Scholar 

  • Hastie T, Friedman J, Tibshirani R (2001) The elements of statistical learning, volume 1. Springer series in statistics Springer, Berlin

  • Holmes G, Hall M, Prank E (1999) Generating rule sets from model trees. In: Australasian joint conference on artificial intelligence. Springer, pp 1–12

  • Holzinger A, Langs G, Denk H, Zatloukal K, Muller H (2019) Causability and explainability of artificial intelligence in medicine. Wiley Interdiscip Rev Data Min Knowl Discov 9(4):e1312. https://doi.org/10.1002/widm.1312

    Article  Google Scholar 

  • Holzinger A, Carrington A, Muller H (2020) Measuring the quality of explanations: the system causability scale (SCS). KI Künstliche Intell 34:193–198. https://doi.org/10.1007/s13218-020-00636-z

    Article  Google Scholar 

  • Holzinger A, Carrington A, Muller H (2021) Measuring the quality of explanations: the system causability scale (SCS). Inf Fusion 71:28–37

    Article  Google Scholar 

  • Holzinger A, Malle B, Saranti A, Pfeifer B (2021) Towards multi-modal causability with Graph Neural Networks enabling information fusion for explainable AI. Inf Fusion 71:28–37

    Article  Google Scholar 

  • Hornik K, Buchta C, Zeileis A (2009) Open-source machine learning: R meets Weka. Comput Stat 24(2):225–232. https://doi.org/10.1007/s00180-008-0119-7

    Article  MathSciNet  MATH  Google Scholar 

  • Krishnan S, Wu E (2017) Palm: machine learning explanations for iterative debugging. In: Proceedings of the 2nd workshop on human-in-the-loop data analytics, pp. 1–6

  • Linardatos P, Papastefanopoulos V, Kotsiantis S (2021) Explainable AI: a review of machine learning interpretability methods. Entropy 23:18. https://doi.org/10.3390/e23010018

    Article  Google Scholar 

  • Lipton Z (2018) The mythos of model interpretability. Queue 16(3):31–57

    Article  Google Scholar 

  • Luque A, Carrasco A, Martína A, Heras A (2019) The impact of class imbalance in classification performance metrics based on the binary confusion matrix. Pattern Recognit 91:216–231

    Article  Google Scholar 

  • Margot V (2020) A rigorous method to compare interpretability. hal-02530389v5

  • Markus A, Kors J, Rijnbeek P (2020) The role of explainability in creating trustworthy artificial intelligence for health care: a comprehensive survey of the terminology, design choices, and evaluation strategies. J Biomed Inform. https://doi.org/10.1016/j.jbi.2020.103655

    Article  Google Scholar 

  • Mehrabi N, Morstatter F, Saxena N, Lerman K, Galstyan A (2019) A survey on bias and fairness in machine learning. arXiv:1908.09635 [cs]

  • Molnar C, Casalicchio G, Bischl B (2019) Quantifying model complexity via functional decomposition for better post-hoc interpretability. In: Joint European conference on machine learning and knowledge discovery in databases. Springer, pp 193–204

  • Molnar C. Interpretable machine learning. A guide for making black box models explainable. https://christophm.github.io/interpretable-ml-book/. ISBN 978-0-244-76852-2

  • Mothilal R, Sharma A, Tan C (2020) Explaining machine learning classifiers through diverse counterfactual explanations. In: Proceedings of the 2020 conference on fairness, accountability, and transparency, pp. 607–617

  • Murdoch W, Singh C, Kumbier K, Abbasi-Asl R, Yu B (2019) Definitions, methods, and applications in interpretable machine learning. PNAS J 11(44):22071–22080

    Article  MathSciNet  Google Scholar 

  • O’Sullivan S (2019) Legal, regulatory, and ethical frameworks for development of standards in artificial intelligence (AI) and autonomous robotic surgery. Int J Med Robot Comput Assist Surg 15:1–12. https://doi.org/10.1002/rcs

    Article  Google Scholar 

  • Parliament and Council of the European Union. General data protection regulation. https://eur-lex.europa.eu/eli/reg/2016/679/oj

  • Philipp M, Rusch T, Hornik K, Strobl C (2018) Measuring the stability of results from supervised statistical learning. J Comput Graph Stat 27(4):685–700

    Article  MathSciNet  Google Scholar 

  • Puri N, Gupta P, Agarwal P, Verma S, Krishnamurthy B (2017) Magix: model agnostic globally interpretable explanations. arXiv preprint arXiv:1706.07160

  • Qayyum A, Qadir J, Bilal M, Fuqaha A (2020) Secure and robust machine learning for healthcare: a survey. IEEE Rev Biomed Eng

  • Quinlan J (1986) Induction of decision trees. Mach Learn 1:81–106

    Google Scholar 

  • Quinlan J (1992) C4.5 programs for machine learning. Morgan Kaufmann

  • Nassih R, Berrado A (2020) State of the art of Fairness, Interpretability and Explainability. In: 13th international conference on intelligent systems: theories and applications (SITA’20). September 23–24, 2020, Rabat, Morocco. ACM, New York, NY, USA

  • Ribeiro M, Singh S, Guestrin C (2016) Why should i trust you?: Explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 1135–1144

  • Ribeiro M, Singh S, Guestrin C (2018) Anchors: high-precision model-agnostic explanations. In: Thirty-second AAAI conference on artificial intelligence

  • Rudin C, Chen C, Chen Z, Huang H, Semenova L, Zhong C (2022) Interpretable machine learning: fundamental principles and 10 grand challenges. Stat Surv. arXiv:2103.1125

  • Schneeberger D, Stoger K, Holzinger A (2020) The European Legal Framework for Medical AI”; Springer Nature Switzerland AG 2020. In: Holzinger A et al (eds). https://doi.org/10.1007/978-3-030-57321-8_12

  • Selvaraju R, Cogswell M, Das A, Vedantam R, Parikh D, Batra D (2019) Grad-CAM: visual explanations from deep networks. Int J Comput vis. https://doi.org/10.1007/s11263-019-01228-7

    Article  Google Scholar 

  • Stanford University (2016) Artificial intelligence: trends and predictions for 2030. https://www.qulix.com/wp-content/uploads/2017/04/Artificial-intelligence-Trends-and-Predictions-for-2030.pdf

  • Strobl C, Boulesteix A, Kneib T, Augustin T, Zeileis A (2008) Conditional variable importance for random forests. BMC Bioinform 9(1):307. https://doi.org/10.1186/1471-2105-9-307

    Article  Google Scholar 

  • Strumbelj E, Kononenko I (2014) Explaining prediction models and individual predictions with feature contributions. Knowl Inf Syst 41(3):647–665

    Article  Google Scholar 

  • Tjoa E, Guan C (2015) A survey on explainable artificial intelligence: towards medical XAI. J Latex Class Files 14(8):564

    Google Scholar 

  • Ustun B, Rudin C (2016) Supersparse linear integer models for optimized medical scoring systems. Mach Learn 102(3):349–391

    Article  MathSciNet  Google Scholar 

  • Wang T, Rudin C, Velez F, Liu Y, Klampfl E, MacNeille P (2017) A Bayesian framework for learning rule sets for interpretable classification. J Mach Learn Res 18:256

    MathSciNet  MATH  Google Scholar 

  • Ying X (2019) An overview of overfitting and its solutions. IOP Conf Ser J Phys Conf Ser 1168-022022

  • Zhou J, Gandomi A, Chen F, Holzinger A (2021) Evaluating the quality of machine learning explanations: a survey on methods and metrics. Electronics 10:593. https://doi.org/10.3390/electronics10050593

    Article  Google Scholar 

  • Zhou Q, Liao F, Mou C, Wang P (2018) Measuring interpretability for different types of machine learning models. In: Paciffc-Asia conference on knowledge discovery and data mining, pp 295–308

Download references

Acknowledgements

This work was supported by the lookAfterRisk research project (POCI-01-0145-FEDER-030290). The authors would also like to thank to the Santo André Hospital/Leiria Hospital Centre and the Santa Cruz Hospital, Lisbon, for providing the clinical datasets used in this study.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to F. Valente.

Additional information

Communicated by Martin Atzmueller, Johannes Fürnkranz, Tomáš Kliegr and Ute Schmid.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Valente, F., Paredes, S., Henriques, J. et al. Interpretability, personalization and reliability of a machine learning based clinical decision support system. Data Min Knowl Disc 36, 1140–1173 (2022). https://doi.org/10.1007/s10618-022-00821-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10618-022-00821-8

Keywords

Navigation