Skip to main content

Explaining the Predictions of an Arbitrary Prediction Model: Feature Contributions and Quasi-nomograms

  • Chapter
  • First Online:
Human and Machine Learning

Part of the book series: Human–Computer Interaction Series ((HCIS))

  • 4279 Accesses

Abstract

Acquisition of knowledge from data is the quintessential task of machine learning. The knowledge we extract this way might not be suitable for immediate use and one or more data postprocessing methods could be applied as well. Data postprocessing includes the integration, filtering, evaluation, and explanation of acquired knowledge. Nomograms, graphical devices for approximate calculations of functions, are a useful tool for visualising and comparing prediction models. It is well known that any generalised additive model can be represented by a quasi-nomogram – a nomogram where some summation performed by the human is required. Nomograms of this type are widely used, especially in medical prognostics. Methods for constructing such a nomogram were developed for specific types of prediction models thus assuming that the structure of the model is known. In this chapter we extend our previous work on a general method for explaining arbitrary prediction models (classification or regression) to a general methodology for constructing a quasi-nomogram for a black-box prediction model. We show that for an additive model, such a quasi-nomogram is equivalent to the one we would construct if the structure of the model was known.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 54.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    www.sciencedirect.com currently lists 1393 research papers that feature the word “nomogram” in the title, keywords, or abstract and were published between 2006 and 2015. Most of them are from the medical field.

  2. 2.

    Linear regression is, of course, just a special case of generalised additive model with identity link function and linear effect functions

References

  1. Achen, C.H.: Intepreting and Using Regression. Sage Publications (1982)

    Google Scholar 

  2. Baehrens, D., Schroeter, T., Harmeling, S., Kawanabe, M., Hansen, K., MÞller, K.R.: How to explain individual classification decisions. J. Mach. Learn. Res. 11, 1803–1831 (2010)

    MathSciNet  MATH  Google Scholar 

  3. Bosnić, Z., Vračar, P., Radović, M.D., Devedzić, G., Filipović, N.D., Kononenko, I.: Mining data from hemodynamic simulations for generating prediction and explanation models. IEEE Trans. Inf. Technol. Biomed. 16(2), 248–254 (2012)

    Article  Google Scholar 

  4. Breiman, L.: Random forests. Mach. Learn. J. 45, 5–32 (2001)

    Article  Google Scholar 

  5. Cho, B.H., Yu, H., Lee, J., Chee, Y.J., Kim, I.Y., Kim, S.I.: Nonlinear support vector machine visualization for risk factor analysis using nomograms and localized radial basis function kernels. IEEE Trans. Inf. Technol. Biomed. 12(2), 247–256 (2008)

    Article  Google Scholar 

  6. Chun, F.K.H., Briganti, A., Karakiewicz, P.I., Graefen, M.: Should we use nomograms to predict outcome?. Eur. Urol. Suppl. 7(5), 396–399 (2008). Update Uro-Oncology 2008, Fifth Fall Meeting of the European Society of Oncological Urology (ESOU)

    Google Scholar 

  7. Demšar, J., Zupan, B., Leban, G., Curk, T.: Orange: From experimental machine learning to interactive data mining. In: PKDD’04, pp. 537–539 (2004)

    Google Scholar 

  8. d’Ocagne, M.: Traité de nomographie. Gauthier-Villars, Paris (1899)

    Google Scholar 

  9. Doerfler, R.: The lost art of nomography. UMAP J. 30(4), 457–493 (2009)

    Google Scholar 

  10. Eastham, J.A., Scardino, P.T., Kattan, M.W.: Predicting an optimal outcome after radical prostatectomy: the trifecta nomogram. J. Urol. 79(6), 2011–2207 (2008)

    Google Scholar 

  11. Grömping, U.: Estimators of relative importance in linear regression based on variance decomposition. Am. Stat. 61(2), (2007)

    Article  MathSciNet  Google Scholar 

  12. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. SIGKDD Explor. Newsl. 11(1), 10–18 (2009)

    Article  Google Scholar 

  13. Jaeckel, P.: Monte Carlo Methods in Finance. Wiley, New York (2002)

    Google Scholar 

  14. Jakulin, A., Možina, M., Demšar, J., Bratko, I., Zupan, B.: Nomograms for visualizing support vector machines. In: KDD ’05: Proceeding of the eleventh ACM SIGKDD International Conference on Knowledge Discovery In Data Mining, pp. 108–117. ACM, New York, USA (2005)

    Google Scholar 

  15. Kanao, K., Mizuno, R., Kikuchi, E., Miyajima, A., Nakagawa, K., Ohigashi, T., Nakashima, J., Oya, M.: Preoperative prognostic nomogram (probability table) for renal cell carcinoma based on tnm classification. J. Urol. 181(2), 480–485 (2009)

    Article  Google Scholar 

  16. Kattan, M.W., Marasco, J.: What is a real nomogram. Semin. Oncol. 37(1), 23–26 (2010)

    Article  Google Scholar 

  17. Kubatko, J., Oliver, D., Pelton, K., Rosenbaum, D.T.: A starting point for analyzing basketball statistics. J. Quantit. Anal. Sports 3(3), 00–01 (2007)

    MathSciNet  Google Scholar 

  18. Kukar, M., Grošelj, C.: Supporting diagnostics of coronary artery disease with neural networks. In: Adaptive and Natural Computing Algorithms, pp. 80–89. Springer, Berlin (2011)

    Chapter  Google Scholar 

  19. Kukar, M., Kononenko, I., Grošelj, C.: Modern parameterization and explanation techniques in diagnostic decision support system: a case study in diagnostics of coronary artery disease. Artif. Intell. Med. 52(2), 77–90 (2011)

    Article  Google Scholar 

  20. Lee, K.M., Kim, W.J., Ryu, K.H., Lee, S.H.: A nomogram construction method using genetic algorithm and naive Bayesian technique. In: Proceedings of the 11th WSEAS International Conference on Mathematical and Computational Methods In Science And Engineering, pp. 145–149. World Scientific and Engineering Academy and Society (WSEAS), Stevens Point, Wisconsin, USA (2009)

    Google Scholar 

  21. Lemaire, V., Féraud, R., Voisine, N.: Contact personalization using a score understanding method. In: International Joint Conference on Neural Networks (IJCNN) (2008)

    Google Scholar 

  22. Lughofer, E., Richter, R., Neissl, U., Heidl, W., Eitzinger, C., Radauer, T.: Advanced linguistic explanations of classifier decisions for users’ annotation support. In: 2016 IEEE 8th International Conference on Intelligent Systems (IS), pp. 421–432. IEEE, New York (2016)

    Google Scholar 

  23. Možina, M., Demšar, J., Kattan, M., Zupan, B.: Nomograms for visualization of naive Bayesian classifier. In: PKDD ’04: Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases, pp. 337–348. Springer, New York, USA (2004)

    Google Scholar 

  24. Nguyen, C.T., Stephenson, A.J., Kattan, M.W.: Are nomograms needed in the management of bladder cancer?. Urol. Oncol. Semin. Orig. Investig. 28(1), 102 – 107 (2010). Proceedings: Midwinter Meeting of the Society of Urologic Oncology (December 2008): Updated Issues in Kidney, Bladder, Prostate, and Testis Cancer

    Google Scholar 

  25. Niederreiter, H.: Low-discrepancy and low-dispersion sequences. J. Number Theory 30(1), 51–70 (1988)

    Article  MathSciNet  Google Scholar 

  26. Niederreiter, H.: Random Number Generation and Quasi-Monte Carlo Methods. Society for Industrial and Applied Mathematics, Philadelphia, PA, USA (1992)

    Book  Google Scholar 

  27. Pregeljc, M., Štrumbelj, E., Mihelcic, M., Kononenko, I.: Learning and explaining the impact of enterprises organizational quality on their economic results. Intelligent Data Analysis for Real-Life Applications: Theory and Practice pp. 228–248 (2012)

    Google Scholar 

  28. Radović, M.D., Filipović, N.D., Bosnić, Z., Vračar, P., Kononenko, I.: Mining data from hemodynamic simulations for generating prediction and explanation models. In: 2010 10th IEEE International Conference on Information Technology and Applications in Biomedicine (ITAB), pp. 1–4. IEEE, New York (2010)

    Google Scholar 

  29. Robnik-Šikonja, M., Kononenko, I.: Explaining classifications for individual instances. IEEE Trans. Knowl. Data Eng. 20(5), 589–600 (2008)

    Article  Google Scholar 

  30. Robnik-Šikonja, M., Kononenko, I., Štrumbelj, E.: Quality of classification explanations with prbf. Neurocomputing 96, 37–46 (2012)

    Article  Google Scholar 

  31. Robnik-Šikonja, M., Likas, A., Constantinopoulos, C., Kononenko, I., Štrumbelj, E.: Efficiently explaining decisions of probabilistic RBF classification networks. In: Adaptive and Natural Computing Algorithms, pp. 169–179. Springer, Berlin (2011)

    Chapter  Google Scholar 

  32. Robnik-Šikonja, M., Kononenko, I.: Explaining classifications for individual instances. IEEE TKDE 20, 589–600 (2008)

    Google Scholar 

  33. Shapley, L.S.: A Value for n-person games. Contributions to the Theory of Games, vol. II. Princeton University Press, Princeton (1953)

    Google Scholar 

  34. Štrumbelj, E., Bosnić, Z., Zakotnik, B., Grašič-Kuhar, C., Kononenko, I.: Explanation and reliability of breast cancer recurrence predictions. Knowl. Inf. Syst. 24(2), 305–324 (2010)

    Article  Google Scholar 

  35. Štrumbelj, E., Kononenko, I.: An efficient explanation of individual classifications using game theory. J. Mach. Learn. Res. 11, 1–18 (2010)

    MathSciNet  MATH  Google Scholar 

  36. Štrumbelj, E., Kononenko, I.: A general method for visualizing and explaining black-box regression models. In: Dobnikar A., Lotric U., Ster B. (eds.) ICANNGA (2). Lecture Notes in Computer Science, vol. 6594, pp. 21–30. Springer, Berlin (2011)

    Chapter  Google Scholar 

  37. Štrumbelj, E., Kononenko, I.: Explaining prediction models and individual predictions with feature contributions. Knowl. Inf. Syst. 41(3), 647–665 (2014)

    Article  Google Scholar 

  38. Vien, N.A., Viet, N.H., Chung, T., Yu, H., Kim, S., Cho, B.H.: Vrifa: a nonlinear SVM visualization tool using nomogram and localized radial basis function (LRBF) kernels. In: CIKM, pp. 2081–2082 (2009)

    Google Scholar 

  39. Zien, A., Krämer, N., Sonnenburg, S., Rätsch, G.: The feature importance ranking measure. In: ECML PKDD 2009, Part II, pp. 694–709. Springer, Berlin (2009)

    Chapter  Google Scholar 

  40. Zlotnik, A., Abraira, V.: A general-purpose nomogram generator for predictive logistic regression models. Stata J. 15(2), 537–546 (2015)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Erik Štrumbelj .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Štrumbelj, E., Kononenko, I. (2018). Explaining the Predictions of an Arbitrary Prediction Model: Feature Contributions and Quasi-nomograms. In: Zhou, J., Chen, F. (eds) Human and Machine Learning. Human–Computer Interaction Series. Springer, Cham. https://doi.org/10.1007/978-3-319-90403-0_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-90403-0_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-90402-3

  • Online ISBN: 978-3-319-90403-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics