Skip to main content

Constructing Interpretable Decision Trees Using Parallel Coordinates

  • Conference paper
  • First Online:
Book cover Artificial Intelligence and Soft Computing (ICAISC 2020)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12416))

Included in the following conference series:

Abstract

The interest in interpretable models that are not only accurate but also understandable is rapidly increasing; often resulting in the machine-learning community turning to decision tree classifiers. Many techniques of growing decision trees use oblique rules to increase the accuracy of the tree and decrease its overall size, but this severely limits understandability by a human user. We propose a new type of oblique rule for decision tree classifiers that is interpretable to human users. We use the parallel coordinates system of visualisation to display both the dataset and rule to the user in an intuitive way. We propose the use of an evolutionary algorithm to learn this new type of rule and show that it produced significantly smaller trees compared to a tree created with axis-parallel rules with minimal loss in accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    We use only adjacent attributes in the visualisation as pairs due to the computational cost of evaluation of every possible attribute pair and maintaining the mental map of the user. This reduces the complexity from \(O(d^{2})\) to O(d), and the user can still permute attributes at will to create pairs.

  2. 2.

    We only run once on the full dataset when measuring the size and the depth of the deepest leaf for C4.5 and J48 since these algorithms are completely deterministic.

  3. 3.

    We make available the sourcecode for our algorithm here.

  4. 4.

    We thank the authors of the OC1-DE algorithm [33] for providing their source code.

References

  1. Adadi, A., Berrada, M.: Peeking inside the black-box: a survey on explainable artificial intelligence (xai). IEEE Access 6, 52138–52160 (2018). https://doi.org/10.1109/ACCESS.2018.2870052

    Article  Google Scholar 

  2. Ala-Pietilä, P., et al.: European union: general data protection regulation (EU) 2016/679. Techinal. report., European Commission, B-1049 Brussels (8th April 2016)

    Google Scholar 

  3. Arlot, S., Celisse, A., et al.: A survey of cross-validation procedures for model selection. Stat. Surv. 4, 40–79 (2010)

    Article  MathSciNet  Google Scholar 

  4. Blanco-Justicia, A., Domingo-Ferrer, J.: Machine learning explainability through comprehensible decision trees. In: Holzinger, A., Kieseberg, P., Tjoa, A.M., Weippl, E. (eds.) CD-MAKE 2019. LNCS, vol. 11713, pp. 15–26. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-29726-8_2

    Chapter  Google Scholar 

  5. Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and Regression Trees. Wadsworth and Brooks, Monterrey (1984)

    MATH  Google Scholar 

  6. Calvo, B., Santafé-Guzmán, R.: scmamp: Statistical comparison of multiple algorithms in multiple problems. The R J. 8(1) August 2016

    Google Scholar 

  7. Cantú-Paz, E., Kamath, C.: Inducing oblique decision trees with evolutionary algorithms. IEEE Trans. Evol. Comput. 7(1), 54–68 (2003)

    Article  Google Scholar 

  8. Chaudhuri, S., Dayal, U.: An overview of data warehousing and OLAP technology. SIGMOD Rec. 26(1), 65–74 (1997). https://doi.org/10.1145/248603.248616

    Article  Google Scholar 

  9. Chaudhuri, S., Dayal, U., Narasayya, V.: An overview of business intelligence technology. Commun. ACM 54(8), 88–98 (2011). https://doi.org/10.1145/1978542.1978562

    Article  Google Scholar 

  10. Few, S.: Multivariate analysis using parallel coordinates. Perceptual Edge (September 12th 2006). www.perceptualedge.com. Accessed 5 Nov 2019

  11. Forman, G., Scholz, M.: Apples-to-apples in cross-validation studies: Pitfalls in classifier performance measurement. SIGKDD Explor. Newsl. 12(1), 49–57 (2010). https://doi.org/10.1145/1882471.1882479

    Article  Google Scholar 

  12. Freitas, A.A.: Comprehensible classification models: a position paper. SIGKDD Explor. 15(1), 1–10 (2013)

    Article  Google Scholar 

  13. Friedman, M.: The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J. Am. Stat. Assoc. 32, 675–701 (1937)

    Article  Google Scholar 

  14. Health, D.G., Kasif, S., Salzberg, S.: Induction of oblique decision trees. In: 13th International Joint Conference on Artificial Intelligence, pp. 1002–1007. Morgan Kaufmann (1993)

    Google Scholar 

  15. Holzinger, A.: Interactive machine learning for health informatics: when do we need the human-in-the-loop? Brain Inform. 3(2), 119–131 (2016). https://doi.org/10.1007/s40708-016-0042-6

    Article  Google Scholar 

  16. Hommel, G.: A stagewise rejective multiple test procedure based on a modified Bonferroni test. Biometrika 75(2), 383–386 (1988). https://doi.org/10.1093/biomet/75.2.383

    Article  MATH  Google Scholar 

  17. Hutson, M.: Artificial intelligence faces reproducibility crisis. Science 359(6377), 725–726 (2018). https://doi.org/10.1126/science.359.6377.725

    Article  Google Scholar 

  18. Iman, R., Davenport, J.: Approximations of the critical region of the friedman statistic. Commun. Stat. Theor. Meth. 99(6), 571–595 (1980)

    Article  Google Scholar 

  19. Inselberg, A.: Parallel Coordinates : Visual Multidimensional Geometry and its Applications. Springer, NY (2009)

    Book  Google Scholar 

  20. Inselberg, A., Avidan, T.: Classification and visualization for high-dimensional data. In: 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 20th–23rd August, pp. 370–374. ACM, Boston, MA, USA (2000)

    Google Scholar 

  21. Johansson, J., Forsell, C., Lind, M., Cooper, M.: Perceiving patterns in parallel coordinates: determining thresholds for identification of relationships. Inf. Vis. 7(2), 152–162 (2008). https://doi.org/10.1057/palgrave.ivs.9500166

    Article  Google Scholar 

  22. Lavrac̆, N.: Selected techniques for data mining in medicine. Artif. Intell. Med. 16(1), 3–23 (1999)

    Article  MathSciNet  Google Scholar 

  23. Lichman, M.: UCI machine learning repository (2013). https://archive.ics.uci.edu/ml

  24. Mitchell, T.M.: Machine Learning. McGraw-Hill, New York (1997)

    MATH  Google Scholar 

  25. Monroe, D.: AI, explain yourself. Commun. ACM 61(11), 11–13 (2018). https://doi.org/10.1145/3276742

    Article  Google Scholar 

  26. Moore, A., Murdock, V., Cai, Y., Jones, K.: Transparent tree ensembles. In: 41st International ACM SIGIR Confernce on Research & Development in Information Retrieval, pp. 1241–1244. SIGIR 2018. ACM, NY (2018). https://doi.org/10.1145/3209978.3210151

  27. Mues, C., Huysmans, J., Vanthienen, J., Baesens, B.: Comprehensible credit-scoring knowledge visualization using decision tables and diagrams. In: Enterprise Information Systems VI, pp. 109–115. Springer (2006). https://doi.org/10.1007/1-4020-3675-2_13

  28. Murthy, S.K., Kasif, S., Salzberg, S.: A system for induction of oblique decision trees. J. Artif. Int. Res. 2(1), 1–32 (1994)

    MATH  Google Scholar 

  29. Murthy, S., Kasif, S., Salzberg, S., Beigel, R.: OC1: Randomized induction of oblique decision trees. In: 11th National Conference on Artificial Intelligence, AAAI Press. pp. 322–327. AAAI 1993 (1993)

    Google Scholar 

  30. Pouyanfar, S., et al.: A survey on deep learning: algorithms, techniques, and applications. ACM Comput. Surv. 51(5), 1–36 (2018). https://doi.org/10.1145/3234150

    Article  Google Scholar 

  31. Quinlan, J.R., Rivest, R.L.: Inferring decision trees using the minimum description length principle. Inf. Comput. 80(3), 227–248 (1989). https://doi.org/10.1016/0890-5401(89)90010-2

    Article  MathSciNet  MATH  Google Scholar 

  32. Quinlan, J.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, San Mateo, CA (1993)

    Google Scholar 

  33. Rivera-Lopez, R., Canul-Reich, J., Gámez, J.A., Puerta, J.M.: OC1-DE: a differential evolution based approach for inducing oblique decision trees. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2017. LNCS (LNAI), vol. 10245, pp. 427–438. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-59063-9_38

    Chapter  Google Scholar 

  34. Storn, R., Price, K.: Differential evolution-a simple and efficient heuristic for global optimization over continuous spaces. J. Glob. Optim. 11(4), 341–359 (1997)

    Article  MathSciNet  Google Scholar 

  35. Verbeke, W., Martens, D., Mues, C., Baesens, B.: Building comprehensible customer churn prediction models with advanced rule induction techniques. Expert Syst. Appl. 38(3), 2354–2364 (2011)

    Article  Google Scholar 

  36. Ware, M., Frank, E., Holmes, G., A., H.M., Witten, I.H.: Interactive machine learning: letting users build classifiers. Int. J. Hum. Comput. Stud. 55(3), 281–292 (2001)

    Google Scholar 

  37. Wegman, E.J.: Hyperdimensional data analysis using parallel coordinates. J. Am. Stat. Assoc. 85(411), 664–675 (1990)

    Article  Google Scholar 

  38. Wilcoxon, F.: Individual comparisons by ranking methods. In: Breakthroughs in Statistics: Methodology and Distribution, pp. 196–202. Springer, NY (1992). https://doi.org/10.1007/978-1-4612-4380-9_16

  39. Yang, Y., Morillo, I.G., Hospedales, T.M.: Deep neural decision trees. In: ICML Workshop on Human Interpretability in Machine Learning (WHI 2018) (2018)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vladimir Estivill-Castro .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Estivill-Castro, V., Gilmore, E., Hexel, R. (2020). Constructing Interpretable Decision Trees Using Parallel Coordinates. In: Rutkowski, L., Scherer, R., Korytkowski, M., Pedrycz, W., Tadeusiewicz, R., Zurada, J.M. (eds) Artificial Intelligence and Soft Computing. ICAISC 2020. Lecture Notes in Computer Science(), vol 12416. Springer, Cham. https://doi.org/10.1007/978-3-030-61534-5_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-61534-5_14

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-61533-8

  • Online ISBN: 978-3-030-61534-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics