Abstract
The interest in interpretable models that are not only accurate but also understandable is rapidly increasing; often resulting in the machine-learning community turning to decision tree classifiers. Many techniques of growing decision trees use oblique rules to increase the accuracy of the tree and decrease its overall size, but this severely limits understandability by a human user. We propose a new type of oblique rule for decision tree classifiers that is interpretable to human users. We use the parallel coordinates system of visualisation to display both the dataset and rule to the user in an intuitive way. We propose the use of an evolutionary algorithm to learn this new type of rule and show that it produced significantly smaller trees compared to a tree created with axis-parallel rules with minimal loss in accuracy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
We use only adjacent attributes in the visualisation as pairs due to the computational cost of evaluation of every possible attribute pair and maintaining the mental map of the user. This reduces the complexity from \(O(d^{2})\) to O(d), and the user can still permute attributes at will to create pairs.
- 2.
We only run once on the full dataset when measuring the size and the depth of the deepest leaf for C4.5 and J48 since these algorithms are completely deterministic.
- 3.
We make available the sourcecode for our algorithm here.
- 4.
We thank the authors of the OC1-DE algorithm [33] for providing their source code.
References
Adadi, A., Berrada, M.: Peeking inside the black-box: a survey on explainable artificial intelligence (xai). IEEE Access 6, 52138–52160 (2018). https://doi.org/10.1109/ACCESS.2018.2870052
Ala-Pietilä, P., et al.: European union: general data protection regulation (EU) 2016/679. Techinal. report., European Commission, B-1049 Brussels (8th April 2016)
Arlot, S., Celisse, A., et al.: A survey of cross-validation procedures for model selection. Stat. Surv. 4, 40–79 (2010)
Blanco-Justicia, A., Domingo-Ferrer, J.: Machine learning explainability through comprehensible decision trees. In: Holzinger, A., Kieseberg, P., Tjoa, A.M., Weippl, E. (eds.) CD-MAKE 2019. LNCS, vol. 11713, pp. 15–26. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-29726-8_2
Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and Regression Trees. Wadsworth and Brooks, Monterrey (1984)
Calvo, B., Santafé-Guzmán, R.: scmamp: Statistical comparison of multiple algorithms in multiple problems. The R J. 8(1) August 2016
Cantú-Paz, E., Kamath, C.: Inducing oblique decision trees with evolutionary algorithms. IEEE Trans. Evol. Comput. 7(1), 54–68 (2003)
Chaudhuri, S., Dayal, U.: An overview of data warehousing and OLAP technology. SIGMOD Rec. 26(1), 65–74 (1997). https://doi.org/10.1145/248603.248616
Chaudhuri, S., Dayal, U., Narasayya, V.: An overview of business intelligence technology. Commun. ACM 54(8), 88–98 (2011). https://doi.org/10.1145/1978542.1978562
Few, S.: Multivariate analysis using parallel coordinates. Perceptual Edge (September 12th 2006). www.perceptualedge.com. Accessed 5 Nov 2019
Forman, G., Scholz, M.: Apples-to-apples in cross-validation studies: Pitfalls in classifier performance measurement. SIGKDD Explor. Newsl. 12(1), 49–57 (2010). https://doi.org/10.1145/1882471.1882479
Freitas, A.A.: Comprehensible classification models: a position paper. SIGKDD Explor. 15(1), 1–10 (2013)
Friedman, M.: The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J. Am. Stat. Assoc. 32, 675–701 (1937)
Health, D.G., Kasif, S., Salzberg, S.: Induction of oblique decision trees. In: 13th International Joint Conference on Artificial Intelligence, pp. 1002–1007. Morgan Kaufmann (1993)
Holzinger, A.: Interactive machine learning for health informatics: when do we need the human-in-the-loop? Brain Inform. 3(2), 119–131 (2016). https://doi.org/10.1007/s40708-016-0042-6
Hommel, G.: A stagewise rejective multiple test procedure based on a modified Bonferroni test. Biometrika 75(2), 383–386 (1988). https://doi.org/10.1093/biomet/75.2.383
Hutson, M.: Artificial intelligence faces reproducibility crisis. Science 359(6377), 725–726 (2018). https://doi.org/10.1126/science.359.6377.725
Iman, R., Davenport, J.: Approximations of the critical region of the friedman statistic. Commun. Stat. Theor. Meth. 99(6), 571–595 (1980)
Inselberg, A.: Parallel Coordinates : Visual Multidimensional Geometry and its Applications. Springer, NY (2009)
Inselberg, A., Avidan, T.: Classification and visualization for high-dimensional data. In: 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 20th–23rd August, pp. 370–374. ACM, Boston, MA, USA (2000)
Johansson, J., Forsell, C., Lind, M., Cooper, M.: Perceiving patterns in parallel coordinates: determining thresholds for identification of relationships. Inf. Vis. 7(2), 152–162 (2008). https://doi.org/10.1057/palgrave.ivs.9500166
Lavrac̆, N.: Selected techniques for data mining in medicine. Artif. Intell. Med. 16(1), 3–23 (1999)
Lichman, M.: UCI machine learning repository (2013). https://archive.ics.uci.edu/ml
Mitchell, T.M.: Machine Learning. McGraw-Hill, New York (1997)
Monroe, D.: AI, explain yourself. Commun. ACM 61(11), 11–13 (2018). https://doi.org/10.1145/3276742
Moore, A., Murdock, V., Cai, Y., Jones, K.: Transparent tree ensembles. In: 41st International ACM SIGIR Confernce on Research & Development in Information Retrieval, pp. 1241–1244. SIGIR 2018. ACM, NY (2018). https://doi.org/10.1145/3209978.3210151
Mues, C., Huysmans, J., Vanthienen, J., Baesens, B.: Comprehensible credit-scoring knowledge visualization using decision tables and diagrams. In: Enterprise Information Systems VI, pp. 109–115. Springer (2006). https://doi.org/10.1007/1-4020-3675-2_13
Murthy, S.K., Kasif, S., Salzberg, S.: A system for induction of oblique decision trees. J. Artif. Int. Res. 2(1), 1–32 (1994)
Murthy, S., Kasif, S., Salzberg, S., Beigel, R.: OC1: Randomized induction of oblique decision trees. In: 11th National Conference on Artificial Intelligence, AAAI Press. pp. 322–327. AAAI 1993 (1993)
Pouyanfar, S., et al.: A survey on deep learning: algorithms, techniques, and applications. ACM Comput. Surv. 51(5), 1–36 (2018). https://doi.org/10.1145/3234150
Quinlan, J.R., Rivest, R.L.: Inferring decision trees using the minimum description length principle. Inf. Comput. 80(3), 227–248 (1989). https://doi.org/10.1016/0890-5401(89)90010-2
Quinlan, J.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, San Mateo, CA (1993)
Rivera-Lopez, R., Canul-Reich, J., Gámez, J.A., Puerta, J.M.: OC1-DE: a differential evolution based approach for inducing oblique decision trees. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2017. LNCS (LNAI), vol. 10245, pp. 427–438. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-59063-9_38
Storn, R., Price, K.: Differential evolution-a simple and efficient heuristic for global optimization over continuous spaces. J. Glob. Optim. 11(4), 341–359 (1997)
Verbeke, W., Martens, D., Mues, C., Baesens, B.: Building comprehensible customer churn prediction models with advanced rule induction techniques. Expert Syst. Appl. 38(3), 2354–2364 (2011)
Ware, M., Frank, E., Holmes, G., A., H.M., Witten, I.H.: Interactive machine learning: letting users build classifiers. Int. J. Hum. Comput. Stud. 55(3), 281–292 (2001)
Wegman, E.J.: Hyperdimensional data analysis using parallel coordinates. J. Am. Stat. Assoc. 85(411), 664–675 (1990)
Wilcoxon, F.: Individual comparisons by ranking methods. In: Breakthroughs in Statistics: Methodology and Distribution, pp. 196–202. Springer, NY (1992). https://doi.org/10.1007/978-1-4612-4380-9_16
Yang, Y., Morillo, I.G., Hospedales, T.M.: Deep neural decision trees. In: ICML Workshop on Human Interpretability in Machine Learning (WHI 2018) (2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Estivill-Castro, V., Gilmore, E., Hexel, R. (2020). Constructing Interpretable Decision Trees Using Parallel Coordinates. In: Rutkowski, L., Scherer, R., Korytkowski, M., Pedrycz, W., Tadeusiewicz, R., Zurada, J.M. (eds) Artificial Intelligence and Soft Computing. ICAISC 2020. Lecture Notes in Computer Science(), vol 12416. Springer, Cham. https://doi.org/10.1007/978-3-030-61534-5_14
Download citation
DOI: https://doi.org/10.1007/978-3-030-61534-5_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-61533-8
Online ISBN: 978-3-030-61534-5
eBook Packages: Computer ScienceComputer Science (R0)