Elsevier

Information Fusion

Volume 61, September 2020, Pages 124-138
Information Fusion

Explainable decision forest: Transforming a decision forest into an interpretable tree

https://doi.org/10.1016/j.inffus.2020.03.013Get rights and content

Highlights

  • Decision forests can be transformed into single decision trees.

  • The generated tree often approximates the accuracy of its source forest.

  • The new tree provides interpretable classifications as opposed to random forest.

  • The generated tree outperforms similar existing approaches

  • The resulting tree is usually more complex than existing tree models.

Abstract

Decision forests are considered the best practice in many machine learning challenges, mainly due to their superior predictive performance. However, simple models like decision trees may be preferred over decision forests in cases in which the generated predictions must be efficient or interpretable (e.g. in insurance or health-related use cases). This paper presents a novel method for transforming a decision forest into an interpretable decision tree, which aims at preserving the predictive performance of decision forests while enabling efficient classifications that can be understood by humans. This is done by creating a set of rule conjunctions that represent the original decision forest; the conjunctions are then hierarchically organized to form a new decision tree. We evaluate the proposed method on 33 UCI datasets and show that the resulting model usually approximates the ROC AUC gained by random forest while providing an interpretable decision path for each classification.

Introduction

Decision forest is an umbrella term for ensemble methods that combine multiple decision trees in supervised machine learning tasks. Their ability to aggregate different hypotheses rather than search for a local optima, along with their robustness to different sample sizes and feature spaces, make them popular in many data science challenges [1], [2], [3]. However, despite decision forest’s high degree of accuracy, other models may be preferable for two main reasons. First, classifications of decision forests are usually inefficient compared to single classifier models as many decision trees are applied to generate a single classification. This attribute becomes a serious vulnerability in real-time predictive systems [4], [5]. Second, it is not easy to intuitively explain the rational behind the classifications of decision forests as each classification consists of the results of many trees. This issue usually prevents the use of decision forests in domains that require a clear explanation for individual decisions (e.g., medicine, insurance, etc.) [6].

Previous studies that addressed the above mentioned vulnerabilities of ensemble models can be categorized into two main approaches: ensemble pruning methods and ensemble derived models. The objective of ensemble pruning methods is to search for a subset of ensemble members that performs at least as good as the source ensemble [7]. These methods were shown to significantly improve ensemble performance in terms of complexity and accuracy. The problem of interpretability nevertheless remains unsolved when using such methods, as the resulting ensemble still cannot be interpreted. The notion of deriving a single intelligible model from a given decision forest was tested in a few studies as well. One approach is to train a simple model, using a large set of synthetic or unlabeled data that was classified by a previously trained decision forest [8], [9]. However, this approach depends on unlabeled data, which limits its usage to cases where unlabeled data is available or with an unbiased procedure for generating a synthetic dataset. Another approach for transforming a decision forest into a single intelligible classifier is to include a post-processing step. In this step, the decision tree is derived from the structure of the given decision forest [10], [11]. A substantial limitation of existing post-processing methods is their high complexity, which prevents their application for large decision forests. In addition, many hyperparameters must be tuned in order to find a suitable setting for a given case.

This paper presents a scalable method for transforming a decision forest into a single decision tree. The resulting decision tree approximates the predictive performance of the original decision forest while providing intelligible and faster predictions. A decision tree has been selected to be the outputted model as it was shown to be interpretable both in terms of its graphical model structure as well as its decomposability, i.e. - each node and decision path can be corresponded to a plain textual description [12]. As opposed to similar methods, the proposed method is suitable for forests of any size and does not require complex hyperparameter tuning. The method includes two main stages. In the first stage, we create a conjunction set that represents the original decision forest. In the second stage, we build a decision tree that organizes the conjunction set in a tree structure. The remainder of the paper is structured as follows: In Section 2 we lay the scientific background and describe related work. In Section 3, we present the developed method. Section 4 presents an experimental evaluation and discusses its results. Section 5 concludes and suggests future research directions.

Section snippets

Background

Ensemble models and specifically decision forests are considered the best practice in many supervised machine-learning tasks, mainly due to their superior predictive performance compared to other models [1], [2], [13]. Nonetheless, simple models like decision trees might be preferred over decision forests under some circumstances [6], [14], [15]. Building an interpretable decision tree that approximates the predictive performance of a given decision forest is the subject of this work. The

Forest based tree (FBT)

This section presents a method that uses a trained decision forest for generating a single decision tree in a post-processing manner. We name the new model forest based tree (FBT). The main contribution of FBT is in expanding the range of models that can be used in cases where there is a trade-off between predictive performance to prediction time or prediction interpertability. In addition, in contrast to existing methods, this method can be applied on large decision forests without requiring

Experimental evaluation

The effectiveness of the proposed forest based tree was evaluated by carrying out an experimental study as described below. The experimental study compared different variations of the forest based tree with several benchmark classifiers by considering two evaluation criteria: predictive performance and classification complexity. The predictive performance was assessed using the multiclass extension of the ROC AUC measure that aggregates the ROC AUC values over each pair of classes [66]. ROC AUC

Conclusion and future work

In this paper we presented a novel method for building an intelligible decision tree based on a given decision forest. The resulting tree often approximates the predictive performance obtained by the source forest while significantly reducing its prediction complexity. The new tree also provides a decision path as an explanatory mechanism for its classifications. As opposed to existing methods that aim to achieve the same objective, the proposed method does not require the availability of

CRediT authorship contribution statement

Omer Sagi: Conceptualization, Methodology, Software, Formal analysis, Investigation, Resources, Visualization, Writing - original draft, Data curation, Validation. Lior Rokach: Conceptualization, Supervision, Writing - review & editing, Validation.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References (75)

  • N.C. Oza et al.

    Classifier ensembles: select real-world applications

    Inf. Fusion

    (2008)
  • Z.-H. Zhou et al.

    Selective ensemble of decision trees

    Rough sets, fuzzy sets, data mining, and granular computing

    (2003)
  • R.C. Fong et al.

    Interpretable explanations of black boxes by meaningful perturbation

    Proceedings of the IEEE International Conference on Computer Vision

    (2017)
  • T. Chen et al.

    Xgboost: a scalable tree boosting system

    Proceedings of the 22Nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

    (2016)
  • I. Partalas et al.

    Focused ensemble selection: a diversity-based method for greedy ensemble selection.

    ECAI

    (2008)
  • A.A. Freitas

    Comprehensible classification models: a position paper

    ACM SIGKDD Explor. Newslett.

    (2014)
  • Y. Zhang et al.

    Ensemble pruning via semi-definite programming

    J. Mach. Learn. Res.

    (2006)
  • C. Bucilu et al.

    Model compression

    Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

    (2006)
  • A. Van Assche et al.

    Seeing the forest through the trees: Learning a comprehensible model from an ensemble

    European Conference on Machine Learning

    (2007)
  • G. Vandewiele et al.

    Genesim: genetic extraction of a single, interpretable model

    NIPS2016, the 30th Conference on Neural Information Processing Systems

    (2016)
  • Z.C. Lipton, The mythos of model interpretability,...
  • T.G. Dietterich

    Ensemble methods in machine learning

    International Workshop on Multiple Classifier Systems

    (2000)
  • B. Letham et al.

    Interpretable classifiers using rules and Bayesian analysis: building a better stroke prediction model

    Ann. Appl. Stat.

    (2015)
  • M.A. Ahmad et al.

    Interpretable machine learning in healthcare

    Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics

    (2018)
  • R. Guidotti et al.

    A survey of methods for explaining black box models

    ACM Comput. Surv.

    (2018)
  • M.T. Ribeiro et al.

    Why should i trust you?: Explaining the predictions of any classifier

    Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

    (2016)
  • I. Bratko

    Machine learning: between accuracy and interpretability

    Learning, Networks and Statistics

    (1997)
  • R. Guidotti et al.

    A survey of methods for explaining black box models

    ACM Comput. Surv.

    (2019)
  • D. Gunning

    Explainable artificial intelligence (xai)

    Defense Advanced Research Projects Agency (DARPA), nd Web

    (2017)
  • A. Adadi et al.

    Peeking inside the black-box: a survey on explainable artificial intelligence (xai)

    IEEE Access

    (2018)
  • T.Z. Zarsky

    Incompatible: the GDPR in the age of big data

    Seton Hall Law Rev.

    (2016)
  • B. Goodman et al.

    European union regulations on algorithmic decision-making and a ǣright to explanationǥ

    AI Mag.

    (2017)
  • S.M. Lundberg et al.

    A unified approach to interpreting model predictions

    Advances in Neural Information Processing Systems

    (2017)
  • J. Bien et al.

    Prototype selection for interpretable classification

    Ann. Appl. Stat.

    (2011)
  • E. Štrumbelj et al.

    Explaining prediction models and individual predictions with feature contributions

    Knowl. Inf. Syst.

    (2014)
  • F. Wang et al.

    Falling rule lists

    Artificial Intelligence and Statistics

    (2015)
  • H. Lakkaraju, E. Kamar, R. Caruana, J. Leskovec, Interpretable & explorable approximations of black box models,...
  • Cited by (112)

    • Interpretable synthetic signals for explainable one-class time-series classification

      2024, Engineering Applications of Artificial Intelligence
    View all citing articles on Scopus
    View full text