Explainable decision forest: Transforming a decision forest into an interpretable tree
Introduction
Decision forest is an umbrella term for ensemble methods that combine multiple decision trees in supervised machine learning tasks. Their ability to aggregate different hypotheses rather than search for a local optima, along with their robustness to different sample sizes and feature spaces, make them popular in many data science challenges [1], [2], [3]. However, despite decision forest’s high degree of accuracy, other models may be preferable for two main reasons. First, classifications of decision forests are usually inefficient compared to single classifier models as many decision trees are applied to generate a single classification. This attribute becomes a serious vulnerability in real-time predictive systems [4], [5]. Second, it is not easy to intuitively explain the rational behind the classifications of decision forests as each classification consists of the results of many trees. This issue usually prevents the use of decision forests in domains that require a clear explanation for individual decisions (e.g., medicine, insurance, etc.) [6].
Previous studies that addressed the above mentioned vulnerabilities of ensemble models can be categorized into two main approaches: ensemble pruning methods and ensemble derived models. The objective of ensemble pruning methods is to search for a subset of ensemble members that performs at least as good as the source ensemble [7]. These methods were shown to significantly improve ensemble performance in terms of complexity and accuracy. The problem of interpretability nevertheless remains unsolved when using such methods, as the resulting ensemble still cannot be interpreted. The notion of deriving a single intelligible model from a given decision forest was tested in a few studies as well. One approach is to train a simple model, using a large set of synthetic or unlabeled data that was classified by a previously trained decision forest [8], [9]. However, this approach depends on unlabeled data, which limits its usage to cases where unlabeled data is available or with an unbiased procedure for generating a synthetic dataset. Another approach for transforming a decision forest into a single intelligible classifier is to include a post-processing step. In this step, the decision tree is derived from the structure of the given decision forest [10], [11]. A substantial limitation of existing post-processing methods is their high complexity, which prevents their application for large decision forests. In addition, many hyperparameters must be tuned in order to find a suitable setting for a given case.
This paper presents a scalable method for transforming a decision forest into a single decision tree. The resulting decision tree approximates the predictive performance of the original decision forest while providing intelligible and faster predictions. A decision tree has been selected to be the outputted model as it was shown to be interpretable both in terms of its graphical model structure as well as its decomposability, i.e. - each node and decision path can be corresponded to a plain textual description [12]. As opposed to similar methods, the proposed method is suitable for forests of any size and does not require complex hyperparameter tuning. The method includes two main stages. In the first stage, we create a conjunction set that represents the original decision forest. In the second stage, we build a decision tree that organizes the conjunction set in a tree structure. The remainder of the paper is structured as follows: In Section 2 we lay the scientific background and describe related work. In Section 3, we present the developed method. Section 4 presents an experimental evaluation and discusses its results. Section 5 concludes and suggests future research directions.
Section snippets
Background
Ensemble models and specifically decision forests are considered the best practice in many supervised machine-learning tasks, mainly due to their superior predictive performance compared to other models [1], [2], [13]. Nonetheless, simple models like decision trees might be preferred over decision forests under some circumstances [6], [14], [15]. Building an interpretable decision tree that approximates the predictive performance of a given decision forest is the subject of this work. The
Forest based tree (FBT)
This section presents a method that uses a trained decision forest for generating a single decision tree in a post-processing manner. We name the new model forest based tree (FBT). The main contribution of FBT is in expanding the range of models that can be used in cases where there is a trade-off between predictive performance to prediction time or prediction interpertability. In addition, in contrast to existing methods, this method can be applied on large decision forests without requiring
Experimental evaluation
The effectiveness of the proposed forest based tree was evaluated by carrying out an experimental study as described below. The experimental study compared different variations of the forest based tree with several benchmark classifiers by considering two evaluation criteria: predictive performance and classification complexity. The predictive performance was assessed using the multiclass extension of the ROC AUC measure that aggregates the ROC AUC values over each pair of classes [66]. ROC AUC
Conclusion and future work
In this paper we presented a novel method for building an intelligible decision tree based on a given decision forest. The resulting tree often approximates the predictive performance obtained by the source forest while significantly reducing its prediction complexity. The new tree also provides a decision path as an explanatory mechanism for its classifications. As opposed to existing methods that aim to achieve the same objective, the proposed method does not require the availability of
CRediT authorship contribution statement
Omer Sagi: Conceptualization, Methodology, Software, Formal analysis, Investigation, Resources, Visualization, Writing - original draft, Data curation, Validation. Lior Rokach: Conceptualization, Supervision, Writing - review & editing, Validation.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
References (75)
Decision forest: twenty years of research
Inf. Fusion
(2016)- et al.
Random forest in remote sensing: a review of applications and future directions
ISPRS J. Photogramm. Remote Sens.
(2016) Collective-agreement-based pruning of ensembles
Comput. Stat. Data Anal.
(2009)Knowledge discovery via multiple models
Intell. Data Anal.
(1998)- et al.
Explaining machine learning models in sales predictions
Expert Syst. Appl.
(2017) - et al.
Data mining with decision trees and decision rules
Future Gener. Comput. Syst.
(1997) - et al.
An empirical evaluation of the comprehensibility of decision table, tree and rule based predictive models
Decis. Support Syst.
(2011) - et al.
Boosted decision trees as an alternative to artificial neural networks for particle identification
Nucl. Instrum. Methods Phys. Res., Sect. A
(2005) - et al.
An up-to-date comparison of state-of-the-art classification algorithms
Expert Syst. Appl.
(2017) - et al.
Pindroid: a novel android malware detection system using ensemble learning methods
Comput. Secur.
(2017)
Classifier ensembles: select real-world applications
Inf. Fusion
Selective ensemble of decision trees
Rough sets, fuzzy sets, data mining, and granular computing
Interpretable explanations of black boxes by meaningful perturbation
Proceedings of the IEEE International Conference on Computer Vision
Xgboost: a scalable tree boosting system
Proceedings of the 22Nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Focused ensemble selection: a diversity-based method for greedy ensemble selection.
ECAI
Comprehensible classification models: a position paper
ACM SIGKDD Explor. Newslett.
Ensemble pruning via semi-definite programming
J. Mach. Learn. Res.
Model compression
Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Seeing the forest through the trees: Learning a comprehensible model from an ensemble
European Conference on Machine Learning
Genesim: genetic extraction of a single, interpretable model
NIPS2016, the 30th Conference on Neural Information Processing Systems
Ensemble methods in machine learning
International Workshop on Multiple Classifier Systems
Interpretable classifiers using rules and Bayesian analysis: building a better stroke prediction model
Ann. Appl. Stat.
Interpretable machine learning in healthcare
Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics
A survey of methods for explaining black box models
ACM Comput. Surv.
Why should i trust you?: Explaining the predictions of any classifier
Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Machine learning: between accuracy and interpretability
Learning, Networks and Statistics
A survey of methods for explaining black box models
ACM Comput. Surv.
Explainable artificial intelligence (xai)
Defense Advanced Research Projects Agency (DARPA), nd Web
Peeking inside the black-box: a survey on explainable artificial intelligence (xai)
IEEE Access
Incompatible: the GDPR in the age of big data
Seton Hall Law Rev.
European union regulations on algorithmic decision-making and a ǣright to explanationǥ
AI Mag.
A unified approach to interpreting model predictions
Advances in Neural Information Processing Systems
Prototype selection for interpretable classification
Ann. Appl. Stat.
Explaining prediction models and individual predictions with feature contributions
Knowl. Inf. Syst.
Falling rule lists
Artificial Intelligence and Statistics
Cited by (112)
Interpretable synthetic signals for explainable one-class time-series classification
2024, Engineering Applications of Artificial IntelligenceAn analysis of ensemble pruning methods under the explanation of Random Forest
2024, Information SystemsEarly detection of students’ failure using Machine Learning techniques
2023, Operations Research PerspectivesDiscriminant analysis of volatile compounds in wines obtained from different managements of vineyards obtained by e-nose
2023, Smart Agricultural TechnologyAutomatic Feature Engineering for Learning Compact Decision Trees
2023, Expert Systems with ApplicationsPredicting adhesion strength of micropatterned surfaces using gradient boosting models and explainable artificial intelligence visualizations
2023, Materials Today Communications