Elsevier

Information Sciences

Volume 288, 20 December 2014, Pages 153-173
Information Sciences

Evolutionary induction of global model trees with specialized operators and memetic extensions

https://doi.org/10.1016/j.ins.2014.07.051Get rights and content

Abstract

Metaheuristics, such as evolutionary algorithms (EAs), have been successfully applied to the problem of decision tree induction. Recently, an EA was proposed to evolve model trees, which are a particular type of decision tree that is employed to solve regression problems. However, there is a need to specialize the EAs in order to exploit the full potential of evolutionary induction. The main contribution of this paper is a set of solutions and techniques that incorporates knowledge about the inducing problem for the global model tree into the evolutionary search. The objective of this paper is to demonstrate that specialized EA can find more accurate and less complex solutions to the traditional greedy-induced counterparts and the straightforward application of EA.

This paper proposes a novel solution for each step of the evolutionary process and presents a new specialized EA for model tree induction called the Global Model Tree (GMT). An empirical investigation shows that trees induced by the GMT are one order of magnitude less complex than trees induced by popular greedy algorithms, and they are equivalent in terms of predictive accuracy with output models from straightforward implementations of evolutionary induction and state-of-the-art methods.

Introduction

The most common predictive tasks in data mining [17] are classification and regression. Decision trees [29], [36] are one of the most popular prediction techniques. The success of tree-based approaches can be explained by their ease of application, speed of operation, and effectiveness. Furthermore, the hierarchical tree structure, where appropriate tests from consecutive nodes are sequentially applied, closely resembles a human method of decision making, which makes decision trees natural and easy to understand even for inexperienced analysts. Regression and model trees [22] are variants of decision trees, and they have been designed to approximate real-valued functions instead of being used for classification tasks. The main difference between a regression tree and a model tree is that, in the latter, a constant value in the terminal node is replaced by a regression plane.

Inducing an optimal model tree, as with the problem of learning an optimal decision tree, is known to be NP-complete [24]. Consequently, practical decision-tree learning algorithms are based on heuristics such as greedy algorithms, where locally optimal decisions are made in each tree node. Such algorithms cannot guarantee to return the globally optimal decision tree. The purpose of this paper is to illustrate the application of a specialized evolutionary algorithm (EA) [27] to the problem of model tree induction. The objectives are to show that evolutionary induction may result in finding globally optimal solutions that are more accurate and less complex than the traditional greedy-induced counterparts and straightforward application of EA. This research shows the impact of the application of specialized EAs on the tree structure, tests in internal nodes, and models in the leaves. By incorporating the knowledge about global model tree induction, the full potential of EAs is exploited. Local optimizations are also proposed for EAs problem search, which is known as a memetic algorithm [28], [7].

Our previous research showed that global inducers are capable of efficiently evolving accurate and compact univariate regression trees [25], called Global Regression Trees (GRT), and model trees with simple linear regression in the leaves [8], [10]. In our previous papers, we proposed model trees with multiple linear regression in the leaves [9] and considered how memetic extensions improve the global induction of regression and model trees [11]. This paper reviews and significantly extends our previous work on model trees in almost every step of evolutionary induction. We introduce new specialized operators and local search components that improve pure evolutionary methods and propose a smoothing process to increase the prediction accuracy of the model tree. A new multi-objective optimization strategy (lexicographic analysis) is verified as an alternative fitness function to a weight formula. Additional data sets and new experiments illustrate the advantage of the global search solutions for popular model tree algorithms.

This paper is organized as follows. The following section provides a brief background on model trees, reviews related work, and describes some of the advantages with regard to using EAs for model tree induction. Section 3 describes the approach and demonstrates how each step of the EA can be improved. Section 4 presents a validation of the proposed solutions in three sets of experiments. In the last section, the paper is concluded and possible future works are sketched.

The presented experiments demonstrate how each step of the EA can be improved.

Section snippets

Global vs local induction

Decision trees are often built through a process that is known as a recursive partitioning. The most popular tree-induction is based on the top-down approach [35]. It starts from the root node, where the locally optimal split (test) is searched according to the given optimality measure (e.g., Gini, Twoing, or the entropy rule for classification trees and the least squared or least absolute deviation error criterion for regression trees). Next, the training data is redirected to newly created

Evolutionary induction of the global model tree

In this section, we would like to propose the solution called Global Model Tree (GMT), which is an evolutionary approach for the global induction of model trees. The GMT general structure follows a typical framework for an evolutionary algorithm with an unstructured population and a generational selection. Each step of the GMT will be discussed separately: representation, initialization, fitness function, selection and terminal condition, genetic operators, and smoothing. In each step,

Experimental validation

In this section, three sets of experiments are presented. First, we would like to share some details of the GMT evaluation. Next, we validate the overall performance of the GMT solution with respect to predictive accuracy, build time, and tree and model size. The results are confronted with popular greedy counterparts on a number of large datasets. Finally, we compare the GMT, with its baseline denoted as bGMT (straightforward application of EA), and the solution called E-Motion [2], which also

Conclusion

Greedy regression and model tree inducers are fast, white box solutions that usually have a slightly lower prediction accuracy when compared to the complex or ensemble-learning techniques. However, when applied to large datasets, they often lose their important advantage – simplicity – and generate trees with hundreds or even thousands of leaves with regressions models that include dozens of explanatory attributes each. Such large trees are almost impossible to understand and interpret, and

Acknowledgments

The authors thank Bernhard Pfahringer, who provide us with preprocessed datasets. This project was funded by the Polish National Science Center and allocated on the basis of decision 2013/09/N/ST6/04083.

References (44)

  • M. Czajkowski, M. Kretowski, An evolutionary algorithm for global induction of regression trees with multivariate...
  • M. Czajkowski et al.

    An evolutionary algorithm for global induction of regression and model trees

    Int. J. Data Min., Model. Manage.

    (2013)
  • M. Czajkowski, M. Kretowski, Does memetic approach improve global induction of regression and model trees? in:...
  • J. Demsar

    Statistical comparisons of classifiers over multiple data sets

    J. Mach. Learn. Res.

    (2006)
  • A. Dobra, J. Gehrke, SECRET: a scalable linear regression tree algorithm, in: Proceedings of KDD’02,...
  • A.E. Eiben et al.

    Parameter control in evolutionary algorithms

    IEEE Trans. Evolution. Comput.

    (1999)
  • F. Esposito et al.

    A comparative analysis of methods for pruning decision trees

    IEEE Trans. Patt. Anal. Mach. Intell.

    (1997)
  • G. Fan et al.

    Regression tree analysis using target

    J. Comput. Graph. Statist.

    (2005)
  • A. Freitas

    A critical review of multi-objective optimization in data mining: a position paper

    SIGKDD Explor. Newsl.

    (2004)
  • J.H. Friedman

    Stochastic gradient boosting

    Comput. Statist. Data Anal.

    (1999)
  • P. Gagne et al.

    Best regression model using information criteria

    J. Mod. Appl. Statist. Meth.

    (2002)
  • Cited by (22)

    • GPU-based acceleration of evolutionary induction of model trees

      2022, Applied Soft Computing
      Citation Excerpt :

      Thus, the exact results for the prediction error are not included. In previous works (e.g., [5,13]), GMT was thoroughly validated concerning the prediction error, tree size and model size in the leaves. It was shown that GMT outperforms popular, greedy, single-tree counterparts, e.g., M5 [20] (state-of-the-art model tree inducer) and Weka REPTree [10].

    • The role of decision tree representation in regression problems – An evolutionary perspective

      2016, Applied Soft Computing Journal
      Citation Excerpt :

      Parameter tuning for EAs is a difficult task. Hopefully, all important EA parameters (e.g., population size, the probability of mutation and crossover, etc.) and the decision tree parameters (maximum size, minimum objects to make a split) were experimentally validated and tuned in previous papers for trees with homogeneous representations [12]. Those general settings should also work well with the mixed regression trees; therefore, they can be treated as default.

    • Cost-sensitive Global Model Trees applied to loan charge-off forecasting

      2015, Decision Support Systems
      Citation Excerpt :

      Fig. 5 illustrates actual f0(x) and the new, randomly modified regression model denoted as f2(x). In the paper, we have extended the GMT solution [17] to work as a cost-sensitive learner. The original GMT is cost-neutral algorithm with symmetric loss function and therefore, couldn't be applied to the data with different prediction costs.

    View all citing articles on Scopus
    View full text