Elsevier

Knowledge-Based Systems

Volume 37, January 2013, Pages 394-414
Knowledge-Based Systems

A competitive ensemble pruning approach based on cross-validation technique

https://doi.org/10.1016/j.knosys.2012.08.024Get rights and content

Abstract

Ensemble pruning is crucial for the consideration of both efficiency and predictive accuracy of an ensemble system. This paper proposes a new Competitive technique for Ensemble Pruning based on Cross-Validation (CEPCV). The data to be learnt by neural computing models are mostly drifting with time and environment, therefore a dynamic ensemble pruning method is indispensable for practical applications, while the proposed CEPCV method is just the kind of dynamic ensemble pruning method, which can realize on-line ensemble pruning and take full advantage of potentially valuable information. The algorithm naturally inherits the predominance of cross-validation technique, which implies that those networks regarded as winners in selective competitions and chosen into the pruned ensemble have the “strongest” generalization capability. It is essentially based on the strategy of “divide and rule, collect the wisdom”, and might alleviate the local minima problem of many conventional ensemble pruning approaches only at the cost of a little greater computational cost, which is acceptable to most applications of ensemble learning. The comparative experiments among the four ensemble pruning algorithms, including: CEPCV and the state-of-the-art Directed Hill Climbing Ensemble Pruning (DHCEP) algorithm and two baseline methods, i.e. BSM, which chooses the Best Single Model in the initial ensemble based on their performances on the pruning set, and ALL, which reserves all network members of the initial ensemble, on ten benchmark classification tasks, demonstrate the effectiveness and validity of CEPCV.

Introduction

Ensemble learning is an important topic of interest in the research communities of pattern recognition and machine learning for its desirable generalization capability [1], [2]. It refers to training a collection of base predictors for a given classification or regression task and then combining their outputs with a combinational strategy [3]. It is also termed multiple classifier systems [4], expert committee[5], decision forest [6], [7], etc. Remarkable improvement in generalization performance has been observed from ensemble learning in a broad scope of application fields, for example: face recognition [8], optical character recognition [9], scientific image analysis [10], [11], medical diagnosis [12], [13], financial time series prediction [10], military purposes. [14], intrusion detection [15], etc.

Typically, ensemble learning algorithms consist of two main stages: the generation of multiple predictive models and their fusion [2]. Recently, a so-called ensemble pruning stage has been considered as an additional intermediate stage which deals with the selection of the appropriate ensemble members prior to combination [16], [17], [18], [19], [20], [21], [22], [23], [24]. It is also termed ensemble pruning, selective ensemble, ensemble thinning or ensemble selection.

Ensemble pruning is important and necessary for the consideration of two factors: efficiency and predictive accuracy [2]. Firstly, an ensemble system with large size will lead to heavy computational burdens. In certain applications, such as stream data mining, it is especially important to minimize the running time expenses. And when models are distributed over a network, a large number of constituent models will certainly lead to another serious problem, i.e. a large amount of communication costs [2]. Secondly, the other factor of predictive accuracy is equally influential. An ensemble may comprise constituent models with either high or low predictive accuracy. Those ensemble members with low predictive accuracy will negatively affect the overall predictive performance of the whole ensemble. Pruning these models while still maintaining a rather high diversity among the reserved ones is typically considered a proper method for the construction of an efficient and effective ensemble system [2].

The problem of ensemble pruning has been proven to be an NP-complete problem [25], [26]. Enumerative algorithm for searching the best subset of classifiers is not easily worked for ensembles that contain a large number of constituent models. Greedy algorithms, however, possess high speed, since they only consider a very small subspace among all the possible combinations [16], [17], [18], [21], [27]. But this characteristic may result in suboptimal solutions of the ensemble pruning problem [25]. A compact review about the related works on ensemble pruning is given in Section 2.2 of this paper.

This work, however, studies the problem of ensemble pruning from a perspective of competitive learning. For the research purpose, the n-Bits Binary Coding ICBP Ensemble System (nBBC-ICBP-ES) proposed in our previous work [28] is employed as the basic ensemble. A brief introduction about nBBC-ICBP-ES is given in Section 5.2. The reason why nBBC-ICBP-ES is adopted as the initial ensemble system for this work is very natural and intuitive. Because nBBC-ICBP-ES is successfully implemented in our previous work. It is simple but efficient and effective, and its effectiveness has been verified through experiments on several Benchmark classification tasks. And it is anticipated that the investigation about CEPCV Algorithm would improve the classification performance and generalization capability of the initial nBBC-ICBP-ES further, so that a desirable selective ensemble could be achieved, which is an original neural network system completely resulted from our own research works.

After the basic nBBC-ICBP-ES has been generated, the proposed Competitive Ensemble Pruning Algorithm Based on Cross-Validation (CEPCV) is started up for the purpose of ensemble pruning, wherein the final pruned ensemble is dynamically constructed with the help of cross-validation technique. Explicitly, for the specific test instance t under consideration, we calculate its squared Euclidean distance from every validation sample vi. After that, all the validation samples are arranged according to their above calculated squared distances values from t. Then, the first VSn validation instances in the arranged array of validation set, i.e. the VSn nearest neighbors of test sample t in the validation set, are picked out to form the dynamic validation subset associated with the specific test instance t. Each constituent ICBP model in the basic nBBC-ICBP-ES is then employed to provide its classification results to the above selected VSn dynamic validation instances. Those ICBP components which correctly classify at least τ dynamic validation instances are declared the winners in the competition and selected into the dynamically pruned NNE associated with test instance t. Finally, the classification decision for test sample t is made based upon the dynamically pruned NNE using the method of majority voting.

Our motivations for the development of CEPCV algorithm mainly consist of: First of all, the data needed to be learnt by the neural computing models are usually drifting and changing along with time and environment [2]. However, a majority of the typical ensemble pruning strategies imply that the component models selected to comprise the pruned ensemble are changeless once decided. They are incapable to realize ensemble pruning flexibly and changeably. This kind of defect will inevitably lead to neglect of valuable heuristic information in the data. In contrast, the proposed CEPCV method can actualize ensemble pruning dynamically, the pruning decisions of which are alterable and sensitive to each different testing sample under processing. This characteristic constitutes a remarkable novelty of CEPCV algorithm, which makes it evidently different from other typical ensemble pruning methods, i.e. it realizes pruning operation at the same time with the test procedure of the ensemble system, resulting in a final pruned ensemble with significantly higher accuracy and reliability.

Secondly, CEPCV algorithm naturally inherits the competence of the technique of cross-validation. The cross-validation technique is a standard tool in statistics which provides an appealing guiding principle to choose, within a set of candidate model structures, the “best” one according to a certain criterion [29]. The hope here in CEPCV is that the networks regarded as winners and selected into the pruned ensemble have the “best” generalization capability.

Thirdly, CEPCV algorithm boosts up the holistic predictive performance of selected models, while maintaining a high diversity among them. Constitutionally, the basic thinking behind CEPCV algorithm is the divide-and-conquer strategy [30], which is a significant research strategy of ensemble learning. And the unique strategy of CEPCV algorithm itself can be explained as “rout the enemy forces one by one”. It might alleviate the local minimum problem of many traditional ensemble pruning approaches at the cost of a little greater computational cost, which is generally acceptable to the requirements of most applications.

The notion of diversity here is a rather broad sense of concept. It means that those specific selective subensembles are diversified among each other, which is associated with each different test instance t. In this sense, it could be considered that, the selected subensembles maintain a high diversity among each other. Kuncheva and Whitaker have studied several statistics which can measure diversity among binary classifier outputs in their published work [31]. However, most of these diversity measures are not applicable to dynamically pruned ensembles, such as those resulted from CEPCV algorithm. Therefore, it would be our future work to investigate some diversity measures that could be applied to the scenario of dynamic ensemble pruning.

The remains of this work are structured as follows: Section 2 presents the method of ensemble pruning, including a theoretical analysis and a compact review about its related works in reference papers. Section 3 briefly reviews the technique of cross-validation for neural network optimization. Section 4 presents the proposed Competitive neural network Ensemble Pruning algorithm based on Cross-Validation technique (CEPCV). Section 5 reports the results of experimental study. From these experimental results, the final conclusions are drawn in Section 6.

Section snippets

Theoretical analysis on ensemble pruning

It was about 22 years ago when Hansen and Salamon proposed Neural Network Ensemble (NNE) [29]. They claim that ensembling a group of neural networks can improve the generalization capability of each individual component network significantly. This technology has recently become a very hot topic in both neural networks and machine learning communities for its remarkably desirable performance. However, it should be noticed that the law of “the more, the better” is not always true for all occasions

The technique of cross-validation for neural network optimization

As stated by Hansen and Salamon in their remarkable work that [29], making choice of a neural network architecture is equivalent to making choice of parameterization for the input–output relation within a data set, while cross-validation is a standard technique for deciding between alternative choices of parameterization for a data set by statistical methods. The original training process of a neural network involves training on the complete dataset by minimizing the accumulated classification

Detailed descriptions about CEPCV algorithm

In our proposed Competitive Ensemble Pruning Algorithm Based on Cross-Validation (CEPCV), the final pruned ensemble is dynamically constructed during the course of testing using the technique of cross-validation. Specifically, during the testing phase, namely after all the component networks in an ensemble have completed their training, when a test sample t is provided to the ensemble, for each validation sample vi, calculate its squared Euclidean distance from this test sample t. Then, rank

Experimental datasets

In order to compare the classification performance of the proposed CEPCV algorithm with that of some other state-of-the-art algorithms and baseline methods, and validate the effectiveness of CEPCV algorithm, 10 groups of experiments on benchmark classification datasets are carried out (viz. Car Evaluation dataset, Page Blocks dataset, Gene Detection dataset, Date Calculation dataset, Pen-Based Recognition of Handwritten Digits dataset, Mushroom Discrimination dataset, Thyroid dataset, Image

Conclusions

Ensemble pruning is vital for the reasons of both efficiency and predictive accuracy. And abundant work has been dedicated to the research field of ensemble pruning. However, the problem of ensemble pruning has been proven to be an NP-complete problem. While this work proposes a so-called Competitive Ensemble Pruning Algorithm Based on Cross-Validation (CEPCV) for the purpose of ensemble pruning, wherein the final pruned ensemble is constructed employing cross-validation technique in an online

Acknowledgements

This work is supported by the National Natural Science Foundation of China under the Grant No. 61100108, and Ministry of Education NUAA research project under the Grant No. NS2011015, and a project funded by the Priority Academic Program Development of Jiangsu Higher Education Institutions (PAPD).

References (41)

  • Q. Dai et al.

    The build of n-bits binary coding ICBP ensemble system

    Neurocomputing

    (2011)
  • Z.-H. Zhou et al.

    Ensembling neural networks: many could be better than all

    Artificial Intelligence

    (2002)
  • Y. Hong et al.

    Unsupervised data pruning for clustering of noisy data

    Knowledge-Based Systems

    (2008)
  • T.G. Dietterich, Ensemble methods in machine learning, in: Presented at Proceedings of the 1st International Workshop...
  • I. Partalas et al.

    An ensemble uncertainty aware measure for directed hill climbing ensemble pruning

    Machine Learning

    (2010)
  • T.K. Ho

    The random subspace method for constructing decision forests

    IEEE Transactions on Pattern Analysis and Machine Intelligence

    (1998)
  • Q.H. Hu et al.

    Constructing rough decision forests

    Lecture Notes in Artificial Intelligence

    (2005)
  • F.J. Huang, Z.-H.Z. hou, H.-J. Zhang, T.H. Chen, Pose invariant face recognition, in: Presented at 4th IEEE...
  • L.K. Hansen, L. Liisberg, P. Salamon, Ensemble methods for handwritten digit recognition, in: Presented at IEEE...
  • Y. Zhao, J. Gao, X. Yang, A survey of neural network ensemble, in: Presented at International Conference on Neural...
  • Cited by (70)

    • A comprehensive review on ensemble deep learning: Opportunities and challenges

      2023, Journal of King Saud University - Computer and Information Sciences
    • Optimizing the early glaucoma detection from visual fields by combining preprocessing techniques and ensemble classifier with selection strategies

      2022, Expert Systems with Applications
      Citation Excerpt :

      Among weak learners, the most used include decision trees (DT) (Breiman, Friedman, Stone, & Olshen, 1984; Mehta, Agrawal, & Rissanen, 1996; Quinlan, 1996; Shafer, Agrawal, & Mehta, 1996), artificial neural networks (ANN) (Andersson et al., 2013; Bizios et al., 2007; Boudhane, Nsiri, & Toulni, 2016) or support vector machine (SVM) (Cortes & Vapnik, 1995; Hearst, Dumais, Osuna, Platt, & Scholkopf, 1998; Smola & Schölkopf, 2004; Vapnik, 2013), k-nearest neighbors (k-NN) (Koumétio & Toulni, 2021; Quinlan, 1996; Sinta, Wijayanto, & Sartono, 2014; Zhang, Li, Zong, Zhu, & Wang, 2017) or naive bayes (NB). The KNN algorithm is also particularly interesting in the design of regions of competence for the dynamic selection algorithms of the set of classifiers in dynamic ensemble aggregation (Dai, 2013; Hernández-Lobato, Martínez-Muñoz, & Suárez, 2013; Roy et al., 2018; Ruta & Gabrys, 2005). We discuss these selections algorithms in Section 4.5.

    • Using a classifier ensemble for proactive quality monitoring and control: The impact of the choice of classifiers types, selection criterion, and fusion process

      2018, Computers in Industry
      Citation Excerpt :

      The bagging and random subspace methods are more robust than other algorithms when the data are noisy [10,33]; therefore, for this study, we selected the bagging approach. The classifier selection problem has been addressed by many authors in previous studies [41,32]; different strategies have been proposed to select the classifiers for an ensemble [8,9,11,16,42,12]. In this study, two strategies of classifier selection are tested and compared to evaluate the impact of the diversity of classifiers on the accuracy of the classifier ensemble.

    View all citing articles on Scopus
    View full text