A competitive ensemble pruning approach based on cross-validation technique
Introduction
Ensemble learning is an important topic of interest in the research communities of pattern recognition and machine learning for its desirable generalization capability [1], [2]. It refers to training a collection of base predictors for a given classification or regression task and then combining their outputs with a combinational strategy [3]. It is also termed multiple classifier systems [4], expert committee[5], decision forest [6], [7], etc. Remarkable improvement in generalization performance has been observed from ensemble learning in a broad scope of application fields, for example: face recognition [8], optical character recognition [9], scientific image analysis [10], [11], medical diagnosis [12], [13], financial time series prediction [10], military purposes. [14], intrusion detection [15], etc.
Typically, ensemble learning algorithms consist of two main stages: the generation of multiple predictive models and their fusion [2]. Recently, a so-called ensemble pruning stage has been considered as an additional intermediate stage which deals with the selection of the appropriate ensemble members prior to combination [16], [17], [18], [19], [20], [21], [22], [23], [24]. It is also termed ensemble pruning, selective ensemble, ensemble thinning or ensemble selection.
Ensemble pruning is important and necessary for the consideration of two factors: efficiency and predictive accuracy [2]. Firstly, an ensemble system with large size will lead to heavy computational burdens. In certain applications, such as stream data mining, it is especially important to minimize the running time expenses. And when models are distributed over a network, a large number of constituent models will certainly lead to another serious problem, i.e. a large amount of communication costs [2]. Secondly, the other factor of predictive accuracy is equally influential. An ensemble may comprise constituent models with either high or low predictive accuracy. Those ensemble members with low predictive accuracy will negatively affect the overall predictive performance of the whole ensemble. Pruning these models while still maintaining a rather high diversity among the reserved ones is typically considered a proper method for the construction of an efficient and effective ensemble system [2].
The problem of ensemble pruning has been proven to be an NP-complete problem [25], [26]. Enumerative algorithm for searching the best subset of classifiers is not easily worked for ensembles that contain a large number of constituent models. Greedy algorithms, however, possess high speed, since they only consider a very small subspace among all the possible combinations [16], [17], [18], [21], [27]. But this characteristic may result in suboptimal solutions of the ensemble pruning problem [25]. A compact review about the related works on ensemble pruning is given in Section 2.2 of this paper.
This work, however, studies the problem of ensemble pruning from a perspective of competitive learning. For the research purpose, the n-Bits Binary Coding ICBP Ensemble System (nBBC-ICBP-ES) proposed in our previous work [28] is employed as the basic ensemble. A brief introduction about nBBC-ICBP-ES is given in Section 5.2. The reason why nBBC-ICBP-ES is adopted as the initial ensemble system for this work is very natural and intuitive. Because nBBC-ICBP-ES is successfully implemented in our previous work. It is simple but efficient and effective, and its effectiveness has been verified through experiments on several Benchmark classification tasks. And it is anticipated that the investigation about CEPCV Algorithm would improve the classification performance and generalization capability of the initial nBBC-ICBP-ES further, so that a desirable selective ensemble could be achieved, which is an original neural network system completely resulted from our own research works.
After the basic nBBC-ICBP-ES has been generated, the proposed Competitive Ensemble Pruning Algorithm Based on Cross-Validation (CEPCV) is started up for the purpose of ensemble pruning, wherein the final pruned ensemble is dynamically constructed with the help of cross-validation technique. Explicitly, for the specific test instance t under consideration, we calculate its squared Euclidean distance from every validation sample vi. After that, all the validation samples are arranged according to their above calculated squared distances values from t. Then, the first VSn validation instances in the arranged array of validation set, i.e. the VSn nearest neighbors of test sample t in the validation set, are picked out to form the dynamic validation subset associated with the specific test instance t. Each constituent ICBP model in the basic nBBC-ICBP-ES is then employed to provide its classification results to the above selected VSn dynamic validation instances. Those ICBP components which correctly classify at least τ dynamic validation instances are declared the winners in the competition and selected into the dynamically pruned NNE associated with test instance t. Finally, the classification decision for test sample t is made based upon the dynamically pruned NNE using the method of majority voting.
Our motivations for the development of CEPCV algorithm mainly consist of: First of all, the data needed to be learnt by the neural computing models are usually drifting and changing along with time and environment [2]. However, a majority of the typical ensemble pruning strategies imply that the component models selected to comprise the pruned ensemble are changeless once decided. They are incapable to realize ensemble pruning flexibly and changeably. This kind of defect will inevitably lead to neglect of valuable heuristic information in the data. In contrast, the proposed CEPCV method can actualize ensemble pruning dynamically, the pruning decisions of which are alterable and sensitive to each different testing sample under processing. This characteristic constitutes a remarkable novelty of CEPCV algorithm, which makes it evidently different from other typical ensemble pruning methods, i.e. it realizes pruning operation at the same time with the test procedure of the ensemble system, resulting in a final pruned ensemble with significantly higher accuracy and reliability.
Secondly, CEPCV algorithm naturally inherits the competence of the technique of cross-validation. The cross-validation technique is a standard tool in statistics which provides an appealing guiding principle to choose, within a set of candidate model structures, the “best” one according to a certain criterion [29]. The hope here in CEPCV is that the networks regarded as winners and selected into the pruned ensemble have the “best” generalization capability.
Thirdly, CEPCV algorithm boosts up the holistic predictive performance of selected models, while maintaining a high diversity among them. Constitutionally, the basic thinking behind CEPCV algorithm is the divide-and-conquer strategy [30], which is a significant research strategy of ensemble learning. And the unique strategy of CEPCV algorithm itself can be explained as “rout the enemy forces one by one”. It might alleviate the local minimum problem of many traditional ensemble pruning approaches at the cost of a little greater computational cost, which is generally acceptable to the requirements of most applications.
The notion of diversity here is a rather broad sense of concept. It means that those specific selective subensembles are diversified among each other, which is associated with each different test instance t. In this sense, it could be considered that, the selected subensembles maintain a high diversity among each other. Kuncheva and Whitaker have studied several statistics which can measure diversity among binary classifier outputs in their published work [31]. However, most of these diversity measures are not applicable to dynamically pruned ensembles, such as those resulted from CEPCV algorithm. Therefore, it would be our future work to investigate some diversity measures that could be applied to the scenario of dynamic ensemble pruning.
The remains of this work are structured as follows: Section 2 presents the method of ensemble pruning, including a theoretical analysis and a compact review about its related works in reference papers. Section 3 briefly reviews the technique of cross-validation for neural network optimization. Section 4 presents the proposed Competitive neural network Ensemble Pruning algorithm based on Cross-Validation technique (CEPCV). Section 5 reports the results of experimental study. From these experimental results, the final conclusions are drawn in Section 6.
Section snippets
Theoretical analysis on ensemble pruning
It was about 22 years ago when Hansen and Salamon proposed Neural Network Ensemble (NNE) [29]. They claim that ensembling a group of neural networks can improve the generalization capability of each individual component network significantly. This technology has recently become a very hot topic in both neural networks and machine learning communities for its remarkably desirable performance. However, it should be noticed that the law of “the more, the better” is not always true for all occasions
The technique of cross-validation for neural network optimization
As stated by Hansen and Salamon in their remarkable work that [29], making choice of a neural network architecture is equivalent to making choice of parameterization for the input–output relation within a data set, while cross-validation is a standard technique for deciding between alternative choices of parameterization for a data set by statistical methods. The original training process of a neural network involves training on the complete dataset by minimizing the accumulated classification
Detailed descriptions about CEPCV algorithm
In our proposed Competitive Ensemble Pruning Algorithm Based on Cross-Validation (CEPCV), the final pruned ensemble is dynamically constructed during the course of testing using the technique of cross-validation. Specifically, during the testing phase, namely after all the component networks in an ensemble have completed their training, when a test sample t is provided to the ensemble, for each validation sample vi, calculate its squared Euclidean distance from this test sample t. Then, rank
Experimental datasets
In order to compare the classification performance of the proposed CEPCV algorithm with that of some other state-of-the-art algorithms and baseline methods, and validate the effectiveness of CEPCV algorithm, 10 groups of experiments on benchmark classification datasets are carried out (viz. Car Evaluation dataset, Page Blocks dataset, Gene Detection dataset, Date Calculation dataset, Pen-Based Recognition of Handwritten Digits dataset, Mushroom Discrimination dataset, Thyroid dataset, Image
Conclusions
Ensemble pruning is vital for the reasons of both efficiency and predictive accuracy. And abundant work has been dedicated to the research field of ensemble pruning. However, the problem of ensemble pruning has been proven to be an NP-complete problem. While this work proposes a so-called Competitive Ensemble Pruning Algorithm Based on Cross-Validation (CEPCV) for the purpose of ensemble pruning, wherein the final pruned ensemble is constructed employing cross-validation technique in an online
Acknowledgements
This work is supported by the National Natural Science Foundation of China under the Grant No. 61100108, and Ministry of Education NUAA research project under the Grant No. NS2011015, and a project funded by the Priority Academic Program Development of Jiangsu Higher Education Institutions (PAPD).
References (41)
- et al.
EROS: ensemble rough subspaces
Pattern Recognition
(2007) - et al.
Multiple classifier combination for face-based identity verification
Pattern Recognition
(2004) - et al.
Using diversity of errors for selecting members of a committee classifier
Pattern Recognition
(2006) - et al.
Stability problems with artificial neural networks and the ensemble solution
Artificial Intelligence in Medicine
(2000) - et al.
Lung cancer cell identification based on artificial neural network ensembles
Artificial Intelligence in Medicine
(2002) - et al.
An efficient fuzzy weighted average algorithm for the military UAV selecting under group decision-making
Knowledge-Based Systems
(2011) - et al.
An ensemble design of intrusion detection system for handling uncertainty using Neutrosophic Logic Classifier
Knowledge-Based Systems
(2012) - et al.
Ensemble diversity measures and their application to thinning
Information Fusion
(2005) - et al.
Pruning an ensemble of classifiers via reinforcement learning
Neurocomputing
(2009) - et al.
Using boosting to prune bagging ensembles
Pattern Recognition Letters
(2007)
The build of n-bits binary coding ICBP ensemble system
Neurocomputing
Ensembling neural networks: many could be better than all
Artificial Intelligence
Unsupervised data pruning for clustering of noisy data
Knowledge-Based Systems
An ensemble uncertainty aware measure for directed hill climbing ensemble pruning
Machine Learning
The random subspace method for constructing decision forests
IEEE Transactions on Pattern Analysis and Machine Intelligence
Constructing rough decision forests
Lecture Notes in Artificial Intelligence
Cited by (70)
A comprehensive review on ensemble deep learning: Opportunities and challenges
2023, Journal of King Saud University - Computer and Information SciencesOptimizing the early glaucoma detection from visual fields by combining preprocessing techniques and ensemble classifier with selection strategies
2022, Expert Systems with ApplicationsCitation Excerpt :Among weak learners, the most used include decision trees (DT) (Breiman, Friedman, Stone, & Olshen, 1984; Mehta, Agrawal, & Rissanen, 1996; Quinlan, 1996; Shafer, Agrawal, & Mehta, 1996), artificial neural networks (ANN) (Andersson et al., 2013; Bizios et al., 2007; Boudhane, Nsiri, & Toulni, 2016) or support vector machine (SVM) (Cortes & Vapnik, 1995; Hearst, Dumais, Osuna, Platt, & Scholkopf, 1998; Smola & Schölkopf, 2004; Vapnik, 2013), k-nearest neighbors (k-NN) (Koumétio & Toulni, 2021; Quinlan, 1996; Sinta, Wijayanto, & Sartono, 2014; Zhang, Li, Zong, Zhu, & Wang, 2017) or naive bayes (NB). The KNN algorithm is also particularly interesting in the design of regions of competence for the dynamic selection algorithms of the set of classifiers in dynamic ensemble aggregation (Dai, 2013; Hernández-Lobato, Martínez-Muñoz, & Suárez, 2013; Roy et al., 2018; Ruta & Gabrys, 2005). We discuss these selections algorithms in Section 4.5.
A hybrid transfer learning algorithm incorporating TrSVM with GASEN
2019, Pattern RecognitionUsing a classifier ensemble for proactive quality monitoring and control: The impact of the choice of classifiers types, selection criterion, and fusion process
2018, Computers in IndustryCitation Excerpt :The bagging and random subspace methods are more robust than other algorithms when the data are noisy [10,33]; therefore, for this study, we selected the bagging approach. The classifier selection problem has been addressed by many authors in previous studies [41,32]; different strategies have been proposed to select the classifiers for an ensemble [8,9,11,16,42,12]. In this study, two strategies of classifier selection are tested and compared to evaluate the impact of the diversity of classifiers on the accuracy of the classifier ensemble.
A Guide to Cross-Validation for Artificial Intelligence in Medical Imaging
2023, Radiology: Artificial Intelligence