Abstract
As is well known, the Greedy Ensemble Pruning (GEP) algorithm, also called the Directed Hill Climbing Ensemble Pruning (DHCEP) algorithm, possesses relatively good performance and high speed. However, because the algorithm only explores a relatively small subspace within the whole solution space, it often produces suboptimal solutions of the ensemble pruning problem. Aiming to address this drawback, in this work, we propose a novel Randomized GEP (RandomGEP) algorithm, also called the Randomized DHCEP (RandomDHCEP) algorithm, that effectively enlarges the search space of the classical DHCEP while maintaining the same level of time complexity with the help of a randomization technique. The randomization of the classical DHCEP algorithm achieves a good tradeoff between the effectiveness and efficiency of ensemble pruning. Besides, the RandomDHCEP algorithm naturally inherits the two intrinsic advantages that a randomized algorithm usually possesses. First, in most cases, its running time or space requirements are smaller than well-behaved deterministic ensemble pruning algorithms. Second, it is easy to comprehend and implement. Experimental results on three benchmark classification datasets verify the practicality and effectiveness of the RandomGEP algorithm.


Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Dietterich TG (2000) Ensemble methods in machine learning. In: Proceedings of the 1st international workshop in multiple classifier systems. Cagliari, Italy
Breiman L (1996) Bagging predictors. Mach Learn 24:123– 140
Breiman L (1999) Pasting small votes for classification in large databases and on-line. Mach Learn 36:85–103
Dietterich T (2000) An experimental comparison of three methods for constructing ensembles of decision trees: bagging, boosting, and randomization. Mach Learn 40:139–158
Breiman L (2001) Random forests. Mach Learn 45:5–32
Freund Y, Schapire RE (1996) Experiments with a new boosting algorithm. In: 13th international conference on machine learning
Banfield RE, Hall LO, Bowyer KW, Kegelmeyer WP (2005) Ensemble diversity measures and their application to thinning. Inf Fusion 6:49–62
Caruana R, Munson A, Niculescu-Mizil A (2006) Getting the Most Out of Ensemble Selection. In: 6th international conference on data mining. Hong Kong
Schapire R (2001) The boosting approach to machine learning: An overview. In: MSRI workshop on nonlinear estimation and classification
Wolpert DH (1992) Stacked generalization. Neural Netw 5:241–259
Margineantu D, Dietterich T (1997) Pruning adaptive boosting. In: Proceedings of the 14th international conference on machine learning
Prodromidis AL, Stolfo SJ (2001) Cost complexity-based pruning of ensemble classifiers. Knowl Inf Syst 3:449–469
Partalas I, Tsoumakas G, Vlahavas I (2010) An ensemble uncertainty aware measure for directed hill climbing ensemble pruning. Mach Learn 81:257–282
Zhou Z-H, Wu J, Tang W (2002) Ensembling neural networks: Many could be better than all. Artif Intell 137:239–263
Zhou Z, Tang W (2003) Selective ensemble of decision trees. In: Proceedings of the 9th international conference on rough sets, fuzzy sets, data mining, and granular computing. Chongqing, China
Caruana R, Niculescu-Mizil A, Crew G, Ksikes A (2004) Ensemble selection from libraries of models. In: Proceedings of the 21st international conference on machine learning
Martinez-Munoz G, Suarez A (2004) Aggregation ordering in bagging. In: International conference on artificial intelligence and applications
Alsuwaiyel MH (2003) Algorithms design techniques and analysis. World Scientific Publishing Co. Pte. Ltd., Singapore
Dai Q, Liu NZ (2011) The build of n-Bits binary coding ICBP ensemble system. Neurocomput 74:3509–3519
Dai Q, Chen SC, Zhang BZ (2003) Improved CBP neural network model with applications in time series prediction. Neural Process Lett 18:197–211
Dai Q (2010) The build of a dynamic classifier selection ICBP system and its application to pattern recognition. Neural Comput Appl 19:123–137
Weijters A (1995) The BP-SOM architecture and learning rule. Neural Process Lett 2:13–16
Weijters A, Bosch VD, Herik HJ (1997) Intelligible neural networks with BP-SOM. Marcke and Daelemans, pp 27–36
Eggermont J (1998) Rule-extraction and learning in the BP-SOM architecture in Computer Science Department: Leiden University
UCI repository of machine learning databases. http://www.ics.uci.edu/%7Etextasciitildemlearn/MLRepository.html or http://www.ftp.ics.uci.edu/pub/machine-learning-databases/
Acknowledgments
This work is supported by the National Natural Science Foundation of China under Grants no. 61473150, 61100108, and 61375021. It is also supported by the Natural Science Foundation of Jiangsu Province of China under Grant no. SBK201322136, and is supported by the “Fundamental Research Funds for the Central Universities,” no. NZ2013306, and the Qing Lan Project, no. YPB13001. We would like to express our appreciation for the valuable comments from reviewers and editors.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Dai, Q., Li, M. Introducing randomness into greedy ensemble pruning algorithms. Appl Intell 42, 406–429 (2015). https://doi.org/10.1007/s10489-014-0605-2
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-014-0605-2