Abstract
Ensemble pruning deals with the reduction of base classifiers prior to combination in order to improve generalization and prediction efficiency. Existing ensemble pruning algorithms require much pruning time. This paper presents a fast pruning approach: pattern mining based ensemble pruning (PMEP). In this algorithm, the prediction results of all base classifiers are organized as a transaction database, and FP-Tree structure is used to compact the prediction results. Then a greedy pattern mining method is explored to find the ensemble of size k. After obtaining the ensembles of all possible sizes, the one with the best accuracy is outputted. Compared with Bagging, GASEN, and Forward Selection, experimental results show that PMEP achieves the best prediction accuracy and keeps the size of the final ensemble small, more importantly, its pruning time is much less than other ensemble pruning algorithms.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Ali KM, Pazzani MJ (1996) Error reduction through learning multiple descriptions. Mach Learn 24(3): 173–202
Allwein EL, Schapire RE, Singer Y (2000) Reducing multiclass to binary: a unifying approach for margin classifiers. J Mach Learn Res 1: 113–141
Asuncion DNA (2007) UCI machine learning repository. http://www.ics.uci.edu/mlearn/MLRepository.html
Breiman L (1996) Bagging predictors. Mach Learn 24(2): 123–140
Caruana R, Niculescu-Mizil A, Crew G, Ksikes A (2004) Ensemble selection from libraries of models. In: Proceedings of the 21st international conference on machine learning (ICML2004), Banff, Alberta
Demsar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7: 1–30
Han J, Pei J (2000) Mining frequent patterns by pattern growth: methodology and implications. SIGKDD Explor 2(2): 14–20
Jain AK, Duin RPW, Mao JC (2000) Statistical pattern recognition: a review. IEEE Trans Patt Anal Mach Intell 22(1): 4–37
Martínez-Muñoz G, Suarez A (2007) Using boosting to prune bagging ensembles. Patt Recogn Lett 28(1): 156–165
Opitz D, Maclin R (1999) Popular ensemble methods: an empirical study. J Artif Intell Res 11: 169–198
Parmanto B, Munro PW, Doyle HR (1996) Improving committee diagnosis with resampling techniques. In: Touretzky DS, Mozer MC, Hesselmo ME (eds) Advances in neural information processing systems. MIT Press, Cambridge, pp 882–888
Partalas I, Tsoumakas G, Vlahavas I (2009) Pruning an ensemble of classifiers via reinforcement learning. Neurocomputing (in press)
Rumelhart DE, Hinton GE, Williams RJ (1986) Learning internal representations by error propagation. In: Rumelhart DE, McClelland JL (eds) Parallel distributed processing: explorations in the microstructure of cognition. MIT Press, Cambridge, pp 318–362
Ruta D, Gabrys B (2005) Classifier selection for majority voting. Inf Fusion 6(1): 63–81
Schapire RE (1999) A brief introduction to boosting. Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence. Morgan Kaufmann 1401–1406
Sewell M (2008) Ensemble learning. http://machine-learning.martinsewell.com/ensembles/ensemble-learning.pdf
Tsoumakas G, Angelis L, Vlahavas I (2005) Selective fusion of heterogeneous classifiers. Intell Data Anal 9(6): 511–525
Wolpert DH (1992) Stacked generalization. Neural Netw 5(2): 241–259
Xu L, Krzyzak A, Suen CY (1992) Methods of combining multiple classifiers and their applications to handwriting recognition. IEEE Trans Syst Man Cybern 22(3): 418–435
Zhang Y, Burer S, Street WN (2006) Ensemble pruning via semi-definite programming. J Mach Learn Res 7: 1315–1338
Zhou ZH, Wu JX, Tang W (2002) Ensembling neural networks: many could be better than all. Artif Intell 137(1–2): 239–263
Author information
Authors and Affiliations
Corresponding author
Additional information
Responsible editors: Aleksander Kołcz, Wray Buntine, Marko Grobelnik, Dunja Mladenic, and John Shawe-Taylor.
Rights and permissions
About this article
Cite this article
Zhao, QL., Jiang, YH. & Xu, M. A fast ensemble pruning algorithm based on pattern mining process. Data Min Knowl Disc 19, 277–292 (2009). https://doi.org/10.1007/s10618-009-0138-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10618-009-0138-1