Abstract
Ensembles of classifiers that are trained on different parts of the input space provide good results in general. As a popular boosting technique, AdaBoost is an iterative and gradient based deterministic method used for this purpose where an exponential loss function is minimized. Bagging is a random search based ensemble creation technique where the training set of each classifier is arbitrarily selected. In this paper, a genetic algorithm based ensemble creation approach is proposed where both resampled training sets and classifier prototypes evolve so as to maximize the combined accuracy. The objective function based random search procedure of the resultant system guided by both ensemble accuracy and diversity can be considered to share the basic properties of bagging and boosting. Experimental results have shown that the proposed approach provides better combined accuracies using a fewer number of classifiers than AdaBoost.
Similar content being viewed by others
References
Skurichina M, Duin RPW (2002) Bagging boosting and the random subspace method for linear classifiers. Pattern Anal Appl 5:121–135
Breiman L (1996) Bagging predictors. Mach Learn 24:123–140
Bauer E, Kohavi R (1999) An empirical comparison of voting classification algorithms: bagging, boosting, and variants. Mach Learn 36:105–142
Schapire RE (2002) The boosting approach to machine learning: an overview. In: Proceedings of the MSRI workshop on nonlinear estimation and classification, Berkeley, California
Collins M, Schapire RE, Singer Y (2000) Logistic regression, AdaBoost and Bregman distances. In: Proceedings of the 13th annual conference on computational learning theory, Palo Alto, California, June/July 2000, pp 158–169
Whitaker CJ, Kuncheva LI (2003) Examining the relationship between majority vote accuracy and diversity in bagging and boosting. Technical report, School of Informatics, University of Wales, Bangor, UK
Krogh A, Vedelsby J (1995) Neural network ensembles, cross validation, and active learning. In: Tesauro G, Touretzky D, Leen T (eds) Advances in neural information processing systems, vol 7. MIT Press, Cambridge, Massachusetts, pp 231–238
Opitz D, Maclin R (1999) Popular ensemble methods: an empirical study. J Artif Intell Res 11:169–198
Zenobi G, Cunningham P (2001) Using diversity in preparing ensembles of classifiers based on different feature subsets to minimize generalization error. Lecture notes in computer science, Springer, Berlin Heidelberg New York, p 2167
Melville P, Mooney RJ (2003) Constructing diverse classifier ensembles using artificial training examples. In: Proceedings of the 18th international joint conference on artificial intelligence (IJCAI 2003), Acapulco, Mexico, August 2003, pp 505–510
Demirekler M, Altınçay H (2002) Plurality voting based multiple classifier systems: statistically independent with respect to dependent classifier sets. Pattern Recogn 35(11):2365–2379
Giacinto G, Roli F (2001) Design of effective neural network ensembles for image classification purposes. Image Vision Comput 19(9–10):669–707
Kuncheva LI, Whitaker CJ (2003) Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Mach Learn 51:181–207
Ruta D, Gabrys B (2001) Analysis of the correlation between majority voting error and the diversity measures in multiple classifier systems. In: Proceedings of the 4th international ICSC symposium on soft computing and intelligent systems for industry, Paisley, Scotland, June 2001
Ruta D, Gabrys B (2004) Classifier selection for majority voting. More information at http://cis.paisley.ac.uk/ruta-ci0/publications.htm. Inform Fusion J (to appear)
Opitz DW, Shavlik JW (1996) Actively searching for an effective neural-network ensemble. Connect Sci 8(3/4):337–353
Friedman J, Hastie T, Tibshirani R (2000) Additive logistic regression: a statistical view of boosting. Ann Stat 28(2):337–374
Hansen LK, Salamon P (1990) Neural network ensembles. IEEE Trans Pattern Anal Machine Intell 12(10):993–1001
Hashem S (1997) Optimal linear combinations of neural networks. Neural Netw 10(4):599–614
Freund Y, Schapire RE (1996) Experiments with a new boosting algorithm. In: Proceedings of the 13th international conference on machine learning (ML’96), Bari, Italy, July 1996. Morgan Kauffmann, pp 148–156
Ho TK (1998) The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Machine Intell 20(8):832–844
Whitley D (1994) A genetic algorithm tutorial. Stat Comput 4:65–85
Gen M, Cheng R (1997) Genetic algorithms and engineering design. Wiley, New York
Kuncheva LI, Jain LC (2000) Designing classifier fusion systems by genetic algorithms. IEEE Trans Evol Comput 4(4):327–336
Sirlantzis K, Fairhurst MC, Hoque S (2001) Genetic algorithms for multi-classifier system configuration: a case study in character recognition. In: Kittler J, Roli F (eds) Proceedings of the 2nd international workshop on multiple classifier systems (MCS 2001), Cambridge, UK, July 2001. Lecture notes in computer science. Springer, Berlin Heidelberg New York, pp 99–108
Lam L, Suen CY (1995) Optimal combinations of pattern classifiers. Pattern Recogn Lett 16:945–954
Ruta D, Gabrys B (2001) Genetic algorithms for multi-classifier system configuration: a case study in character recognition. In: Kittler J, Roli F (eds) Proceedings of the 2nd international workshop on multiple classifier systems (MCS 2001), Cambridge, UK, July 2001. Lecture notes in computer science, Springer, Berlin Heidelberg New York, pp 399–408
Opitz D, Shavlik J (1999) A genetic algorithm approach for creating neural network ensembles. In: Sharkey AJC (ed) Combining artificial neural nets. Springer, Berlin Heidelberg New York, pp 79–99
Thompson S (1999) Pruning boosted classifiers with a real valued genetic algorithm. Knowl-Based Syst 12:277–284
Zhou Z, Wu J, Tang W (2002) Ensembling neural networks: many could be better than all. Artif Intell 137:239–263
Leardi R (1994) Application of a genetic algorithm to feature selection under full validation conditions and to outlier detection. J Chemometr 8:65–79
Siedlecki W, Sklansky J (1989) A note on genetic algorithms for large-scale feature selection. Pattern Recogn Lett 10:335–347
Kittler J, Hatef M, Duin R, Matas J (1998) On combining classifiers. IEEE Trans Pattern Anal Machine Intell 20(3):226–239
Duin RPW, Tax DMJ (2000) Experiments with classifier combining rules. In: Kittler J, Roli F (eds) Proceedings of the 1st international workshop on multiple classifier systems (MCS 2000), Sardinia, Italy, June 2000. Lecture notes in computer science, Springer, Berlin Heidelberg New York, pp 16–29
Gunter S, Bunke H (2002) Generating classifier ensembles from multiple prototypes and its application to handwriting recognition. In: Proceedings of the 3rd international workshop on multiple classifier systems (MCS2002), Cagliari, Italy, June 2002. Lecture notes in computer science, Springer, Berlin Heidelberg New York, pp 179–188
Duin RPW (2000) PRTools version 3.0: a Matlab toolbox for pattern recognition. Pattern Recognition Group, Delft University, The Netherlands
Roli F, Giacinto G, Vernazza G (2001) Methods for designing multiple classifier systems. In: Kittler J, Roli F (eds) Proceedings of the 2nd international workshop on multiple classifier systems (MCS 2001), Cambridge, UK, July 2001. Lecture notes in computer science, Springer, Berlin Heidelberg New York, pp 78–87
Kalyanmoy Deb (1999) Multi-objective genetic algorithms: problem difficulties and construction of test problems. Evol Comput 7(3):205–230
Author information
Authors and Affiliations
Corresponding author
Additional information
About the Author
Hakan ALTINÇAY was born in Cyprus, in 1972. He received his B.S. (with High Honors), M.S. and Ph.D. degrees all from Middle East Technical University, Ankara, Turkey in Electrical and Electronics Engineering department. He has been working as a research assistant in Speech Processing Laboratory in Middle East Technical University between February 1996 and February 2000. He is currently working as an Assistant Professor in Computer Engineering department of Eastern Mediterranean University, Northern Cyprus since August 2000. His main areas of interest include data fusion, pattern recognition and speaker recognition over low bit rate channels.
Rights and permissions
About this article
Cite this article
Altınçay, H. Optimal resampling and classifier prototype selection in classifier ensembles using genetic algorithms. Pattern Anal Applic 7, 285–295 (2004). https://doi.org/10.1007/s10044-004-0225-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10044-004-0225-2