Skip to main content

Advertisement

Log in

Optimal resampling and classifier prototype selection in classifier ensembles using genetic algorithms

  • Theoretical Advances
  • Published:
Pattern Analysis and Applications Aims and scope Submit manuscript

Abstract

Ensembles of classifiers that are trained on different parts of the input space provide good results in general. As a popular boosting technique, AdaBoost is an iterative and gradient based deterministic method used for this purpose where an exponential loss function is minimized. Bagging is a random search based ensemble creation technique where the training set of each classifier is arbitrarily selected. In this paper, a genetic algorithm based ensemble creation approach is proposed where both resampled training sets and classifier prototypes evolve so as to maximize the combined accuracy. The objective function based random search procedure of the resultant system guided by both ensemble accuracy and diversity can be considered to share the basic properties of bagging and boosting. Experimental results have shown that the proposed approach provides better combined accuracies using a fewer number of classifiers than AdaBoost.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

References

  1. Skurichina M, Duin RPW (2002) Bagging boosting and the random subspace method for linear classifiers. Pattern Anal Appl 5:121–135

    Article  MathSciNet  MATH  Google Scholar 

  2. Breiman L (1996) Bagging predictors. Mach Learn 24:123–140

    Article  MATH  Google Scholar 

  3. Bauer E, Kohavi R (1999) An empirical comparison of voting classification algorithms: bagging, boosting, and variants. Mach Learn 36:105–142

    Article  Google Scholar 

  4. Schapire RE (2002) The boosting approach to machine learning: an overview. In: Proceedings of the MSRI workshop on nonlinear estimation and classification, Berkeley, California

  5. Collins M, Schapire RE, Singer Y (2000) Logistic regression, AdaBoost and Bregman distances. In: Proceedings of the 13th annual conference on computational learning theory, Palo Alto, California, June/July 2000, pp 158–169

  6. Whitaker CJ, Kuncheva LI (2003) Examining the relationship between majority vote accuracy and diversity in bagging and boosting. Technical report, School of Informatics, University of Wales, Bangor, UK

  7. Krogh A, Vedelsby J (1995) Neural network ensembles, cross validation, and active learning. In: Tesauro G, Touretzky D, Leen T (eds) Advances in neural information processing systems, vol 7. MIT Press, Cambridge, Massachusetts, pp 231–238

  8. Opitz D, Maclin R (1999) Popular ensemble methods: an empirical study. J Artif Intell Res 11:169–198

    MATH  Google Scholar 

  9. Zenobi G, Cunningham P (2001) Using diversity in preparing ensembles of classifiers based on different feature subsets to minimize generalization error. Lecture notes in computer science, Springer, Berlin Heidelberg New York, p 2167

  10. Melville P, Mooney RJ (2003) Constructing diverse classifier ensembles using artificial training examples. In: Proceedings of the 18th international joint conference on artificial intelligence (IJCAI 2003), Acapulco, Mexico, August 2003, pp 505–510

  11. Demirekler M, Altınçay H (2002) Plurality voting based multiple classifier systems: statistically independent with respect to dependent classifier sets. Pattern Recogn 35(11):2365–2379

    Article  MATH  Google Scholar 

  12. Giacinto G, Roli F (2001) Design of effective neural network ensembles for image classification purposes. Image Vision Comput 19(9–10):669–707

    Article  Google Scholar 

  13. Kuncheva LI, Whitaker CJ (2003) Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Mach Learn 51:181–207

    Article  MATH  Google Scholar 

  14. Ruta D, Gabrys B (2001) Analysis of the correlation between majority voting error and the diversity measures in multiple classifier systems. In: Proceedings of the 4th international ICSC symposium on soft computing and intelligent systems for industry, Paisley, Scotland, June 2001

  15. Ruta D, Gabrys B (2004) Classifier selection for majority voting. More information at http://cis.paisley.ac.uk/ruta-ci0/publications.htm. Inform Fusion J (to appear)

  16. Opitz DW, Shavlik JW (1996) Actively searching for an effective neural-network ensemble. Connect Sci 8(3/4):337–353

    Article  Google Scholar 

  17. Friedman J, Hastie T, Tibshirani R (2000) Additive logistic regression: a statistical view of boosting. Ann Stat 28(2):337–374

    Article  MATH  Google Scholar 

  18. Hansen LK, Salamon P (1990) Neural network ensembles. IEEE Trans Pattern Anal Machine Intell 12(10):993–1001

    Article  Google Scholar 

  19. Hashem S (1997) Optimal linear combinations of neural networks. Neural Netw 10(4):599–614

    Article  Google Scholar 

  20. Freund Y, Schapire RE (1996) Experiments with a new boosting algorithm. In: Proceedings of the 13th international conference on machine learning (ML’96), Bari, Italy, July 1996. Morgan Kauffmann, pp 148–156

  21. Ho TK (1998) The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Machine Intell 20(8):832–844

    Article  Google Scholar 

  22. Whitley D (1994) A genetic algorithm tutorial. Stat Comput 4:65–85

    Article  Google Scholar 

  23. Gen M, Cheng R (1997) Genetic algorithms and engineering design. Wiley, New York

    Google Scholar 

  24. Kuncheva LI, Jain LC (2000) Designing classifier fusion systems by genetic algorithms. IEEE Trans Evol Comput 4(4):327–336

    Article  Google Scholar 

  25. Sirlantzis K, Fairhurst MC, Hoque S (2001) Genetic algorithms for multi-classifier system configuration: a case study in character recognition. In: Kittler J, Roli F (eds) Proceedings of the 2nd international workshop on multiple classifier systems (MCS 2001), Cambridge, UK, July 2001. Lecture notes in computer science. Springer, Berlin Heidelberg New York, pp 99–108

  26. Lam L, Suen CY (1995) Optimal combinations of pattern classifiers. Pattern Recogn Lett 16:945–954

    Article  Google Scholar 

  27. Ruta D, Gabrys B (2001) Genetic algorithms for multi-classifier system configuration: a case study in character recognition. In: Kittler J, Roli F (eds) Proceedings of the 2nd international workshop on multiple classifier systems (MCS 2001), Cambridge, UK, July 2001. Lecture notes in computer science, Springer, Berlin Heidelberg New York, pp 399–408

  28. Opitz D, Shavlik J (1999) A genetic algorithm approach for creating neural network ensembles. In: Sharkey AJC (ed) Combining artificial neural nets. Springer, Berlin Heidelberg New York, pp 79–99

    Google Scholar 

  29. Thompson S (1999) Pruning boosted classifiers with a real valued genetic algorithm. Knowl-Based Syst 12:277–284

    Article  Google Scholar 

  30. Zhou Z, Wu J, Tang W (2002) Ensembling neural networks: many could be better than all. Artif Intell 137:239–263

    Article  MathSciNet  MATH  Google Scholar 

  31. Leardi R (1994) Application of a genetic algorithm to feature selection under full validation conditions and to outlier detection. J Chemometr 8:65–79

    Google Scholar 

  32. Siedlecki W, Sklansky J (1989) A note on genetic algorithms for large-scale feature selection. Pattern Recogn Lett 10:335–347

    Article  Google Scholar 

  33. Kittler J, Hatef M, Duin R, Matas J (1998) On combining classifiers. IEEE Trans Pattern Anal Machine Intell 20(3):226–239

    Article  Google Scholar 

  34. Duin RPW, Tax DMJ (2000) Experiments with classifier combining rules. In: Kittler J, Roli F (eds) Proceedings of the 1st international workshop on multiple classifier systems (MCS 2000), Sardinia, Italy, June 2000. Lecture notes in computer science, Springer, Berlin Heidelberg New York, pp 16–29

  35. Gunter S, Bunke H (2002) Generating classifier ensembles from multiple prototypes and its application to handwriting recognition. In: Proceedings of the 3rd international workshop on multiple classifier systems (MCS2002), Cagliari, Italy, June 2002. Lecture notes in computer science, Springer, Berlin Heidelberg New York, pp 179–188

  36. Duin RPW (2000) PRTools version 3.0: a Matlab toolbox for pattern recognition. Pattern Recognition Group, Delft University, The Netherlands

  37. Roli F, Giacinto G, Vernazza G (2001) Methods for designing multiple classifier systems. In: Kittler J, Roli F (eds) Proceedings of the 2nd international workshop on multiple classifier systems (MCS 2001), Cambridge, UK, July 2001. Lecture notes in computer science, Springer, Berlin Heidelberg New York, pp 78–87

  38. Kalyanmoy Deb (1999) Multi-objective genetic algorithms: problem difficulties and construction of test problems. Evol Comput 7(3):205–230

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hakan Altınçay.

Additional information

About the Author

Hakan ALTINÇAY was born in Cyprus, in 1972. He received his B.S. (with High Honors), M.S. and Ph.D. degrees all from Middle East Technical University, Ankara, Turkey in Electrical and Electronics Engineering department. He has been working as a research assistant in Speech Processing Laboratory in Middle East Technical University between February 1996 and February 2000. He is currently working as an Assistant Professor in Computer Engineering department of Eastern Mediterranean University, Northern Cyprus since August 2000. His main areas of interest include data fusion, pattern recognition and speaker recognition over low bit rate channels.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Altınçay, H. Optimal resampling and classifier prototype selection in classifier ensembles using genetic algorithms. Pattern Anal Applic 7, 285–295 (2004). https://doi.org/10.1007/s10044-004-0225-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10044-004-0225-2

Keywords

Navigation