Skip to main content

A Diversity Measure for Tree-Based Classifier Ensembles

  • Chapter
Book cover Data Analysis and Decision Support

Abstract

Combining multiple classifiers into an ensemble has proved to be very successful in the past decade. The key of this success is the diversity of the component classifiers, because many experiments showed that unrelated members form an ensemble of high accuracy.

In this paper we propose a new pairwise measure of diversity for classifier ensembles based on Hamann’s similarity coefficient.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • BLAKE, C, KEOGH, E. and MERZ, C.J. (1998): UCI Repository of Machine Learning Databases. Department of Information and Computer Science, University of California, Irvine.

    Google Scholar 

  • BREIMAN, L. (1996): Bagging predictors. Machine Learning, 24, 123–140.

    MATH  MathSciNet  Google Scholar 

  • BREIMAN, L. (1998): Arcing classifiers. Annals of Statistics, 26, 801–849.

    Article  MATH  MathSciNet  Google Scholar 

  • BREIMAN, L. (2001): Random Forests. Machine Learning 45, 5–32.

    Article  MATH  Google Scholar 

  • CUNNIGHAM, P. and CARNEY, J. (2000): Diversity versus quality in classification ensembles based on feature selection. In: Proceedings of European Conference on Machine Learning, LNCS, vol. 1810, Springer, Berlin, 109–116.

    Google Scholar 

  • DIETTERICH, T. and BAKIRI, G. (1995): Solving multiclass learning problem via error-correcting output codes. Journal of Artificial Intelligence Research, 2, 263–286.

    MATH  Google Scholar 

  • FLEISS, J.L. (1981): Statistical Methods for Rates and Proportions. John Wiley and Sons, New York.

    MATH  Google Scholar 

  • FREUND, Y. and SCHAPIRE, R.E. (1997): A decision-theoretic generalization of on-line learning and an application to boosting, Journal of Computer and System Sciences 55, 119–139.

    Article  MathSciNet  MATH  Google Scholar 

  • GIACINTO, G. and ROLI, F. (2001): Design of effective neural network ensembles for image classification processes. Image Vision and Computing Journal, 19, 699–707.

    Article  Google Scholar 

  • HAMANN, U. (1961): Merkmalsbestand und Verwandtschafsbeziehungen der farinosae. Ein Beitrag zum System der Monokotyledonen. Willdenowia, 2, 639–768.

    Google Scholar 

  • HANSEN, L.K. and SALAMON, P. (1990): Neural network ensembles. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12, 993–1001.

    Article  Google Scholar 

  • HASHEM, S. (1999): Treating harmful collinearity in neural network ensembles. In: A.J. Sharkey (Ed.): Combining Atrificial Neural Nets, Springer-Verlag, London, 101–125.

    Google Scholar 

  • HO, T.K. (1998): The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20, 832–844.

    Article  Google Scholar 

  • KUNCHEVA, L., WHITAKER, C., SHIPP, D., and DUIN, R. (2000): Is independence good for combining classifiers. In: Proceedingd of the 15th International Conference on Pattern Recognition, Barcelona, Spain, 168–171.

    Google Scholar 

  • KUNCHEVA, L. and WHITAKER, C. (2003): Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Machine Learning, 51, 181–207.

    Article  MATH  Google Scholar 

  • MARGINEANTU, M.M. and DIETTERICH, T.G. (1997): Pruning adaptive boosting. In: Proceedings of the 14th International Conference on Machine Learning, Morgan Kaufmann, San Mateo, 211–218.

    Google Scholar 

  • MELVILLE and MOONEY (2003): Constructing diverse classifier ensembles using artificial training examples. In: Proceedings of the 18th International Joint Conference on Artificial Intelligence, Morgan Kaufmann, San Mateo, 505–510.

    Google Scholar 

  • OZA, N.C. and TUMER, K. (1999): Dimensionality reduction through classifier ensembles. Technical Report NASA-ARC-IS1999-126, NASA Ames Labs.

    Google Scholar 

  • PARTRIDGE, D. and KRZANOWSKI, W.J. (1997): Software diversity: practical statistics for its measurement and exploitation. Information and Software Technology, 39, 707–717.

    Article  Google Scholar 

  • PARTRIDGE, D. and YATES, W.B. (1996): Engineering multiversion neural-net systems. Neural Computation, 8, 869–893.

    Google Scholar 

  • ROSEN, B.E. (1996): Ensemble learning using decorrelated neural networks. Connection Science, 8, 373–383.

    Article  Google Scholar 

  • SHARKEY, A. and SHARKEY, N. (1997): Diversity, selection, and ensembles of artificial neural nets. In: Neural Networksand their applications, NEURAP-97, 205–212.

    Google Scholar 

  • SKALAK, D.B. (1996): The sources of increased accuracy for two proposed boosting algorithms. In: Proceedings of the American Association for Artificial Intelligence AAAI-96, Morgan Kaufmann, San Mateo.

    Google Scholar 

  • THERNEAU, T.M. and ATKINSON, E.J. (1997): An introduction to recursive partitioning using the RPART routines. Mayo Foundation, Rochester.

    Google Scholar 

  • TUMER, K. and GHOSH, J. (1996): Analysis of decision boundaries in linearly combined neural classifiers. Pattern Recognition, 29, 341–348.

    Article  Google Scholar 

  • WOLPERT, D. (1992): Stacked generalization. Neural Networks 5, 241–259.

    Article  Google Scholar 

  • ZENOBI, G. and CUNNINGHAM, P. (2001): Using diversity in preparing ensembles of classifiers based on different feature subsets to minimize generalization error. Lecture Notes in Computer Science 2167, 576–587.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin · Heidelberg

About this chapter

Cite this chapter

Gatnar, E. (2005). A Diversity Measure for Tree-Based Classifier Ensembles. In: Baier, D., Decker, R., Schmidt-Thieme, L. (eds) Data Analysis and Decision Support. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-28397-8_4

Download citation

Publish with us

Policies and ethics