A Diversity Measure for Tree-Based Classifier Ensembles

Gatnar, Eugeniusz

doi:10.1007/3-540-28397-8_4

Eugeniusz Gatnar²²

Part of the book series: Studies in Classification, Data Analysis, and Knowledge Organization ((STUDIES CLASS))

2345 Accesses
4 Citations

Abstract

Combining multiple classifiers into an ensemble has proved to be very successful in the past decade. The key of this success is the diversity of the component classifiers, because many experiments showed that unrelated members form an ensemble of high accuracy.

In this paper we propose a new pairwise measure of diversity for classifier ensembles based on Hamann’s similarity coefficient.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

BLAKE, C, KEOGH, E. and MERZ, C.J. (1998): UCI Repository of Machine Learning Databases. Department of Information and Computer Science, University of California, Irvine.
Google Scholar
BREIMAN, L. (1996): Bagging predictors. Machine Learning, 24, 123–140.
MATH MathSciNet Google Scholar
BREIMAN, L. (1998): Arcing classifiers. Annals of Statistics, 26, 801–849.
Article MATH MathSciNet Google Scholar
BREIMAN, L. (2001): Random Forests. Machine Learning 45, 5–32.
Article MATH Google Scholar
CUNNIGHAM, P. and CARNEY, J. (2000): Diversity versus quality in classification ensembles based on feature selection. In: Proceedings of European Conference on Machine Learning, LNCS, vol. 1810, Springer, Berlin, 109–116.
Google Scholar
DIETTERICH, T. and BAKIRI, G. (1995): Solving multiclass learning problem via error-correcting output codes. Journal of Artificial Intelligence Research, 2, 263–286.
MATH Google Scholar
FLEISS, J.L. (1981): Statistical Methods for Rates and Proportions. John Wiley and Sons, New York.
MATH Google Scholar
FREUND, Y. and SCHAPIRE, R.E. (1997): A decision-theoretic generalization of on-line learning and an application to boosting, Journal of Computer and System Sciences 55, 119–139.
Article MathSciNet MATH Google Scholar
GIACINTO, G. and ROLI, F. (2001): Design of effective neural network ensembles for image classification processes. Image Vision and Computing Journal, 19, 699–707.
Article Google Scholar
HAMANN, U. (1961): Merkmalsbestand und Verwandtschafsbeziehungen der farinosae. Ein Beitrag zum System der Monokotyledonen. Willdenowia, 2, 639–768.
Google Scholar
HANSEN, L.K. and SALAMON, P. (1990): Neural network ensembles. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12, 993–1001.
Article Google Scholar
HASHEM, S. (1999): Treating harmful collinearity in neural network ensembles. In: A.J. Sharkey (Ed.): Combining Atrificial Neural Nets, Springer-Verlag, London, 101–125.
Google Scholar
HO, T.K. (1998): The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20, 832–844.
Article Google Scholar
KUNCHEVA, L., WHITAKER, C., SHIPP, D., and DUIN, R. (2000): Is independence good for combining classifiers. In: Proceedingd of the 15th International Conference on Pattern Recognition, Barcelona, Spain, 168–171.
Google Scholar
KUNCHEVA, L. and WHITAKER, C. (2003): Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Machine Learning, 51, 181–207.
Article MATH Google Scholar
MARGINEANTU, M.M. and DIETTERICH, T.G. (1997): Pruning adaptive boosting. In: Proceedings of the 14th International Conference on Machine Learning, Morgan Kaufmann, San Mateo, 211–218.
Google Scholar
MELVILLE and MOONEY (2003): Constructing diverse classifier ensembles using artificial training examples. In: Proceedings of the 18th International Joint Conference on Artificial Intelligence, Morgan Kaufmann, San Mateo, 505–510.
Google Scholar
OZA, N.C. and TUMER, K. (1999): Dimensionality reduction through classifier ensembles. Technical Report NASA-ARC-IS1999-126, NASA Ames Labs.
Google Scholar
PARTRIDGE, D. and KRZANOWSKI, W.J. (1997): Software diversity: practical statistics for its measurement and exploitation. Information and Software Technology, 39, 707–717.
Article Google Scholar
PARTRIDGE, D. and YATES, W.B. (1996): Engineering multiversion neural-net systems. Neural Computation, 8, 869–893.
Google Scholar
ROSEN, B.E. (1996): Ensemble learning using decorrelated neural networks. Connection Science, 8, 373–383.
Article Google Scholar
SHARKEY, A. and SHARKEY, N. (1997): Diversity, selection, and ensembles of artificial neural nets. In: Neural Networksand their applications, NEURAP-97, 205–212.
Google Scholar
SKALAK, D.B. (1996): The sources of increased accuracy for two proposed boosting algorithms. In: Proceedings of the American Association for Artificial Intelligence AAAI-96, Morgan Kaufmann, San Mateo.
Google Scholar
THERNEAU, T.M. and ATKINSON, E.J. (1997): An introduction to recursive partitioning using the RPART routines. Mayo Foundation, Rochester.
Google Scholar
TUMER, K. and GHOSH, J. (1996): Analysis of decision boundaries in linearly combined neural classifiers. Pattern Recognition, 29, 341–348.
Article Google Scholar
WOLPERT, D. (1992): Stacked generalization. Neural Networks 5, 241–259.
Article Google Scholar
ZENOBI, G. and CUNNINGHAM, P. (2001): Using diversity in preparing ensembles of classifiers based on different feature subsets to minimize generalization error. Lecture Notes in Computer Science 2167, 576–587.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Statistics, Katowice University of Economics, ul. Bogucicka 14, 40-226, Katowice, Poland
Eugeniusz Gatnar

Authors

Eugeniusz Gatnar
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute of Business Administration and Economics, Brandenburg University of Technology Cottbus, Konrad-Wachsmann-Allee 1, 03046, Cottbus, Germany
Daniel Baier (Chair of Marketing and Innovation Management) (Chair of Marketing and Innovation Management)
Department of Business Administration and Economics, Bielefeld University, Universitätsstr. 25, 33615, Bielefeld, Germany
Reinhold Decker (Chair of Marketing) (Chair of Marketing)
Computer Based New Media Group (CGNM), Institute for Computer Science, University of Freiburg, Georges-Köhler-Allee 51, 79110, Freiburg, Germany
Lars Schmidt-Thieme

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Gatnar, E. (2005). A Diversity Measure for Tree-Based Classifier Ensembles. In: Baier, D., Decker, R., Schmidt-Thieme, L. (eds) Data Analysis and Decision Support. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-28397-8_4

Download citation

DOI: https://doi.org/10.1007/3-540-28397-8_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-26007-3
Online ISBN: 978-3-540-28397-3
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics