Abstract
Combining multiple classifiers into an ensemble has proved to be very successful in the past decade. The key of this success is the diversity of the component classifiers, because many experiments showed that unrelated members form an ensemble of high accuracy.
In this paper we propose a new pairwise measure of diversity for classifier ensembles based on Hamann’s similarity coefficient.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
BLAKE, C, KEOGH, E. and MERZ, C.J. (1998): UCI Repository of Machine Learning Databases. Department of Information and Computer Science, University of California, Irvine.
BREIMAN, L. (1996): Bagging predictors. Machine Learning, 24, 123–140.
BREIMAN, L. (1998): Arcing classifiers. Annals of Statistics, 26, 801–849.
BREIMAN, L. (2001): Random Forests. Machine Learning 45, 5–32.
CUNNIGHAM, P. and CARNEY, J. (2000): Diversity versus quality in classification ensembles based on feature selection. In: Proceedings of European Conference on Machine Learning, LNCS, vol. 1810, Springer, Berlin, 109–116.
DIETTERICH, T. and BAKIRI, G. (1995): Solving multiclass learning problem via error-correcting output codes. Journal of Artificial Intelligence Research, 2, 263–286.
FLEISS, J.L. (1981): Statistical Methods for Rates and Proportions. John Wiley and Sons, New York.
FREUND, Y. and SCHAPIRE, R.E. (1997): A decision-theoretic generalization of on-line learning and an application to boosting, Journal of Computer and System Sciences 55, 119–139.
GIACINTO, G. and ROLI, F. (2001): Design of effective neural network ensembles for image classification processes. Image Vision and Computing Journal, 19, 699–707.
HAMANN, U. (1961): Merkmalsbestand und Verwandtschafsbeziehungen der farinosae. Ein Beitrag zum System der Monokotyledonen. Willdenowia, 2, 639–768.
HANSEN, L.K. and SALAMON, P. (1990): Neural network ensembles. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12, 993–1001.
HASHEM, S. (1999): Treating harmful collinearity in neural network ensembles. In: A.J. Sharkey (Ed.): Combining Atrificial Neural Nets, Springer-Verlag, London, 101–125.
HO, T.K. (1998): The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20, 832–844.
KUNCHEVA, L., WHITAKER, C., SHIPP, D., and DUIN, R. (2000): Is independence good for combining classifiers. In: Proceedingd of the 15th International Conference on Pattern Recognition, Barcelona, Spain, 168–171.
KUNCHEVA, L. and WHITAKER, C. (2003): Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Machine Learning, 51, 181–207.
MARGINEANTU, M.M. and DIETTERICH, T.G. (1997): Pruning adaptive boosting. In: Proceedings of the 14th International Conference on Machine Learning, Morgan Kaufmann, San Mateo, 211–218.
MELVILLE and MOONEY (2003): Constructing diverse classifier ensembles using artificial training examples. In: Proceedings of the 18th International Joint Conference on Artificial Intelligence, Morgan Kaufmann, San Mateo, 505–510.
OZA, N.C. and TUMER, K. (1999): Dimensionality reduction through classifier ensembles. Technical Report NASA-ARC-IS1999-126, NASA Ames Labs.
PARTRIDGE, D. and KRZANOWSKI, W.J. (1997): Software diversity: practical statistics for its measurement and exploitation. Information and Software Technology, 39, 707–717.
PARTRIDGE, D. and YATES, W.B. (1996): Engineering multiversion neural-net systems. Neural Computation, 8, 869–893.
ROSEN, B.E. (1996): Ensemble learning using decorrelated neural networks. Connection Science, 8, 373–383.
SHARKEY, A. and SHARKEY, N. (1997): Diversity, selection, and ensembles of artificial neural nets. In: Neural Networksand their applications, NEURAP-97, 205–212.
SKALAK, D.B. (1996): The sources of increased accuracy for two proposed boosting algorithms. In: Proceedings of the American Association for Artificial Intelligence AAAI-96, Morgan Kaufmann, San Mateo.
THERNEAU, T.M. and ATKINSON, E.J. (1997): An introduction to recursive partitioning using the RPART routines. Mayo Foundation, Rochester.
TUMER, K. and GHOSH, J. (1996): Analysis of decision boundaries in linearly combined neural classifiers. Pattern Recognition, 29, 341–348.
WOLPERT, D. (1992): Stacked generalization. Neural Networks 5, 241–259.
ZENOBI, G. and CUNNINGHAM, P. (2001): Using diversity in preparing ensembles of classifiers based on different feature subsets to minimize generalization error. Lecture Notes in Computer Science 2167, 576–587.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin · Heidelberg
About this chapter
Cite this chapter
Gatnar, E. (2005). A Diversity Measure for Tree-Based Classifier Ensembles. In: Baier, D., Decker, R., Schmidt-Thieme, L. (eds) Data Analysis and Decision Support. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-28397-8_4
Download citation
DOI: https://doi.org/10.1007/3-540-28397-8_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-26007-3
Online ISBN: 978-3-540-28397-3
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)