Diversity in Random Subspacing Ensembles

Tsymbal, Alexey; Pechenizkiy, Mykola; Cunningham, Pádraig

doi:10.1007/978-3-540-30076-2_31

Alexey Tsymbal¹⁹,
Mykola Pechenizkiy²⁰ &
Pádraig Cunningham¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3181))

Included in the following conference series:

International Conference on Data Warehousing and Knowledge Discovery

451 Accesses

Abstract

Ensembles of learnt models constitute one of the main current directions in machine learning and data mining. It was shown experimentally and theoretically that in order for an ensemble to be effective, it should consist of classifiers having diversity in their predictions. A number of ways are known to quantify diversity in ensembles, but little research has been done about their appropriateness. In this paper, we compare eight measures of the ensemble diversity with regard to their correlation with the accuracy improvement due to ensembles. We conduct experiments on 21 data sets from the UCI machine learning repository, comparing the correlations for random subspacing ensembles with different ensemble sizes and with six different ensemble integration methods. Our experiments show that the greatest correlation of the accuracy improvement, on average, is with the disagreement, entropy, and ambiguity diversity measures, and the lowest correlation, surprisingly, is with the Q and double fault measures. Normally, the correlation decreases linearly as the ensemble size increases. Much higher correlation values can be seen with the dynamic integration methods, which are shown to better utilize the ensemble diversity than their static analogues.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Optimising Diversity in Classifier Ensembles

Article Open access 16 March 2022

An Empirical Investigation on the Use of Diversity for Creation of Classifier Ensembles

Optimising Diversity in Classifier Ensembles of Classification Trees

References

Bauer, E., Kohavi, R.: An empirical comparison of voting classification algorithms: bagging, boosting, and variants. Machine Learning 36(1,2), 105–139 (1999)
Article Google Scholar
Blake, C.L., Keogh, E., Merz, C.J.: UCI repository of machine learning databases, Dept. of Information and Computer Science, University of California, Irvine, CA (1999), http://www.ics.uci.edu/~mlearn/MLRepository.html
Brodley, C., Lane, T.: Creating and exploiting coverage and diversity. In: Proc. AAAI 1996 Workshop on Integrating Multiple Learned Models, Portland, OR, pp. 8–14 (1996)
Google Scholar
Cunningham, P., Carney, J.: Diversity versus quality in classification ensembles based on feature selection. In: Lopez de Mantaras, R., Plaza, E. (eds.) ECML 2000. LNCS (LNAI), vol. 1810, pp. 109–116. Springer, Heidelberg (2000)
Chapter Google Scholar
Dietterich, T.G.: An experimental comparison of three methods for constructing ensembles of decision trees: bagging, boosting, and randomization. Machine Learning 40(2), 139–157 (2000)
Article Google Scholar
Dietterich, T.G.: Machine learning research: four current directions. AI Magazine 18(4), 97–136 (1997)
Google Scholar
Domingos, P., Pazzani, M.: On the optimality of the simple Bayesian classifier under zeroone loss. Machine Learning 29(2,3), 103–130 (1997)
Article MATH Google Scholar
Giacinto, G., Roli, F.: Design of effective neural network ensembles for image classification processes. Image Vision and Computing Journal 19(9-10), 699–707 (2001)
Article Google Scholar
Ho, T.K.: The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence 20(8), 832–844 (1998)
Article Google Scholar
Krogh, A., Vedelsby, J.: Neural network ensembles, cross validation, and active learning. In: Touretzky, D., Leen, T. (eds.) Advances in Neural Information Processing Systems, vol. 7, pp. 231–238. MIT Press, Cambridge (1995)
Google Scholar
Kuncheva, L.I., Whitaker, C.J.: Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Machine Learning 51(2), 181–207 (2003)
Article MATH Google Scholar
Opitz, D.: Feature selection for ensembles. In: Proc. 16th National Conf. on Artificial Intelligence, pp. 379–384. AAAI Press, Menlo Park (1999)
Google Scholar
Puuronen, S., Terziyan, V., Tsymbal, A.: A dynamic integration algorithm for an ensemble of classifiers. In: Raś, Z.W., Skowron, A. (eds.) ISMIS 1999. LNCS, vol. 1609, pp. 592–600. Springer, Heidelberg (1999)
Chapter Google Scholar
Schaffer, C.: Selecting a classification method by cross-validation. Machine Learning 13, 135–143 (1993)
Google Scholar
Shipp, C.A., Kuncheva, L.I.: Relationship between combination methods and measures of diversity in combining classifiers. Information Fusion 3, 135–148 (2002)
Article Google Scholar
Skalak, D.B.: The sources of increased accuracy for two proposed boosting algorithms. In: AAAI 1996 Workshop on Integrating Multiple Models for Improving and Scaling Machine Learning Algorithms (in conjunction with AAAI 1996), Portland, Oregon, USA, pp. 120–125 (1996)
Google Scholar
Skurichina, M., Duin, R.P.W.: Bagging and the random subspace method for redundant feature spaces. In: Kittler, J., Roli, F. (eds.) MCS 2001. LNCS, vol. 2096, pp. 1–10. Springer, Heidelberg (2001)
Chapter Google Scholar
Tsymbal, A., Puuronen, S., Patterson, D.: Ensemble feature selection with the simple Bayesian classification. Information Fusion 4(2), 87–100 (2003)
Article Google Scholar
Tsymbal, A., Puuronen, S., Skrypnyk, I.: Ensemble feature selection with dynamic integration of classifiers. In: Int. ICSC Congress on Computational Intelligence Methods and Applications CIMA 2001, Bangor, Wales, U.K, pp. 558–564 (2001)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Trinity College Dublin, Ireland
Alexey Tsymbal & Pádraig Cunningham
Department of Computer Science and Information Systems, University of Jyväskylä, Finland
Mykola Pechenizkiy

Authors

Alexey Tsymbal
View author publications
You can also search for this author in PubMed Google Scholar
Mykola Pechenizkiy
View author publications
You can also search for this author in PubMed Google Scholar
Pádraig Cunningham
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Graduate School of Informatics, Kyoto University, Yoshida-Honmachi, 606-8501, Sakyo, Kyoto, Japan
Yahiko Kambayashi
I.B.M. India Research Lab,, India
Mukesh Mohania
Institute for Application Oriented Knowledge Processing (FAW), Johannes Kepler University Linz, Austria
Wolfram Wöß

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tsymbal, A., Pechenizkiy, M., Cunningham, P. (2004). Diversity in Random Subspacing Ensembles. In: Kambayashi, Y., Mohania, M., Wöß, W. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2004. Lecture Notes in Computer Science, vol 3181. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30076-2_31

Download citation

DOI: https://doi.org/10.1007/978-3-540-30076-2_31
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22937-7
Online ISBN: 978-3-540-30076-2
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Diversity in Random Subspacing Ensembles

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Optimising Diversity in Classifier Ensembles

An Empirical Investigation on the Use of Diversity for Creation of Classifier Ensembles

Optimising Diversity in Classifier Ensembles of Classification Trees

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Diversity in Random Subspacing Ensembles

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Optimising Diversity in Classifier Ensembles

An Empirical Investigation on the Use of Diversity for Creation of Classifier Ensembles

Optimising Diversity in Classifier Ensembles of Classification Trees

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation