Dynamics of Variance Reduction in Bagging and Other Techniques Based on Randomisation

Fumera, G.; Roli, F.; Serrau, A.

doi:10.1007/11494683_32

G. Fumera²⁰,
F. Roli²⁰ &
A. Serrau²⁰

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 3541))

Included in the following conference series:

International Workshop on Multiple Classifier Systems

Abstract

In this paper the performance of bagging in classification problems is theoretically analysed, using a framework developed in works by Tumer and Ghosh and extended by the authors. A bias-variance decomposition is derived, which relates the expected misclassification probability attained by linearly combining classifiers trained on N bootstrap replicates of a fixed training set to that attained by a single bootstrap replicate of the same training set. Theoretical results show that the expected misclassification probability of bagging has the same bias component as a single bootstrap replicate, while the variance component is reduced by a factor N. Experimental results show that the performance of bagging as a function of the number of bootstrap replicates follows quite well our theoretical prediction. It is finally shown that theoretical results derived for bagging also apply to other methods for constructing multiple classifiers based on randomisation, such as the random subspace method and tree randomisation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

On Properties of Undersampling Bagging and Its Extensions for Imbalanced Data

Generalized bagging

Article 24 March 2021

Multi-class and feature selection extensions of Roughly Balanced Bagging for imbalanced data

Article Open access 10 February 2017

References

Bauer, E., Kohavi, R.: An empirical comparison of voting classification algorithms: bagging, boosting, and variants. Machine Learning 36, 105–139 (1999)
Article Google Scholar
Breiman, L.: Bias, Variance, and arcing classifiers. Tech. Rep., Dept. of Statistics, Univ. of California (1995)
Google Scholar
Breiman, L.: Bagging predictors. Machine Learning 24, 123–140 (1996)
MATH MathSciNet Google Scholar
Buhlmann, P., Yu, B.: Analyzing bagging. Ann. Statist. 30, 927–961 (2002)
Article MathSciNet Google Scholar
Dietterich, T.G., Kong, E.B.: Machine Learning Bias, Statistical Bias, and Statistical Variance of Decision Tree Algorithms. Tech. Rep., Dept of Computer Science, Oregon State Univ. (1995)
Google Scholar
Dietterich, T.G.: An experimental comparison of three methods for constructing ensembles of decision trees: bagging, boosting and randomization. Machine Learning 40, 1–22 (1999)
Google Scholar
Friedman, J.H.: On bias, variance, 0-1 - loss, and the curse-of-dimensionality. Data Mining and Knowledge Discovery 1, 55–77 (1997)
Article Google Scholar
Friedman, J.H., Hall, P.: On bagging and nonlinear estimation. Tech. Rep., Stanford University, Stanford, CA (2000)
Google Scholar
Fumera, G., Roli, F.: A Theoretical and Experimental Analysis of Linear Combiners for Multiple Classifier Systems. IEEE Trans. Pattern Analysis Machine Intelligence (in press)
Google Scholar
Grandvalet, Y.: Bagging can stabilize without reducing variance. In: Proc. Int. Conf. Artificial Neural Networks. LNCS, pp. 49–56. Springer, Heidelberg (2001)
Google Scholar
Banfield, H., Hall, L.O., Boweyer, K.W., Kegelmeyer, W.P.: A new ensemble diversity measure applied to thinning ensembles. In: Kittler, J., Roli, F. (eds.) Proc. Int. Workshop Multiple Classifier Systems. LNCS, vol. 2096, pp. 306–316. Springer, Heidelberg (2003)
Chapter Google Scholar
Ho, T.K.: The random subspace method for constructing decision forests. IEEE Trans. Pattern Analysis Machine Intelligence 20, 832–844 (1998)
Article Google Scholar
Kuncheva, L.I.: Combining Pattern Classifiers: Methods and Algorithms. Wiley, Hoboken (2004)
Book MATH Google Scholar
Kohavi, R., Wolpert, D.H.: Bias plus variance decomposition for zero-one loss functions. In: Saitta, L. (ed.) Proc. Int. Conf. Machine Learning, pp. 275–283. Morgan Kaufmann, San Francisco (1996)
Google Scholar
Latinne, P., Debeir, O., Decaestecker, C.: Limiting the number of trees in random forests. In: Kittler, J., Roli, F. (eds.) MCS 2001. LNCS, vol. 2096, pp. 178–187. Springer, Heidelberg (2001)
Chapter Google Scholar
Skurichina, M., Duin, R.P.W.: Bagging for linear classifiers. Pattern Recognition 31, 909–930 (1998)
Article Google Scholar
Tibshirani, R.: Bias, variance and prediction error for classification rules. Tech. Rep., Dept. of Statistics, University of Toronto (1996)
Google Scholar
Tumer, K.: Linear and order statistics combiners for reliable pattern classification. PhD dissertation, The University of Texas, Austin (1996)
Google Scholar
Tumer, K., Ghosh, J.: Analysis of Decision Boundaries in Linearly Combined Neural Classifiers. Pattern Recognition 29, 341–348 (1996)
Article Google Scholar
Tumer, K., Ghosh, J.: Linear and order statistics combiners for pattern classification. In: Sharkey, A.J.C. (ed.) Combining Artificial Neural Nets, pp. 127–155. Springer, Heidelberg (1999)
Google Scholar

Download references

Author information

Authors and Affiliations

Dept. of Electrical and Electronic Eng., University of Cagliari, Italy
G. Fumera, F. Roli & A. Serrau

Authors

G. Fumera
View author publications
You can also search for this author in PubMed Google Scholar
F. Roli
View author publications
You can also search for this author in PubMed Google Scholar
A. Serrau
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

NASA Ames Research Center, Mail Stop 269-1, 94035-1000, Moffett Field, CA, USA
Nikunj C. Oza
Signal Processing and Pattern Recognition Laboratory, Electrical and Computer Engineering, Rowan University, 08028, Glassboro, NJ, USA
Robi Polikar
Centre for Vision, Speech and Signal Processing, University of Surrey, GU2 7XH, Guildford, UK
Josef Kittler
University of Cagliari, Department of Electrical and Electronic Engineering, Piazza d’Armi, 09123, Cagliari, Italy
Fabio Roli

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fumera, G., Roli, F., Serrau, A. (2005). Dynamics of Variance Reduction in Bagging and Other Techniques Based on Randomisation. In: Oza, N.C., Polikar, R., Kittler, J., Roli, F. (eds) Multiple Classifier Systems. MCS 2005. Lecture Notes in Computer Science, vol 3541. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11494683_32

Download citation

DOI: https://doi.org/10.1007/11494683_32
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-26306-7
Online ISBN: 978-3-540-31578-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Dynamics of Variance Reduction in Bagging and Other Techniques Based on Randomisation

Abstract

Access this chapter

Preview

Similar content being viewed by others

On Properties of Undersampling Bagging and Its Extensions for Imbalanced Data

Generalized bagging

Multi-class and feature selection extensions of Roughly Balanced Bagging for imbalanced data

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Dynamics of Variance Reduction in Bagging and Other Techniques Based on Randomisation

Abstract

Access this chapter

Preview

Similar content being viewed by others

On Properties of Undersampling Bagging and Its Extensions for Imbalanced Data

Generalized bagging

Multi-class and feature selection extensions of Roughly Balanced Bagging for imbalanced data

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation