Skip to main content
Log in

On Cesáro Averages for Weighted Trees in the Random Forest

  • Published:
Journal of Classification Aims and scope Submit manuscript

Abstract

The random forest is a popular and effective classification method. It uses a combination of bootstrap resampling and subspace sampling to construct an ensemble of decision trees that are then averaged for a final prediction. In this paper, we propose a potential improvement on the random forest that can be thought of as applying a weight to each tree before averaging. The new method is motivated by the potential instability of averaging predictions of trees that may be of highly variable quality, and because of this, we replace the regular average with a Cesáro average. We provide both a theoretical analysis that gives exact conditions under which the new approach outperforms the traditional random forest, and numerical analysis that shows the new approach is competitive when training a classification model on numerous realistic data sets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Apostol, T. (1976). Introduction to analytic number theory, Berlin Germany. New York: Springer.

    Book  Google Scholar 

  • Bache, K. , & Lichman, M. UCI machine learning repository. http://archive.ics.uci.edu/ml.

  • Breiman, L. (2001). Random forests. Machine Learning, 45, 5–32.

    Article  Google Scholar 

  • Daho, M.E.H., Settouti, N., Lazouni, M.E., Chikh, M.E.A. (2014). Weighted vote for trees aggregation in random forest. In Intl Conference on Multimedia Computing Systems (ICMCS) (pp. 438–443).

  • Friedman, J.H. (2006). Recent advances in predictive (machine) learning. Journal of Classification, 23, 175–197.

    Article  MathSciNet  Google Scholar 

  • Hendricks, P. (2015). titanic: Titanic passenger survival data set. R package version 0.1.0. https://CRAN.R-project.org/package=titanic.

  • Li, H.B., Wang, W., Ding, H.W., Dong, J. (2010). Trees weighting random forest method for classifying high-dimensional noisy data. In Proc. IEEE 7th Int. Conf. e-Business Eng. (ICEBE) (pp. 160–163).

  • Naghibi, S.A., Pourghasemi, H.R., Dixon, B. (2016). GIS-based groundwater potential mapping using boosted regression tree, classification and regression tree, and random forest machine learning models in Iran. Environmental Monitoring and Assessment, 188, 44.

    Article  Google Scholar 

  • Ronao, C.A., & Cho, S.B. (2015). Random forests with weighted voting for anomalous query access detection in relational databases. Artificial Intelligence and Soft Computing, 9120, 36–48.

    Article  Google Scholar 

  • Stein, E., & Shakarchi, R. (2003). Fourier analysis: an introduction Princeton. New Jersey: Princeton University Press.

    MATH  Google Scholar 

  • Subasi, A., Alickovic, E., Kevric, J. (2017). Diagnosis of chronic kidney disease by using random forest. CMBEBIH, 62, 589–594.

    Article  Google Scholar 

  • Weisstein, E.W. (2004). Harmonic series. http://mathworld.wolfram.com/HarmonicSeries.html.

  • Winham, S.J., Freimuth, R.R., Biernacka, J.M. (2013). A weighted random forests approach to improve predictive performance. Statistical Analysis and Data Mining, 6, 496–505.

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hieu Pham.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Pham, H., Olafsson, S. On Cesáro Averages for Weighted Trees in the Random Forest. J Classif 37, 223–236 (2020). https://doi.org/10.1007/s00357-019-09322-8

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00357-019-09322-8

Keywords

Navigation