Skip to main content

A Comparative Study on the Performance of Several Ensemble Methods with Low Subsampling Ratio

  • Conference paper
Intelligent Information and Database Systems (ACIIDS 2010)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5991))

Included in the following conference series:

Abstract

In ensemble methods each base learner is trained on a resampled version of the original training sample with the same size. In this paper we have used resampling without replacement or subsampling to train base classifiers with low subsample ratio i.e., the size of each subsample is smaller than the original training sample. The main objective of this paper is to check if the scalability performance of several well known ensemble methods with low subsample ratio are competent and compare them with their original counterpart. We have selected three ensemble methods: Bagging, Adaboost and Bundling. In all the ensemble methods a full decision tree is used as the base classifier. We have applied the subsampled version of the above ensembles in several well known benchmark datasets to check the error rate. We have also checked the time complexity of each ensemble method with low subsampling ratio. From the experiments, it is apparent that in the case of bagging and adaboost with low subsampling ratio for most of the cases the error rate is inversely related with subsample size, while for bundling it is opposite. Overall performance of the ensemble methods with low subsampling ratio from experiments showed that bundling is superior in accuracy with low subsampling ratio in almost all the datasets, while bagging is superior in reducing time complexity.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Blake, C.L., Merz, C.J.: UCI Repository of Machine Learning Databases, http://www.ics.uci.edu/mlearn/MLRepository.html

  2. Breiman, L.: Bagging predictors. Machine Learning 24(2), 123–140 (1996a)

    MATH  MathSciNet  Google Scholar 

  3. Breiman, L.: Out-of-bag estimation. Statistics Department, University of Berkeley CA 94708, Technical Report (1996b)

    Google Scholar 

  4. Breiman, L.: Heuristics of instability and stabilization in model selection. Annals of Statistics 24(6), 2350–2383 (1996c)

    Article  MATH  MathSciNet  Google Scholar 

  5. Bühlman, P.: Bagging, subagging and bragging for improving some prediction algorithms. In: Arkitas, M.G., Politis, D.N. (eds.) Recent Advances and Trends in Nonparametric Statistics, pp. 9–34. Elsevier, Amsterdam (2003)

    Google Scholar 

  6. Demšar, J.: Statistical comparisons of classifiers over multiple datasets. J. Mach. Learn. Research 7, 1–30 (2006)

    Google Scholar 

  7. Freund, Y., Schapire, R.: Experiments with a New boosting algorithm. In: Machine Learning: Proceedings to the Thirteenth International Conference, pp. 148–156. Morgan Kaufmann, San Francisco (1996)

    Google Scholar 

  8. Freund, Y., Schapire, R.: A decision-theoretic generalization of online learning and an application to boosting. J. Comput. System Sci. 55, 119–139 (1997)

    Article  MATH  MathSciNet  Google Scholar 

  9. Friedman, J.: Stochastic gradient boosting. Comput. Statist. Data Anal. 38, 367–378 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  10. Friedman, J., Hall, P.: On Bagging and Non-linear Estimation. J. Statist. Planning and Infer. 137(3), 669–683 (2007)

    Article  MATH  MathSciNet  Google Scholar 

  11. Hastie, T., Tibshirani, R., Freidman, J.: The elements of statistical learning: data mining, inference and prediction. Springer, New York (2001)

    MATH  Google Scholar 

  12. Hothorn, T., Lausen, B.: Double-bagging: combining classifiers by bootstrap aggregation. Pattern Recognition 36(6), 1303–1309 (2003)

    Article  MATH  Google Scholar 

  13. Hothorn, T., Lausen, B.: Bundling classifiers by bagging trees. Comput. Statist. Data Anal. 49, 1068–1078 (2005)

    Article  MATH  MathSciNet  Google Scholar 

  14. Kuncheva, L.I.: Combining Pattern Classifiers. Methods and Algorithms. John Wiley and Sons, Chichester (2004)

    Book  MATH  Google Scholar 

  15. Rodríguez, J., Kuncheva, L., Alonso, C.: Rotation forest: A new classifier ensemble method. IEEE Trans. Patt. Analys. Mach. Intell. 28(10), 1619–1630 (2006)

    Article  Google Scholar 

  16. Zaman, F., Hirose, H.: Double SVMbagging: A subsampling approach to SVM ensemble. To appear in Intelligent Automation and Computer Engineering. Springer, Heidelberg (2009)

    Google Scholar 

  17. Zaman, F., Hirose, H.: Effect of Subsampling Rate on Subbagging and Related Ensembles of Stable Classifiers. In: Chaudhury, S., Mitra, S., Murthy, C.A., Sastry, P.S., Pal, S.K. (eds.) PReMI 2009. LNCS, vol. 5909, pp. 44–49. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  18. Zhang, C.X., Zhang, J.S., Zhang, G.Y.: An efficient modified boosting method for solving classification problems. J. Comput. Applied Mathemat. 214, 381–392 (2008)

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Faisal, Z., Hirose, H. (2010). A Comparative Study on the Performance of Several Ensemble Methods with Low Subsampling Ratio. In: Nguyen, N.T., Le, M.T., ÅšwiÄ…tek, J. (eds) Intelligent Information and Database Systems. ACIIDS 2010. Lecture Notes in Computer Science(), vol 5991. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12101-2_33

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-12101-2_33

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-12100-5

  • Online ISBN: 978-3-642-12101-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics