Skip to main content

An Empirical Comparison of Pruning Methods for Ensemble Classifiers

  • Conference paper
  • First Online:
Book cover Advances in Intelligent Data Analysis (IDA 2001)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2189))

Included in the following conference series:

Abstract

Many researchers have shown that ensemble methods such as Boosting and Bagging improve the accuracy of classification. Boosting and Bagging perform well with unstable learning algorithms such as neural networks or decision trees. Pruning decision tree classifiers is intended to make trees simpler and more comprehensible and avoid over-fitting. However it is known that pruning individual classifiers of an ensemble does not necessarily lead to improved generalisation. Examples of individual tree pruning methods are Minimum Error Pruning (MEP), Error-based Pruning (EBP), Reduced-Error Pruning(REP), Critical Value Pruning (CVP) and Cost-Complexity Pruning (CCP). In this paper, we report the results of applying Boosting and Bagging with these five pruning methods to eleven datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. E. Bauer and R. Kohavi. An empirical comparison of voting classification algorithms: Bagging, boosting, and variants. Maching Learning, 36(1):105–142, 1999.

    Article  Google Scholar 

  2. Leo Breiman. Bagging predictors. Machine Learning, 26(2):123–140, 1996.

    Google Scholar 

  3. Leo Breiman. Some infinity theory for predictor ensembles. Technical report, TR 577, Department of Statistics, University of California, Berkeley, 2000

    Google Scholar 

  4. Leo Breiman, J. H. Freidman, R. A. Olshen, and C.J. Stone. Classification and regression trees. Wadsworth International Group, 1984. ftp://ftp.stat.berkeley.edu/pub/users/breiman.

  5. L. A. Breslow and D. W. Aha. Simplifying decision trees: A survey. Knowledge Engineering Review, pages 1–40, 1997.

    Google Scholar 

  6. B. Cestnik and I. Bratko. On estimating probabilities in tree pruning. In Kodratoff Y., editor, Machine Learning-EWSL-91.European Working Session on Learning Proceedings, pages 138-50. Springer-Verlag, 1991.

    Google Scholar 

  7. T. G. Dietterich D. Margineantu. Pruning adaptive boosting. In International Conference on Machine Learning, pages 211–218. Morgan Kaufmann, 1997.

    Google Scholar 

  8. Thomas G. Detterich. Approximate statistical tests for comparing supervised classification learning algorithms. Neural Computation, 10:1895–1923, 1998.

    Article  Google Scholar 

  9. T. G. Dietterich. An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization. Machine Learning, 40(2):139–158, 2000.

    Article  Google Scholar 

  10. F. Esposito, D. Malerba, and G. Semeraro. A comparative analysis of methods for pruning decision trees. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(5):476–491, May 1997.

    Article  Google Scholar 

  11. Y. Freund and R. Schapire. Experiments with a new boosting algorithm. Machine Learning:Proceedings of the Thirteenth International Conference, pages 148–156, 1996.

    Google Scholar 

  12. G. James. Majority Vote Classifiers: Theory and Applications. PhD thesis, Dept. of Statistics, Univ. of Stanford, May 1998. http://www-stat.stanford.edu/ gareth/.

  13. J. Mingers. Expert systems-rule induction with statistical data. Operational Research Society, 38:39–47, 1987.

    Article  Google Scholar 

  14. T. Niblett and I. Bratko. Learning decision rules in noisy domains. In Expert System 86, Cambridge. Cambridge University Press, 1986.

    Google Scholar 

  15. J. Ross Quinlan. Personal communication from Quinlan.

    Google Scholar 

  16. J. Ross Quinlan. Bagging, boosting, and c4.5. In Fourteenth National Conference on Artificial Intelligence, 1996.

    Google Scholar 

  17. R. Quinlan. Induction of decision tree. Machine Learning, 1:81–106, 1986.

    Google Scholar 

  18. R. Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo, California, 1993.

    Google Scholar 

  19. R.J. Quinlan. Simplyfying decision trees. International Journal of Man-Machine Studies, 27:221–234, 1987.

    Article  Google Scholar 

  20. http://www.cse.unsw.edu.au~quinlan.

  21. These datasets can be found on: http://www.ics.uci.edu/mlearn/MLSummary.html.

  22. P. E. Utgoff and C. E. Brodley. Linear machine decision trees. Technical report, Department of Computer Science, University of Massachusetts, Amhers, 1991.

    Google Scholar 

  23. T. Windeatt and G. Ardeshir. Boosting unpruned and pruned decision trees. In Applied Informatics, Preceedings of the IASTED International Symposia, pages 66–71, 2001.

    Google Scholar 

  24. T. Windeatt and R. Ghaderi. Binary labelling and decision level fusion. Information fusion, 2(2):103–112, 2001.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2001 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Windeatt, T., Ardeshir, G. (2001). An Empirical Comparison of Pruning Methods for Ensemble Classifiers. In: Hoffmann, F., Hand, D.J., Adams, N., Fisher, D., Guimaraes, G. (eds) Advances in Intelligent Data Analysis. IDA 2001. Lecture Notes in Computer Science, vol 2189. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44816-0_21

Download citation

  • DOI: https://doi.org/10.1007/3-540-44816-0_21

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-42581-6

  • Online ISBN: 978-3-540-44816-7

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics