An Empirical Comparison of Pruning Methods for Ensemble Classifiers

Windeatt, Terry; Ardeshir, Gholamreza

doi:10.1007/3-540-44816-0_21

Terry Windeatt⁵ &
Gholamreza Ardeshir⁵

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2189))

Included in the following conference series:

International Symposium on Intelligent Data Analysis

1167 Accesses
5 Citations

Abstract

Many researchers have shown that ensemble methods such as Boosting and Bagging improve the accuracy of classification. Boosting and Bagging perform well with unstable learning algorithms such as neural networks or decision trees. Pruning decision tree classifiers is intended to make trees simpler and more comprehensible and avoid over-fitting. However it is known that pruning individual classifiers of an ensemble does not necessarily lead to improved generalisation. Examples of individual tree pruning methods are Minimum Error Pruning (MEP), Error-based Pruning (EBP), Reduced-Error Pruning(REP), Critical Value Pruning (CVP) and Cost-Complexity Pruning (CCP). In this paper, we report the results of applying Boosting and Bagging with these five pruning methods to eleven datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

E. Bauer and R. Kohavi. An empirical comparison of voting classification algorithms: Bagging, boosting, and variants. Maching Learning, 36(1):105–142, 1999.
Article Google Scholar
Leo Breiman. Bagging predictors. Machine Learning, 26(2):123–140, 1996.
Google Scholar
Leo Breiman. Some infinity theory for predictor ensembles. Technical report, TR 577, Department of Statistics, University of California, Berkeley, 2000
Google Scholar
Leo Breiman, J. H. Freidman, R. A. Olshen, and C.J. Stone. Classification and regression trees. Wadsworth International Group, 1984. ftp://ftp.stat.berkeley.edu/pub/users/breiman.
L. A. Breslow and D. W. Aha. Simplifying decision trees: A survey. Knowledge Engineering Review, pages 1–40, 1997.
Google Scholar
B. Cestnik and I. Bratko. On estimating probabilities in tree pruning. In Kodratoff Y., editor, Machine Learning-EWSL-91.European Working Session on Learning Proceedings, pages 138-50. Springer-Verlag, 1991.
Google Scholar
T. G. Dietterich D. Margineantu. Pruning adaptive boosting. In International Conference on Machine Learning, pages 211–218. Morgan Kaufmann, 1997.
Google Scholar
Thomas G. Detterich. Approximate statistical tests for comparing supervised classification learning algorithms. Neural Computation, 10:1895–1923, 1998.
Article Google Scholar
T. G. Dietterich. An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization. Machine Learning, 40(2):139–158, 2000.
Article Google Scholar
F. Esposito, D. Malerba, and G. Semeraro. A comparative analysis of methods for pruning decision trees. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(5):476–491, May 1997.
Article Google Scholar
Y. Freund and R. Schapire. Experiments with a new boosting algorithm. Machine Learning:Proceedings of the Thirteenth International Conference, pages 148–156, 1996.
Google Scholar
G. James. Majority Vote Classifiers: Theory and Applications. PhD thesis, Dept. of Statistics, Univ. of Stanford, May 1998. http://www-stat.stanford.edu/ gareth/.
J. Mingers. Expert systems-rule induction with statistical data. Operational Research Society, 38:39–47, 1987.
Article Google Scholar
T. Niblett and I. Bratko. Learning decision rules in noisy domains. In Expert System 86, Cambridge. Cambridge University Press, 1986.
Google Scholar
J. Ross Quinlan. Personal communication from Quinlan.
Google Scholar
J. Ross Quinlan. Bagging, boosting, and c4.5. In Fourteenth National Conference on Artificial Intelligence, 1996.
Google Scholar
R. Quinlan. Induction of decision tree. Machine Learning, 1:81–106, 1986.
Google Scholar
R. Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo, California, 1993.
Google Scholar
R.J. Quinlan. Simplyfying decision trees. International Journal of Man-Machine Studies, 27:221–234, 1987.
Article Google Scholar
http://www.cse.unsw.edu.au~quinlan.
These datasets can be found on: http://www.ics.uci.edu/mlearn/MLSummary.html.
P. E. Utgoff and C. E. Brodley. Linear machine decision trees. Technical report, Department of Computer Science, University of Massachusetts, Amhers, 1991.
Google Scholar
T. Windeatt and G. Ardeshir. Boosting unpruned and pruned decision trees. In Applied Informatics, Preceedings of the IASTED International Symposia, pages 66–71, 2001.
Google Scholar
T. Windeatt and R. Ghaderi. Binary labelling and decision level fusion. Information fusion, 2(2):103–112, 2001.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Centre for Vision, Speech and Signal Processing, School of Electronics Engineering, Information Technology and Mathematics, Guildford, Surrey, Gu2 7XH, UK
Terry Windeatt & Gholamreza Ardeshir

Authors

Terry Windeatt
View author publications
You can also search for this author in PubMed Google Scholar
Gholamreza Ardeshir
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Royal Institute of Technology, Centre for Autonomous Systems, 10044, Stockholm, Sweden
Frank Hoffmann
Imperial College, Huxley Building 180 Queen’s Gate, London, SW7 2BZ, UK
David J. Hand & Niall Adams &
Department of Computer Science, Vanderbilt University, Box 1679, Station B, Nashville, TN, 37235, USA
Douglas Fisher
Department of Computer Science, New University of Lisbon, 2825-114, Caparica, Portugal
Gabriela Guimaraes

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Windeatt, T., Ardeshir, G. (2001). An Empirical Comparison of Pruning Methods for Ensemble Classifiers. In: Hoffmann, F., Hand, D.J., Adams, N., Fisher, D., Guimaraes, G. (eds) Advances in Intelligent Data Analysis. IDA 2001. Lecture Notes in Computer Science, vol 2189. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44816-0_21

Download citation

DOI: https://doi.org/10.1007/3-540-44816-0_21
Published: 03 September 2001
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42581-6
Online ISBN: 978-3-540-44816-7
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics