Abstract
Many researchers have shown that ensemble methods such as Boosting and Bagging improve the accuracy of classification. Boosting and Bagging perform well with unstable learning algorithms such as neural networks or decision trees. Pruning decision tree classifiers is intended to make trees simpler and more comprehensible and avoid over-fitting. However it is known that pruning individual classifiers of an ensemble does not necessarily lead to improved generalisation. Examples of individual tree pruning methods are Minimum Error Pruning (MEP), Error-based Pruning (EBP), Reduced-Error Pruning(REP), Critical Value Pruning (CVP) and Cost-Complexity Pruning (CCP). In this paper, we report the results of applying Boosting and Bagging with these five pruning methods to eleven datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
E. Bauer and R. Kohavi. An empirical comparison of voting classification algorithms: Bagging, boosting, and variants. Maching Learning, 36(1):105–142, 1999.
Leo Breiman. Bagging predictors. Machine Learning, 26(2):123–140, 1996.
Leo Breiman. Some infinity theory for predictor ensembles. Technical report, TR 577, Department of Statistics, University of California, Berkeley, 2000
Leo Breiman, J. H. Freidman, R. A. Olshen, and C.J. Stone. Classification and regression trees. Wadsworth International Group, 1984. ftp://ftp.stat.berkeley.edu/pub/users/breiman.
L. A. Breslow and D. W. Aha. Simplifying decision trees: A survey. Knowledge Engineering Review, pages 1–40, 1997.
B. Cestnik and I. Bratko. On estimating probabilities in tree pruning. In Kodratoff Y., editor, Machine Learning-EWSL-91.European Working Session on Learning Proceedings, pages 138-50. Springer-Verlag, 1991.
T. G. Dietterich D. Margineantu. Pruning adaptive boosting. In International Conference on Machine Learning, pages 211–218. Morgan Kaufmann, 1997.
Thomas G. Detterich. Approximate statistical tests for comparing supervised classification learning algorithms. Neural Computation, 10:1895–1923, 1998.
T. G. Dietterich. An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization. Machine Learning, 40(2):139–158, 2000.
F. Esposito, D. Malerba, and G. Semeraro. A comparative analysis of methods for pruning decision trees. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(5):476–491, May 1997.
Y. Freund and R. Schapire. Experiments with a new boosting algorithm. Machine Learning:Proceedings of the Thirteenth International Conference, pages 148–156, 1996.
G. James. Majority Vote Classifiers: Theory and Applications. PhD thesis, Dept. of Statistics, Univ. of Stanford, May 1998. http://www-stat.stanford.edu/ gareth/.
J. Mingers. Expert systems-rule induction with statistical data. Operational Research Society, 38:39–47, 1987.
T. Niblett and I. Bratko. Learning decision rules in noisy domains. In Expert System 86, Cambridge. Cambridge University Press, 1986.
J. Ross Quinlan. Personal communication from Quinlan.
J. Ross Quinlan. Bagging, boosting, and c4.5. In Fourteenth National Conference on Artificial Intelligence, 1996.
R. Quinlan. Induction of decision tree. Machine Learning, 1:81–106, 1986.
R. Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo, California, 1993.
R.J. Quinlan. Simplyfying decision trees. International Journal of Man-Machine Studies, 27:221–234, 1987.
These datasets can be found on: http://www.ics.uci.edu/mlearn/MLSummary.html.
P. E. Utgoff and C. E. Brodley. Linear machine decision trees. Technical report, Department of Computer Science, University of Massachusetts, Amhers, 1991.
T. Windeatt and G. Ardeshir. Boosting unpruned and pruned decision trees. In Applied Informatics, Preceedings of the IASTED International Symposia, pages 66–71, 2001.
T. Windeatt and R. Ghaderi. Binary labelling and decision level fusion. Information fusion, 2(2):103–112, 2001.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Windeatt, T., Ardeshir, G. (2001). An Empirical Comparison of Pruning Methods for Ensemble Classifiers. In: Hoffmann, F., Hand, D.J., Adams, N., Fisher, D., Guimaraes, G. (eds) Advances in Intelligent Data Analysis. IDA 2001. Lecture Notes in Computer Science, vol 2189. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44816-0_21
Download citation
DOI: https://doi.org/10.1007/3-540-44816-0_21
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42581-6
Online ISBN: 978-3-540-44816-7
eBook Packages: Springer Book Archive