Abstract
This paper presents an empirical investigation of eight well-known simplification methods for decision trees induced from training data. Twelve data sets are considered to compare both the accuracy and the complexity of simplified trees. The computation of optimally pruned trees is used in order to give a clear definition of bias of the methods towards overpruning and underpruning. The results indicate that the simplification strategies which exploit an independent pruning set do not perform better than the others. Furthermore, some methods show an evident bias towards either underpruning or overpruning.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
L. Breiman, J. Friedman, R. Olshen, and C. Stone. Classification and regression trees. Wadsworth International, Belmont, CA, 1984.
W. Buntine and T. Niblett. A further comparison of splitting rules for decision-tree induction. Machine Learning, 8:75–85, 1992.
B. Cestnik and I. Bratko. On estimating probabilities in tree pruning. In Proceedings of the EWSL-91, pages 138–150, 1991.
F. Esposito, D. Malerba, and G. Semeraro. Decision tree pruning as a search in the state space. In P. Brazdil, editor, Machine Learning: ECML-93. Springer-Verlag, Berlin, 1993.
D. H. Fisher. Pessimistic and optimistic induction. Department of Computer Science, Vanderbilt University, 1992.
R. C. Holte. Very simple classification rules perform well on most commonly used datasets. Machine Learning, 11:63–90, 1993.
P. M. Murphy and D. W. Aha. UCI repository of machine learning databases [machine-readable data repository]. Department of Information and Computer Science, University of California, Irvine, 1994.
J. Mingers. An empirical comparison of selection measures for decision-tree induction. Machine Learning, 4:227–243, 1989.
T. Niblett and I. Bratko. Learning decision rules in noisy domains. In Proceedings of Expert Systems 86, Cambridge, 1986. Cambridge University Press.
T. Niblett. Constructing decision trees in noisy domains. In I. Bratko and N. Lavrac, editors, Progress in Machine Learning. Sigma Press, Wilmslow, 1987.
J.R. Quinlan. Simplifying decision trees. International Journal of Man-Machine Studies, 27:221–234, 1987.
J. R. Quinlan. C.1.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo, 1993.
C. Schaffer. Overfitting avoidance as bias. Machine Learning, 10:153–178, 1993.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1996 Springer-Verlag New York, Inc.
About this chapter
Cite this chapter
Malerba, D., Esposito, F., Semeraro, G. (1996). A Further Comparison of Simplification Methods for Decision-Tree Induction. In: Fisher, D., Lenz, HJ. (eds) Learning from Data. Lecture Notes in Statistics, vol 112. Springer, New York, NY. https://doi.org/10.1007/978-1-4612-2404-4_35
Download citation
DOI: https://doi.org/10.1007/978-1-4612-2404-4_35
Publisher Name: Springer, New York, NY
Print ISBN: 978-0-387-94736-5
Online ISBN: 978-1-4612-2404-4
eBook Packages: Springer Book Archive