Abstract
This paper presents a study of one particular problem of decision tree induction, namely (post-)pruning, with the aim of finding a common framework for the plethora of pruning methods appeared in literature. Given a tree Tmax to prune, a state space is defined as the set of all subtrees of Tmax to which only one operator, called any-depth branch pruning operator, can be applied in several ways in order to move from one state to another. By introducing an evaluation function f defined on the set of subtrees, the problem of tree pruning can be cast as an optimization problem, and it is also possible to classify each post-pruning method according to both its search strategy and the kind of information exploited by f. Indeed, while some methods use only the training set in order to evaluate the accuracy of a decision tree, other methods exploit an additional pruning set that allows them to get less biased estimates of the predictive accuracy of apruned tree. The introduction of the state space shows that very simple search strategies are used by the postpruning methods considered. Finally, some empirical results allow theoretical observations on strengths and weaknesses of pruning methods to be better understood.
Chapter PDF
Similar content being viewed by others
References
L. Breiman, J. Friedman, R. Olshen, C. Stone: Classification and regression trees. Belmont, CA: Wadsworth International 1984
J. R. Quinlan: Induction of decision trees. Machine Learning 1, 81–106 (1986)
J. R. Quinlan: Simplifying decision trees. International Journal of Man-Machine Studies 27, 221–234 (1987) (also appeared in: B. R. Gaines, J. H. Boose (eds.): Knowledge Acquisition for Knowledge-Based Systems. Academic Press 1988)
M. Gams, N. Lavrac: Review of five empirical learning systems within a proposed schemata. In: I. Bratko, N. Lavrac (eds.): Progress in Machine Learning. Wilmslow: Sigma Press 1987
J. Mingers: An empirical comparison of selection measures for decision-tree induction. Machine Learning 3, 319–342 (1989)
J. Mingers: An empirical comparison of pruning methods for decision tree induction. Machine Learning 4, 227–243 (1989)
B. Cestnik, I. Kononenko, I. Bratko: ASSISTANT 86: Aknowledge-elicitation tool for sophisticated users. In: I. Bratko, N. Lavrac (eds.): Progress in Machine Learning. Wilmslow: Sigma Press 1987
J. R. Quinlan: Determinate literals in inductive logic programming. Proceedings of the IJCAI 91. San Mateo, CA: Morgan Kaufmann 1991, pp. 746–750
T. Niblett: Constructing decision trees in noisy domains. In: I. Bratko, N. Lavrac (eds.): Progress in Machine Learning. Wilmslow: Sigma Press 1987
A. V. Aho, J. E. Hopcroft, J. D. Ullman: The design and analysis of computer algorithms. Reading, MA: Addison Wesley 1974
F. Esposito, D. Malerba, G. Semeraro: Pruning methods in decision tree induction: a unifying view. Technical report (1992)
T. Niblett, I. Bratko: Learning decision rules in noisy domains. Proceedings of Expert Systems 86. Cambridge: University Press 1986
B. Cestnik, I. Bratko: On estimating probabilities in tree pruning. Proceedings of the EWSL-91. Berlin: Springer-Verlag 1991, pp. 138–150
A. Barr, E. Feigenbaum: The handbook of artificial intelligence, (Vol. 1). Reading, MA: Addison Wesley 1981
S. B. Gelfand, C. S. Ravishankar, E. J. Delp: An iterative growing and pruning algorithm for classification tree design. IEEE Transactions on Pattern Analysis and Machine Intelligence PAMI-13, 2, 163–174 (1991)
W. Buntine, T. Niblett: A further comparison of splitting rules for decision-tree induction. Machine Learning 8, 75–85 (1992)
C. Schaffer: Deconstructing the digit recognition problem. In: Machine Learning: Proceedings of the Ninth International Workshop (ML92). San Mateo, CA: Morgan Kaufmann (1992), pp. 394–399
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1993 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Esposito, F., Malerba, D., Semeraro, G. (1993). Decision tree pruning as a search in the state space. In: Brazdil, P.B. (eds) Machine Learning: ECML-93. ECML 1993. Lecture Notes in Computer Science, vol 667. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-56602-3_135
Download citation
DOI: https://doi.org/10.1007/3-540-56602-3_135
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-56602-1
Online ISBN: 978-3-540-47597-2
eBook Packages: Springer Book Archive