Abstract
Quinlan and Rivest have suggested a decision-tree inference method using the Minimum Description Length idea. We show that there is an error in their derivation of message lengths, which fortunately has no effect on the final inference. We further suggest two improvements to their coding techniques, one removing an inefficiency in the description of non-binary trees, and one improving the coding of leaves. We argue that these improvements are superior to similarly motivated proposals in the original paper.
Empirical tests confirm the good results reported by Quinlan and Rivest, and show our coding proposals to lead to useful improvements in the performance of the method.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Barron, A.R., & Cover, T.M. (1991). Minimum complexity density estimation. IEEE Transactions on Information Theory, 37 (4), 1034–1054.
Georgeff, M.P., & Wallace, C.S. (1984). A general criterion for inductive inference. Proceedings of the 6th European Conference on Artificial Intelligence, Tim O'Shea (Ed.). Amsterdam: Elsevier.
Hamming, R.W. (1980). Coding and information theory. Englewood Cliffs, NJ: Prentice Hall.
Quinlan, J.R. & Rivest, R.L. (1989). Inferring decision trees using the minimum description length principle. Information & Computation, 80, 227–248.
Quinlan, J.R. (1986). Induction of decision trees. Machine Learning, 1 (1), 81–106.
Rissanen, J. (1983). A universal prior for integers and estimation by minimum description length. Annals. of Statistics, 11, 416–431.
Rissanen, J., & Langdon, G.G. (1981). Universal modeling and coding. IEEE Transactions on Information Theory, IT-27, 12–23.
Shannon, C.E., & Weaver, W. (1949). The mathematical theory of communication. Urbana: University of Illinois Press.
Wallace, C.S., & Boulton, D.M. (1968). An information measure for classification. Computer Journal, 11, 185–195.
Wallace, C.S., & Freeman, P.R. (1987). Estimation & inference by compact coding. Journal of the Royal Statistical Society (B), 49, 240–265.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Wallace, C., Patrick, J. Coding Decision Trees. Machine Learning 11, 7–22 (1993). https://doi.org/10.1023/A:1022646101185
Issue Date:
DOI: https://doi.org/10.1023/A:1022646101185