Families of splitting criteria for classification trees

Shih, Y.-S.

doi:10.1023/A:1008920224518

Families of splitting criteria for classification trees

Published: November 1999

Volume 9, pages 309–315, (1999)
Cite this article

Statistics and Computing Aims and scope Submit manuscript

Y.-S. Shih

428 Accesses
63 Citations
Explore all metrics

Abstract

Several splitting criteria for binary classification trees are shown to be written as weighted sums of two values of divergence measures. This weighted sum approach is then used to form two families of splitting criteria. One of them contains the chi-squared and entropy criterion, the other contains the mean posterior improvement criterion. Both family members are shown to have the property of exclusive preference. Furthermore, the optimal splits based on the proposed families are studied. We find that the best splits depend on the parameters in the families. The results reveal interesting differences among various criteria. Examples are given to demonstrate the usefulness of both families.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Statistical comparison of classifiers through Bayesian hierarchical modelling

Article 18 May 2017

Giorgio Corani, Alessio Benavoli, … Marco Zaffalon

Algorithmic Aspects of Determining Depth Functions in a Procedure for Optimal Hypothesis Selection in Data Classification Problems

Article 23 September 2016

O. A. Galkin

Enhancing techniques for learning decision trees from imbalanced data

Article 02 March 2019

Ikram Chaabane, Radhouane Guermazi & Mohamed Hammami

References

Aeberhard, S., Coomans, D. and de Vel, O. (1993) Improvements in the preformance of regularized discriminant analysis. Journal of Chemometrics, 7, 99–115.
Google Scholar
Breiman, L. (1996) Some properties of splitting criteria. Machine Learning, 24, 41–47.
Google Scholar
Breiman, L., Friedman, J. H., Olshen, R. A. and Stone, C. J. (1984) Classification and Regression Trees. Wadsworth, Belmont, California.
Google Scholar
Buntine, W. and Niblett, T. (1992) A further comparison of splitting rules for decision tree induction. Machine Learning, 8, 75–85.
Article Google Scholar
Ciampi, A., Chang, C.-H., Hogg, S. and McKinney, S. (1987) Recursive partitioning: a versatile method for exploratory data analysis in biostatistics. In: M. I. B. and G. J. Umphrey (eds) Biostatistics, pp. 23–50, D. Reidel, New York.
Google Scholar
Clark, L. A. and Pregibon, D. (1992) Tree-based models. In: J. M. Chambers and T. J. Hastie (eds) Statistical Models in S, Wadsworth & Brooks/Cole, Pacific Grove, CA.
Fayyad, U. M. and Irani, R. B. (1992) The attribute selection problem in decision tree generation, 10th National Conference on AI, AAAI-92, pp. 104–110, MIT Press.
Hand, D. J. (1997)Construction and Assessment of Classification Rules, John Wiley, Chichester, England.
Google Scholar
Kass, G. V. (1980) An exploratory technique for investigation large quantities of categorical data. Applied Statistics, 29, 119–127.
Google Scholar
Loh, W.-Y. and Shih, Y.-S. (1997)Split selection methods for classification trees. Statistica Sinica, 7, 815–840.
Google Scholar
Loh, W.-Y. and Vanichsetakul, N. (1988) Tree-structured classification via generalized discriminant analysis (with discussion). Journal of the American Statistical Association, 83, 715–728.
Google Scholar
Lubischew, A. A. (1962) On the use of discriminant functions in taxonomy. Biometrics, 18, 455–477.
Google Scholar
Merz, C. J. and Murphy, P. M. (1996) UCI Repository of Machine Learning Databases, Department of Information and Computer Science, University of California, Irvine, CA.
Google Scholar
Quinlan, J. R. (1993) C4.5: Programs for Machine Learning, Morgan Kaufmann, San Mateo, CA.
Read, T. R. C. and Cressie, N. A. C. (1988) Goodness-of-Fit Statistics for Discrete Multivariate Data, Springer-Verlag, New York.
Google Scholar
Taylor, P. C. and Silverman, B. W. (1993) Block diagrams and splitting criteria for classification trees. Statistics and Computing, 3, 147–161.
Google Scholar

Download references

Authors

Y.-S. Shih
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Shih, YS. Families of splitting criteria for classification trees. Statistics and Computing 9, 309–315 (1999). https://doi.org/10.1023/A:1008920224518

Download citation

Issue Date: November 1999
DOI: https://doi.org/10.1023/A:1008920224518

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Families of splitting criteria for classification trees

Abstract

Access this article

Similar content being viewed by others

Statistical comparison of classifiers through Bayesian hierarchical modelling

Algorithmic Aspects of Determining Depth Functions in a Procedure for Optimal Hypothesis Selection in Data Classification Problems

Enhancing techniques for learning decision trees from imbalanced data

References

Rights and permissions

About this article

Cite this article

Navigation

Families of splitting criteria for classification trees

Abstract

Access this article

Similar content being viewed by others

Statistical comparison of classifiers through Bayesian hierarchical modelling

Algorithmic Aspects of Determining Depth Functions in a Procedure for Optimal Hypothesis Selection in Data Classification Problems

Enhancing techniques for learning decision trees from imbalanced data

References

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation