Skip to main content

A Metric Approach to Building Decision Trees Based on Goodman-Kruskal Association Index

  • Conference paper
Advances in Knowledge Discovery and Data Mining (PAKDD 2004)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3056))

Included in the following conference series:

  • 3089 Accesses

Abstract

We introduce a numerical measure on sets of partitions of finite sets that is linked to the Goodman-Kruskal association index commonly used in statistics. This measure allows us to define a metric on such partions used for constructing decision trees. Experimental results suggest that by replacing the usual splitting criterion used in C4.5 by a metric criterion based on the Goodman-Kruskal coefficient it is possible, in most cases, to obtain smaller decision trees without sacrificing accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo (1993)

    Google Scholar 

  2. Breiman, L., Friedman, J.H., Ohlsen, R.A., Stone, C.J.: Classification and Regression Trees. Chapman & Hall/CRC, Boca Raton (1984) (republished 1993)

    MATH  Google Scholar 

  3. Utgoff, P.E.: Decision tree induction based on efficient tree restructuring. Technical Report 95-18, University of Massachusetts, Amherst (1995)

    Google Scholar 

  4. Utgoff, P.E., Clouse, J.A.: A Kolmogorov-Smirnoff metric for decision tree induction. Technical Report 96-3, University of Massachusetts, Amherst (1996)

    Google Scholar 

  5. de Mántaras, R.L.: A distance-based attribute selection measure for decision tree induction. Machine Learning 6, 81–92 (1991)

    Article  MATH  Google Scholar 

  6. Simovici, D.A., Jaroszewicz, S.: Generalized conditional entropy and decision trees. In: Proceedings of EGC, Lyon, France, pp. 369–380 (2003)

    Google Scholar 

  7. Daróczy, Z.: Generalized information functions. Information and Control 16, 36–51 (1970)

    Article  MATH  MathSciNet  Google Scholar 

  8. Goodman, L.A., Kruskal, W.H.: Measures of Association for Cross-Classification, vol. 1. Springer, New York (1980)

    Google Scholar 

  9. Liebtrau, A.M.: Measures of Association. SAGE, Beverly Hills (1983)

    Google Scholar 

  10. Grätzer, G.: General Lattice Theory, 2nd edn. Birkhäuser, Basel (1998)

    MATH  Google Scholar 

  11. Simovici, D.A., Tenney, R.L.: Relational Database Systems. Academic Press, New York (1995)

    Google Scholar 

  12. Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Chapman and Hall, Boca Raton (1998)

    Google Scholar 

  13. Blake, C.L., Merz, C.J.: UCI Repository of machine learning databases. University of California, Irvine, Dept. of Information and Computer Sciences (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html

  14. Witten, I.H., Frank, E.: Data Mining. Morgan Kaufmann, San Francisco (2000)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Simovici, D.A., Jaroszewicz, S. (2004). A Metric Approach to Building Decision Trees Based on Goodman-Kruskal Association Index. In: Dai, H., Srikant, R., Zhang, C. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2004. Lecture Notes in Computer Science(), vol 3056. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24775-3_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-24775-3_23

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-22064-0

  • Online ISBN: 978-3-540-24775-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics