Skip to main content

A theoretical framework for decision trees in uncertain domains: Application to medical data sets

  • Decision-Support Theories
  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1211))

Abstract

Experimental evidence shows that many attribute selection criteria involved in the induction of decision trees perform comparably. We set up a theoretical framework that explains this empirical law. It furthermore provides an infinite set of criteria (the C.M. criteria) which contains the most commonly used criteria. We also define C.M. pruning which is suitable in uncertain domains. In such domains, like medicine, some sub-trees which don't lessen the error rate can be relevant to point out some populations of specific interest or to give a representation of a large data file. C.M. pruning allows to keep such sub-trees, even when keeping the sub-trees doesn't increase the classification efficiency. Thus we obtain a consistent framework for both building and pruning decision trees in uncertain domains. We give typical examples in medicine, highlighting routine use of induction in this domain even if the targeted diagnosis cannot be reached for many cases from the findings under investigation.

This is a preview of subscription content, log in via an institution.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Babic, A., Krusinska, E., & Strömberg, J. E. (1992) Extraction of diagnostic rules using recursive partitioning systems: a comparison of two approaches. Artificial Intelligence in Medicine 4, 373–387.

    Google Scholar 

  2. Breiman, L., Friedman, J. H., Olshen, R. A., & Stone, C. J. (1984) Classification and regression trees. Wadsworth. Statistics probability series. Belmont.

    Google Scholar 

  3. Breiman, L. (1996) Some properties of splitting criteria (technical note). Machine Learning 21, 41–47.

    Google Scholar 

  4. Buntine, W. (1992) Learning classification trees. Statistics and Computing 2, 63–73

    Google Scholar 

  5. Buntine, W., & Niblett, T. (1992) A further comparison of splitting rules for decision-tree induction. Machine Learning 8, 75–85.

    Google Scholar 

  6. Catlett, J. (1991) Overpruning large decision trees. In proceedings of the Twelfth International Joint Conference on Artificial Intelligence IJCAI 91. (pp 764–769). Sydney, Australia.

    Google Scholar 

  7. Crémilleux, B. (1991) Induction automatique: aspects théoriques, le système ARBRE, applications en médecine. Ph D thesis. Joseph Fourier University. Grenoble (France).

    Google Scholar 

  8. Crémilleux, B., & Robert, C. (1996) A Pruning Method for Decision Trees in Uncertain Domains: Applications in Medicine. In proceedings of the workshop Intelligent Data Analysis in Medicine and Pharmacology, ECAI 96. (pp 15–20). Budapest, Hungary.

    Google Scholar 

  9. Crémilleux, B., & Zreik, K. (1996) Le rôle de l'interaction personne-système lors de la production d'arbres de décision. In proceedings of the international Conference on Human-System Learning CAPS 96. (pp 20–31). Caen, France.

    Google Scholar 

  10. Esposito, F., Malerba, D., & Semeraro, G. (1993) Decision tree pruning as search in the state space. In Proceedings of European Conference on Machine Learning ECML 93. (pp 165–184). Vienna (Austria), P. B. Brazdil (Ed.). Lecture notes in artificial intelligence. N∘ 667. Springer-Verlag.

    Google Scholar 

  11. Fayyad, U. M., & Irani, K. B. (1992) The attribute selection problem in decision tree generation. In Proceedings of Tenth National Conference on Artificial Intelligence. (pp 104–110). Cambridge, MA: AAAI Press/MIT Press.

    Google Scholar 

  12. Fayyad, U. M. (1994) Branching on attribute values in decision tree generation. In proceedings of Twelfth National Conference on Artificial Intelligence. (pp 601–606). AAAI Press/MIT Press.

    Google Scholar 

  13. File, P. E., Dugard P. I., & Houston, A. S. (1994) Evaluation of the use of induction in the development of a medical expert system. Computers and Biomedical Research 27, 383–395.

    Google Scholar 

  14. Gams, M., & Petkovsek, M. (1988) Learning from examples in the presence of noise. In proceedings of Eighth International Workshop Expert Systems and Their Applications. (pp 609–624). Avignon, France.

    Google Scholar 

  15. Gascuel, O., & Caraux, G. (1992) Statistical significance in inductive learning. In proceedings of the Tenth European Conference on Artificial Intelligence ECAI 92. (pp 435–439). Vienne, Austria.

    Google Scholar 

  16. Gelfand, S. B., Ravishankar, C. S., & Delp, E. J. (1991) An iterative growing and pruning algorithm for classification tree design. IEEE Transactions on Pattern Analysis and Machine Intelligence 13(2), 163–174.

    Google Scholar 

  17. Goodman, R. M. F., & Smyth, P. (1988) Information-theoretic rule induction. In proceedings of the Eighth European Conference on Artificial Intelligence ECAI 88. (pp 357–362). München, Germany.

    Google Scholar 

  18. Hart, A. (1984) Experience in the use of an inductive system in knowledge engineering. In M. Bramer (Ed.), Research and development in expert systems. Cambridge University Press.

    Google Scholar 

  19. Jalbert, P., Jalbert, H., & Sele, B. (1988) Types of imbalances in human reciprocal translocations: risks at birth. The cytogenetics of mammalian rearrangements, Alan R. Liss. 267–291.

    Google Scholar 

  20. Janssen, F., Schachner, J., Hubbard, J., & Hartman, J. (1987) The risk of deep venous thrombosis: a computerized epidemiologic approach. Surg. Am.

    Google Scholar 

  21. Kern, J., Dezelic, G., Dürrigl, T., & Vuletic, S. (1993) Medical decision making based on inductive learning method. Artificial Intelligence in Medicine 5, 213–223.

    Google Scholar 

  22. Kira, K., & Rendell, L. (1992) A practical approach to feature selection. In Proceedings of the International Conference on Machine Learning. (pp 249–256). Aberdeen, D. Sleeman & P. Edwards (Eds). Morgan Kaufmann.

    Google Scholar 

  23. Kononenko, I. (1994) Estimating attributes: analysis and extensions of RELIEF. In Proceedings of European Conference on Machine Learning ECML 94. (pp 171–182). Catania (Italy), F. Bergadano & L De Raedt (Eds.). Lecture notes in artificial intelligence. N∘ 784. Springer-Verlag.

    Google Scholar 

  24. Kononenko, I. (1995) On biases in estimating multi-valued attributes. In proceedings of the Fourteenth International Joint Conference on Artificial Intelligence IJCAI 95. (pp 1034–1040). Montréal, Canada.

    Google Scholar 

  25. Liu, W. Z., & White, A. P. (1994) The importance of attribute selection measures in decision tree induction. Machine Learning 15, 25–41.

    Google Scholar 

  26. Lopez de Mantaras, R. (1991) A distance-based attribute selection measure for decision tree induction. Machine Learning 6, 81–92.

    Google Scholar 

  27. Marshall, R. (1986) Partitioning methods for classification and decision making in medicine. Statistics in Medicine 5, 517–526.

    Google Scholar 

  28. Mingers, J. (1986) Expert systems — experiments with rule induction. Journal of the Operational Research Society 37(11), 1031–1037.

    Google Scholar 

  29. Mingers, J. (1989) An empirical comparison of selection measures for decision-tree induction. Machine Learning 3, 319–342.

    Google Scholar 

  30. Mingers, J. (1989) An empirical comparison of pruning methods for decision-tree induction. Machine Learning 4, 227–243.

    Google Scholar 

  31. Niblett, T. (1987) Constructing decision trees in noisy domains. In Proceedings of 2nd European Working Sessions on Learning EWSL 87. (pp 67–78). Bled (Yugoslavia), Sigma Press. Wilmslow.

    Google Scholar 

  32. Quinlan, J. R. (1986) Induction of decision trees. Machine Learning 1, 81–106.

    Google Scholar 

  33. Quinlan, J. R., & Rivest, R. L. (1989) Inferring decision trees using the minimum description length principle. Information and Computation 80(3), 227–248.

    Google Scholar 

  34. Quinlan J. R. (1993) C4.5 Programs for Machine Learning. San Mateo, CA. Morgan Kaufmann.

    Google Scholar 

  35. Rockafellar, R. T. (1970) Convex analysis. Princeton University Press. Princeton. New Jersey.

    Google Scholar 

  36. Safavian, S. R., & Landgrebe, D. (1991) A survey of decision tree classifier methodology. IEEE Transactions on Systems, Man, and Cybernetics 21(3), 660–674.

    Google Scholar 

  37. Schaffer, C. (1993) Overfitting avoidance as bias. Machine Learning 10, 153–178.

    Google Scholar 

  38. Taylor, C. C., Michie D., & Spiegelhalter, D. J. (1994) Machine learning, neural and statistical classification. Ellis Horwood Series in Artificial Intelligence.

    Google Scholar 

  39. Wallace, C. S., & Patrick, J. D. (1993) Coding decision trees. Mach. Learn.11, 7–22.

    Google Scholar 

  40. White, A. P., & Liu, W. Z. (1994) Bias in Information-Based Measures in Decision Tree Induction. Machine Learning 15, 321–329.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Elpida Keravnou Catherine Garbay Robert Baud Jeremy Wyatt

Rights and permissions

Reprints and permissions

Copyright information

© 1997 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Crémilleux, B., Robert, C. (1997). A theoretical framework for decision trees in uncertain domains: Application to medical data sets. In: Keravnou, E., Garbay, C., Baud, R., Wyatt, J. (eds) Artificial Intelligence in Medicine. AIME 1997. Lecture Notes in Computer Science, vol 1211. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0029447

Download citation

  • DOI: https://doi.org/10.1007/BFb0029447

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-62709-8

  • Online ISBN: 978-3-540-68448-0

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics