A theoretical framework for decision trees in uncertain domains: Application to medical data sets

Crémilleux, B.; Robert, C.

doi:10.1007/BFb0029447

B. Crémilleux¹ &
C. Robert²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1211))

Included in the following conference series:

Conference on Artificial Intelligence in Medicine in Europe

Abstract

Experimental evidence shows that many attribute selection criteria involved in the induction of decision trees perform comparably. We set up a theoretical framework that explains this empirical law. It furthermore provides an infinite set of criteria (the C.M. criteria) which contains the most commonly used criteria. We also define C.M. pruning which is suitable in uncertain domains. In such domains, like medicine, some sub-trees which don't lessen the error rate can be relevant to point out some populations of specific interest or to give a representation of a large data file. C.M. pruning allows to keep such sub-trees, even when keeping the sub-trees doesn't increase the classification efficiency. Thus we obtain a consistent framework for both building and pruning decision trees in uncertain domains. We give typical examples in medicine, highlighting routine use of induction in this domain even if the targeted diagnosis cannot be reached for many cases from the findings under investigation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Babic, A., Krusinska, E., & Strömberg, J. E. (1992) Extraction of diagnostic rules using recursive partitioning systems: a comparison of two approaches. Artificial Intelligence in Medicine 4, 373–387.
Google Scholar
Breiman, L., Friedman, J. H., Olshen, R. A., & Stone, C. J. (1984) Classification and regression trees. Wadsworth. Statistics probability series. Belmont.
Google Scholar
Breiman, L. (1996) Some properties of splitting criteria (technical note). Machine Learning 21, 41–47.
Google Scholar
Buntine, W. (1992) Learning classification trees. Statistics and Computing 2, 63–73
Google Scholar
Buntine, W., & Niblett, T. (1992) A further comparison of splitting rules for decision-tree induction. Machine Learning 8, 75–85.
Google Scholar
Catlett, J. (1991) Overpruning large decision trees. In proceedings of the Twelfth International Joint Conference on Artificial Intelligence IJCAI 91. (pp 764–769). Sydney, Australia.
Google Scholar
Crémilleux, B. (1991) Induction automatique: aspects théoriques, le système ARBRE, applications en médecine. Ph D thesis. Joseph Fourier University. Grenoble (France).
Google Scholar
Crémilleux, B., & Robert, C. (1996) A Pruning Method for Decision Trees in Uncertain Domains: Applications in Medicine. In proceedings of the workshop Intelligent Data Analysis in Medicine and Pharmacology, ECAI 96. (pp 15–20). Budapest, Hungary.
Google Scholar
Crémilleux, B., & Zreik, K. (1996) Le rôle de l'interaction personne-système lors de la production d'arbres de décision. In proceedings of the international Conference on Human-System Learning CAPS 96. (pp 20–31). Caen, France.
Google Scholar
Esposito, F., Malerba, D., & Semeraro, G. (1993) Decision tree pruning as search in the state space. In Proceedings of European Conference on Machine Learning ECML 93. (pp 165–184). Vienna (Austria), P. B. Brazdil (Ed.). Lecture notes in artificial intelligence. N∘ 667. Springer-Verlag.
Google Scholar
Fayyad, U. M., & Irani, K. B. (1992) The attribute selection problem in decision tree generation. In Proceedings of Tenth National Conference on Artificial Intelligence. (pp 104–110). Cambridge, MA: AAAI Press/MIT Press.
Google Scholar
Fayyad, U. M. (1994) Branching on attribute values in decision tree generation. In proceedings of Twelfth National Conference on Artificial Intelligence. (pp 601–606). AAAI Press/MIT Press.
Google Scholar
File, P. E., Dugard P. I., & Houston, A. S. (1994) Evaluation of the use of induction in the development of a medical expert system. Computers and Biomedical Research 27, 383–395.
Google Scholar
Gams, M., & Petkovsek, M. (1988) Learning from examples in the presence of noise. In proceedings of Eighth International Workshop Expert Systems and Their Applications. (pp 609–624). Avignon, France.
Google Scholar
Gascuel, O., & Caraux, G. (1992) Statistical significance in inductive learning. In proceedings of the Tenth European Conference on Artificial Intelligence ECAI 92. (pp 435–439). Vienne, Austria.
Google Scholar
Gelfand, S. B., Ravishankar, C. S., & Delp, E. J. (1991) An iterative growing and pruning algorithm for classification tree design. IEEE Transactions on Pattern Analysis and Machine Intelligence 13(2), 163–174.
Google Scholar
Goodman, R. M. F., & Smyth, P. (1988) Information-theoretic rule induction. In proceedings of the Eighth European Conference on Artificial Intelligence ECAI 88. (pp 357–362). München, Germany.
Google Scholar
Hart, A. (1984) Experience in the use of an inductive system in knowledge engineering. In M. Bramer (Ed.), Research and development in expert systems. Cambridge University Press.
Google Scholar
Jalbert, P., Jalbert, H., & Sele, B. (1988) Types of imbalances in human reciprocal translocations: risks at birth. The cytogenetics of mammalian rearrangements, Alan R. Liss. 267–291.
Google Scholar
Janssen, F., Schachner, J., Hubbard, J., & Hartman, J. (1987) The risk of deep venous thrombosis: a computerized epidemiologic approach. Surg. Am.
Google Scholar
Kern, J., Dezelic, G., Dürrigl, T., & Vuletic, S. (1993) Medical decision making based on inductive learning method. Artificial Intelligence in Medicine 5, 213–223.
Google Scholar
Kira, K., & Rendell, L. (1992) A practical approach to feature selection. In Proceedings of the International Conference on Machine Learning. (pp 249–256). Aberdeen, D. Sleeman & P. Edwards (Eds). Morgan Kaufmann.
Google Scholar
Kononenko, I. (1994) Estimating attributes: analysis and extensions of RELIEF. In Proceedings of European Conference on Machine Learning ECML 94. (pp 171–182). Catania (Italy), F. Bergadano & L De Raedt (Eds.). Lecture notes in artificial intelligence. N∘ 784. Springer-Verlag.
Google Scholar
Kononenko, I. (1995) On biases in estimating multi-valued attributes. In proceedings of the Fourteenth International Joint Conference on Artificial Intelligence IJCAI 95. (pp 1034–1040). Montréal, Canada.
Google Scholar
Liu, W. Z., & White, A. P. (1994) The importance of attribute selection measures in decision tree induction. Machine Learning 15, 25–41.
Google Scholar
Lopez de Mantaras, R. (1991) A distance-based attribute selection measure for decision tree induction. Machine Learning 6, 81–92.
Google Scholar
Marshall, R. (1986) Partitioning methods for classification and decision making in medicine. Statistics in Medicine 5, 517–526.
Google Scholar
Mingers, J. (1986) Expert systems — experiments with rule induction. Journal of the Operational Research Society 37(11), 1031–1037.
Google Scholar
Mingers, J. (1989) An empirical comparison of selection measures for decision-tree induction. Machine Learning 3, 319–342.
Google Scholar
Mingers, J. (1989) An empirical comparison of pruning methods for decision-tree induction. Machine Learning 4, 227–243.
Google Scholar
Niblett, T. (1987) Constructing decision trees in noisy domains. In Proceedings of 2nd European Working Sessions on Learning EWSL 87. (pp 67–78). Bled (Yugoslavia), Sigma Press. Wilmslow.
Google Scholar
Quinlan, J. R. (1986) Induction of decision trees. Machine Learning 1, 81–106.
Google Scholar
Quinlan, J. R., & Rivest, R. L. (1989) Inferring decision trees using the minimum description length principle. Information and Computation 80(3), 227–248.
Google Scholar
Quinlan J. R. (1993) C4.5 Programs for Machine Learning. San Mateo, CA. Morgan Kaufmann.
Google Scholar
Rockafellar, R. T. (1970) Convex analysis. Princeton University Press. Princeton. New Jersey.
Google Scholar
Safavian, S. R., & Landgrebe, D. (1991) A survey of decision tree classifier methodology. IEEE Transactions on Systems, Man, and Cybernetics 21(3), 660–674.
Google Scholar
Schaffer, C. (1993) Overfitting avoidance as bias. Machine Learning 10, 153–178.
Google Scholar
Taylor, C. C., Michie D., & Spiegelhalter, D. J. (1994) Machine learning, neural and statistical classification. Ellis Horwood Series in Artificial Intelligence.
Google Scholar
Wallace, C. S., & Patrick, J. D. (1993) Coding decision trees. Mach. Learn.11, 7–22.
Google Scholar
White, A. P., & Liu, W. Z. (1994) Bias in Information-Based Measures in Decision Tree Induction. Machine Learning 15, 321–329.
Google Scholar

Download references

Author information

Authors and Affiliations

GREYC, CNRS - UPRESA 1526, Université de Caen, Esplanade de la Paix, F-14032, Caen Cédex, France
B. Crémilleux
Institut de Recherche en Mathématiques Appliquées, Université Joseph Fourier, BP 53 X, F-38041, Grenoble Cédex, France
C. Robert

Authors

B. Crémilleux
View author publications
You can also search for this author in PubMed Google Scholar
C. Robert
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Elpida Keravnou Catherine Garbay Robert Baud Jeremy Wyatt

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Crémilleux, B., Robert, C. (1997). A theoretical framework for decision trees in uncertain domains: Application to medical data sets. In: Keravnou, E., Garbay, C., Baud, R., Wyatt, J. (eds) Artificial Intelligence in Medicine. AIME 1997. Lecture Notes in Computer Science, vol 1211. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0029447

Download citation

DOI: https://doi.org/10.1007/BFb0029447
Published: 22 November 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-62709-8
Online ISBN: 978-3-540-68448-0
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics