Decision trees for probabilistic data

Aboa, Jean-Pascal; Emilion, Richard

doi:10.1007/3-540-44466-1_39

Jean-Pascal Aboa⁷ &
Richard Emilion⁸

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1874))

Included in the following conference series:

International Conference on Data Warehousing and Knowledge Discovery

523 Accesses
1 Citations

Abstract

We propose an algorithm to build decision trees when the observed data are probability distributions. This is of interest when one deals with massive database or with probabilistic models. We illustrate our method with a dataset describing districts of Great Britain. Our decision tree yields rules which explain the unemployment rate.

The decision tree in our case is built by replacing the test X > α, which is used to split the nodes in the usual case of real numbers, by the test P(X > α) < β, where α and β are determined through an algorithm based on probabilistic split evaluation criteria.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

K. Alsabti, S. Ranka, and V. Singh. Clouds: A decision tree classifier for large datasets. In KDD’98, Août 1998.
Google Scholar
A. Baccini and A. Pousse. Point de vue unitaire de la segmentation. quelques conséquences. CRAS, A(280):241, Janvier 1975.
MathSciNet Google Scholar
L. Breiman, J.H. Friedman, R.A. Olshen, and C.J. Stone. Classification and Regression trees. Wadsworth and brooks, 1984.
Google Scholar
M. Chavent. Analyse des données symboliques: une méthode divisive declassification. PhD thesis, Université Paris 9 Dauphine, 1998.
Google Scholar
E. Diday. Introduction à l’approche symbolique en analyse des données. Cahier du CEREMADE, univ. Paris Dauphine, N. 8823, 1988.
Google Scholar
R. Kohavi and M. Sahami. Error-based and entropy-based discretization of continuous features. In KDD’96, 1996.
Google Scholar
J.R. Quinlan. Induction of decision trees. Machine Learning, 1(1), 1986.
Google Scholar
Schweizer. Distribution functions: numbers of future. In Mathematics of fuzzy systems, pages 137–149. 2nd Napoli meeting, 1985.
Google Scholar

Download references

Author information

Authors and Affiliations

Ceremade, Univ. Paris IX-Dauphine, 75775, Paris cedex 16
Jean-Pascal Aboa
UFR SEGMI modalx, univ. Paris X, France
Richard Emilion

Authors

Jean-Pascal Aboa
View author publications
You can also search for this author in PubMed Google Scholar
Richard Emilion
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Graduate School of Informatics, Kyoto University, Yoshida-Honmachi, Sakyo-ku, Kyoto, 606-8501, Japan
Yahiko Kambayashi
Computer Science Department, Western Michigan University, Kalamazoo, MI, 49008, USA
Mukesh Mohania
Vienna University of Technology, IFS, Favoritenstr. 9-11/188, 1040, Vienna, Austria
A. Min Tjoa

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Aboa, JP., Emilion, R. (2000). Decision trees for probabilistic data. In: Kambayashi, Y., Mohania, M., Tjoa, A.M. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2000. Lecture Notes in Computer Science, vol 1874. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44466-1_39

Download citation

DOI: https://doi.org/10.1007/3-540-44466-1_39
Published: 06 July 2001
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-67980-6
Online ISBN: 978-3-540-44466-4
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics