Skip to main content

Decision trees for probabilistic data

  • Conference paper
  • First Online:
Data Warehousing and Knowledge Discovery (DaWaK 2000)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1874))

Included in the following conference series:

Abstract

We propose an algorithm to build decision trees when the observed data are probability distributions. This is of interest when one deals with massive database or with probabilistic models. We illustrate our method with a dataset describing districts of Great Britain. Our decision tree yields rules which explain the unemployment rate.

The decision tree in our case is built by replacing the test X > α, which is used to split the nodes in the usual case of real numbers, by the test P(X > α) < β, where α and β are determined through an algorithm based on probabilistic split evaluation criteria.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. K. Alsabti, S. Ranka, and V. Singh. Clouds: A decision tree classifier for large datasets. In KDD’98, Août 1998.

    Google Scholar 

  2. A. Baccini and A. Pousse. Point de vue unitaire de la segmentation. quelques conséquences. CRAS, A(280):241, Janvier 1975.

    MathSciNet  Google Scholar 

  3. L. Breiman, J.H. Friedman, R.A. Olshen, and C.J. Stone. Classification and Regression trees. Wadsworth and brooks, 1984.

    Google Scholar 

  4. M. Chavent. Analyse des données symboliques: une méthode divisive declassification. PhD thesis, Université Paris 9 Dauphine, 1998.

    Google Scholar 

  5. E. Diday. Introduction à l’approche symbolique en analyse des données. Cahier du CEREMADE, univ. Paris Dauphine, N. 8823, 1988.

    Google Scholar 

  6. R. Kohavi and M. Sahami. Error-based and entropy-based discretization of continuous features. In KDD’96, 1996.

    Google Scholar 

  7. J.R. Quinlan. Induction of decision trees. Machine Learning, 1(1), 1986.

    Google Scholar 

  8. Schweizer. Distribution functions: numbers of future. In Mathematics of fuzzy systems, pages 137–149. 2nd Napoli meeting, 1985.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2000 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Aboa, JP., Emilion, R. (2000). Decision trees for probabilistic data. In: Kambayashi, Y., Mohania, M., Tjoa, A.M. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2000. Lecture Notes in Computer Science, vol 1874. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44466-1_39

Download citation

  • DOI: https://doi.org/10.1007/3-540-44466-1_39

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-67980-6

  • Online ISBN: 978-3-540-44466-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics