Skip to main content

Discretization

  • Reference work entry
  • First Online:
Encyclopedia of Machine Learning and Data Mining
  • 237 Accesses

Synonyms

Binning

Definition

Discretization is a process that transforms a numeric attribute into a categorical attribute. Under discretization, a new categorical attribute X′ is formed from and replaces an existing numeric attribute X. Each value x′ of X′ corresponds to an interval (a,b] of X. Any original numeric value x of X that belongs to (a,b] is replaced by x′. The boundary values of formed intervals are often called “cut points.”

Motivation and Background

Many learning systems require categorical data, while many data are numeric. Discretization allows numeric data to be transformed into categorical form suited to processing by such systems. Further, in some cases effective discretization can improve either computational or prediction performance relative to learning from the original numeric data.

Taxonomy

The following taxonomy identifies many key dimensions along which alternative discretization techniques can be distinguished.

Supervised vs. Unsupervised (Dougherty et al. 1995...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 699.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 949.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Recommended Reading

  • Bay SD (2000) Multivariate discretization of continuous variables for set mining. In: Proceedings of the sixth ACM SIGKDD international conference on knowledge discovery and data mining, pp 315–319

    Google Scholar 

  • Dougherty J, Kohavi R, Sahami M (1995) Supervised and unsupervised discretization of continuous features. In: Proceedings of the twelfth international conference on machine learning, pp 194–202

    Google Scholar 

  • Fayyad UM, Irani KB (1993) Multi-interval discretization of continuous-valued attributes for classification learning. In: Proceedings of the thirteenth international joint conference on artificial intelligence, pp 1022–1027

    Google Scholar 

  • Hsu CN, Huang HJ, Wong TT (2000) Why discretization works for naïve Bayesian classifiers. In: Proceedings of the seventeenth international conference on machine learning, pp 309–406

    Google Scholar 

  • Kerber R (1992) ChiMerge: discretization for numeric attributes. In: AAAI national conference on artificial intelligence, pp 123–128

    Google Scholar 

  • Kononenko I (1992) Naive Bayesian classifier and continuous Attributes. Informatica 16(1):1–8

    Google Scholar 

  • Quinlan JR (1993) C4.5: Programs for machine learning. Morgan Kaufmann Publishers, San Francisco

    Google Scholar 

  • Yang Y, Webb G (2001) Proportional k-interval discretization for naive-Bayes classifiers. In: Proceedings of the twelfth European conference on machine learning, pp 564–575

    MATH  Google Scholar 

  • Yang Y, Webb G (2002) Non-disjoint discretization for naive-Bayes classifiers. In: Proceedings of the nineteenth international conference on machine learning, pp 666–673

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer Science+Business Media New York

About this entry

Cite this entry

Yang, Y. (2017). Discretization. In: Sammut, C., Webb, G.I. (eds) Encyclopedia of Machine Learning and Data Mining. Springer, Boston, MA. https://doi.org/10.1007/978-1-4899-7687-1_221

Download citation

Publish with us

Policies and ethics