Synonyms
Definition
Discretization is a process that transforms a numeric attribute into a categorical attribute. Under discretization, a new categorical attribute X′ is formed from and replaces an existing numeric attribute X. Each value x′ of X′ corresponds to an interval (a,b] of X. Any original numeric value x of X that belongs to (a,b] is replaced by x′. The boundary values of formed intervals are often called “cut points.”
Motivation and Background
Many learning systems require categorical data, while many data are numeric. Discretization allows numeric data to be transformed into categorical form suited to processing by such systems. Further, in some cases effective discretization can improve either computational or prediction performance relative to learning from the original numeric data.
Taxonomy
The following taxonomy identifies many key dimensions along which alternative discretization techniques can be distinguished.
Supervised vs. Unsupervised (Dougherty et al. 1995...
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Recommended Reading
Bay SD (2000) Multivariate discretization of continuous variables for set mining. In: Proceedings of the sixth ACM SIGKDD international conference on knowledge discovery and data mining, pp 315–319
Dougherty J, Kohavi R, Sahami M (1995) Supervised and unsupervised discretization of continuous features. In: Proceedings of the twelfth international conference on machine learning, pp 194–202
Fayyad UM, Irani KB (1993) Multi-interval discretization of continuous-valued attributes for classification learning. In: Proceedings of the thirteenth international joint conference on artificial intelligence, pp 1022–1027
Hsu CN, Huang HJ, Wong TT (2000) Why discretization works for naïve Bayesian classifiers. In: Proceedings of the seventeenth international conference on machine learning, pp 309–406
Kerber R (1992) ChiMerge: discretization for numeric attributes. In: AAAI national conference on artificial intelligence, pp 123–128
Kononenko I (1992) Naive Bayesian classifier and continuous Attributes. Informatica 16(1):1–8
Quinlan JR (1993) C4.5: Programs for machine learning. Morgan Kaufmann Publishers, San Francisco
Yang Y, Webb G (2001) Proportional k-interval discretization for naive-Bayes classifiers. In: Proceedings of the twelfth European conference on machine learning, pp 564–575
Yang Y, Webb G (2002) Non-disjoint discretization for naive-Bayes classifiers. In: Proceedings of the nineteenth international conference on machine learning, pp 666–673
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer Science+Business Media New York
About this entry
Cite this entry
Yang, Y. (2017). Discretization. In: Sammut, C., Webb, G.I. (eds) Encyclopedia of Machine Learning and Data Mining. Springer, Boston, MA. https://doi.org/10.1007/978-1-4899-7687-1_221
Download citation
DOI: https://doi.org/10.1007/978-1-4899-7687-1_221
Published:
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4899-7685-7
Online ISBN: 978-1-4899-7687-1
eBook Packages: Computer ScienceReference Module Computer Science and Engineering