Abstract
Discretization of continuous attributes is an important task for certain types of machine learning algorithms. Bayesian approaches, for instance, require assumptions about data distributions. Decision Trees, on the other hand, require sorting operations to deal with continuous attributes, which largely increase learning times. This paper presents a new method of discretization, whose main characteristic is that it takes into account interdependencies between attributes. Detecting interdependencies can be seen as discovering redundant attributes. This means that our method performs attribute selection as a side effect of the discretization. Empirical evaluation on five benchmark datasets from UCI repository, using C4.5 and a naive Bayes, shows a consistent reduction of the features without loss of generalization accuracy.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
J. Catlett. Megainduction: a test flight. In Machine Learning: Proceedings of the 8 International Conference. Morgan Kaufmann, 1991. 160
J. Catlett. On changing continuous attributes into ordered discrete attributes. In Y. Kodratoff, editor, European Working Session on Learning-EWSL91. LNAI 482 Springer Verlag, 1991. 162
P. Domingos and M. Pazzani. Beyond independence: Conditions for the optimality of the simple bayesian classifier. In L. Saitta, editor, Machine Learning Proc. of 13th International Conference. Morgan Kaufmann, 1996. 160, 161
J. Dougherty, R. Kohavi, and M. Sahami. Supervised and unsupervised discretization of continuous features. In A. Prieditis and S. Russel, editors, Machine Learning Proc. of 12th International Conference. Morgan Kaufmann, 1995. 161, 163
U. Fayyad and K Irani. Multi-interval discretization of continuous-valued attributes for classification learning. In 13th International Joint Conference of Artificial Intelligence, 1993. 161, 162, 163
R.C. Holte. Very simple classification rules perform well on most commonly used datasets. Machine Learning, Vol, 11, 1993. 162
R. Kohavi and M. Sahami. Error-based and entropy-based discretization of continuous features. In Proceedings of KDD 96, 1996. 160
B. Moret and H. Shapiro. Algorithms from P to NP-Design and Efficiency, Vol. 1. The Benjamin Publishers, 1990. 165
Kerber R. Chimerge: discretization of numeric attributes. In Proceedings of the 10 National Conference on Artificial Intelligence. MIT Press, 1992. 162
M. Richeldi and M. Rossoto. Class driven statistical discretization of continuous attributes. In S. Wrobel and N. Lavrac, editors, Machine Learning: ECML-95. LNAI 912, Springer Verlag, 1995. 162
M. Robnik-Sikonja and I Kononenko. Discretization of continuous attributes using relieff. In Proceedings of RK95, 1995. 163
L. Torgo and J. Gama. Search-based class discretization. In Proceedings ECML-97. LNAI 1224, Springer Verlag, 1997. 161
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gama, J., Torgo, L., Soares, C. (1998). Dynamic Discretization of Continuous Attributes. In: Coelho, H. (eds) Progress in Artificial Intelligence — IBERAMIA 98. IBERAMIA 1998. Lecture Notes in Computer Science(), vol 1484. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-49795-1_14
Download citation
DOI: https://doi.org/10.1007/3-540-49795-1_14
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64992-2
Online ISBN: 978-3-540-49795-0
eBook Packages: Springer Book Archive