Abstract
We show that finding optimal discretization of instances of decision tables with two attributes with real values and binary decisions is computationally hard. This is done by abstracting the problem in such a way that it regards partitioning points in the plane into regions, subject to certain minimality restrictions, and proving them to be NP-hard. We also propose a new method to find optimal discretizations.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This research was partly supported by Polish State Committee for Scientific Research grant No. 8T11C03614, No. 8T11C01011 and Research Program of European Union - ESPRIT-CRIT2 No. 20288.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Catlett J. (1991). On changing continuous attributes into ordered discrete attributes. In Y. Kodratoff, (ed.), Machine Learning-EWSL-91, Porto, Portugal, March 1991, LNAI, pp. 164–178.
Chmielewski M.R., Grzymala-Busse J.W. (1994). Global discretization of attributes as preprocessing for machine learning. Proc. of the III International Workshop on RSSC94, November 1994, pp. 294–301.
Fayyad U. M., Irani K.B. (1992). The attribute selection problem in decision tree generation. Proc. of AAAI-92, July 1992, San Jose, CA. MIT Press, pp. 104–110.
Fayyad, U. M., Irani, K.B. (1993). Multi-interval discretization of continuous-valued attributes for classification learning. In Proc. of the 13th International Joint Conference on Artificial Intelligence, Morgan Kaufmann, pp. 1022–1027.
Garey, M.R., Johnson, D.S. (1979). Computers and Intractability, A guide to the theory of NP-completeness, W.H. Freeman, San Francisco.
Holt R.C. (1993). Very simple classification rules perform well on most commonly used datasets, Machine Learning 11, pp. 63–90.
Kerber R. (1992), Chimerge, Discretization of numeric attributes. Proc. of the Tenth National Conference on Artificial Intelligence, MIT Press, pp. 123–128.
Nguyen S. H., Nguyen H. S. (1996), Some efficient algorithms for rough set methods. Proc. of the Conference of Information Processing and Management of Uncertainty in Knowledge-Based Systems, Granada, Spain, pp. 1451–1456.
Nguyen, H.S., Nguyen, S.H. (1997). Discretization methods for data mining. In A. Skowron and L. Polkowski (Eds.), Rough Set in Data Mining and Knowledge Discovery (in preparation). Berlin, Springer Verlag.
Nguyen H. S, Skowron A. (1995). Quantization of real values attributes, rough set and boolean reasoning approaches. Proc. of the Second Joint Annual Conference on Information Sciences, Wrightsville Beach, NC, 1995, USA, pp. 34–37.
Pawlak Z. (1991). Rough sets: Theoretical aspects of reasoning about data, Kluwer Dordrecht.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Chlebus, B.S., Nguyen, S.H. (1998). On Finding Optimal Discretizations for Two Attributes. In: Polkowski, L., Skowron, A. (eds) Rough Sets and Current Trends in Computing. RSCTC 1998. Lecture Notes in Computer Science(), vol 1424. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-69115-4_74
Download citation
DOI: https://doi.org/10.1007/3-540-69115-4_74
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64655-6
Online ISBN: 978-3-540-69115-0
eBook Packages: Springer Book Archive