Abstract
Rough set theory has become an important mathematical tool to deal with imprecise, incomplete and inconsistent data. As we all know, rough set theory works better on discretized or binarized data. However, most real life data sets consist of not only discrete attributes but also continuous attributes. In this paper, we propose a supervised and multivariate discretization algorithm — SMD for rough sets. SMD uses both class information and relations between attributes to determine the discretization scheme. To evaluate algorithm SMD, we ran the algorithm on real life data sets obtained from the UCI Machine Learning Repository. The experimental results show that our algorithm is effective. And the time complexity of our algorithm is relatively low, compared with the current multivariate discretization algorithms.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Nguyen, H.S., Skowron, A.: Quantization of real value attributes:rough set and boolean reasoning approach. In: Proceedings of the Second Joint Annual Conference on Information Sciences, pp. 34–37. Society for Information Processing,
Nguyen, S.H., Nguyen, H.S.: Some efficient algorithms for rough set methods. In: Proceedings of IPMU 1996, Granada, Spain, pp. 1451–1456 (1996)
Nguyen, H.S.: Discretization problem for rough sets methods. In: Polkowski, L., Skowron, A. (eds.) RSCTC 1998. LNCS (LNAI), vol. 1424, pp. 545–555. Springer, Heidelberg (1998)
Nguyen, H.S., Nguyen, S.H.: Discretization Methods in Data Mining. In: Rough Sets in Knowledge Discovery, Physica, pp. 451–482 (1998)
Pawlak, Z.: Rough sets. International Journal of Computer and Information Sciences 11, 341–356 (1982)
Pawlak, Z.: Rough sets: Theoretical Aspects of Reasoning about Data. Kluwer Academic Publishers, Dordrecht (1991)
Catlett, J.: On Changing Continuous Attributes into Ordered Discrete Attributes. In: Kodratoff, Y. (ed.) EWSL 1991. LNCS, vol. 482, pp. 164–178. Springer, Heidelberg (1991)
Kerber, R.: Chimerge: Discretization of Numeric Attributes. In: Proc. of the Ninth National Conference of Articial Intelligence, pp. 123–128. AAAI Press, Menlo Park (1992)
Dougherty, J., Kohavi, R., Sahami, M.: Supervised and Unsupervised Discretization of Continuous Features. In: Proceedings of the 12th International Conference on Machine Learning, pp. 194–202. Morgan Kaufmann Publishers, San Francisco (1995)
Øhrn, A.: Rosetta Technical Reference Manual (1999), http://www.idi.ntnu.no/_aleks/rosetta
Blake, C.L., Merz, C.J.: UCI Machine Learning Repository, http://archive.ics.uci.edu/ml/
Bay, S.D.: Multivariate Discretization of Continuous Variables for Set Mining. In: Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 315–319 (2000)
Bay, S.D.: Multivariate Discretization for Set Mining. Knowledge and Information Systems 3(4), 491–512 (2001)
Monti, S., Cooper, G.F.: A Multivariate Discretization Method for Learning Bayesian Networks from Mixed Data. In: Proceedings of 14th Conference of Uncertainty in AI, pp. 404–413 (1998)
Tsai, C.J., Lee, C.I., Yang, W.P.: A discretization algorithm based on Class-Attribute Contingency Coefficient. Information Sciences 178, 714–731 (2008)
Wong, A.K.C., Chiu, D.K.Y.: Synthesizing Statistical Knowledge from Incomplete Mixed- Mode Data. IEEE Trans. Pattern Analysis and Machine Intelligence, 796–805 (1987)
Liu, H., Setiono, R.: Chi2: Feature Selection and Discretization of Numeric Attributes, pp. 388–391. IEEE Computer Society, Los Alamitos (1995)
Liu, H., Hussain, F., Tan, C.L., Dash, M.: Discretization: an enabling technique. Journal of Data Mining and Knowledge Discovery 6(4), 393–423 (2002)
Fayyad, U.M., Irani, K.B.: Multi-interval discretization of continuous-valued attributes for classification learning. In: Proceeding of Thirteenth International Conference on Artificial Intelligence, pp. 1022–1027 (1993)
Pongaksorn, P., Rakthanmanon, T., Waiyamai, K.: DCR: Discretization using Class Information to Reduce Number of Intervals. In: QIMIE 2009: Quality issues, measures of interestingness and evaluation of data mining models, pp. 17–28 (2009)
Wang, G.Y.: Rough set theory and knowledge acquisition. Xian Jiaotong University Press (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jiang, F., Zhao, Z., Ge, Y. (2010). A Supervised and Multivariate Discretization Algorithm for Rough Sets. In: Yu, J., Greco, S., Lingras, P., Wang, G., Skowron, A. (eds) Rough Set and Knowledge Technology. RSKT 2010. Lecture Notes in Computer Science(), vol 6401. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16248-0_81
Download citation
DOI: https://doi.org/10.1007/978-3-642-16248-0_81
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-16247-3
Online ISBN: 978-3-642-16248-0
eBook Packages: Computer ScienceComputer Science (R0)