Abstract
This paper presents results of experiments on incomplete data sets obtained by random replacement of attribute values with symbols of missing attribute values. Rule sets were induced from such data using two different types of lower and upper approximation: local and global, and two different interpretations of missing attribute values: lost values and ”do not care” conditions. Additionally, we used a probabilistic option, one of the most successful traditional methods to handle missing attribute values. In our experiments we recorded the total error rate, a result of ten-fold cross validation. Using the Wicoxon matched-pairs signed ranks test (5% level of significance for two-tailed test) we observed that for missing attribute values interpreted as ”do not care” conditions, the global type of approximations is worse than the local type and that the probabilistic option is worse than the local type.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Grzymala-Busse, J.W., Wang, A.Y.: Modified algorithms LEM1 and LEM2 for rule induction from data with missing attribute values. In: Proceedings of the Fifth International Workshop on Rough Sets and Soft Computing (RSSC 1997) at the Third Joint Conference on Information Sciences (JCIS 1997), pp. 69–72 (1997)
Stefanowski, J., Tsoukias, A.: On the extension of rough sets under incomplete information. In: Zhong, N., Skowron, A., Ohsuga, S. (eds.) RSFDGrC 1999. LNCS (LNAI), vol. 1711, pp. 73–82. Springer, Heidelberg (1999)
Stefanowski, J., Tsoukias, A.: Incomplete information tables and rough classification. Computational Intelligence 17(3), 545–566 (2001)
Grzymala-Busse, J.W.: On the unknown attribute values in learning from examples. In: Proceedings of the ISMIS 1991, 6th International Symposium on Methodologies for Intelligent Systems, pp. 368–377 (1991)
Kryszkiewicz, M.: Rough set approach to incomplete information systems. In: Proceedings of the Second Annual Joint Conference on Information Sciences, pp. 194–197 (1995)
Kryszkiewicz, M.: Rules in incomplete information systems. Information Sciences 113(3-4), 271–292 (1999)
Grzymala-Busse, J.W., Rzasa, W.: Local and global approximations for incomplete data. In: Greco, S., Hata, Y., Hirano, S., Inuiguchi, M., Miyamoto, S., Nguyen, H.S., Słowiński, R. (eds.) RSCTC 2006. LNCS (LNAI), vol. 4259, pp. 244–253. Springer, Heidelberg (2006)
Grzymala-Busse, J.W., Rzasa, W.: A local version of the mlem2 algorithm for rule induction. Fundamenta Informaticae 100, 99–116 (2010)
Grzymala-Busse, J.W., Rzasa, W.: Local and global approximations for incomplete data. Transactions on Rough Sets 8, 21–34 (2008)
Grzymala-Busse, J.W.: Three approaches to missing attribute values—a rough set perspective. In: Proceedings of the Workshop on Foundation of Data Mining, in Conjunction with the Fourth IEEE International Conference on Data Mining, pp. 55–62 (2004)
Grzymala-Busse, J.W.: MLEM2: A new algorithm for rule induction from imperfect data. In: Proceedings of the 9th International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems, pp. 243–250 (2002)
Pawlak, Z.: Rough Sets. Theoretical Aspects of Reasoning about Data. Kluwer Academic Publishers, Dordrecht (1991)
Grzymala-Busse, J.W., Hu, M.: A comparison of several approaches to missing attribute values in data mining. In: Proceedings of the Second International Conference on Rough Sets and Current Trends in Computing, pp. 340–347 (2000)
Chmielewski, M.R., Grzymala-Busse, J.W.: Global discretization of continuous attributes as preprocessing for machine learning. International Journal of Approximate Reasoning 15(4), 319–331 (1996)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Grzymala-Busse, J.W. (2011). A Comparison of Some Rough Set Approaches to Mining Symbolic Data with Missing Attribute Values. In: Kryszkiewicz, M., Rybinski, H., Skowron, A., Raś, Z.W. (eds) Foundations of Intelligent Systems. ISMIS 2011. Lecture Notes in Computer Science(), vol 6804. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21916-0_6
Download citation
DOI: https://doi.org/10.1007/978-3-642-21916-0_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-21915-3
Online ISBN: 978-3-642-21916-0
eBook Packages: Computer ScienceComputer Science (R0)