Abstract
In this paper we discuss four kinds of missing attribute values: lost values (the values that were recorded but currently are unavailable), ”do not care” conditions (the original values were irrelevant), restricted ”do not care” conditions (similar to ordinary ”do not care” conditions but interpreted differently, these missing attribute values may occur when in the same data set there are lost values and ”do not care” conditions), and attribute-concept values (these missing attribute values may be replaced by any attribute value limited to the same concept). Through the entire paper the same calculus, based on computations of blocks of attribute-value pairs, is used. Incomplete data are characterized by characteristic relations, which in general are neither symmetric nor transitive. Lower and upper approximations are generalized for data with missing attribute values. Finally, some experiments on different interpretations of missing attribute values and different approximation definitions are cited.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Greco, S., Matarazzo, B., Slowinski, R.: Dealing with missing data in rough set analysis of multi-attribute and multi-criteria decision problems. In: Zanakis, S.H., Doukidis, G., Zopounidis, Z. (eds.) Decision Making: Recent Developments and Worldwide Applications, pp. 295–316. Kluwer Academic Publishers, Dordrecht (2000)
Grzymala-Busse, J.W.: On the unknown attribute values in learning from examples. In: Proceedings of the ISMIS-91, 6th International Symposium on Methodologies for Intelligent Systems, Charlotte, North Carolina, pp. 368–377 (1991)
Grzymala-Busse, J.W.: MLEM2: A new algorithm for rule induction from imperfect data. In: Proceedings of the IPMU 2002, 9th International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems, Annecy, France, pp. 243–250 (2002)
Grzymala-Busse, J.W.: Rough set strategies to data with missing attribute values. In: Workshop Notes, Foundations and New Directions of Data Mining, the 3rd International Conference on Data Mining, Melbourne, Florida, pp. 56–63 (2003)
Grzymala-Busse, J.W.: Data with missing attribute values: Generalization of idiscernibility relation and rule induction. In: Transactions on Rough Sets. LNCS, Journal Subline vol. 1, pp. 78–95. Springer, Heidelberg (2004)
Grzymala-Busse, J.W.: Characteristic relations for incomplete data: A generalization of the indiscernibility relation. In: Tsumoto, S., Słowiński, R., Komorowski, J., Grzymała-Busse, J.W. (eds.) RSCTC 2004. LNCS (LNAI), vol. 3066, pp. 244–253. Springer, Heidelberg (2004)
Grzymala-Busse, J.W.: Incomplete data and generalization of indiscernibility relation, definability, and approximations. In: Ślęzak, D., Wang, G., Szczuka, M., Düntsch, I., Yao, Y. (eds.) RSFDGrC 2005. LNCS (LNAI), vol. 3641, pp. 244–253. Springer, Heidelberg (2005)
Grzymala-Busse, J.W., Hu, M.: A comparison of several approaches to missing attribute values in data mining. In: Ziarko, W., Yao, Y. (eds.) RSCTC 2000. LNCS (LNAI), vol. 2005, pp. 340–347. Springer, Heidelberg (2001)
Grzymala-Busse, J.W., Santoso, S.: Experiments on data with three interpretations of missing attribute values—A rough set approach. In: Accepted for the IIS 2006 Conference, Intelligent Information Systems, New Trends in Intelligent Information Processing and WEB Mining, Ustron, Poland (2006)
Grzymala-Busse, J.W., Wang, A.Y.: Modified algorithms LEM1 and LEM2 for rule induction from data with missing attribute values. In: Proceedings of the Fifth International Workshop on Rough Sets and Soft Computing (RSSC 1997) at the Third Joint Conference on Information Sciences (JCIS 1997), Research Triangle Park, North Carolina, pp. 69–72 (1997)
Hong, T.P., Tseng, L.H., Chien, B.C.: Learning coverage rules from incomplete data based on rough sets. In: Proceedings of the IEEE International Conference on Systems, Man and Cybernetics, Hague, the Netherlands, pp. 3226–3231 (2004)
Kryszkiewicz, M.: Rough set approach to incomplete information systems. In: Proceedings of the Second Annual Joint Conference on Information Sciences, Wrightsville Beach, North Carolina, pp. 194–197 (1995)
Kryszkiewicz, M.: Rules in incomplete information systems. Information Sciences 113, 271–292 (1999)
Lin, T.Y.: Topological and fuzzy rough sets. In: Slowinski, R. (ed.) Intelligent Decision Support. Handbook of Applications and Advances of the Rough Set Theory, pp. 287–304. Kluwer Academic Publishers, Dordrecht (1992)
Pawlak, Z.: Rough Sets. Theoretical Aspects of Reasoning about Data. Kluwer Academic Publishers, Dordrecht (1991)
Slowinski, R., Vanderpooten, D.: A generalized definition of rough approximations based on similarity. IEEE Transactions on Knowledge and Data Engineering 12, 331–336 (2000)
Stefanowski, J., Tsoukias, A.: On the extension of rough sets under incomplete information. In: Zhong, N., Skowron, A., Ohsuga, S. (eds.) RSFDGrC 1999. LNCS (LNAI), vol. 1711, pp. 73–81. Springer, Heidelberg (1999)
Stefanowski, J., Tsoukias, A.: Incomplete information tables and rough classification. Computational Intelligence 17, 545–566 (2001)
Wang, G.: Extension of rough set under incomplete information systems. In: Proceedings of the FUZZ_IEEE 2002, IEEE International Conference on Fuzzy Systems, Honolulu, Hawaii, vol. 2, pp. 1098–1103 (2002)
Yao, Y.Y.: Two views of the theory of rough sets in finite universes. International J. of Approximate Reasoning 15, 291–317 (1996)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Grzymala-Busse, J.W. (2006). A Rough Set Approach to Data with Missing Attribute Values. In: Wang, GY., Peters, J.F., Skowron, A., Yao, Y. (eds) Rough Sets and Knowledge Technology. RSKT 2006. Lecture Notes in Computer Science(), vol 4062. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11795131_10
Download citation
DOI: https://doi.org/10.1007/11795131_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-36297-5
Online ISBN: 978-3-540-36299-9
eBook Packages: Computer ScienceComputer Science (R0)