Abstract
This paper presents experimental results on twelve data sets with many missing attribute values, interpreted as lost values and attribute-concept values. Data mining was accomplished using three kinds of probabilistic approximations: singleton, subset and concept. We compared the best results, using all three kinds of probabilistic approximations, for six data sets with lost values and six data sets with attribute-concept values, where missing attribute values were located in the same places. For five pairs of data sets the error rate, evaluated by ten-fold cross validation, was significantly smaller for lost values than for attribute-concept values (5 % significance level). For the remaining pair of data sets both interpretations of missing attribute values do not differ significantly.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Clark, P.G., Grzymala-Busse, J.W.: Experiments on probabilistic approximations. In: Proceedings of the 2011 IEEE International Conference on Granular Computing, pp. 144–149 (2011)
Clark, P.G., Grzymala-Busse, J.W.: Rule induction using probabilistic approximations and data with missing attribute values. In: Proceedings of the 15-th IASTED International Conference on Artificial Intelligence and Soft Computing ASC 2012, pp. 235–242 (2012)
Clark, P.G., Grzymała-Busse, J.W.: An experimental comparison of three interpretations of missing attribute values using probabilistic approximations. In: Ciucci, D., Inuiguchi, M., Yao, Y., śȩzak, D. (eds.) RSFDGrC 2013. LNCS, vol. 8170, pp. 77–86. Springer, Heidelberg (2013)
Clark, P.G., Grzymala-Busse, J.W.: Mining incomplete data with lost values and attribute-concept values. In: Proceedings of the 2014 IEEE International Conference on Granular Computing, pp. 49–54 (2014)
Grzymala-Busse, J.W.: LERS–a system for learning from examples based on rough sets. In: Slowinski, R. (ed.) Intelligent Decision Support. Handbook of Applications and Advances of the Rough Set Theory, pp. 3–18. Kluwer Academic Publishers, Dordrecht (1992)
Grzymala-Busse, J.W.: MLEM2: a new algorithm for rule induction from imperfect data. In: Proceedings of the 9th International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems, pp. 243–250 (2002)
Grzymała-Busse, J.W.: Generalized parameterized approximations. In: Yao, J.T., Ramanna, S., Wang, G., Suraj, Z. (eds.) RSKT 2011. LNCS, vol. 6954, pp. 136–145. Springer, Heidelberg (2011)
Grzymala-Busse, J.W., Wang, A.Y.: Modified algorithms LEM1 and LEM2 for rule induction from data with missing attribute values. In: Proceedings of the 5-th International Workshop on Rough Sets and Soft Computing in conjunction with the Third Joint Conference on Information Sciences, pp. 69–72 (1997)
Grzymala-Busse, J.W., Ziarko, W.: Data mining based on rough sets. In: Wang, J. (ed.) Data Mining: Opportunities and Challenges, pp. 142–173. Idea Group Publ., Hershey (2003)
Pawlak, Z., Skowron, A.: Rough sets: some extensions. Inf. Sci. 177, 28–40 (2007)
Pawlak, Z., Wong, S.K.M., Ziarko, W.: Rough sets: probabilistic versus deterministic approach. Int. J. Man-Mach. Stud. 29, 81–95 (1988)
Ślȩzak, D., Ziarko, W.: The investigation of the bayesian rough set model. Int. J. Approximate Reasoning 40, 81–91 (2005)
Stefanowski, J., Tsoukias, A.: Incomplete information tables and rough classification. Comput. Intell. 17(3), 545–566 (2001)
Wang, G.: Extension of rough set under incomplete information systems. In: Proceedings of the IEEE International Conference on Fuzzy Systems, pp. 1098–1103 (2002)
Yao, Y.Y.: Probabilistic rough set approximations. Int. J. Approximate Reasoning 49, 255–271 (2008)
Yao, Y.Y., Wong, S.K.M.: A decision theoretic framework for approximate concepts. Int. J. Man Mach. Studies 37, 793–809 (1992)
Ziarko, W.: Variable precision rough set model. J. Comput. Sys. Sci. 46(1), 39–59 (1993)
Ziarko, W.: Probabilistic approach to rough sets. Int. J. Approximate Reasoning 49, 272–284 (2008)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Clark, P.G., Grzymala-Busse, J.W. (2015). Mining Incomplete Data with Many Lost and Attribute-Concept Values. In: Ciucci, D., Wang, G., Mitra, S., Wu, WZ. (eds) Rough Sets and Knowledge Technology. RSKT 2015. Lecture Notes in Computer Science(), vol 9436. Springer, Cham. https://doi.org/10.1007/978-3-319-25754-9_9
Download citation
DOI: https://doi.org/10.1007/978-3-319-25754-9_9
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-25753-2
Online ISBN: 978-3-319-25754-9
eBook Packages: Computer ScienceComputer Science (R0)