Skip to main content

Mining Incomplete Data with Many Lost and Attribute-Concept Values

  • Conference paper
Rough Sets and Knowledge Technology (RSKT 2015)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9436))

Included in the following conference series:

  • 1071 Accesses

Abstract

This paper presents experimental results on twelve data sets with many missing attribute values, interpreted as lost values and attribute-concept values. Data mining was accomplished using three kinds of probabilistic approximations: singleton, subset and concept. We compared the best results, using all three kinds of probabilistic approximations, for six data sets with lost values and six data sets with attribute-concept values, where missing attribute values were located in the same places. For five pairs of data sets the error rate, evaluated by ten-fold cross validation, was significantly smaller for lost values than for attribute-concept values (5 % significance level). For the remaining pair of data sets both interpretations of missing attribute values do not differ significantly.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Clark, P.G., Grzymala-Busse, J.W.: Experiments on probabilistic approximations. In: Proceedings of the 2011 IEEE International Conference on Granular Computing, pp. 144–149 (2011)

    Google Scholar 

  2. Clark, P.G., Grzymala-Busse, J.W.: Rule induction using probabilistic approximations and data with missing attribute values. In: Proceedings of the 15-th IASTED International Conference on Artificial Intelligence and Soft Computing ASC 2012, pp. 235–242 (2012)

    Google Scholar 

  3. Clark, P.G., Grzymała-Busse, J.W.: An experimental comparison of three interpretations of missing attribute values using probabilistic approximations. In: Ciucci, D., Inuiguchi, M., Yao, Y., śȩzak, D. (eds.) RSFDGrC 2013. LNCS, vol. 8170, pp. 77–86. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  4. Clark, P.G., Grzymala-Busse, J.W.: Mining incomplete data with lost values and attribute-concept values. In: Proceedings of the 2014 IEEE International Conference on Granular Computing, pp. 49–54 (2014)

    Google Scholar 

  5. Grzymala-Busse, J.W.: LERS–a system for learning from examples based on rough sets. In: Slowinski, R. (ed.) Intelligent Decision Support. Handbook of Applications and Advances of the Rough Set Theory, pp. 3–18. Kluwer Academic Publishers, Dordrecht (1992)

    Chapter  Google Scholar 

  6. Grzymala-Busse, J.W.: MLEM2: a new algorithm for rule induction from imperfect data. In: Proceedings of the 9th International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems, pp. 243–250 (2002)

    Google Scholar 

  7. Grzymała-Busse, J.W.: Generalized parameterized approximations. In: Yao, J.T., Ramanna, S., Wang, G., Suraj, Z. (eds.) RSKT 2011. LNCS, vol. 6954, pp. 136–145. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  8. Grzymala-Busse, J.W., Wang, A.Y.: Modified algorithms LEM1 and LEM2 for rule induction from data with missing attribute values. In: Proceedings of the 5-th International Workshop on Rough Sets and Soft Computing in conjunction with the Third Joint Conference on Information Sciences, pp. 69–72 (1997)

    Google Scholar 

  9. Grzymala-Busse, J.W., Ziarko, W.: Data mining based on rough sets. In: Wang, J. (ed.) Data Mining: Opportunities and Challenges, pp. 142–173. Idea Group Publ., Hershey (2003)

    Chapter  Google Scholar 

  10. Pawlak, Z., Skowron, A.: Rough sets: some extensions. Inf. Sci. 177, 28–40 (2007)

    Article  MathSciNet  Google Scholar 

  11. Pawlak, Z., Wong, S.K.M., Ziarko, W.: Rough sets: probabilistic versus deterministic approach. Int. J. Man-Mach. Stud. 29, 81–95 (1988)

    Article  Google Scholar 

  12. Ślȩzak, D., Ziarko, W.: The investigation of the bayesian rough set model. Int. J. Approximate Reasoning 40, 81–91 (2005)

    Article  MathSciNet  Google Scholar 

  13. Stefanowski, J., Tsoukias, A.: Incomplete information tables and rough classification. Comput. Intell. 17(3), 545–566 (2001)

    Article  Google Scholar 

  14. Wang, G.: Extension of rough set under incomplete information systems. In: Proceedings of the IEEE International Conference on Fuzzy Systems, pp. 1098–1103 (2002)

    Google Scholar 

  15. Yao, Y.Y.: Probabilistic rough set approximations. Int. J. Approximate Reasoning 49, 255–271 (2008)

    Article  Google Scholar 

  16. Yao, Y.Y., Wong, S.K.M.: A decision theoretic framework for approximate concepts. Int. J. Man Mach. Studies 37, 793–809 (1992)

    Article  Google Scholar 

  17. Ziarko, W.: Variable precision rough set model. J. Comput. Sys. Sci. 46(1), 39–59 (1993)

    Article  MathSciNet  Google Scholar 

  18. Ziarko, W.: Probabilistic approach to rough sets. Int. J. Approximate Reasoning 49, 272–284 (2008)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jerzy W. Grzymala-Busse .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Clark, P.G., Grzymala-Busse, J.W. (2015). Mining Incomplete Data with Many Lost and Attribute-Concept Values. In: Ciucci, D., Wang, G., Mitra, S., Wu, WZ. (eds) Rough Sets and Knowledge Technology. RSKT 2015. Lecture Notes in Computer Science(), vol 9436. Springer, Cham. https://doi.org/10.1007/978-3-319-25754-9_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-25754-9_9

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-25753-2

  • Online ISBN: 978-3-319-25754-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics