Skip to main content

An Experimental Comparison of Three Interpretations of Missing Attribute Values Using Probabilistic Approximations

  • Conference paper
Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing (RSFDGrC 2013)

Abstract

This paper presents results of experiments on 24 data sets with three different interpretations of missing attribute values: lost values, attribute-concept values, and “do not care” conditions. Lost values were erased or forgotten to be inserted. Attribute-concept values are any values from the attribute domain restricted to the respective concept. “Do not care” conditions are any values from the attribute domain without any restriction. For our experiments we used concept probabilistic approximations, a generalization of standard approximations. Our main objective was to determine the best interpretation of missing attribute values, in terms of the error rate. Results of experiments indicate that the lost value interpretation of missing attribute values is the best. Our secondary objective was to test how useful proper concept probabilistic approximations (i.e., different from standard concept lower and upper approximations) are for mining data with missing attribute values. Proper concept probabilistic approximations were better than standard concept approximations for 12 data sets and worse for five data sets (out of 24).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Grzymała-Busse, J.W.: Generalized parameterized approximations. In: Yao, J., Ramanna, S., Wang, G., Suraj, Z. (eds.) RSKT 2011. LNCS, vol. 6954, pp. 136–145. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  2. Wong, S.K.M., Ziarko, W.: INFER—an adaptive decision support system based on the probabilistic approximate classification. In: Proceedings of the 6th International Workshop on Expert Systems and their Applications, pp. 713–726 (1986)

    Google Scholar 

  3. Grzymala-Busse, J.W., Ziarko, W.: Data mining based on rough sets. In: Wang, J. (ed.) Data Mining: Opportunities and Challenges, pp. 142–173. Idea Group Publ., Hershey (2003)

    Google Scholar 

  4. Pawlak, Z., Skowron, A.: Rough sets: Some extensions. Information Sciences 177, 28–40 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  5. Pawlak, Z., Wong, S.K.M., Ziarko, W.: Rough sets: probabilistic versus deterministic approach. International Journal of Man-Machine Studies 29, 81–95 (1988)

    Article  MATH  Google Scholar 

  6. Ślęzak, D., Ziarko, W.: The investigation of the bayesian rough set model. International Journal of Approximate Reasoning 40, 81–91 (2005)

    Article  MathSciNet  Google Scholar 

  7. Yao, Y.Y.: Probabilistic rough set approximations. International Journal of Approximate Reasoning 49, 255–271 (2008)

    Article  MATH  Google Scholar 

  8. Yao, Y.Y., Wong, S.K.M.: A decision theoretic framework for approximate concepts. International Journal of Man-Machine Studies 37, 793–809 (1992)

    Article  Google Scholar 

  9. Ziarko, W.: Variable precision rough set model. Journal of Computer and System Sciences 46(1), 39–59 (1993)

    Article  MathSciNet  MATH  Google Scholar 

  10. Ziarko, W.: Probabilistic approach to rough sets. International Journal of Approximate Reasoning 49, 272–284 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  11. Clark, P.G., Grzymala-Busse, J.W.: Experiments on probabilistic approximations. In: Proceedings of the 2011 IEEE International Conference on Granular Computing, pp. 144–149 (2011)

    Google Scholar 

  12. Clark, P.G., Grzymala-Busse, J.W.: Rule induction using probabilistic approximations and data with missing attribute values. In: Proceedings of the 15th IASTED International Conference on Artificial Intelligence and Soft Computing, ASC 2012, pp. 235–242 (2012)

    Google Scholar 

  13. Grzymala-Busse, J.W., Wang, A.Y.: Modified algorithms LEM1 and LEM2 for rule induction from data with missing attribute values. In: Proceedings of the Fifth International Workshop on Rough Sets and Soft Computing, RSSC 1997, at the Third Joint Conference on Information Sciences, JCIS 1997, pp. 69–72 (1997)

    Google Scholar 

  14. Stefanowski, J., Tsoukias, A.: Incomplete information tables and rough classification. Computational Intelligence 17(3), 545–566 (2001)

    Article  Google Scholar 

  15. Grzymala-Busse, J.W.: On the unknown attribute values in learning from examples. In: Proceedings of the ISMIS 1991, 6th International Symposium on Methodologies for Intelligent Systems, pp. 368–377 (1991)

    Google Scholar 

  16. Kryszkiewicz, M.: Rules in incomplete information systems. Information Sciences 113(3-4), 271–292 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  17. Grzymala-Busse, J.W.: LERS—a system for learning from examples based on rough sets. In: Slowinski, R. (ed.) Intelligent Decision Support. Handbook of Applications and Advances of the Rough Set Theory, pp. 3–18. Kluwer Academic Publishers, Dordrecht (1992)

    Chapter  Google Scholar 

  18. Grzymala-Busse, J.W.: Three approaches to missing attribute values—a rough set perspective. In: Proceedings of the Workshop on Foundation of Data Mining, in conjunction with the Fourth IEEE International Conference on Data Mining, pp. 55–62 (2004)

    Google Scholar 

  19. Grzymala-Busse, J.W., Rzasa, W.: A local version of the MLEM2 algorithm for rule induction. Fundamenta Informaticae 100, 99–116 (2010)

    MathSciNet  MATH  Google Scholar 

  20. Grzymala-Busse, J.W.: MLEM2: A new algorithm for rule induction from imperfect data. In: Proceedings of the 9th International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems, pp. 243–250 (2002)

    Google Scholar 

  21. Grzymala-Busse, J.W.: Knowledge acquisition under uncertainty—A rough set approach. Journal of Intelligent & Robotic Systems 1, 3–16 (1988)

    Article  MathSciNet  Google Scholar 

  22. Grzymala-Busse, J.W.: Generalized probabilistic approximations. Transactions on Rough Sets 16, 1–16 (2013)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Clark, P.G., Grzymała-Busse, J.W. (2013). An Experimental Comparison of Three Interpretations of Missing Attribute Values Using Probabilistic Approximations. In: Ciucci, D., Inuiguchi, M., Yao, Y., Ślęzak, D., Wang, G. (eds) Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing. RSFDGrC 2013. Lecture Notes in Computer Science(), vol 8170. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41218-9_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-41218-9_9

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-41217-2

  • Online ISBN: 978-3-642-41218-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics