An Experimental Comparison of Three Interpretations of Missing Attribute Values Using Probabilistic Approximations

Clark, Patrick G.; Grzymała-Busse, Jerzy W.

doi:10.1007/978-3-642-41218-9_9

Patrick G. Clark²⁴ &
Jerzy W. Grzymała-Busse^24,25

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8170))

Included in the following conference series:

International Workshop on Rough Sets, Fuzzy Sets, Data Mining, and Granular-Soft Computing

1194 Accesses

Abstract

This paper presents results of experiments on 24 data sets with three different interpretations of missing attribute values: lost values, attribute-concept values, and “do not care” conditions. Lost values were erased or forgotten to be inserted. Attribute-concept values are any values from the attribute domain restricted to the respective concept. “Do not care” conditions are any values from the attribute domain without any restriction. For our experiments we used concept probabilistic approximations, a generalization of standard approximations. Our main objective was to determine the best interpretation of missing attribute values, in terms of the error rate. Results of experiments indicate that the lost value interpretation of missing attribute values is the best. Our secondary objective was to test how useful proper concept probabilistic approximations (i.e., different from standard concept lower and upper approximations) are for mining data with missing attribute values. Proper concept probabilistic approximations were better than standard concept approximations for 12 data sets and worse for five data sets (out of 24).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Mining Data with Many Missing Attribute Values Using Global and Saturated Probabilistic Approximations Based on Characteristic Sets

A Comparison of Concept and Global Probabilistic Approximations Based on Mining Incomplete Data

On the Number of Rules and Conditions in Mining Data with Attribute-Concept Values and “Do Not Care” Conditions

References

Grzymała-Busse, J.W.: Generalized parameterized approximations. In: Yao, J., Ramanna, S., Wang, G., Suraj, Z. (eds.) RSKT 2011. LNCS, vol. 6954, pp. 136–145. Springer, Heidelberg (2011)
Chapter Google Scholar
Wong, S.K.M., Ziarko, W.: INFER—an adaptive decision support system based on the probabilistic approximate classification. In: Proceedings of the 6th International Workshop on Expert Systems and their Applications, pp. 713–726 (1986)
Google Scholar
Grzymala-Busse, J.W., Ziarko, W.: Data mining based on rough sets. In: Wang, J. (ed.) Data Mining: Opportunities and Challenges, pp. 142–173. Idea Group Publ., Hershey (2003)
Google Scholar
Pawlak, Z., Skowron, A.: Rough sets: Some extensions. Information Sciences 177, 28–40 (2007)
Article MathSciNet MATH Google Scholar
Pawlak, Z., Wong, S.K.M., Ziarko, W.: Rough sets: probabilistic versus deterministic approach. International Journal of Man-Machine Studies 29, 81–95 (1988)
Article MATH Google Scholar
Ślęzak, D., Ziarko, W.: The investigation of the bayesian rough set model. International Journal of Approximate Reasoning 40, 81–91 (2005)
Article MathSciNet Google Scholar
Yao, Y.Y.: Probabilistic rough set approximations. International Journal of Approximate Reasoning 49, 255–271 (2008)
Article MATH Google Scholar
Yao, Y.Y., Wong, S.K.M.: A decision theoretic framework for approximate concepts. International Journal of Man-Machine Studies 37, 793–809 (1992)
Article Google Scholar
Ziarko, W.: Variable precision rough set model. Journal of Computer and System Sciences 46(1), 39–59 (1993)
Article MathSciNet MATH Google Scholar
Ziarko, W.: Probabilistic approach to rough sets. International Journal of Approximate Reasoning 49, 272–284 (2008)
Article MathSciNet MATH Google Scholar
Clark, P.G., Grzymala-Busse, J.W.: Experiments on probabilistic approximations. In: Proceedings of the 2011 IEEE International Conference on Granular Computing, pp. 144–149 (2011)
Google Scholar
Clark, P.G., Grzymala-Busse, J.W.: Rule induction using probabilistic approximations and data with missing attribute values. In: Proceedings of the 15th IASTED International Conference on Artificial Intelligence and Soft Computing, ASC 2012, pp. 235–242 (2012)
Google Scholar
Grzymala-Busse, J.W., Wang, A.Y.: Modified algorithms LEM1 and LEM2 for rule induction from data with missing attribute values. In: Proceedings of the Fifth International Workshop on Rough Sets and Soft Computing, RSSC 1997, at the Third Joint Conference on Information Sciences, JCIS 1997, pp. 69–72 (1997)
Google Scholar
Stefanowski, J., Tsoukias, A.: Incomplete information tables and rough classification. Computational Intelligence 17(3), 545–566 (2001)
Article Google Scholar
Grzymala-Busse, J.W.: On the unknown attribute values in learning from examples. In: Proceedings of the ISMIS 1991, 6th International Symposium on Methodologies for Intelligent Systems, pp. 368–377 (1991)
Google Scholar
Kryszkiewicz, M.: Rules in incomplete information systems. Information Sciences 113(3-4), 271–292 (1999)
Article MathSciNet MATH Google Scholar
Grzymala-Busse, J.W.: LERS—a system for learning from examples based on rough sets. In: Slowinski, R. (ed.) Intelligent Decision Support. Handbook of Applications and Advances of the Rough Set Theory, pp. 3–18. Kluwer Academic Publishers, Dordrecht (1992)
Chapter Google Scholar
Grzymala-Busse, J.W.: Three approaches to missing attribute values—a rough set perspective. In: Proceedings of the Workshop on Foundation of Data Mining, in conjunction with the Fourth IEEE International Conference on Data Mining, pp. 55–62 (2004)
Google Scholar
Grzymala-Busse, J.W., Rzasa, W.: A local version of the MLEM2 algorithm for rule induction. Fundamenta Informaticae 100, 99–116 (2010)
MathSciNet MATH Google Scholar
Grzymala-Busse, J.W.: MLEM2: A new algorithm for rule induction from imperfect data. In: Proceedings of the 9th International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems, pp. 243–250 (2002)
Google Scholar
Grzymala-Busse, J.W.: Knowledge acquisition under uncertainty—A rough set approach. Journal of Intelligent & Robotic Systems 1, 3–16 (1988)
Article MathSciNet Google Scholar
Grzymala-Busse, J.W.: Generalized probabilistic approximations. Transactions on Rough Sets 16, 1–16 (2013)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical Engineering and Computer Science, University of Kansas, Lawrence, KS, 66045, USA
Patrick G. Clark & Jerzy W. Grzymała-Busse
Institute of Computer Science, Polish Academy of Sciences, 01–237, Warsaw, Poland
Jerzy W. Grzymała-Busse

Authors

Patrick G. Clark
View author publications
You can also search for this author in PubMed Google Scholar
Jerzy W. Grzymała-Busse
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of Milano-Bicocca, viale Sarca 336/14, 20126, Milano, Italy
Davide Ciucci
Osaka University, 560-8531, Toyonaka, Osaka, Japan
Masahiro Inuiguchi
University of Regina, S4S 0A2, Regina, SK, Canada
Yiyu Yao
University of Warsaw, ul. Banacha, 2, 02-097, Warsaw, Poland
Dominik Ślęzak
Chongqing Institute of Green and Intelligent Technology, CAS, 401122, Chongqing, China
Guoyin Wang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Clark, P.G., Grzymała-Busse, J.W. (2013). An Experimental Comparison of Three Interpretations of Missing Attribute Values Using Probabilistic Approximations. In: Ciucci, D., Inuiguchi, M., Yao, Y., Ślęzak, D., Wang, G. (eds) Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing. RSFDGrC 2013. Lecture Notes in Computer Science(), vol 8170. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41218-9_9

Download citation

DOI: https://doi.org/10.1007/978-3-642-41218-9_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-41217-2
Online ISBN: 978-3-642-41218-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

An Experimental Comparison of Three Interpretations of Missing Attribute Values Using Probabilistic Approximations

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Mining Data with Many Missing Attribute Values Using Global and Saturated Probabilistic Approximations Based on Characteristic Sets

A Comparison of Concept and Global Probabilistic Approximations Based on Mining Incomplete Data

On the Number of Rules and Conditions in Mining Data with Attribute-Concept Values and “Do Not Care” Conditions

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

An Experimental Comparison of Three Interpretations of Missing Attribute Values Using Probabilistic Approximations

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Mining Data with Many Missing Attribute Values Using Global and Saturated Probabilistic Approximations Based on Characteristic Sets

A Comparison of Concept and Global Probabilistic Approximations Based on Mining Incomplete Data

On the Number of Rules and Conditions in Mining Data with Attribute-Concept Values and “Do Not Care” Conditions

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation