Global and Saturated Probabilistic Approximations Based on Generalized Maximal Consistent Blocks

Clark, Patrick G.; Grzymala-Busse, Jerzy W.; Hippe, Zdzislaw S.; Mroczek, Teresa; Niemiec, Rafal

doi:10.1007/978-3-030-61705-9_32

Patrick G. Clark¹²,
Jerzy W. Grzymala-Busse^12,13,
Zdzislaw S. Hippe¹³,
Teresa Mroczek¹³ &
…
Rafal Niemiec¹³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12344))

Included in the following conference series:

International Conference on Hybrid Artificial Intelligence Systems

1100 Accesses

Abstract

In this paper incomplete data sets, or data sets with missing attribute values, have two interpretations, lost values and “do not care” conditions. Additionally, the process of data mining is based on two types of probabilistic approximations, global and saturated. We present results of experiments on mining incomplete data sets using four approaches, combining two interpretations of missing attribute values with two types of probabilistic approximations. We compare our four approaches, using the error rate computed as a result of ten-fold cross validation as a criterion of quality. We show that for some data sets the error rate is significantly smaller (5% level of significance) for lost values than for “do not care” conditions, while for other data sets the error rate is smaller for “do not care” conditions. For “do not care” conditions, the error rate is significantly smaller for saturated probabilistic approximations than for global probabilistic approximations for two data sets, for another data set it is the other way around, while for remaining five data sets the difference is insignificant. Thus, for an incomplete data set, the best approach to data mining should be chosen by trying all four approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Mining Incomplete Data Using Global and Saturated Probabilistic Approximations Based on Characteristic Sets and Maximal Consistent Blocks

Rule Set Complexity for Mining Incomplete Data Using Probabilistic Approximations Based on Generalized Maximal Consistent Blocks

Mining Data with Many Missing Attribute Values Using Global and Saturated Probabilistic Approximations Based on Characteristic Sets

References

Clark, P.G., Gao, C., Grzymala-Busse, J.W., Mroczek, T.: Characteristic sets and generalized maximal consistent blocks in mining incomplete data. Proc. Int. Joint Conf. Rough Sets, Part 1, 477–486 (2017)
Article Google Scholar
Clark, P.G., Gao, C., Grzymala-Busse, J.W., Mroczek, T.: Characteristic sets and generalized maximal consistent blocks in mining incomplete data. Inf. Sci. 453, 66–79 (2018)
Article MathSciNet Google Scholar
Clark, P.G., Gao, C., Grzymala-Busse, J.W., Mroczek, T., Niemiec, R.: A comparison of concept and global probabilistic approximations based on mining incomplete data. In: Damaševičius, R., Vasiljevienė, G. (eds.) ICIST 2018. CCIS, vol. 920, pp. 324–335. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-99972-2_26
Chapter Google Scholar
Clark, P.G., Grzymala-Busse, J.W., Mroczek, T., Niemiec, R.: A comparison of global and saturated probabilistic approximations using characteristic sets in mining incomplete data. In: Proceedings of the Eight International Conference on Intelligent Systems and Applications, pp. 10–15 (2019)
Google Scholar
Clark, P.G., Grzymala-Busse, J.W., Mroczek, T., Niemiec, R.: Rule set complexity in mining incomplete data using global and saturated probabilistic approximations. In: Proceedings of the 25-th International Conference on Information and Software Technologies, pp. 451–462 (2019)
Google Scholar
Grzymala-Busse, J.W.: LERS–a system for learning from examples based on rough sets. In: Slowinski, R. (ed.) Intelligent Decision Support. Theory and Decision Library (Series D: System Theory, Knowledge Engineering and Problem Solving), vol 11. Springer, Dordrecht (1992) https://doi.org/10.1007/978-94-015-7975-9_1
Grzymala-Busse, J.W.: MLEM2: a new algorithm for rule induction from imperfect data. In: Proceedings of the 9th International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems, pp. 243–250 (2002)
Google Scholar
Grzymala-Busse, J.W.: Generalized parameterized approximations. In: Proceedings of the 6-th International Conference on Rough Sets and Knowledge Technology, pp. 136–145 (2011)
Google Scholar
Grzymala-Busse, J.W., Clark, P.G., Kuehnhausen, M.: Generalized probabilistic approximations of incomplete data. Int. J. Approximate Reason. 132, 180–196 (2014)
Article MathSciNet Google Scholar
Grzymala-Busse, J.W., Rzasa, W.: Local and global approximations for incomplete data. In: Proceedings of the Fifth International Conference on Rough Sets and Current Trends in Computing, pp. 244–253 (2006)
Google Scholar
Grzymala-Busse, J.W., Rzasa, W.: Local and global approximations for incomplete data. Trans. Rough Sets 8, 21–34 (2008)
MathSciNet MATH Google Scholar
Grzymala-Busse, J.W., Ziarko, W.: Data mining based on rough sets. In: Wang, J. (ed.) Data Mining: Opportunities and Challenges, pp. 142–173. Idea Group Publ, Hershey, PA (2003)
Chapter Google Scholar
Leung, Y., Wu, W., Zhang, W.: Knowledge acquisition in incomplete information systems: a rough set approach. Eur. J. Oper. Res. 168, 164–180 (2006)
Article MathSciNet Google Scholar
Pawlak, Z., Skowron, A.: Rough sets: some extensions. Inf. Sci. 177, 28–40 (2007)
Article MathSciNet Google Scholar
Pawlak, Z., Wong, S.K.M., Ziarko, W.: Rough sets: probabilistic versus deterministic approach. Int. J. Man-Mach. Stud. 29, 81–95 (1988)
Article Google Scholar
Ślȩzak, D., Ziarko, W.: The investigation of the bayesian rough set model. Int. J. Approximate Reason. 40, 81–91 (2005)
Article MathSciNet Google Scholar
Wong, S.K.M., Ziarko, W.: INFER–an adaptive decision support system based on the probabilistic approximate classification. In: Proceedings of the 6-th International Workshop on Expert Systems and their Applications, pp. 713–726 (1986)
Google Scholar
Yao, Y.Y.: Probabilistic rough set approximations. Int. J. Approximate Reason. 49, 255–271 (2008)
Article Google Scholar
Yao, Y.Y., Wong, S.K.M.: A decision theoretic framework for approximate concepts. Int. J. Man-Mach. Stud. 37, 793–809 (1992)
Article Google Scholar
Ziarko, W.: Variable precision rough set model. J. Comput. Syst. Sci. 46(1), 39–59 (1993)
Article MathSciNet Google Scholar
Ziarko, W.: Probabilistic approach to rough sets. Int. J. Approximate Reason. 49, 272–284 (2008)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical Engineering and Computer Science, University of Kansas, Lawrence, KS, 66045, USA
Patrick G. Clark & Jerzy W. Grzymala-Busse
Department of Artificial Intelligence, University of Information Technology and Management, 35–225, Rzeszow, Poland
Jerzy W. Grzymala-Busse, Zdzislaw S. Hippe, Teresa Mroczek & Rafal Niemiec

Authors

Patrick G. Clark
View author publications
You can also search for this author in PubMed Google Scholar
Jerzy W. Grzymala-Busse
View author publications
You can also search for this author in PubMed Google Scholar
Zdzislaw S. Hippe
View author publications
You can also search for this author in PubMed Google Scholar
Teresa Mroczek
View author publications
You can also search for this author in PubMed Google Scholar
Rafal Niemiec
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jerzy W. Grzymala-Busse .

Editor information

Editors and Affiliations

University of Oviedo, Oviedo, Spain
Enrique Antonio de la Cal
University of Oviedo, Oviedo, Spain
José Ramón Villar Flecha
University of A Coruña, Ferrol, Spain
Héctor Quintián
University of Salamanca, Salamanca, Spain
Emilio Corchado

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Clark, P.G., Grzymala-Busse, J.W., Hippe, Z.S., Mroczek, T., Niemiec, R. (2020). Global and Saturated Probabilistic Approximations Based on Generalized Maximal Consistent Blocks. In: de la Cal, E.A., Villar Flecha, J.R., Quintián, H., Corchado, E. (eds) Hybrid Artificial Intelligent Systems. HAIS 2020. Lecture Notes in Computer Science(), vol 12344. Springer, Cham. https://doi.org/10.1007/978-3-030-61705-9_32

Download citation

DOI: https://doi.org/10.1007/978-3-030-61705-9_32
Published: 04 November 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-61704-2
Online ISBN: 978-3-030-61705-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Global and Saturated Probabilistic Approximations Based on Generalized Maximal Consistent Blocks

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Mining Incomplete Data Using Global and Saturated Probabilistic Approximations Based on Characteristic Sets and Maximal Consistent Blocks

Rule Set Complexity for Mining Incomplete Data Using Probabilistic Approximations Based on Generalized Maximal Consistent Blocks

Mining Data with Many Missing Attribute Values Using Global and Saturated Probabilistic Approximations Based on Characteristic Sets

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Global and Saturated Probabilistic Approximations Based on Generalized Maximal Consistent Blocks

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Mining Incomplete Data Using Global and Saturated Probabilistic Approximations Based on Characteristic Sets and Maximal Consistent Blocks

Rule Set Complexity for Mining Incomplete Data Using Probabilistic Approximations Based on Generalized Maximal Consistent Blocks

Mining Data with Many Missing Attribute Values Using Global and Saturated Probabilistic Approximations Based on Characteristic Sets

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation