Abstract
A rule-based chase algorithm (called Chase 2), presented in this paper, provides a strategy for predicting what values should replace the null values in a relational database. When information about an object is partially incomplete (a set of weighted values of the same attribute can be treated as an allowed attribute value), Chase 2 is decreasing that incompleteness. In other words, when several weighted values of the same attribute are assigned to an object, Chase 2 will increase their standard deviation. To make the presentation clear and simple, we take an incomplete information system S of type λ as the model of data. To begin Chase 2 process, each attribute in S that has either unknown or incomplete values for some objects in S is set, one by one, as a decision attribute and all other attributes in S are treated as condition attributes. Assuming that d is the decision attribute, we take a subsystem S 1 of S by selecting from S any object x such that d(x) ≠ NULL. Now, the subsystem S 1 is used for extracting rules describing values of attribute d. In the next step, each incomplete slot in S which is in the column corresponding to attribute d is chased by previously extracted rules from S 1, describing d. All other incomplete attributes in a database are processed the same way. This concludes the first loop of Chase 2. The whole process is recursively repeated till no more new values can be predicted by Chase 2. In this case, we say that a fixed point in values prediction was reached.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Atzeni, P., DeAntonellis, V.: Relational database theory. The Benjamin Cummings Publishing Company (1992)
Benjamins, V.R., Fensel, D., Pérez, A.G.: Knowledge management through ontologies. In: Proceedings of the 2nd International Conference on Practical Aspects of Knowledge Management (PAKM 1998), Basel, Switzerland (1998)
Chandrasekaran, B., Josephson, J.R., Benjamins, V.R.: The ontology of tasks and methods. In: Proceedings of the 11th Workshop on Knowledge Acquisition, Modeling and Management, Banff, Alberta, Canada (1998)
Dardzińska, A., Raś, Z.W.: On Rules Discovery from Incomplete Information Systems. In: Lin, T.Y., Hu, X., Ohsuga, S., Liau, C. (eds.) Proceedings of ICDM 2003 Workshop on Foundations and New Directions of Data Mining, pp. 31–35. IEEE Computer Society, Melbourne (2003)
Dardzińska, A., Raś, Z.W.: Chasing Unknown Values in Incomplete Information Systems. In: Lin, T.Y., Hu, X., Ohsuga, S., Liau, C. (eds.) Proceedings of ICDM 2003 Workshop on Foundations and New Directions of Data Mining, pp. 24–30. IEEE Computer Society, Melbourne (2003)
Fensel, D.: Ontologies: a silver bullet for knowledge management and electronic commerce. Springer, Heidelberg (1998)
Giudici, P.: Applied Data Mining. Statistical Methods for Business and Industry. Wiley, West Sussex (2003)
Grzymala-Busse, J.: On the unknown attribute values in learning from examples. In: Raś, Z.W., Zemankova, M. (eds.) ISMIS 1991. LNCS, vol. 542, pp. 368–377. Springer, Heidelberg (1991)
Grzymala-Busse, J.: A new version of the rule induction system LERS. Fundamenta Informaticae 31(1), 27–39 (1997)
Grzymala-Busse, J., Hu, M.: A Comparison of several approaches to missing attribute values in data mining. In: Ziarko, W.P., Yao, Y. (eds.) RSCTC 2000. LNCS (LNAI), vol. 2005, pp. 340–347. Springer, Heidelberg (2000)
Little, R., Rubin, D.B.: Statistical analysis with missing data. John Wiley and Sons, New York (1987)
Pawlak, Z.: Rough sets-theoretical aspects of reasoning about data. Kluwer, Dordrecht (1991)
Pawlak, Z.: Information systems - theoretical foundations. Information Systems Journal 6, 205–218 (1991)
Quinlan, J.: Unknown attribute values in induction, in. In: Proceedings of the Sixth International Machine Learning Workshop, pp. 164–168 (1989)
Raś, Z.W.: Resolving queries through cooperation in multi-agent systems. In: Lin, T.Y., Cercone, N. (eds.) Rough Sets and Data Mining, pp. 239–258. Kluwer Academic Publishers, Dordrecht (1997)
Raś, Z.W., Arramreddy, S.: Rough sets approach for handling inconsistencies in semantics of queries, PPT presentation on (2004), http://www.cs.uncc.edu/~ras/KDD-02/1
Raś, Z.W., Dardzińska, A.: Ontology Based Distributed Autonomous Knowledge Systems. Information Systems International Journal 29(1), 47–58 (2004)
Raś, Z.W., Joshi, S.: Query approximate answering system for an incomplete DKBS. Fundamenta Informaticae Journal, IOS Press 30(3/4), 313–324 (1997)
Schafer, J.L.: Analysis of incomplete multivariate data. Book 72, Chapman and Hall series Monographs on Statistics and Applied Probability. Chapman and Hall, London (1997)
Skowron, A.: Boolean reasoning for decision rules generation. In: Komorowski, J., Raś, Z.W. (eds.) ISMIS 1993. LNCS, vol. 689, pp. 295–305. Springer, Heidelberg (1993)
Sowa, J.F.: Knowledge Representation: Logical, Philosophical, and Computational Foundations. Brooks/Cole Publishing Co., Pacific Grove (2000b)
Sowa, J.F.: Ontological categories. In: Albertazzi, L. (ed.) Shapes of Forms: From Gestalt Psychology and Phenomenology to Ontology and Mathematics, pp. 307–340. Kluwer Academic Publishers, Dordrecht (1999a)
Wu, X., Barbara, D.: Learning missing values from summary constraints. In: KDD Explorations, vol. 4(1) (2002)
Van Heijst, G., Schreiber, A., Wielinga, B.: Using explicit ontologies in KBS development. International Journal of Human and Computer Studies 46(2/3), 183–292 (1997)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Dardzińska, A., Raś, Z.W. (2005). CHASE 2 – Rule Based Chase Algorithm for Information Systems of Type λ . In: Tsumoto, S., Yamaguchi, T., Numao, M., Motoda, H. (eds) Active Mining. Lecture Notes in Computer Science(), vol 3430. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11423270_14
Download citation
DOI: https://doi.org/10.1007/11423270_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-26157-5
Online ISBN: 978-3-540-31933-7
eBook Packages: Computer ScienceComputer Science (R0)