Two-Phase Rule Induction from Incomplete Data

Li, Huaxiong; Yao, Yiyu; Zhou, Xianzhong; Huang, Bing

doi:10.1007/978-3-540-79721-0_12

Huaxiong Li^1,2,
Yiyu Yao²,
Xianzhong Zhou¹ &
…
Bing Huang¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5009))

Included in the following conference series:

International Conference on Rough Sets and Knowledge Technology

1526 Accesses
8 Citations

Abstract

A framework of learning a new form of rules from incomplete data is introduced so that a user can easily identify attributes with or without missing values in a rule. Two levels of measurement are assigned to a rule. An algorithm for two-phase rule induction is presented. Instead of filling in missing attribute values before or during the process of rule induction, we divide rule induction into two phases. In the first phase, rules and partial rules are induced based on non-missing values. In the second phase, partial rules are modified and refined by filling in some missing values. Such rules truthfully reflect the knowledge embedded in the incomplete data. The study not only presents a new view of rule induction from incomplete data, but also provides a practical solution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Greco, S., Matarazzo, B., Slowinski, R.: Handling Missing Values in Rough SetAnalysis of Multi-attribute and Multi-criteria Decision Problems. In: Zhong, N., Skowron, A., Ohsuga, S. (eds.) RSFDGrC 1999. LNCS (LNAI), vol. 1711, pp. 146–157. Springer, Heidelberg (1999)
Google Scholar
Grzymala-Busse, J.W., Hu, M.: A Comparison of Several Approches to Missing Attribute Values in Data Mining. In: Ziarko, W., Yao, Y. (eds.) RSCTC 2000. LNCS (LNAI), vol. 2005, pp. 378–385. Springer, Heidelberg (2001)
Chapter Google Scholar
Grzymala-Busse, J.W., Grzymala-Busse, W.J.: An Experimental Comparison of Three Rough Set Approaches to Missing Attribute Values. In: Peters, J.F., Skowron, A., Düntsch, I., Grzymała-Busse, J.W., Orłowska, E., Polkowski, L. (eds.) Transactions on Rough Sets VI. LNCS, vol. 4374, pp. 31–50. Springer, Heidelberg (2007)
Chapter Google Scholar
Grzymala-Busse, J.W.: Data with Missing Attribute Values: Generalization of Indiscernibility Relation and Rule Induction. In: Peters, J.F., Skowron, A., Grzymała-Busse, J.W., Kostek, B.z., Świniarski, R.W., Szczuka, M. (eds.) Transactions on Rough Sets I. LNCS, vol. 3100, pp. 78–95. Springer, Heidelberg (2004)
Google Scholar
Kryszkiewicz, M.: Rough Set Approach to Incomplete Information Systems. Information Sciences 112, 39–49 (1998)
Article MATH MathSciNet Google Scholar
Kryszkiewicz, M.: Rules in Incomplete Information Systems. Information Sciences 113, 271–292 (1999)
Article MATH MathSciNet Google Scholar
Mitchell, T.M.: Generalization as Search. Artificial Intelligence 18, 203–226 (1982)
Article MathSciNet Google Scholar
Polkowski, L., Artiemjew, P.: Granular Classifiers and Missing Values. In: Proc. of ICCI 2007, pp. 186–194 (2007)
Google Scholar
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo (1993)
Google Scholar
Stefamowski, J., Tsoukeas, A.: On the Extension of Rough Sets under Incomplete Information. Int. J. of Intel. Sys. 16, 29–38 (1999)
Google Scholar
Yao, J.T., Yao, Y.Y.: Induction of Classification Rules by Granular Computing. In: Alpigini, J.J., Peters, J.F., Skowron, A., Zhong, N. (eds.) RSCTC 2002. LNCS (LNAI), vol. 2475, pp. 331–338. Springer, Heidelberg (2002)
Chapter Google Scholar
Yao, Y.Y.: Concept Formation and Learning: A Cognitive Informatics Perspective. In: Proc. of the ICCI 2004, pp. 42–51 (2004)
Google Scholar
Yao, Y.Y., Zhong, N.: An Analysis of Quantitative Measures Associated with Rules. In: Kapoor, S., Prasad, S. (eds.) FST TCS 2000. LNCS, vol. 1974, pp. 479–488. Springer, Heidelberg (2000)
Google Scholar
Zhang, S.C., Qin, Z.X., Ling, C.X., Sheng, S.L.: Missing is Useful: Missing Values in Cost-sensitive Decision Trees. IEEE Trans. on Know. and Data Eng. 17, 1689–1693 (2005)
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Management and Engineering, Nanjing University, Nanjing, Jiangsu, 210093, P.R. China
Huaxiong Li, Xianzhong Zhou & Bing Huang
Department of Computer Science, University of Regina, Regina, Saskatchewan, S4S 0A2, Canada
Huaxiong Li & Yiyu Yao

Authors

Huaxiong Li
View author publications
You can also search for this author in PubMed Google Scholar
Yiyu Yao
View author publications
You can also search for this author in PubMed Google Scholar
Xianzhong Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Bing Huang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Guoyin Wang Tianrui Li Jerzy W. Grzymala-Busse Duoqian Miao Andrzej Skowron Yiyu Yao

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, H., Yao, Y., Zhou, X., Huang, B. (2008). Two-Phase Rule Induction from Incomplete Data. In: Wang, G., Li, T., Grzymala-Busse, J.W., Miao, D., Skowron, A., Yao, Y. (eds) Rough Sets and Knowledge Technology. RSKT 2008. Lecture Notes in Computer Science(), vol 5009. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-79721-0_12

Download citation

DOI: https://doi.org/10.1007/978-3-540-79721-0_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-79720-3
Online ISBN: 978-3-540-79721-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics