Abstract
One of the most important problems on rule induction methods is that measures used for rule search will be influenced by missing values. In this paper, a new approach to missing values is introduced, called rough estimation of conditional probabilities. This technique uses three estimation strategies, ground mean, lower and upper methods. Attributes which have missing values will be estimated by these methods and will be checked by constraints for probabilistic rules. The proposed method was evaluated on medical databases, the experimental results of which show that induced rules correctly represented experts’ knowledge.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agrawal, R., Imielinski, T., and Swami, A., Mining association rules between sets of items in large databases, in Proceedings of the 1993 International Conference on Management of Data (SIGMOD 93), pp. 207–216, 1993.
Breiman, L., Freidman, J., Olshen, R., and Stone, C., Classification And Regression Trees, Wadsworth International Group, Belmont, 1984.
Kryszkiewicz, M. and Rybinski, H. Incompleteness Aspects in Rough Set Approach. Proceedings of Sixth International Workshop on Rough Sets, Data Mining and Granular Computing, Duke, N.C., 1998.
Michalski, R. S., Mozetic, I., Hong, J., and Lavrac, N., The Multi-Purpose Incremental Learning System AQ15 and its Testing Application to Three Medical Domains, in Proceedings of the fifth National Conference on Artificial Intelligence, 1041–1045, AAAI Press, Menlo Park, 1986.
Pawlak, Z., Rough Sets. Kluwer Academic Publishers, Dordrecht, 1991.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Tsumoto, S. (1999). Rule Discovery in Databases with Missing Values Based on Rough Set Model. In: Zhong, N., Zhou, L. (eds) Methodologies for Knowledge Discovery and Data Mining. PAKDD 1999. Lecture Notes in Computer Science(), vol 1574. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48912-6_38
Download citation
DOI: https://doi.org/10.1007/3-540-48912-6_38
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65866-5
Online ISBN: 978-3-540-48912-2
eBook Packages: Springer Book Archive