A Closest Fit Approach to Missing Attribute Values in Preterm Birth Data

Grzymała-Busse, Jerzy W.; Grzymała-Busse, Witold J.; Goodwin, Linda K.

doi:10.1007/978-3-540-48061-7_49

Jerzy W. Grzymała-Busse⁹,
Witold J. Grzymała-Busse¹⁰ &
Linda K. Goodwin¹¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1711))

Included in the following conference series:

International Workshop on Rough Sets, Fuzzy Sets, Data Mining, and Granular-Soft Computing

787 Accesses
8 Citations

Abstract

In real-life data, in general, many attribute values are missing. Therefore, rule induction requires preprocessing, where missing attribute values are replaced by appropriate values. The rule induction method used in our research is based on rough set theory.

In this paper we present our results on a new approach to missing attribute values called a closest fit. The main idea of the closest fit is based on searching through the set of all cases, considered as vectors of attribute values, for a case that is the most similar to the given case with missing attribute values. There are two possible ways to look for the closest case: we may restrict our attention to the given concept or to the set of all cases. These methods are compared with a special case of the closest fit principle: replacing missing attribute values by the most common value from the concept. All algorithms were implemented in system OOMIS. Our experiments were performed on preterm birth data sets collected at the Duke University Medical Center.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bairagi, R., Suchindran, C.M.: An estimator of the cutoff point maximizing sum of sensitivity and specificity. Sankhya, Series B, Indian Journal of Statistics 51, 263–269 (1989)
MathSciNet Google Scholar
Booker, L.B., Goldberg, D.E., Holland, J.F.: Classifier systems and genetic algorithms. In: Carbonell, J.G. (ed.) Machine Learning. Paradigms and Methods, pp. 235–282. The MIT Press, Cambridge (1990)
Google Scholar
Grzymala-Busse, J.W.: On the unknown attribute values in learning from examples. In: Raś, Z.W., Zemankova, M. (eds.) ISMIS 1991. LNCS (LNAI), vol. 542, pp. 368–377. Springer, Heidelberg (1991)
Google Scholar
Grzymala-Busse, J.W.: LERS—A system for learning from examples based on rough sets. In: Slowinski, R. (ed.) Intelligent Decision Support. Handbook of Applications and Advances of the Rough Sets Theory, pp. 3–18. Kluwer Academic Publishers, Dordrecht (1992)
Google Scholar
Grzymala-Busse, J.W., Goodwin, L.K.: Predicting preterm birth risk using machine learning from data with missing values. Bull. of Internat. Rough Set Society 1, 17–21 (1997)
Google Scholar
Grzymala-Busse, J.W.: LERS—A knowledge discovery system. In: Polkowski, L., Skowron, A. (eds.) Rough Sets in Knowledge Discovery 2, Applications, Case Studies and Software Systems, pp. 562–565. Physica-Verlag, Hidleberg (1998)
Google Scholar
Grzymala-Busse, J.W., Wang, A.Y.: Modified algorithms LEM1 and LEM2 for rule induction from data with missing attribute values. In: Proc. of the Fifth International Workshop on Rough Sets and Soft Computing (RSSC 1997) at the Third Joint Conference on Information Sciences (JCIS 1997), Research Triangle Park, NC, March 2–5, pp. 69–72 (1997)
Google Scholar
Grzymala-Busse, J.W., Zou, X.: Classification strategies using certain and possible rules. In: Polkowski, L., Skowron, A. (eds.) RSCTC 1998. LNCS (LNAI), vol. 1424, pp. 37–44. Springer, Heidelberg (1998)
Chapter Google Scholar
Grzymala-Busse, J.W., Goodwin, L.K., Zhang, X.: Increasing sensitivity of preterm birth by changing rule strengths. In: Submitted for the 8th Workshop on Intelligent Information Systems (IIS 1999), Ustronie, Poland, June 14–18 (1999)
Google Scholar
Holland, J.H., Holyoak, K.J., Nisbett, R.E.: Induction. Processes of Inference, Learning, and Discovery. The MIT Press, Cambridge (1986)
Google Scholar
Michalski, R.S., Mozetic, I., Hong, J., Lavrac, N.: The AQ15 inductive learning system: An overview and experiments. Department of Computer Science, University of Illinois, Rep. UIUCDCD-R-86-1260 (1986)
Google Scholar
Pawlak, Z.: Rough sets. International Journal Computer and Information Sciences 11, 341–356 (1982)
Article MATH MathSciNet Google Scholar
Pawlak, Z.: Rough Sets. Theoretical Aspects of Reasoning about Data. Kluwer Academic Publishers, Dordrecht (1991)
MATH Google Scholar
Stefanowski, J.: On rough set based approaches to induction of decision rules. In: Polkowski, L., Skowron, A. (eds.) Rough Sets in Data Mining and Knowledge Discovery, pp. 500–529. Physica-Verlag, Hidleberg (1998)
Google Scholar
Swets, J.A., Pickett, R.M.: Evaluation of Diagnostic Systems. Methods from Signal Detection Theory. Academic Press, London (1982)
Google Scholar
Ziarko, W.: Systems: DataQuest, DataLogic and KDDR. In: Proc. of the Fourth Int. Workshop on Rough Sets, Fuzzy Sets and Machine Discovery RSFD 1996, Tokyo, Japan, November 6–8, pp. 441–442 (1996)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical Engineering, Computer Science University of Kansas, Lawrence, KS, 66045, USA
Jerzy W. Grzymała-Busse
RS Systems, Inc., Lawrence, KS, 66047, USA
Witold J. Grzymała-Busse
Department of Information Services, the School of Nursing Duke University, Durham, NC, 27710, USA
Linda K. Goodwin

Authors

Jerzy W. Grzymała-Busse
View author publications
You can also search for this author in PubMed Google Scholar
Witold J. Grzymała-Busse
View author publications
You can also search for this author in PubMed Google Scholar
Linda K. Goodwin
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

The International WIC Institute, Beijing University of Technology, China
Ning Zhong
Institute of Mathematics, Warsaw University, Banacha 2, 02-097, Warsaw, Poland
Andrzej Skowron
Professor Emeritus, University of Tokyo,
Setsuo Ohsuga

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Grzymała-Busse, J.W., Grzymała-Busse, W.J., Goodwin, L.K. (1999). A Closest Fit Approach to Missing Attribute Values in Preterm Birth Data. In: Zhong, N., Skowron, A., Ohsuga, S. (eds) New Directions in Rough Sets, Data Mining, and Granular-Soft Computing. RSFDGrC 1999. Lecture Notes in Computer Science(), vol 1711. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-48061-7_49

Download citation

DOI: https://doi.org/10.1007/978-3-540-48061-7_49
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66645-5
Online ISBN: 978-3-540-48061-7
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics