Skip to main content

Handling Missing Attribute Values in Preterm Birth Data Sets

  • Conference paper
Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing (RSFDGrC 2005)


The objective of our research was to find the best approach to handle missing attribute values in data sets describing preterm birth provided by the Duke University. Five strategies were used for filling in missing attribute values, based on most common values and closest fit for symbolic attributes, averages for numerical attributes, and a special approach to induce only certain rules from specified information using the MLEM2 approach. The final conclusion is that the best strategy was to use the global most common method for symbolic attributes and the global average method for numerical attributes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others


  1. Bairagi, R., Suchindran, C.M.: An estimator of the cutoff point maximizing sum of sensitivity and specificity. Sankhya, Series B, Indian Journal of Statistics 51, 263–269 (1989)

    MathSciNet  Google Scholar 

  2. Grzymala-Busse, J.W.: LERS—A system for learning from examples based on rough sets. In: Slowinski, R. (ed.) Intelligent Decision Support. Handbook of Applications and Advances of the Rough Sets Theory, pp. 3–18. Kluwer Academic Publishers, Dordrecht (1992)

    Google Scholar 

  3. Grzymala-Busse, J.W.: MLEM2: A new algorithm for rule induction from imperfect data. In: Proceedings of the 9th International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems, IPMU 2002, Annecy, France, July 1-5, pp. 243–250 (2002)

    Google Scholar 

  4. Grzymala-Busse, J.W., Grzymala-Busse, W.J., Goodwin, L.K.: A closest fit approach to missing attribute values in preterm birth data. In: Zhong, N., Skowron, A., Ohsuga, S. (eds.) RSFDGrC 1999. LNCS (LNAI), pp. 405–413. Springer, Heidelberg (1999)

    Chapter  Google Scholar 

  5. Grzymala-Busse, J.W., Zou, X.: Classification strategies using certain and possible rules. In: Polkowski, L., Skowron, A. (eds.) RSCTC 1998. LNCS (LNAI), vol. 1424, pp. 37–44. Springer, Heidelberg (1998)

    Chapter  Google Scholar 

  6. Grzymala-Busse, J.W., Goodwin, L.K., Zhang, X.: Increasing sensitivity of preterm birth by changing rule strengths. In: Proceedings of the 8th Workshop on Intelligent Information Systems (IIS 1999), Ustronie, Poland, June 14–18, pp. 127–136 (1999)

    Google Scholar 

  7. McLean, M., Walters, W.A., Smith, R.: Prediction and early diagnosis of preterm labor: a critical review. Obstetrical & Gynecological Survey 48, 209–225 (1993)

    Article  Google Scholar 

  8. Swets, J.A., Pickett, R.M.: Evaluation of Diagnostic Systems. Methods from Signal Detection Theory. Academic Press, Methods from (1982)

    Google Scholar 

Download references

Author information

Authors and Affiliations


Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Grzymala-Busse, J.W., Goodwin, L.K., Grzymala-Busse, W.J., Zheng, X. (2005). Handling Missing Attribute Values in Preterm Birth Data Sets. In: Ślęzak, D., Yao, J., Peters, J.F., Ziarko, W., Hu, X. (eds) Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing. RSFDGrC 2005. Lecture Notes in Computer Science(), vol 3642. Springer, Berlin, Heidelberg.

Download citation

  • DOI:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-28660-8

  • Online ISBN: 978-3-540-31824-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics