Handling Incomplete Data Using Evolution of Imputation Methods

Zawistowski, Pawel; Grzenda, Maciej

doi:10.1007/978-3-642-04921-7_3

Pawel Zawistowski¹⁹ &
Maciej Grzenda²⁰

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5495))

Included in the following conference series:

International Conference on Adaptive and Natural Computing Algorithms

2084 Accesses
4 Citations

Abstract

In this paper new approach to treat incomplete data has been proposed. It has been based on the evolution of imputation strategies built using both non-parametric and parametric imputation methods. Genetic algorithms and multilayer perceptrons have been applied to develop a framework for constructing the imputation strategies addressing multiple incomplete attributes. Furthermore we evaluate imputation methods in the context of not only the data they are applied to, but also the model using the data. The accuracy of classification on data sets completed using obtained imputation strategies has been described. The results outperform the corresponding results calculated for the same data sets completed using standard strategies.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Abdella, M., Marwala, T.: The use of genetic algorithms and neural networks to approximate missing data in database. In: IEEE 3rd International Conference on Computational Cybernetics (2005)
Google Scholar
Acuña, E., Rodriguez, C.: The treatment of missing values and its effect in the classifier accuracy. In: Classification, Clustering and Data Mining Applications. Springer, Heidelberg (2004)
Google Scholar
Batista, G.E.A.P.A., Monard, M.C.: A Study of K-Nearest Neighbour as a Model-Based Method to Treat Missing Data. In: Argentine Symposium on Artificial Intelligence (2001)
Google Scholar
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum Likelihood from Incomplete Data via the EM Algorithm (1977)
Google Scholar
Gediga, G., Düntsch, I.: Maximum consistency of incomplete data via non–invasive imputation. Artificial Intelligence Review 19 (2003)
Google Scholar
Grzenda, M.: Load Prediction Using Combination of Neural Networks and Simple Strategies. Frontiers in Artificial Intelligence and Applications 173, 106–113 (2008)
Google Scholar
Grzenda, M., Macukow, B.: Demand Prediction with Multi-Stage Neural Processing. In: Advances in Natural Computation and Data Mining, pp. 131–141. Xidian University Press, China (2006)
Google Scholar
Hu, M., Salvucci, S.M., Cohen, M.P.: Evaluation of some popular imputation algorithms. In: Proceedings of the Survey Research Methods Section. American Statistical Association (1998)
Google Scholar
Jönsson, P., Wohlin, C.: Benchmarking k-nearest neighbour imputation with homogeneous Likert data. Empirical Software Engineering 11(3) (2006)
Google Scholar
Juszczak, P., Duin, R.P.W.: Combining One-Class Classifiers to Classify Missing Data. Multiple Classifier Systems (2004)
Google Scholar
Little, R.J., Rubin, D.B.: Statistical Analysis with Missing Data, 2nd edn. John Wiley and Sons, Chichester (2002)
MATH Google Scholar
Parsons, S.: Current approaches to handling imperfect information in data and knowledge bases. IEEE Transactions on Knowledge and Data Engineering 8(3) (1996)
Google Scholar
Schafer, J.L.: Analysis of Incomplete Multivariate Data. Chapman & Hall/CRC, Boca Raton (1997)
Book MATH Google Scholar
Strike, K., El Emam, K., Madhavji, N.: Software cost estimation with incomplete data. IEEE Transactions on Software Engineering 27(10) (2001)
Google Scholar
Wei, W., Tang, Y.: A generic neural network approach for filling missing data in data mining. In: IEEE International Conference on Systems, Man and Cybernetics (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Electronics and Information Technologies, Institute of Electronic Systems, Warsaw University of Technology, ul. Nowowiejska 15/19, 00-665, Warsaw, Poland
Pawel Zawistowski
Faculty of Mathematics and Information Science, Warsaw University of Technology, Pl. Politechniki 1, 00-661, Warsaw, Poland
Maciej Grzenda

Authors

Pawel Zawistowski
View author publications
You can also search for this author in PubMed Google Scholar
Maciej Grzenda
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Environmental Sciences, University of Kuopio, PO Box 1627, FIN-70211, Kuopio, Finland
Mikko Kolehmainen
Department of Computer Science, University of Kuopio, P.O.Box 1627, 70211, Kuopio, Finland
Pekka Toivanen
Institute of Control and Industrial Electronics, Warsaw University of Technology, ul. Koszykowa 75, 00-662, Warszawa, Poland
Bartlomiej Beliczynski

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zawistowski, P., Grzenda, M. (2009). Handling Incomplete Data Using Evolution of Imputation Methods. In: Kolehmainen, M., Toivanen, P., Beliczynski, B. (eds) Adaptive and Natural Computing Algorithms. ICANNGA 2009. Lecture Notes in Computer Science, vol 5495. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04921-7_3

Download citation

DOI: https://doi.org/10.1007/978-3-642-04921-7_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04920-0
Online ISBN: 978-3-642-04921-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics