Abstract
In this paper new approach to treat incomplete data has been proposed. It has been based on the evolution of imputation strategies built using both non-parametric and parametric imputation methods. Genetic algorithms and multilayer perceptrons have been applied to develop a framework for constructing the imputation strategies addressing multiple incomplete attributes. Furthermore we evaluate imputation methods in the context of not only the data they are applied to, but also the model using the data. The accuracy of classification on data sets completed using obtained imputation strategies has been described. The results outperform the corresponding results calculated for the same data sets completed using standard strategies.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Abdella, M., Marwala, T.: The use of genetic algorithms and neural networks to approximate missing data in database. In: IEEE 3rd International Conference on Computational Cybernetics (2005)
Acuña, E., Rodriguez, C.: The treatment of missing values and its effect in the classifier accuracy. In: Classification, Clustering and Data Mining Applications. Springer, Heidelberg (2004)
Batista, G.E.A.P.A., Monard, M.C.: A Study of K-Nearest Neighbour as a Model-Based Method to Treat Missing Data. In: Argentine Symposium on Artificial Intelligence (2001)
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum Likelihood from Incomplete Data via the EM Algorithm (1977)
Gediga, G., Düntsch, I.: Maximum consistency of incomplete data via non–invasive imputation. Artificial Intelligence Review 19 (2003)
Grzenda, M.: Load Prediction Using Combination of Neural Networks and Simple Strategies. Frontiers in Artificial Intelligence and Applications 173, 106–113 (2008)
Grzenda, M., Macukow, B.: Demand Prediction with Multi-Stage Neural Processing. In: Advances in Natural Computation and Data Mining, pp. 131–141. Xidian University Press, China (2006)
Hu, M., Salvucci, S.M., Cohen, M.P.: Evaluation of some popular imputation algorithms. In: Proceedings of the Survey Research Methods Section. American Statistical Association (1998)
Jönsson, P., Wohlin, C.: Benchmarking k-nearest neighbour imputation with homogeneous Likert data. Empirical Software Engineering 11(3) (2006)
Juszczak, P., Duin, R.P.W.: Combining One-Class Classifiers to Classify Missing Data. Multiple Classifier Systems (2004)
Little, R.J., Rubin, D.B.: Statistical Analysis with Missing Data, 2nd edn. John Wiley and Sons, Chichester (2002)
Parsons, S.: Current approaches to handling imperfect information in data and knowledge bases. IEEE Transactions on Knowledge and Data Engineering 8(3) (1996)
Schafer, J.L.: Analysis of Incomplete Multivariate Data. Chapman & Hall/CRC, Boca Raton (1997)
Strike, K., El Emam, K., Madhavji, N.: Software cost estimation with incomplete data. IEEE Transactions on Software Engineering 27(10) (2001)
Wei, W., Tang, Y.: A generic neural network approach for filling missing data in data mining. In: IEEE International Conference on Systems, Man and Cybernetics (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zawistowski, P., Grzenda, M. (2009). Handling Incomplete Data Using Evolution of Imputation Methods. In: Kolehmainen, M., Toivanen, P., Beliczynski, B. (eds) Adaptive and Natural Computing Algorithms. ICANNGA 2009. Lecture Notes in Computer Science, vol 5495. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04921-7_3
Download citation
DOI: https://doi.org/10.1007/978-3-642-04921-7_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04920-0
Online ISBN: 978-3-642-04921-7
eBook Packages: Computer ScienceComputer Science (R0)