Using imputation algorithms when missing values appear in the test data in contrast with the training data
by Narges Sadat Bathaeian
International Journal of Data Analysis Techniques and Strategies (IJDATS), Vol. 10, No. 2, 2018

Abstract: Real datasets suffer from the problem of missing data. Imputation is a common solution for this problem. Most of research works perform imputation algorithms to training data. Therefore, the output variable of samples might influence the imputation model. This paper aims to compare different imputation algorithms when they are applied to test data and training data. In this research, first, the relations between output variable and different imputation algorithms are investigated. Then six different types of imputation algorithms are applied to both training data and test data. Chosen datasets are globally available, and cover both classification and regression tasks. Also missing values are injected artificially to them. The results showed that performance of all algorithms will reduce in the case of elimination of output variable. Particularly, decline in algorithm, which uses k nearest neighbour for imputation in the classification datasets is not ignorable. Nevertheless, algorithms that are based on random forests have less decline and show better results compared with other five types of algorithms.

Online publication date: Thu, 21-Jun-2018

The full text of this article is only available to individual subscribers or to users at subscribing institutions.

 
Existing subscribers:
Go to Inderscience Online Journals to access the Full Text of this article.

Pay per view:
If you are not a subscriber and you just want to read the full contents of this article, buy online access here.

Complimentary Subscribers, Editors or Members of the Editorial Board of the International Journal of Data Analysis Techniques and Strategies (IJDATS):
Login with your Inderscience username and password:

    Username:        Password:         

Forgotten your password?


Want to subscribe?
A subscription gives you complete access to all articles in the current issue, as well as to all articles in the previous three years (where applicable). See our Orders page to subscribe.

If you still need assistance, please email subs@inderscience.com