Abstract
Many industrial and real-world datasets suffer from an unavoidable problem of missing values. The problem of missing data has been addressed extensively in the statistical analysis literature, and also, but to a lesser extent in the classification literature. The ability to deal with missing data is an essential requirement for classification because inadequate treatment of missing data may lead to large errors on classification. Feature selection has been successfully used to improve classification, but it has been applied mainly to complete data. This paper develops a wrapper feature selection approach to classification with missing data and investigates the impact of this approach. Empirical results on 10 datasets with missing values using C4.5 for an evaluation and particle swarm optimisation as a search technique in feature selection show that a wrapper feature selection for missing data not only can help to improve accuracy of the classifier, but also can help to reduce the complexity of the learned classification model.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Asuncion, A., Newman, D.: UCI machine learning repository (2007)
Barnard, J., Meng, X.-L.: Applications of multiple imputation in medical studies: from AIDS to NHANES. Stat. Methods Med. Res. 8, 17–36 (1999)
Batista, G.E., Monard, M.C.: A study of K-nearest neighbour as an imputation method. In: HIS, vol. 87, pp. 251–260 (2002)
Chuang, L.-Y., Chang, H.-W., Tu, C.-J., Yang, C.-H.: Improved binary PSO for feature selection using gene expression data. Comput. Biol. Chem. 32, 29–38 (2008)
Clark, P., Niblett, T.: The CN2 induction algorithm. Mach. Learn. 3, 261–283 (1989)
Clerc, M., Kennedy, J.: The particle swarm-explosion, stability, and convergence in a multidimensional complex space. IEEE Trans. Evol. Comput. 6, 58–73 (2002)
Dash, M., Liu, H.: Feature selection for classification. Intell. Data Anal. 1, 131–156 (1997)
De’ath, G., Fabricius, K.E.: Classification and regression trees: a powerful yet simple technique for ecological data analysis. Ecology 81, 3178–3192 (2000)
Doquire, G., Verleysen, M.: Feature selection with missing data using mutual information estimators. Neurocomputing 90, 3–11 (2012)
Farhangfar, A., Kurgan, L., Dy, J.: Impact of imputation of missing values on classification error for discrete data. Pattern Recogn. 41, 3692–3705 (2008)
García-Laencina, P.J., Sancho-Gómez, J.-L., Figueiras-Vidal, A.R.: Pattern classification with missing data: a review. Neural Comput. Appl. 19, 263–282 (2010)
Graham, J.W.: Missing data analysis: making it work in the real world. Annu. Rev. Psychol. 60, 549–576 (2009)
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. ACM SIGKDD Explor. Newsl. 11, 10–18 (2009)
Han, J., Kamber, M., Pei, J.: Data Mining, Southeast Asia Edition: Concepts and Techniques. Morgan Kaufmann, Burlington (2006)
Jain, A., Zongker, D.: Feature selection: evaluation, application, and small sample performance. IEEE Trans. Pattern Anal. Mach. Intell. 19, 153–158 (1997)
Kennedy, J.: Particle swarm optimization. In: Encyclopedia of Machine Learning, pp. 760–766 (2010)
Kennedy, J., Kennedy, J.F., Eberhart, R.C.: Swarm Intelligence. Morgan Kaufmann, San Francisco (2001)
Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artif. Intell. 97, 273–324 (1997)
Koller, D., Sahami, M.: Toward optimal feature selection (1996)
Lin, S.-W., Ying, K.-C., Chen, S.-C., Lee, Z.-J.: Particle swarm optimization for parameter determination and feature selection of support vector machines. Expert Syst. Appl. 35, 1817–1824 (2008)
Little, R.J., Rubin, D.B.: Statistical Analysis with Missing Data. Wiley, New York (2014)
Luengo, J., García, S., Herrera, F.: A study on the use of imputation methods for experimentation with radial basis function network classifiers handling missing attribute values: the good synergy between rbfns and eventcovering method. Neural Netw. 23, 406–418 (2010)
MacKay, D.J.: Information theory, inference, and learning algorithms, vol. 7. Citeseer (2003)
Oh, I.-S., Lee, J.-S., Moon, B.-R.: Hybrid genetic algorithms for feature selection. IEEE Trans. Pattern Anal. Mach. Intell. 26, 1424–1437 (2004)
Quinlan, J.R.: C4.5: Programs for Machine Learning. Elsevier, New York (2014)
Schafer, J.L.: Analysis of Incomplete Multivariate Data. CRC Press, Boca Raton (1997)
Schafer, J.L., Graham, J.W.: Missing data: our view of the state of the art. Psycholog. Meth. 7, 147 (2002)
Wang, X., Yang, J., Teng, X., Xia, W., Jensen, R.: Feature selection based on rough sets and particle swarm optimization. Pattern Recogn. Lett. 4, 459–471 (2007)
Xue, B.: Particle Swarm Optimisation for Feature Selection in Classification. Victoria University of Wellington (2014)
Xue, B., Zhang, M., Browne, W.N.: Particle swarm optimization for feature selection in classification: a multi-objective approach. IEEE Trans. Cybern. 43, 1656–1671 (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Tran, C.T., Zhang, M., Andreae, P., Xue, B. (2016). A Wrapper Feature Selection Approach to Classification with Missing Data. In: Squillero, G., Burelli, P. (eds) Applications of Evolutionary Computation. EvoApplications 2016. Lecture Notes in Computer Science(), vol 9597. Springer, Cham. https://doi.org/10.1007/978-3-319-31204-0_44
Download citation
DOI: https://doi.org/10.1007/978-3-319-31204-0_44
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-31203-3
Online ISBN: 978-3-319-31204-0
eBook Packages: Computer ScienceComputer Science (R0)