Abstract
This study brings together two related areas: deep learning and swarm intelligence for missing data estimation in high-dimensional datasets. The growing number of studies in the deep learning area warrants a closer look at its possible application in the aforementioned domain. Missing data being an unavoidable scenario in present day datasets results in different challenges which are nontrivial for existing techniques which constitute narrow artificial intelligence architectures and computational intelligence methods. This can be attributed to the large number of samples and high number of features. In this paper, we propose a new framework for the imputation procedure that uses a deep learning method with a swarm intelligence algorithm, called Deep Learning-Cuckoo Search (DL-CS). This technique is compared to similar approaches and other existing methods. The time required to obtain accurate estimates for the missing data entries surpasses that of existing methods, but this is considered a worthy bargain when the accuracy of the said estimates in a high dimensional setting are taken into consideration.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Abdella, M., Marwala, T.: The use of genetic algorithms and neural networks to approximate missing data in database. In: 3rd International Conference on Computational Cybernetics. (ICCC), pp. 207–212. IEEE (2005)
Leke, C., Twala, B., Marwala, T.: Modeling of missing data prediction: computational intelligence and optimization algorithms. In: International Conference on Systems, Man and Cybernetics (SMC), pp. 1400–1404. IEEE (2014)
Vukosi, M.N., Nelwamondo, F.V., Marwala, T.: Autoencoder, principal component analysis and support vector regression for data imputation. arXiv preprint arXiv:0709.2506 (2007)
Jerez, J.M., Molina, I., García-Laencina, P.J., Alba, E., Ribelles, N., Martín, M., Franco, L.: Missing data imputation using statistical and machine learning methods in a real breast cancer problem. Artif. intell. Med. 50(2), 105–115 (2010). Elsevier
Liew, A.W.-C., Law, N.-F., Yan, H.: Missing value imputation for gene expression data: computational techniques to recover missing data from available information. Brief. Bioinform. 12(5), 498–513 (2011). Oxford University Press
Myers, T.A.: Goodbye, listwise deletion: presenting hot deck imputation as an easy and effective tool for handling missing data. Commun. Methods Meas. 5(4), 297–310 (2011). Taylor & Francis
Schafer, J.L., Graham, J.W.: Missing data: our view of the state of the art. Psychol. Methods 7(2), 147 (2002). American Psychological Association
Van Buuren, S.: Flexible Imputation of Missing Data. CRC Press, Boca Raton (2012)
Leke, C., Marwala, T.: Missing data estimation in high-dimensional datasets: a swarm intelligence-deep neural network approach. In: Tan, Y., Shi, Y., Niu, B. (eds.) ICSI 2016. LNCS, vol. 9712, pp. 259–270. Springer, Cham (2016). doi:10.1007/978-3-319-41000-5_26
Finn C., Tan, X., Duan, Y., Darrell, T., Levine, S., Abbeel, P.: Deep spatial autoencoders for visuomotor learning. In: International Conference on Robotics and Automation (ICRA), pp. 512–519 (2016)
Ju, Y., Guo, J., Liu, S.: A deep learning method combined sparse autoencoder with SVM. In: 2015 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC), pp. 257–260, September 2015
Brain, L.B., Marwala, T., Tettet, T.: Autoencoder networks for HIV classification. Curr. Sci. 91(11), 1467–1473 (2006)
Krizhevsky, A., Hinton, G.E.: Using very deep autoencoders for content-based image retrieval. In: 19th European Symposium on Artificial Neural Networks (ESANN), Bruges, Belgium, 27–29 April 2011
Yang, X.S., Debb, S.: Cuckoo search: recent advances and applications. Neural Comput. Appl. 24(1), 169–174 (2014)
Vasanthakumar, S., Kumarappan, N., Arulraj, R., Vigneysh, T.: Cuckoo search algorithm based environmental economic dispatch of microgrid system with distributed generation. In: International Conference on Smart Technologies and Management for Computing, Communication, Controls, Energy and Materials (ICSTM), pp. 575–580. IEEE (2015)
Wang, J., Zhou, B., Zhou, S.: An improved cuckoo search optimization algorithm for the problem of chaotic systems parameter estimation. Comput. Intell. Neurosci. 2016, 8 (2016)
Ali, F.A., Mohamed, A.T.: A hybrid cuckoo search algorithm with Nelder Mead method for solving global optimization problems. SpringerPlus 5(1), 473 (2016). Springer International Publishing
Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Leke, C., Ndjiongue, A.R., Twala, B., Marwala, T. (2017). A Deep Learning-Cuckoo Search Method for Missing Data Estimation in High-Dimensional Datasets. In: Tan, Y., Takagi, H., Shi, Y. (eds) Advances in Swarm Intelligence. ICSI 2017. Lecture Notes in Computer Science(), vol 10385. Springer, Cham. https://doi.org/10.1007/978-3-319-61824-1_61
Download citation
DOI: https://doi.org/10.1007/978-3-319-61824-1_61
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-61823-4
Online ISBN: 978-3-319-61824-1
eBook Packages: Computer ScienceComputer Science (R0)