Abstract
Stacked Sparse Auto-Encoder (SSAE) is well known hierarchical deep neural networks for simulating the deep architecture of mammal brain. SSAE can be trained in a greedy layer-wise manner by using methods based on gradient such as Limited memory BFGS (LBFGS). However, methods based on gradient have many disadvantages. The main disadvantage is that they are sensitive to the initial value. In this paper, a meta-heuristic algorithm based on gradient, referred to GCIWOSS, is used to optimize the weights and biases of SSAE. Chaos strategy is firstly used to initial the population of IWO and then a new selection strategy is adopted with the purpose of improving the diversity of population and increasing the global exploration ability. The improved IWO is preparing for the following exploitation based on gradient to avoid falling into local optimal values. In the experiments, the proposed algorithm is proven to be effective in extracting features from different image datasets, compared with the LBFGS and several other feature learning models.
Similar content being viewed by others
Data availability
The [DATA TYPE] data used to support the findings of this study are from the related references which were labeled in article. You also can get them from the corresponding author upon request.
References
Abdel-Hamid O, Mohamed AR, Jiang H, Deng L, Penn G, Yu D (2014) Convolutional neural networks for speech recognition. IEEE/ACM Transactions on Audio Speech & Language Processing 22(10):1533–1545
Alghamdi A, Hammad M, Ugail H, Abdel-Raheem A, Muhammad K, Khalifa HS, Abd el-Latif AA (2020) Detection of myocardial infarction based on novel deep transfer learning methods for urban healthcare in smart cities. Multimed Tools Appl. https://doi.org/10.1007/s11042-020-08769-x
Alghamdi A, Polat K, Alghoson A et al (2020) A novel blood pressure estimation method based on the classification of oscillometric waveforms using machine-learning methods. Applied Acoustics 164(10729). https://doi.org/10.1016/j.apacoust.2020.107279
Bai XF, Zhang TJ, Wang CJ et al (2013) A fully automatic player detection method based on one-class SVM. IEICE Transactions on Information and Systems 96.D(2):387–391
Bengio Y, Courville A, Vincent P (2013) Representation learning: a review and new perspectives. IEEE Trans Softw Eng 35(8):1798–1828
Chen Y, Lin Z, Zhao X et al (2017) Deep learning-based classification of hyperspectral data. IEEE Journal of Selected Topics in Applied Earth Observations & Remote Sensing 7(6):2094–2107
Chen H, Jiao LC, Liang MM, Liu FS, Yang Y, Hou B (2019) Fast unsupervised deep fusion network for change detection of multitemporal SAR images. Neuro computing, vol 332 (2019) 56–70.
David OE, Greental I (2014) Genetic algorithms for evolving deep neural networks. In proceedings of the 2014 conference companion on genetic and evolutionary computation companion, ACM, 1451-1452.
Dong Y, Hinton G, Morgan N et al (2012) Introduction to the special section on deep learning for speech and language processing. IEEE Transactions on Audio Speech & Language Processing 20(1):4–6
Eiben AE, Smith J (2015) From evolutionary computation to the evolution of things. Nature 521(7553):476–482
Engelbrecht AP (2007) Computational intelligence: An introduction, second edition. Internet of things. IEEE, 2007.
Floreano D, Dürr P, Mattiussi C (2008) Neuroevolution: from architectures to learning. Evol Intel 1(1):47–62
Gleick J, Hiborn R (1988) Chaos: making a new science. Am J Phys 56(11):1053–1054
Gong M, Liu J, Li H, Cai Q, Su L (2015) A multiobjective sparse feature learning model for deep neural networks. IEEE Transactions on Neural Networks & Learning Systems 26(12):3263–3277
Haoran L, Fazhi H, Yaqian L, Quan Q (2019) A dividing-based many-objectives evolutionary algorithm for large-scale feature selection. Soft Comput 24:6851–6870. https://doi.org/10.1007/s00500-019-04324-5
Hayat M, Bennamoun M, An S (2014) Deep reconstruction models for image set classification. IEEE Conference on Computer Vision & Pattern Recognition IEEE Computer Society 37(4):713–727
Hinton GE, Osindero S, Teh YW (2006) A fast learning algorithm for deep belief nets. Neural Computation 18(7):1527–1554
Hou W, Gao X, Tao D, Li X (2015) Blind image quality assessment via deep learning. IEEE Transactions on Neural Networks & Learning Systems 26(6):1275–1286
Huang FJ, Lecun Y (2006) Large-scale learning with SVM and convolutional nets for generic object categorization. Computer Vision and Pattern Recognition. 2006 IEEE Computer Society Conference on. IEEE 1(2006):284–291
Ji NN, Zhang JS, Zhang CX (2014) A sparse-response deep belief network based on rate distortion theory. Pattern Recognition 47(9):3179–3191
Jing H, He X, Han Q et al (2014) Saliency detection based on integrated features. Neuro computing 129:114–121
Ke C, Ahmad S (2011) Learning speaker-specific characteristics with a deep neural architecture. IEEE Transactions on Neural Networks 22(11):1744–1756
Kennedy J, Eberhart R (1995) Particle swarm optimization. Proceedings of ICNN'95 - international conference on neural networks 4:1942–1948
Kim H, Chang S (2013) High-resolution touch floor system using particle swarm optimization neural network. IEEE Sensors J 13(6):2084–2093
Krizhevsky A, Hinton GE(2009) Learning multiple layers of features from tiny images.Technical Report, University of Toronto.
Lamos-Sweeney JD(2012) Deep learning using genetic algorithms. Dissertations & Theses-Gradworks.
Le QV, Ngiam J, Coates A et al (2011) On optimization methods for deep learning. Proceedings of the 28th International Conference on Machine Learning, ICML 2011 2011:265–272
Lécun Y, Leon B, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Lee TS, Mumford D (2003) Hierarchical Bayesian inference in the visual cortex. J Opt Soc Am A 20(7):1434–1448
Lee H, Ekanadham C, Andrew YN (2007) Sparse deep belief net model for visual area V2. NIPS'07: Proceedings of the 20th International Conference on Neural Information Processing Systems 20:873–880
Li D, Hinton G, Kingsbury B (2013) New types of deep neural network learning for speech recognition and related applications: an overview. Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE international conference on IEEE 2013:8599–8603
Li M, Huang X, Liu H, Liu B, Wu Y, Xiong A, Dong T (2013) Prediction of gas solubility in polymers by back propagation artificial neural network based on self-adaptive particle swarm optimization algorithm and chaos theory. Fluid Phase Equilibria 356:11–17
Liang J, Kelly K (2014) Training stacked denoising autoencoders for representation learning. http://users.ices.utexas.edu/~keith/files/autoencoder/final_report/autoencoder.pdf
Mehrabian AR, Lucas C (2006) A novel numerical optimization algorithm inspired from weed colonization. Ecological Informatics 1(4):355–366
Olshausen BA, Field DJ, Olshausen BA, Field DJ (1996) Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381:607–609
Ranzato M, Boureau YL, Lecun Y (2007) Sparse feature learning for deep belief networks. NIPS'07: Proceedings of the 20th International Conference on Neural Information Processing Systems 2007:1185–1192
Rifai S, Vincent P, Muller X, Glorot X, Bengio Y (2011) Contractive auto-encoders: explicit invariance during feature extraction. ICML'11: Proceedings of the 28th International Conference on International Conference on Machine Learning:833–840
Sarikaya R, Hinton GE, Deoras A (2014) Application of deep belief networks for natural language understanding. Audio Speech & Language Processing IEEE/ACM Transactions on 22(4):778–784
Schölkopf B, Platt J, Hofmann T (2006) Efficient learning of sparse representations with an energy-based model. Advances in Neural Information Processing Systems (NIPS 2006) 2006:1137–1144
Sivagaminathan RK, Ramakrishnan S (2007) A hybrid approach for feature subset selection using neural networks and ant colony optimization. Expert Systems with Applications 33(1):49–60
Storn R, Price K (1995) Differential evolution: a simple and efficient adaptive scheme for global optimization over continuous spaces. J Glob Optim 23(4):341–359
Vincent P, Larochelle H, Lajoie I et al (2010) Stacked Denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J Mach Learn Res 11(12):3371–3408
Wang L (2005) A hybrid genetic algorithm-neural network strategy for simulation optimization. Applied Mathematics & Computation 170(2):1329–1343
Yan XH, He FZ, Hou N, Ai HJ (2017) An efficient particle swarm optimization for large scale hardware/software co-design system. International Journal of Cooperative Information Systems 27(14):1741001
Yao X (1999) Evolving artificial neural networks. Proc IEEE 87(9):1423–1447
Yong JS, He FZ, Li HR, Zhou WQ (2019) A novel bat algorithm based on cross boundary learning and uniform explosion strategy. Applied Mathematics-A Journal of Chinese Universities 34(4):482–504
Yoshua B, Pascal L, Dan P, Hugo L (2006) Greedy layer-wise training of deep networks. NIPS'06: Proceedings of the 19th International Conference on Neural Information Processing Systems 2006:153–160
Yuan Y, Mou L, Lu X (2015) Scene recognition by manifold regularized deep learning architecture. IEEE Transactions on Neural Networks & Learning Systems 26(10):2222–2233
Zhong S, Liu Y, Liu Y (2011) Bilinear deep learning for image classification. Proceedings of the 19th ACM international conference on Multimedia, ACM 2011:883–884
Acknowledgements
This work was supported by the National Natural Science Foundation of China (62002105,61672010, 61701173 and 61702168), Ph.D.Programs Foundation(BSQD2019024), Provincial education project(B2018310).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that there is no conflict of interest regarding the publication of this paper.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Liu, B., Nie, L. Gradient based invasive weed optimization algorithm for the training of deep neural network. Multimed Tools Appl 80, 22795–22819 (2021). https://doi.org/10.1007/s11042-020-10495-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-020-10495-3