Skip to main content
Log in

Gradient based invasive weed optimization algorithm for the training of deep neural network

  • 1130T: Machine Learning and Soft Computing Applications in Multimedia
  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Stacked Sparse Auto-Encoder (SSAE) is well known hierarchical deep neural networks for simulating the deep architecture of mammal brain. SSAE can be trained in a greedy layer-wise manner by using methods based on gradient such as Limited memory BFGS (LBFGS). However, methods based on gradient have many disadvantages. The main disadvantage is that they are sensitive to the initial value. In this paper, a meta-heuristic algorithm based on gradient, referred to GCIWOSS, is used to optimize the weights and biases of SSAE. Chaos strategy is firstly used to initial the population of IWO and then a new selection strategy is adopted with the purpose of improving the diversity of population and increasing the global exploration ability. The improved IWO is preparing for the following exploitation based on gradient to avoid falling into local optimal values. In the experiments, the proposed algorithm is proven to be effective in extracting features from different image datasets, compared with the LBFGS and several other feature learning models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17

Similar content being viewed by others

Data availability

The [DATA TYPE] data used to support the findings of this study are from the related references which were labeled in article. You also can get them from the corresponding author upon request.

References

  1. Abdel-Hamid O, Mohamed AR, Jiang H, Deng L, Penn G, Yu D (2014) Convolutional neural networks for speech recognition. IEEE/ACM Transactions on Audio Speech & Language Processing 22(10):1533–1545

    Article  Google Scholar 

  2. Alghamdi A, Hammad M, Ugail H, Abdel-Raheem A, Muhammad K, Khalifa HS, Abd el-Latif AA (2020) Detection of myocardial infarction based on novel deep transfer learning methods for urban healthcare in smart cities. Multimed Tools Appl. https://doi.org/10.1007/s11042-020-08769-x

  3. Alghamdi A, Polat K, Alghoson A et al (2020) A novel blood pressure estimation method based on the classification of oscillometric waveforms using machine-learning methods. Applied Acoustics 164(10729). https://doi.org/10.1016/j.apacoust.2020.107279

  4. Bai XF, Zhang TJ, Wang CJ et al (2013) A fully automatic player detection method based on one-class SVM. IEICE Transactions on Information and Systems 96.D(2):387–391

    Article  Google Scholar 

  5. Bengio Y, Courville A, Vincent P (2013) Representation learning: a review and new perspectives. IEEE Trans Softw Eng 35(8):1798–1828

    Google Scholar 

  6. Chen Y, Lin Z, Zhao X et al (2017) Deep learning-based classification of hyperspectral data. IEEE Journal of Selected Topics in Applied Earth Observations & Remote Sensing 7(6):2094–2107

    Article  Google Scholar 

  7. Chen H, Jiao LC, Liang MM, Liu FS, Yang Y, Hou B (2019) Fast unsupervised deep fusion network for change detection of multitemporal SAR images. Neuro computing, vol 332 (2019) 56–70.

  8. David OE, Greental I (2014) Genetic algorithms for evolving deep neural networks. In proceedings of the 2014 conference companion on genetic and evolutionary computation companion, ACM, 1451-1452.

  9. Dong Y, Hinton G, Morgan N et al (2012) Introduction to the special section on deep learning for speech and language processing. IEEE Transactions on Audio Speech & Language Processing 20(1):4–6

    Article  Google Scholar 

  10. Eiben AE, Smith J (2015) From evolutionary computation to the evolution of things. Nature 521(7553):476–482

    Article  Google Scholar 

  11. Engelbrecht AP (2007) Computational intelligence: An introduction, second edition. Internet of things. IEEE, 2007.

  12. Floreano D, Dürr P, Mattiussi C (2008) Neuroevolution: from architectures to learning. Evol Intel 1(1):47–62

    Article  Google Scholar 

  13. Gleick J, Hiborn R (1988) Chaos: making a new science. Am J Phys 56(11):1053–1054

    Article  Google Scholar 

  14. Gong M, Liu J, Li H, Cai Q, Su L (2015) A multiobjective sparse feature learning model for deep neural networks. IEEE Transactions on Neural Networks & Learning Systems 26(12):3263–3277

    Article  MathSciNet  Google Scholar 

  15. Haoran L, Fazhi H, Yaqian L, Quan Q (2019) A dividing-based many-objectives evolutionary algorithm for large-scale feature selection. Soft Comput 24:6851–6870. https://doi.org/10.1007/s00500-019-04324-5

    Article  Google Scholar 

  16. Hayat M, Bennamoun M, An S (2014) Deep reconstruction models for image set classification. IEEE Conference on Computer Vision & Pattern Recognition IEEE Computer Society 37(4):713–727

    Google Scholar 

  17. Hinton GE, Osindero S, Teh YW (2006) A fast learning algorithm for deep belief nets. Neural Computation 18(7):1527–1554

    Article  MathSciNet  Google Scholar 

  18. Hou W, Gao X, Tao D, Li X (2015) Blind image quality assessment via deep learning. IEEE Transactions on Neural Networks & Learning Systems 26(6):1275–1286

    Article  MathSciNet  Google Scholar 

  19. Huang FJ, Lecun Y (2006) Large-scale learning with SVM and convolutional nets for generic object categorization. Computer Vision and Pattern Recognition. 2006 IEEE Computer Society Conference on. IEEE 1(2006):284–291

    Google Scholar 

  20. Ji NN, Zhang JS, Zhang CX (2014) A sparse-response deep belief network based on rate distortion theory. Pattern Recognition 47(9):3179–3191

    Article  Google Scholar 

  21. Jing H, He X, Han Q et al (2014) Saliency detection based on integrated features. Neuro computing 129:114–121

    Google Scholar 

  22. Ke C, Ahmad S (2011) Learning speaker-specific characteristics with a deep neural architecture. IEEE Transactions on Neural Networks 22(11):1744–1756

    Article  Google Scholar 

  23. Kennedy J, Eberhart R (1995) Particle swarm optimization. Proceedings of ICNN'95 - international conference on neural networks 4:1942–1948

    Article  Google Scholar 

  24. Kim H, Chang S (2013) High-resolution touch floor system using particle swarm optimization neural network. IEEE Sensors J 13(6):2084–2093

    Article  Google Scholar 

  25. Krizhevsky A, Hinton GE(2009) Learning multiple layers of features from tiny images.Technical Report, University of Toronto.

  26. Lamos-Sweeney JD(2012) Deep learning using genetic algorithms. Dissertations & Theses-Gradworks.

    Google Scholar 

  27. Le QV, Ngiam J, Coates A et al (2011) On optimization methods for deep learning. Proceedings of the 28th International Conference on Machine Learning, ICML 2011 2011:265–272

    Google Scholar 

  28. Lécun Y, Leon B, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324

    Article  Google Scholar 

  29. Lee TS, Mumford D (2003) Hierarchical Bayesian inference in the visual cortex. J Opt Soc Am A 20(7):1434–1448

    Article  Google Scholar 

  30. Lee H, Ekanadham C, Andrew YN (2007) Sparse deep belief net model for visual area V2. NIPS'07: Proceedings of the 20th International Conference on Neural Information Processing Systems 20:873–880

    Google Scholar 

  31. Li D, Hinton G, Kingsbury B (2013) New types of deep neural network learning for speech recognition and related applications: an overview. Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE international conference on IEEE 2013:8599–8603

    Google Scholar 

  32. Li M, Huang X, Liu H, Liu B, Wu Y, Xiong A, Dong T (2013) Prediction of gas solubility in polymers by back propagation artificial neural network based on self-adaptive particle swarm optimization algorithm and chaos theory. Fluid Phase Equilibria 356:11–17

    Article  Google Scholar 

  33. Liang J, Kelly K (2014) Training stacked denoising autoencoders for representation learning. http://users.ices.utexas.edu/~keith/files/autoencoder/final_report/autoencoder.pdf

  34. Mehrabian AR, Lucas C (2006) A novel numerical optimization algorithm inspired from weed colonization. Ecological Informatics 1(4):355–366

    Article  Google Scholar 

  35. Olshausen BA, Field DJ, Olshausen BA, Field DJ (1996) Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381:607–609

    Article  Google Scholar 

  36. Ranzato M, Boureau YL, Lecun Y (2007) Sparse feature learning for deep belief networks. NIPS'07: Proceedings of the 20th International Conference on Neural Information Processing Systems 2007:1185–1192

    Google Scholar 

  37. Rifai S, Vincent P, Muller X, Glorot X, Bengio Y (2011) Contractive auto-encoders: explicit invariance during feature extraction. ICML'11: Proceedings of the 28th International Conference on International Conference on Machine Learning:833–840

  38. Sarikaya R, Hinton GE, Deoras A (2014) Application of deep belief networks for natural language understanding. Audio Speech & Language Processing IEEE/ACM Transactions on 22(4):778–784

    Article  Google Scholar 

  39. Schölkopf B, Platt J, Hofmann T (2006) Efficient learning of sparse representations with an energy-based model. Advances in Neural Information Processing Systems (NIPS 2006) 2006:1137–1144

    Google Scholar 

  40. Sivagaminathan RK, Ramakrishnan S (2007) A hybrid approach for feature subset selection using neural networks and ant colony optimization. Expert Systems with Applications 33(1):49–60

    Article  Google Scholar 

  41. Storn R, Price K (1995) Differential evolution: a simple and efficient adaptive scheme for global optimization over continuous spaces. J Glob Optim 23(4):341–359

    Article  Google Scholar 

  42. Vincent P, Larochelle H, Lajoie I et al (2010) Stacked Denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J Mach Learn Res 11(12):3371–3408

    MathSciNet  MATH  Google Scholar 

  43. Wang L (2005) A hybrid genetic algorithm-neural network strategy for simulation optimization. Applied Mathematics & Computation 170(2):1329–1343

    Article  MathSciNet  Google Scholar 

  44. Yan XH, He FZ, Hou N, Ai HJ (2017) An efficient particle swarm optimization for large scale hardware/software co-design system. International Journal of Cooperative Information Systems 27(14):1741001

    Google Scholar 

  45. Yao X (1999) Evolving artificial neural networks. Proc IEEE 87(9):1423–1447

    Article  Google Scholar 

  46. Yong JS, He FZ, Li HR, Zhou WQ (2019) A novel bat algorithm based on cross boundary learning and uniform explosion strategy. Applied Mathematics-A Journal of Chinese Universities 34(4):482–504

    Article  MathSciNet  Google Scholar 

  47. Yoshua B, Pascal L, Dan P, Hugo L (2006) Greedy layer-wise training of deep networks. NIPS'06: Proceedings of the 19th International Conference on Neural Information Processing Systems 2006:153–160

    Google Scholar 

  48. Yuan Y, Mou L, Lu X (2015) Scene recognition by manifold regularized deep learning architecture. IEEE Transactions on Neural Networks & Learning Systems 26(10):2222–2233

    Article  MathSciNet  Google Scholar 

  49. Zhong S, Liu Y, Liu Y (2011) Bilinear deep learning for image classification. Proceedings of the 19th ACM international conference on Multimedia, ACM 2011:883–884

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (62002105,61672010, 61701173 and 61702168), Ph.D.Programs Foundation(BSQD2019024), Provincial education project(B2018310).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bai Liu.

Ethics declarations

Conflict of interest

The authors declare that there is no conflict of interest regarding the publication of this paper.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, B., Nie, L. Gradient based invasive weed optimization algorithm for the training of deep neural network. Multimed Tools Appl 80, 22795–22819 (2021). https://doi.org/10.1007/s11042-020-10495-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-020-10495-3

Keywords

Navigation