Skip to main content

An Experimental Study of Weight Initialization and Lamarckian Inheritance on Neuroevolution

  • Conference paper
  • First Online:
Applications of Evolutionary Computation (EvoApplications 2021)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12694))

Abstract

Weight initialization is critical in being able to successfully train artificial neural networks (ANNs), and even more so for recurrent neural networks (RNNs) which can easily suffer from vanishing and exploding gradients. In neuroevolution, where evolutionary algorithms are applied to neural architecture search, weights typically need to be initialized at three different times: when the initial genomes (ANN architectures) are created, when offspring genomes are generated by crossover, and when new nodes or edges are created during mutation. This work explores the difference between the state-of-the-art Xavier and Kaiming methods, and novel Lamarckian weight inheritance for weight initialization during crossover and mutation operations. These are examined using the Evolutionary eXploration of Augmenting Memory Models (EXAMM) neuroevolution algorithm, which is capable of evolving RNNs with a variety of modern memory cells (e.g., LSTM, GRU, MGU, UGRNN and Delta-RNN cells) as well as recurrent connections with varying time skips through a high performance island based distributed evolutionary algorithm. Results show that with statistical significance, the Lamarckian strategy outperforms both Kaiming and Xavier weight initialization, can speed neuroevolution by requiring less backpropagation epochs to be evaluated per genome, and that the neuroevolutionary process provides further benefits to neural network weight optimization.

This material is based upon work supported by the U.S. Department of Energy, Office of Science, Office of Advanced Combustion Systems under Award Number #FE0031547 and by the Federal Aviation Administration and MITRE Corporation under the National General Aviation Flight Information Database (NGAFID) award.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Fan in is the number of input signals that feed into the layer, fan out is the number of output signals that come out of the layer.

  2. 2.

    Components are identified as being the same by having the same innovation number, which is uniquely created by the neuroevolution process when an architectural component is added to a genome, and are inherited by children on crossover and mutation, as in the NEAT algorithm [26].

  3. 3.

    These data sets are made publicly available at EXAMM GitHub repository: https://github.com/travisdesell/exact/tree/master/datasets/.

  4. 4.

    https://opendata-renewables.engie.com.

References

  1. Aly, A., Weikersdorfer, D., Delaunay, C.: Optimizing deep neural networks with multiple search neuroevolution. arXiv preprint arXiv:1901.05988 (2019)

  2. Camero, A., Toutouh, J., Alba, E.: Low-cost recurrent neural network expected performance evaluation. arXiv preprint arXiv:1805.07159 (2018)

  3. Camero, A., Toutouh, J., Alba, E.: A specialized evolutionary strategy using mean absolute error random sampling to design recurrent neural networks. arXiv preprint arXiv:1909.02425 (2019)

  4. Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555 (2014)

  5. Collins, J., Sohl-Dickstein, J., Sussillo, D.: Capacity and trainability in recurrent neural networks. arXiv preprint arXiv:1611.09913 (2016)

  6. Deb, K., Pratap, A., Agarwal, S., Meyarivan, T.: A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput. 6(2), 182–197 (2002)

    Article  Google Scholar 

  7. Desell, T.: Accelerating the evolution of convolutional neural networks with node-level mutations and epigenetic weight initialization. In: Proceedings of the Genetic and Evolutionary Computation Conference Companion, pp. 157–158. ACM (2018)

    Google Scholar 

  8. Desell, T., ElSaid, A., Ororbia, A.G.: An empirical exploration of deep recurrent connections using neuro-evolution. In: The 23nd International Conference on the Applications of Evolutionary Computation (EvoStar: EvoApps 2020), Seville, Spain, April 2020

    Google Scholar 

  9. ElSaid, A., El Jamiy, F., Higgins, J., Wild, B., Desell, T.: Optimizing long short-term memory recurrent neural networks using ant colony optimization to predict turbine engine vibration. Appl. Soft Comput. 73, 969–991 D(2018)

    Google Scholar 

  10. ElSaid, A., Karns, J., Lyu, Z., Krutz, D., Ororbia, A., Desell, T.: Improving neuroevolutionary transfer learning of deep recurrent neural networks through network-aware adaptation. In: Proceedings of the 2020 Genetic and Evolutionary Computation Conference, pp. 315–323 (2020)

    Google Scholar 

  11. Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pp. 249–256 (2010)

    Google Scholar 

  12. He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on ImageNet classification. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1026–1034 (2015)

    Google Scholar 

  13. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  14. Jozefowicz, R., Zaremba, W., Sutskever, I.: An empirical exploration of recurrent network architectures. In: International Conference on Machine Learning, pp. 2342–2350 (2015)

    Google Scholar 

  15. Ku, K.W., Mak, M.W.: Exploring the effects of Lamarckian and Baldwinian learning in evolving recurrent neural networks. In: Proceedings of 1997 IEEE International Conference on Evolutionary Computation (ICEC’97), pp. 617–621. IEEE (1997)

    Google Scholar 

  16. Liu, C., et al.: Progressive neural architecture search. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 19–34 (2018)

    Google Scholar 

  17. Liu, Y., Sun, Y., Xue, B., Zhang, M., Yen, G.: A survey on evolutionary neural architecture search. arXiv preprint arXiv:2008.10937 (2020)

  18. Lu, Z., et al.: NSGA-Net: neural architecture search using multi-objective genetic algorithm. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 419–427 (2019)

    Google Scholar 

  19. Ororbia, A., ElSaid, A., Desell, T.: Investigating recurrent neural network memory structures using neuro-evolution. In: Proceedings of the Genetic and Evolutionary Computation Conference. GECCO 2019, pp. 446–455. ACM, New York, NY, USA (2019). https://doi.org/10.1145/3321707.3321795

  20. Ororbia II, A.G., Mikolov, T., Reitter, D.: Learning simpler language models with the differential state framework. Neural Comput. 1–26 (2017). https://doi.org/10.1162/neco_a_01017, pMID: 28957029

  21. Pascanu, R., Mikolov, T., Bengio, Y.: On the difficulty of training recurrent neural networks. In: International Conference on Machine Learning, pp. 1310–1318 (2013)

    Google Scholar 

  22. Pham, H., Guan, M.Y., Zoph, B., Le, Q.V., Dean, J.: Efficient neural architecture search via parameter sharing. arXiv preprint arXiv:1802.03268 (2018)

  23. Prellberg, J., Kramer, O.: Lamarckian evolution of convolutional neural networks. In: Auger, A., Fonseca, C.M., Lourenço, N., Machado, P., Paquete, L., Whitley, D. (eds.) PPSN 2018. LNCS, vol. 11102, pp. 424–435. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-99259-4_34

    Chapter  Google Scholar 

  24. Real, E., et al.: Large-scale evolution of image classifiers. arXiv preprint arXiv:1703.01041 (2017)

  25. Rochester Institute of Technology: Research computing services (2019). https://doi.org/10.34788/0S3G-QD15, https://www.rit.edu/researchcomputing/

  26. Stanley, K., Miikkulainen, R.: Evolving neural networks through augmenting topologies. Evol. Comput. 10(2), 99–127 (2002)

    Article  Google Scholar 

  27. Stanley, K.O., Clune, J., Lehman, J., Miikkulainen, R.: Designing neural networks through neuroevolution. Nat. Mach. Intell. 1(1), 24–35 (2019)

    Article  Google Scholar 

  28. Stanley, K.O., D’Ambrosio, D.B., Gauci, J.: A hypercube-based encoding for evolving large-scale neural networks. Artif. Life 15(2), 185–212 (2009)

    Article  Google Scholar 

  29. Zhang, Q., Li, H.: MOEA/D: a multiobjective evolutionary algorithm based on decomposition. IEEE Trans. Evol. Comput. 11(6), 712–731 (2007)

    Article  Google Scholar 

  30. Zhou, G.B., Wu, J., Zhang, C.L., Zhou, Z.H.: Minimal gated unit for recurrent neural networks. Int. J. Autom. Comput. 13(3), 226–234 (2016)

    Article  Google Scholar 

Download references

Acknowledgements

Most of the computation of this research was done on the high performance computing clusters of Research Computing at Rochester Institute of Technology [25]. We would like to thank the Research Computing team for their assistance and the support they generously offered to ensure that the heavy computation this study required was available.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Zimeng Lyu or Travis Desell .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 123 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Lyu, Z., ElSaid, A., Karns, J., Mkaouer, M., Desell, T. (2021). An Experimental Study of Weight Initialization and Lamarckian Inheritance on Neuroevolution. In: Castillo, P.A., Jiménez Laredo, J.L. (eds) Applications of Evolutionary Computation. EvoApplications 2021. Lecture Notes in Computer Science(), vol 12694. Springer, Cham. https://doi.org/10.1007/978-3-030-72699-7_37

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-72699-7_37

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-72698-0

  • Online ISBN: 978-3-030-72699-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics