An Experimental Study of Weight Initialization and Lamarckian Inheritance on Neuroevolution

Lyu, Zimeng; ElSaid, AbdElRahman; Karns, Joshua; Mkaouer, Mohamed; Desell, Travis

doi:10.1007/978-3-030-72699-7_37

Zimeng Lyu¹⁰,
AbdElRahman ElSaid¹⁰,
Joshua Karns¹⁰,
Mohamed Mkaouer¹⁰ &
…
Travis Desell¹⁰

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12694))

Included in the following conference series:

International Conference on the Applications of Evolutionary Computation (Part of EvoStar)

1677 Accesses
8 Citations

Abstract

Weight initialization is critical in being able to successfully train artificial neural networks (ANNs), and even more so for recurrent neural networks (RNNs) which can easily suffer from vanishing and exploding gradients. In neuroevolution, where evolutionary algorithms are applied to neural architecture search, weights typically need to be initialized at three different times: when the initial genomes (ANN architectures) are created, when offspring genomes are generated by crossover, and when new nodes or edges are created during mutation. This work explores the difference between the state-of-the-art Xavier and Kaiming methods, and novel Lamarckian weight inheritance for weight initialization during crossover and mutation operations. These are examined using the Evolutionary eXploration of Augmenting Memory Models (EXAMM) neuroevolution algorithm, which is capable of evolving RNNs with a variety of modern memory cells (e.g., LSTM, GRU, MGU, UGRNN and Delta-RNN cells) as well as recurrent connections with varying time skips through a high performance island based distributed evolutionary algorithm. Results show that with statistical significance, the Lamarckian strategy outperforms both Kaiming and Xavier weight initialization, can speed neuroevolution by requiring less backpropagation epochs to be evaluated per genome, and that the neuroevolutionary process provides further benefits to neural network weight optimization.

This material is based upon work supported by the U.S. Department of Energy, Office of Science, Office of Advanced Combustion Systems under Award Number #FE0031547 and by the Federal Aviation Administration and MITRE Corporation under the National General Aviation Flight Information Database (NGAFID) award.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Fan in is the number of input signals that feed into the layer, fan out is the number of output signals that come out of the layer.
2.
Components are identified as being the same by having the same innovation number, which is uniquely created by the neuroevolution process when an architectural component is added to a genome, and are inherited by children on crossover and mutation, as in the NEAT algorithm [26].
3.
These data sets are made publicly available at EXAMM GitHub repository: https://github.com/travisdesell/exact/tree/master/datasets/.
4.
https://opendata-renewables.engie.com.

References

Aly, A., Weikersdorfer, D., Delaunay, C.: Optimizing deep neural networks with multiple search neuroevolution. arXiv preprint arXiv:1901.05988 (2019)
Camero, A., Toutouh, J., Alba, E.: Low-cost recurrent neural network expected performance evaluation. arXiv preprint arXiv:1805.07159 (2018)
Camero, A., Toutouh, J., Alba, E.: A specialized evolutionary strategy using mean absolute error random sampling to design recurrent neural networks. arXiv preprint arXiv:1909.02425 (2019)
Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555 (2014)
Collins, J., Sohl-Dickstein, J., Sussillo, D.: Capacity and trainability in recurrent neural networks. arXiv preprint arXiv:1611.09913 (2016)
Deb, K., Pratap, A., Agarwal, S., Meyarivan, T.: A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput. 6(2), 182–197 (2002)
Article Google Scholar
Desell, T.: Accelerating the evolution of convolutional neural networks with node-level mutations and epigenetic weight initialization. In: Proceedings of the Genetic and Evolutionary Computation Conference Companion, pp. 157–158. ACM (2018)
Google Scholar
Desell, T., ElSaid, A., Ororbia, A.G.: An empirical exploration of deep recurrent connections using neuro-evolution. In: The 23nd International Conference on the Applications of Evolutionary Computation (EvoStar: EvoApps 2020), Seville, Spain, April 2020
Google Scholar
ElSaid, A., El Jamiy, F., Higgins, J., Wild, B., Desell, T.: Optimizing long short-term memory recurrent neural networks using ant colony optimization to predict turbine engine vibration. Appl. Soft Comput. 73, 969–991 D(2018)
Google Scholar
ElSaid, A., Karns, J., Lyu, Z., Krutz, D., Ororbia, A., Desell, T.: Improving neuroevolutionary transfer learning of deep recurrent neural networks through network-aware adaptation. In: Proceedings of the 2020 Genetic and Evolutionary Computation Conference, pp. 315–323 (2020)
Google Scholar
Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pp. 249–256 (2010)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on ImageNet classification. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1026–1034 (2015)
Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Jozefowicz, R., Zaremba, W., Sutskever, I.: An empirical exploration of recurrent network architectures. In: International Conference on Machine Learning, pp. 2342–2350 (2015)
Google Scholar
Ku, K.W., Mak, M.W.: Exploring the effects of Lamarckian and Baldwinian learning in evolving recurrent neural networks. In: Proceedings of 1997 IEEE International Conference on Evolutionary Computation (ICEC’97), pp. 617–621. IEEE (1997)
Google Scholar
Liu, C., et al.: Progressive neural architecture search. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 19–34 (2018)
Google Scholar
Liu, Y., Sun, Y., Xue, B., Zhang, M., Yen, G.: A survey on evolutionary neural architecture search. arXiv preprint arXiv:2008.10937 (2020)
Lu, Z., et al.: NSGA-Net: neural architecture search using multi-objective genetic algorithm. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 419–427 (2019)
Google Scholar
Ororbia, A., ElSaid, A., Desell, T.: Investigating recurrent neural network memory structures using neuro-evolution. In: Proceedings of the Genetic and Evolutionary Computation Conference. GECCO 2019, pp. 446–455. ACM, New York, NY, USA (2019). https://doi.org/10.1145/3321707.3321795
Ororbia II, A.G., Mikolov, T., Reitter, D.: Learning simpler language models with the differential state framework. Neural Comput. 1–26 (2017). https://doi.org/10.1162/neco_a_01017, pMID: 28957029
Pascanu, R., Mikolov, T., Bengio, Y.: On the difficulty of training recurrent neural networks. In: International Conference on Machine Learning, pp. 1310–1318 (2013)
Google Scholar
Pham, H., Guan, M.Y., Zoph, B., Le, Q.V., Dean, J.: Efficient neural architecture search via parameter sharing. arXiv preprint arXiv:1802.03268 (2018)
Prellberg, J., Kramer, O.: Lamarckian evolution of convolutional neural networks. In: Auger, A., Fonseca, C.M., Lourenço, N., Machado, P., Paquete, L., Whitley, D. (eds.) PPSN 2018. LNCS, vol. 11102, pp. 424–435. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-99259-4_34
Chapter Google Scholar
Real, E., et al.: Large-scale evolution of image classifiers. arXiv preprint arXiv:1703.01041 (2017)
Rochester Institute of Technology: Research computing services (2019). https://doi.org/10.34788/0S3G-QD15, https://www.rit.edu/researchcomputing/
Stanley, K., Miikkulainen, R.: Evolving neural networks through augmenting topologies. Evol. Comput. 10(2), 99–127 (2002)
Article Google Scholar
Stanley, K.O., Clune, J., Lehman, J., Miikkulainen, R.: Designing neural networks through neuroevolution. Nat. Mach. Intell. 1(1), 24–35 (2019)
Article Google Scholar
Stanley, K.O., D’Ambrosio, D.B., Gauci, J.: A hypercube-based encoding for evolving large-scale neural networks. Artif. Life 15(2), 185–212 (2009)
Article Google Scholar
Zhang, Q., Li, H.: MOEA/D: a multiobjective evolutionary algorithm based on decomposition. IEEE Trans. Evol. Comput. 11(6), 712–731 (2007)
Article Google Scholar
Zhou, G.B., Wu, J., Zhang, C.L., Zhou, Z.H.: Minimal gated unit for recurrent neural networks. Int. J. Autom. Comput. 13(3), 226–234 (2016)
Article Google Scholar

Download references

Acknowledgements

Most of the computation of this research was done on the high performance computing clusters of Research Computing at Rochester Institute of Technology [25]. We would like to thank the Research Computing team for their assistance and the support they generously offered to ensure that the heavy computation this study required was available.

Author information

Authors and Affiliations

Rochester Institute of Technology, Rochester, NY, 14623, USA
Zimeng Lyu, AbdElRahman ElSaid, Joshua Karns, Mohamed Mkaouer & Travis Desell

Authors

Zimeng Lyu
View author publications
You can also search for this author in PubMed Google Scholar
AbdElRahman ElSaid
View author publications
You can also search for this author in PubMed Google Scholar
Joshua Karns
View author publications
You can also search for this author in PubMed Google Scholar
Mohamed Mkaouer
View author publications
You can also search for this author in PubMed Google Scholar
Travis Desell
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Zimeng Lyu or Travis Desell .

Editor information

Editors and Affiliations

ETSIIT-CITIC, University of Granada, Granada, Spain
Pedro A. Castillo
Université Le Havre Normandie, Le Havre, France
Juan Luis Jiménez Laredo

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 123 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lyu, Z., ElSaid, A., Karns, J., Mkaouer, M., Desell, T. (2021). An Experimental Study of Weight Initialization and Lamarckian Inheritance on Neuroevolution. In: Castillo, P.A., Jiménez Laredo, J.L. (eds) Applications of Evolutionary Computation. EvoApplications 2021. Lecture Notes in Computer Science(), vol 12694. Springer, Cham. https://doi.org/10.1007/978-3-030-72699-7_37

Download citation

DOI: https://doi.org/10.1007/978-3-030-72699-7_37
Published: 01 April 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-72698-0
Online ISBN: 978-3-030-72699-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics