Abstract
Weight initialization is critical in being able to successfully train artificial neural networks (ANNs), and even more so for recurrent neural networks (RNNs) which can easily suffer from vanishing and exploding gradients. In neuroevolution, where evolutionary algorithms are applied to neural architecture search, weights typically need to be initialized at three different times: when the initial genomes (ANN architectures) are created, when offspring genomes are generated by crossover, and when new nodes or edges are created during mutation. This work explores the difference between the state-of-the-art Xavier and Kaiming methods, and novel Lamarckian weight inheritance for weight initialization during crossover and mutation operations. These are examined using the Evolutionary eXploration of Augmenting Memory Models (EXAMM) neuroevolution algorithm, which is capable of evolving RNNs with a variety of modern memory cells (e.g., LSTM, GRU, MGU, UGRNN and Delta-RNN cells) as well as recurrent connections with varying time skips through a high performance island based distributed evolutionary algorithm. Results show that with statistical significance, the Lamarckian strategy outperforms both Kaiming and Xavier weight initialization, can speed neuroevolution by requiring less backpropagation epochs to be evaluated per genome, and that the neuroevolutionary process provides further benefits to neural network weight optimization.
This material is based upon work supported by the U.S. Department of Energy, Office of Science, Office of Advanced Combustion Systems under Award Number #FE0031547 and by the Federal Aviation Administration and MITRE Corporation under the National General Aviation Flight Information Database (NGAFID) award.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Fan in is the number of input signals that feed into the layer, fan out is the number of output signals that come out of the layer.
- 2.
Components are identified as being the same by having the same innovation number, which is uniquely created by the neuroevolution process when an architectural component is added to a genome, and are inherited by children on crossover and mutation, as in the NEAT algorithm [26].
- 3.
These data sets are made publicly available at EXAMM GitHub repository: https://github.com/travisdesell/exact/tree/master/datasets/.
- 4.
References
Aly, A., Weikersdorfer, D., Delaunay, C.: Optimizing deep neural networks with multiple search neuroevolution. arXiv preprint arXiv:1901.05988 (2019)
Camero, A., Toutouh, J., Alba, E.: Low-cost recurrent neural network expected performance evaluation. arXiv preprint arXiv:1805.07159 (2018)
Camero, A., Toutouh, J., Alba, E.: A specialized evolutionary strategy using mean absolute error random sampling to design recurrent neural networks. arXiv preprint arXiv:1909.02425 (2019)
Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555 (2014)
Collins, J., Sohl-Dickstein, J., Sussillo, D.: Capacity and trainability in recurrent neural networks. arXiv preprint arXiv:1611.09913 (2016)
Deb, K., Pratap, A., Agarwal, S., Meyarivan, T.: A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput. 6(2), 182–197 (2002)
Desell, T.: Accelerating the evolution of convolutional neural networks with node-level mutations and epigenetic weight initialization. In: Proceedings of the Genetic and Evolutionary Computation Conference Companion, pp. 157–158. ACM (2018)
Desell, T., ElSaid, A., Ororbia, A.G.: An empirical exploration of deep recurrent connections using neuro-evolution. In: The 23nd International Conference on the Applications of Evolutionary Computation (EvoStar: EvoApps 2020), Seville, Spain, April 2020
ElSaid, A., El Jamiy, F., Higgins, J., Wild, B., Desell, T.: Optimizing long short-term memory recurrent neural networks using ant colony optimization to predict turbine engine vibration. Appl. Soft Comput. 73, 969–991 D(2018)
ElSaid, A., Karns, J., Lyu, Z., Krutz, D., Ororbia, A., Desell, T.: Improving neuroevolutionary transfer learning of deep recurrent neural networks through network-aware adaptation. In: Proceedings of the 2020 Genetic and Evolutionary Computation Conference, pp. 315–323 (2020)
Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pp. 249–256 (2010)
He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on ImageNet classification. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1026–1034 (2015)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Jozefowicz, R., Zaremba, W., Sutskever, I.: An empirical exploration of recurrent network architectures. In: International Conference on Machine Learning, pp. 2342–2350 (2015)
Ku, K.W., Mak, M.W.: Exploring the effects of Lamarckian and Baldwinian learning in evolving recurrent neural networks. In: Proceedings of 1997 IEEE International Conference on Evolutionary Computation (ICEC’97), pp. 617–621. IEEE (1997)
Liu, C., et al.: Progressive neural architecture search. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 19–34 (2018)
Liu, Y., Sun, Y., Xue, B., Zhang, M., Yen, G.: A survey on evolutionary neural architecture search. arXiv preprint arXiv:2008.10937 (2020)
Lu, Z., et al.: NSGA-Net: neural architecture search using multi-objective genetic algorithm. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 419–427 (2019)
Ororbia, A., ElSaid, A., Desell, T.: Investigating recurrent neural network memory structures using neuro-evolution. In: Proceedings of the Genetic and Evolutionary Computation Conference. GECCO 2019, pp. 446–455. ACM, New York, NY, USA (2019). https://doi.org/10.1145/3321707.3321795
Ororbia II, A.G., Mikolov, T., Reitter, D.: Learning simpler language models with the differential state framework. Neural Comput. 1–26 (2017). https://doi.org/10.1162/neco_a_01017, pMID: 28957029
Pascanu, R., Mikolov, T., Bengio, Y.: On the difficulty of training recurrent neural networks. In: International Conference on Machine Learning, pp. 1310–1318 (2013)
Pham, H., Guan, M.Y., Zoph, B., Le, Q.V., Dean, J.: Efficient neural architecture search via parameter sharing. arXiv preprint arXiv:1802.03268 (2018)
Prellberg, J., Kramer, O.: Lamarckian evolution of convolutional neural networks. In: Auger, A., Fonseca, C.M., Lourenço, N., Machado, P., Paquete, L., Whitley, D. (eds.) PPSN 2018. LNCS, vol. 11102, pp. 424–435. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-99259-4_34
Real, E., et al.: Large-scale evolution of image classifiers. arXiv preprint arXiv:1703.01041 (2017)
Rochester Institute of Technology: Research computing services (2019). https://doi.org/10.34788/0S3G-QD15, https://www.rit.edu/researchcomputing/
Stanley, K., Miikkulainen, R.: Evolving neural networks through augmenting topologies. Evol. Comput. 10(2), 99–127 (2002)
Stanley, K.O., Clune, J., Lehman, J., Miikkulainen, R.: Designing neural networks through neuroevolution. Nat. Mach. Intell. 1(1), 24–35 (2019)
Stanley, K.O., D’Ambrosio, D.B., Gauci, J.: A hypercube-based encoding for evolving large-scale neural networks. Artif. Life 15(2), 185–212 (2009)
Zhang, Q., Li, H.: MOEA/D: a multiobjective evolutionary algorithm based on decomposition. IEEE Trans. Evol. Comput. 11(6), 712–731 (2007)
Zhou, G.B., Wu, J., Zhang, C.L., Zhou, Z.H.: Minimal gated unit for recurrent neural networks. Int. J. Autom. Comput. 13(3), 226–234 (2016)
Acknowledgements
Most of the computation of this research was done on the high performance computing clusters of Research Computing at Rochester Institute of Technology [25]. We would like to thank the Research Computing team for their assistance and the support they generously offered to ensure that the heavy computation this study required was available.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Lyu, Z., ElSaid, A., Karns, J., Mkaouer, M., Desell, T. (2021). An Experimental Study of Weight Initialization and Lamarckian Inheritance on Neuroevolution. In: Castillo, P.A., Jiménez Laredo, J.L. (eds) Applications of Evolutionary Computation. EvoApplications 2021. Lecture Notes in Computer Science(), vol 12694. Springer, Cham. https://doi.org/10.1007/978-3-030-72699-7_37
Download citation
DOI: https://doi.org/10.1007/978-3-030-72699-7_37
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-72698-0
Online ISBN: 978-3-030-72699-7
eBook Packages: Computer ScienceComputer Science (R0)