Challenges in High-Dimensional Reinforcement Learning with Evolution Strategies

Müller, Nils; Glasmachers, Tobias

doi:10.1007/978-3-319-99259-4_33

Nils Müller¹⁹ &
Tobias Glasmachers¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11102))

Included in the following conference series:

International Conference on Parallel Problem Solving from Nature

1822 Accesses

Abstract

Evolution Strategies (ESs) have recently become popular for training deep neural networks, in particular on reinforcement learning tasks, a special form of controller design. Compared to classic problems in continuous direct search, deep networks pose extremely high-dimensional optimization problems, with many thousands or even millions of variables. In addition, many control problems give rise to a stochastic fitness function. Considering the relevance of the application, we study the suitability of evolution strategies for high-dimensional, stochastic problems. Our results give insights into which algorithmic mechanisms of modern ES are of value for the class of problems at hand, and they reveal principled limitations of the approach. They are in line with our theoretical understanding of ESs. We show that combining ESs that offer reduced internal algorithm cost with uncertainty handling techniques yields promising methods for this class of problems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Neuroevolution

Qualitative differences between evolutionary strategies and reinforcement learning methods for control of autonomous agents

Article 07 December 2022

ANN-EMOA: Evolving Neural Networks Efficiently

Notes

1.
https://github.com/NiMlr/High-Dim-ES-RL.

References

Akimoto, Y., Auger, A., Hansen, N.: Comparison-based natural gradient optimization in high dimension. In: Proceedings of the 2014 Annual Conference on Genetic and Evolutionary Computation, pp. 373–380. ACM (2014)
Google Scholar
Beyer, H.-G., Arnold, D.V.: Qualms regarding the optimality of cumulative path length control in CSA/CMA-evolution strategies. Evol. Comput. 11(1), 19–28 (2003)
Article Google Scholar
Beyer, H.-G., Hellwig, M.: Analysis of the pcCMSA-ES on the noisy ellipsoid model. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 689–696. ACM (2017)
Google Scholar
Beyer, H.-G., Schwefel, H.-P.: Evolution strategies-a comprehensive introduction. Nat. Comput. 1(1), 3–52 (2002)
Article MathSciNet Google Scholar
Beyer, H.-G., Sendhoff, B.: Simplify your covariance matrix adaptation evolution strategy. IEEE Trans. Evol. Comput. 21(5), 746–759 (2017). https://ieeexplore.ieee.org/document/7875115/
Article Google Scholar
Chrabaszcz, P., Loshchilov, I., Hutter, F.: Back to basics: benchmarking canonical evolution strategies for playing atari. Technical report 1802.08842, arXiv.org (2018)
Wierstra, D.: Natural evolution strategies. J. Mach. Learn. Res. 15(1), 949–980 (2014)
MathSciNet MATH Google Scholar
Silver, D., et al.: Mastering the game of go with deep neural networks and tree search. Nature 529(7587), 484–489 (2016)
Article Google Scholar
Such, F., et al.: Deep neuroevolution: genetic algorithms are a competitive alternative for training deep neural networks for reinforcement learning. Technical report 1712.06567, arXiv.org (2017)
Brockman, G., et al.: OpenAI gym. Technical report 1606.01540, arXiv.org (2016)
Loshchilov, I., et al.: Limited-memory matrix adaptation for large scale black-box optimization. Technical report 1705.06693, arXiv.org (2017)
Lehman, J., et al.: ES is more than just a traditional finite-difference approximator. Technical report 1712.06568v2, arXiv.org (2017)
Plappert, M., et al.: Parameter space noise for exploration. Technical report 1706.01905v2, arXiv.org (2017)
Hansen, N., et al.: A method for handling uncertainty in evolutionary optimization with an application to feedback control of combustion. IEEE Trans. Evol. Comput. 13(1), 180–197 (2009)
Article Google Scholar
Hansen, N., et al.: COCO: a platform for comparing continuous optimizers in a black-box setting. Technical report 1603.08785, arXiv.org (2016)
Geijtenbeek, T., et al.: Flexible muscle-based locomotion for bipedal creatures. ACM Trans. Graph. (TOG) 32(6), 206 (2013)
Article Google Scholar
Salimans, T., et al.: Evolution strategies as a scalable alternative to reinforcement learning. Technical report 1703.03864, arXiv.org (2017)
Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529 (2015)
Article Google Scholar
Li, X., et al.: Benchmark functions for the CEC 2013 special session and competition on large-scale global optimization. Gene 7(33), 8 (2013)
Google Scholar
Sun, Y., et al.: A linear time natural evolution strategy for non-separable functions. In: Conference Companion on Genetic and Evolutionary Computation. ACM (2013)
Google Scholar
Hansen, N., Arnold, D.V., Auger, A.: Evolution strategies. In: Kacprzyk, J., Pedrycz, W. (eds.) Springer Handbook of Computational Intelligence, pp. 871–898. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-43505-2_44
Chapter Google Scholar
Hansen, N., Ostermeier, A.: Completely derandomized self-adaptation in evolution atrategies. Evol. Comput. 9(2), 159–195 (2001)
Article Google Scholar
Heidrich-Meisner, V., Igel, C.: Neuroevolution strategies for episodic reinforcement learning. J. Algorithms 64(4), 152–168 (2009)
Article Google Scholar
Igel, C.: Neuroevolution for reinforcement learning using evolution strategies. In: Congress on Evolutionary Computation, vol. 4, pp. 2588–2595 (2003)
Google Scholar
Jägersküpper, J.: How the (1+1)-ES using isotropic mutations minimizes positive definite quadratic forms. Theor. Comput. Sci. 361(1), 38–56 (2006)
Article MathSciNet Google Scholar
Jebalia, M., Auger, A.: On multiplicative noise models for stochastic search. In: Rudolph, G., Jansen, T., Beume, N., Lucas, S., Poloni, C. (eds.) PPSN 2008. LNCS, vol. 5199, pp. 52–61. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-87700-4_6
Chapter Google Scholar
Kawaguchi, K.: Deep learning without poor local minima. In: Advances in Neural Information Processing Systems, pp. 586–594 (2016)
Google Scholar
Loshchilov, I.: A computationally efficient limited memory CMA-ES for large scale optimization. In: Proceedings of the 2014 Annual Conference on Genetic and Evolutionary Computation, pp. 397–404. ACM (2014)
Google Scholar
Moriarty, D.E., Schultz, A.C., Grefenstette, J.J.: Evolutionary algorithms for reinforcement learning. J. Artif. Intell. Res. (JAIR) 11, 241–276 (1999)
Article Google Scholar
Rechenberg, I.: Evolutionsstrategie-Optimierung technischer Systeme nach Prinzipien der biologischen Evolution (1973)
Google Scholar
Ros, R., Hansen, N.: A simple modification in CMA-ES achieving linear time and space complexity. In: Rudolph, G., Jansen, T., Beume, N., Lucas, S., Poloni, C. (eds.) PPSN 2008. LNCS, vol. 5199, pp. 296–305. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-87700-4_30
Chapter Google Scholar
Stanley, K., D’Ambrosio, D., Gauci, J.: A hypercube-based encoding for evolving large-scale neural networks. Artif. Life 15(2), 185–212 (2009)
Article Google Scholar
Stanley, K.O., Miikkulainen, R.: Evolving neural networks through augmenting topologies. Evol. Comput. 10(2), 99–127 (2002)
Article Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction, vol. 1. MIT press, Cambridge (1998)
Google Scholar
Teytaud, O., Gelly, S.: General lower bounds for evolutionary algorithms. In: Runarsson, T.P., Beyer, H.-G., Burke, E., Merelo-Guervós, J.J., Whitley, L.D., Yao, X. (eds.) PPSN 2006. LNCS, vol. 4193, pp. 21–31. Springer, Heidelberg (2006). https://doi.org/10.1007/11844297_3
Chapter Google Scholar
https://www.researchgate.net/publication/220743287_Uncertainty_handling_CMA-ES_for_reinforcement_learning

Download references

Author information

Authors and Affiliations

Institut für Neuroinformatik, Ruhr-Universität Bochum, Bochum, Germany
Nils Müller & Tobias Glasmachers

Authors

Nils Müller
View author publications
You can also search for this author in PubMed Google Scholar
Tobias Glasmachers
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tobias Glasmachers .

Editor information

Editors and Affiliations

Inria Saclay, Palaiseau, France
Anne Auger
University of Coimbra, Coimbra, Portugal
Carlos M. Fonseca
University of Coimbra, Coimbra, Portugal
Nuno Lourenço
University of Coimbra, Coimbra, Portugal
Penousal Machado
University of Coimbra, Coimbra, Portugal
Luís Paquete
Colorado State University, Fort Collins, Colorado, USA
Darrell Whitley

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Müller, N., Glasmachers, T. (2018). Challenges in High-Dimensional Reinforcement Learning with Evolution Strategies. In: Auger, A., Fonseca, C., Lourenço, N., Machado, P., Paquete, L., Whitley, D. (eds) Parallel Problem Solving from Nature – PPSN XV. PPSN 2018. Lecture Notes in Computer Science(), vol 11102. Springer, Cham. https://doi.org/10.1007/978-3-319-99259-4_33

Download citation

DOI: https://doi.org/10.1007/978-3-319-99259-4_33
Published: 21 August 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-99258-7
Online ISBN: 978-3-319-99259-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics