Abstract
Evolution Strategies (ESs) have recently become popular for training deep neural networks, in particular on reinforcement learning tasks, a special form of controller design. Compared to classic problems in continuous direct search, deep networks pose extremely high-dimensional optimization problems, with many thousands or even millions of variables. In addition, many control problems give rise to a stochastic fitness function. Considering the relevance of the application, we study the suitability of evolution strategies for high-dimensional, stochastic problems. Our results give insights into which algorithmic mechanisms of modern ES are of value for the class of problems at hand, and they reveal principled limitations of the approach. They are in line with our theoretical understanding of ESs. We show that combining ESs that offer reduced internal algorithm cost with uncertainty handling techniques yields promising methods for this class of problems.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Akimoto, Y., Auger, A., Hansen, N.: Comparison-based natural gradient optimization in high dimension. In: Proceedings of the 2014 Annual Conference on Genetic and Evolutionary Computation, pp. 373–380. ACM (2014)
Beyer, H.-G., Arnold, D.V.: Qualms regarding the optimality of cumulative path length control in CSA/CMA-evolution strategies. Evol. Comput. 11(1), 19–28 (2003)
Beyer, H.-G., Hellwig, M.: Analysis of the pcCMSA-ES on the noisy ellipsoid model. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 689–696. ACM (2017)
Beyer, H.-G., Schwefel, H.-P.: Evolution strategies-a comprehensive introduction. Nat. Comput. 1(1), 3–52 (2002)
Beyer, H.-G., Sendhoff, B.: Simplify your covariance matrix adaptation evolution strategy. IEEE Trans. Evol. Comput. 21(5), 746–759 (2017). https://ieeexplore.ieee.org/document/7875115/
Chrabaszcz, P., Loshchilov, I., Hutter, F.: Back to basics: benchmarking canonical evolution strategies for playing atari. Technical report 1802.08842, arXiv.org (2018)
Wierstra, D.: Natural evolution strategies. J. Mach. Learn. Res. 15(1), 949–980 (2014)
Silver, D., et al.: Mastering the game of go with deep neural networks and tree search. Nature 529(7587), 484–489 (2016)
Such, F., et al.: Deep neuroevolution: genetic algorithms are a competitive alternative for training deep neural networks for reinforcement learning. Technical report 1712.06567, arXiv.org (2017)
Brockman, G., et al.: OpenAI gym. Technical report 1606.01540, arXiv.org (2016)
Loshchilov, I., et al.: Limited-memory matrix adaptation for large scale black-box optimization. Technical report 1705.06693, arXiv.org (2017)
Lehman, J., et al.: ES is more than just a traditional finite-difference approximator. Technical report 1712.06568v2, arXiv.org (2017)
Plappert, M., et al.: Parameter space noise for exploration. Technical report 1706.01905v2, arXiv.org (2017)
Hansen, N., et al.: A method for handling uncertainty in evolutionary optimization with an application to feedback control of combustion. IEEE Trans. Evol. Comput. 13(1), 180–197 (2009)
Hansen, N., et al.: COCO: a platform for comparing continuous optimizers in a black-box setting. Technical report 1603.08785, arXiv.org (2016)
Geijtenbeek, T., et al.: Flexible muscle-based locomotion for bipedal creatures. ACM Trans. Graph. (TOG) 32(6), 206 (2013)
Salimans, T., et al.: Evolution strategies as a scalable alternative to reinforcement learning. Technical report 1703.03864, arXiv.org (2017)
Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529 (2015)
Li, X., et al.: Benchmark functions for the CEC 2013 special session and competition on large-scale global optimization. Gene 7(33), 8 (2013)
Sun, Y., et al.: A linear time natural evolution strategy for non-separable functions. In: Conference Companion on Genetic and Evolutionary Computation. ACM (2013)
Hansen, N., Arnold, D.V., Auger, A.: Evolution strategies. In: Kacprzyk, J., Pedrycz, W. (eds.) Springer Handbook of Computational Intelligence, pp. 871–898. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-43505-2_44
Hansen, N., Ostermeier, A.: Completely derandomized self-adaptation in evolution atrategies. Evol. Comput. 9(2), 159–195 (2001)
Heidrich-Meisner, V., Igel, C.: Neuroevolution strategies for episodic reinforcement learning. J. Algorithms 64(4), 152–168 (2009)
Igel, C.: Neuroevolution for reinforcement learning using evolution strategies. In: Congress on Evolutionary Computation, vol. 4, pp. 2588–2595 (2003)
Jägersküpper, J.: How the (1+1)-ES using isotropic mutations minimizes positive definite quadratic forms. Theor. Comput. Sci. 361(1), 38–56 (2006)
Jebalia, M., Auger, A.: On multiplicative noise models for stochastic search. In: Rudolph, G., Jansen, T., Beume, N., Lucas, S., Poloni, C. (eds.) PPSN 2008. LNCS, vol. 5199, pp. 52–61. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-87700-4_6
Kawaguchi, K.: Deep learning without poor local minima. In: Advances in Neural Information Processing Systems, pp. 586–594 (2016)
Loshchilov, I.: A computationally efficient limited memory CMA-ES for large scale optimization. In: Proceedings of the 2014 Annual Conference on Genetic and Evolutionary Computation, pp. 397–404. ACM (2014)
Moriarty, D.E., Schultz, A.C., Grefenstette, J.J.: Evolutionary algorithms for reinforcement learning. J. Artif. Intell. Res. (JAIR) 11, 241–276 (1999)
Rechenberg, I.: Evolutionsstrategie-Optimierung technischer Systeme nach Prinzipien der biologischen Evolution (1973)
Ros, R., Hansen, N.: A simple modification in CMA-ES achieving linear time and space complexity. In: Rudolph, G., Jansen, T., Beume, N., Lucas, S., Poloni, C. (eds.) PPSN 2008. LNCS, vol. 5199, pp. 296–305. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-87700-4_30
Stanley, K., D’Ambrosio, D., Gauci, J.: A hypercube-based encoding for evolving large-scale neural networks. Artif. Life 15(2), 185–212 (2009)
Stanley, K.O., Miikkulainen, R.: Evolving neural networks through augmenting topologies. Evol. Comput. 10(2), 99–127 (2002)
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction, vol. 1. MIT press, Cambridge (1998)
Teytaud, O., Gelly, S.: General lower bounds for evolutionary algorithms. In: Runarsson, T.P., Beyer, H.-G., Burke, E., Merelo-Guervós, J.J., Whitley, L.D., Yao, X. (eds.) PPSN 2006. LNCS, vol. 4193, pp. 21–31. Springer, Heidelberg (2006). https://doi.org/10.1007/11844297_3
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Müller, N., Glasmachers, T. (2018). Challenges in High-Dimensional Reinforcement Learning with Evolution Strategies. In: Auger, A., Fonseca, C., Lourenço, N., Machado, P., Paquete, L., Whitley, D. (eds) Parallel Problem Solving from Nature – PPSN XV. PPSN 2018. Lecture Notes in Computer Science(), vol 11102. Springer, Cham. https://doi.org/10.1007/978-3-319-99259-4_33
Download citation
DOI: https://doi.org/10.1007/978-3-319-99259-4_33
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-99258-7
Online ISBN: 978-3-319-99259-4
eBook Packages: Computer ScienceComputer Science (R0)