skip to main content
10.1145/3520304.3528937acmconferencesArticle/Chapter ViewAbstractPublication PagesgeccoConference Proceedingsconference-collections
poster

Scalable evolutionary hierarchical reinforcement learning

Published: 19 July 2022 Publication History

Abstract

This paper investigates a novel method combining Scalable Evolution Strategies (S-ES) and Hierarchical Reinforcement Learning (HRL). S-ES, named for its excellent scalability, was popularised with demonstrated performance comparable to state-of-the-art policy gradient methods. However, S-ES has not been tested in conjunction with HRL methods, which empower temporal abstraction thus allowing agents to tackle more challenging problems. We introduce a novel method merging S-ES and HRL, which creates a highly scalable and efficient (compute time) algorithm. We demonstrate that the proposed method benefits from S-ES's scalability and indifference to delayed rewards. This results in our main contribution: significantly higher learning speed and competitive performance compared to gradient-based HRL methods, across a range of tasks.

References

[1]
Hans-Georg Beyer and Hans-Paul Schwefel. 2002. Evolution strategies - A comprehensive introduction. Natural Computing 1 (2002), 3--52. Issue 1.
[2]
Dimo Brockhoff, Anne Auger, Nikolaus Hansen, Dirk V Arnold, and Tim Hohm. 2010. Mirrored sampling and sequential selection for evolution strategies. In International Conference on Parallel Problem Solving from Nature. Springer, 11--21.
[3]
Edoardo Conti, Vashisht Madhavan, Felipe Petroski Such, Joel Lehman, Kenneth O. Stanley, and Jeff Clune. 2017. Improving Exploration in Evolution Strategies for Deep Reinforcement Learning via a Population of Novelty-Seeking Agents. Advances in Neural Information Processing Systems 2018-December (12 2017), 5027--5038.
[4]
Erwin Coumans et al. 2013. Bullet physics library. Open source: bulletphysics. org 15, 49 (2013), 5.
[5]
Peter Dayan and Geoffrey E Hinton. 1993. Feudal Reinforcement Learning. In Advances in Neural Information Processing Systems, S. Hanson, J. Cowan, and C. Giles (Eds.), Vol. 5. Morgan-Kaufmann.
[6]
B. Flannery, S. Teukolsky, and W. Vetterling. 1986. Numerical Recipes. Cambridge University Press, Cambridge, UK.
[7]
Santiago Gonzalez and Risto Miikkulainen. 2020. Improved training speed, accuracy, and data utilization through loss function optimization. In 2020 IEEE Congress on Evolutionary Computation (CEC). IEEE, 1--8.
[8]
Tuomas Haarnoja, Aurick Zhou, Pieter Abbeel, and Sergey Levine. 2018. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. In International conference on machine learning. PMLR, 1861--1870.
[9]
Nikolaus Hansen and Andreas Ostermeier. 2001. Completely derandomized self-adaptation in evolution strategies. Evolutionary computation 9, 2 (2001), 159--195.
[10]
Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
[11]
Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness, Marc G Bellemare, Alex Graves, Martin Riedmiller, Andreas K Fidjeland, Georg Ostrovski, et al. 2015. Human-level control through deep reinforcement learning. nature 518, 7540 (2015), 529--533.
[12]
Ofir Nachum, Shixiang Shane Gu, Honglak Lee, and Sergey Levine. 2018. Data-Efficient Hierarchical Reinforcement Learning. Advances in Neural Information Processing Systems 31 (2018), 3303--3313.
[13]
Tom Le Paine, Cosmin Paduraru, Andrea Michi, Caglar Gulcehre, Konrad Zolna, Alexander Novikov, Ziyu Wang, and Nando de Freitas. 2020. Hyperparameter selection for offline reinforcement learning. arXiv preprint arXiv:2007.09055 (2020).
[14]
Hongyu Ren, Shengjia Zhao, and Stefano Ermon. 2019. Adaptive Antithetic Sampling for Variance Reduction. In Proceedings of the 36th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 97), Kamalika Chaudhuri and Ruslan Salakhutdinov (Eds.). PMLR, 5420--5428. https://proceedings.mlr.press/v97/ren19b.html
[15]
Tim Salimans, Jonathan Ho, Xi Chen, and Ilya Sutskever. 2017. Evolution Strategies as a Scalable Alternative to Reinforcement Learning. CoRR abs/1703.03864 (2017). arXiv:1703.03864 http://arxiv.org/abs/1703.03864
[16]
Olivier Sigaud and Freek Stulp. 2019. Policy search in continuous action domains: an overview. Neural Networks 113 (2019), 28--40.
[17]
David Silver, Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou, Matthew Lai, Arthur Guez, Marc Lanctot, Laurent Sifre, Dharshan Kumaran, Thore Graepel, et al. 2018. A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play. Science 362, 6419 (2018), 1140--1144.
[18]
Richard S Sutton and Andrew G Barto. 2018. Reinforcement learning: An introduction. MIT press.
[19]
Richard S Sutton, Doina Precup, and Satinder Singh. 1999. Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial intelligence 112, 1--2 (1999), 181--211.
[20]
Alexander Sasha Vezhnevets, Simon Osindero, Tom Schaul, Nicolas Heess, Max Jaderberg, David Silver, and Koray Kavukcuoglu. 2017. Feudal networks for hierarchical reinforcement learning. In International Conference on Machine Learning. PMLR, 3540--3549.

Cited By

View all
  • (2024)Leveraging More of Biology in Evolutionary Reinforcement LearningApplications of Evolutionary Computation10.1007/978-3-031-56855-8_6(91-114)Online publication date: 3-Mar-2024
  • (2023)Evolutionary Computation and the Reinforcement Learning ProblemHandbook of Evolutionary Machine Learning10.1007/978-981-99-3814-8_4(79-118)Online publication date: 2-Nov-2023

Index Terms

  1. Scalable evolutionary hierarchical reinforcement learning
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        GECCO '22: Proceedings of the Genetic and Evolutionary Computation Conference Companion
        July 2022
        2395 pages
        ISBN:9781450392686
        DOI:10.1145/3520304
        Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 19 July 2022

        Check for updates

        Author Tags

        1. evolution strategies
        2. hierarchical reinforcement learning

        Qualifiers

        • Poster

        Conference

        GECCO '22
        Sponsor:

        Acceptance Rates

        Overall Acceptance Rate 1,669 of 4,410 submissions, 38%

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)21
        • Downloads (Last 6 weeks)2
        Reflects downloads up to 16 Feb 2025

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)Leveraging More of Biology in Evolutionary Reinforcement LearningApplications of Evolutionary Computation10.1007/978-3-031-56855-8_6(91-114)Online publication date: 3-Mar-2024
        • (2023)Evolutionary Computation and the Reinforcement Learning ProblemHandbook of Evolutionary Machine Learning10.1007/978-981-99-3814-8_4(79-118)Online publication date: 2-Nov-2023

        View Options

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Figures

        Tables

        Media

        Share

        Share

        Share this Publication link

        Share on social media