poster

Scalable evolutionary hierarchical reinforcement learning

Authors:

Sasha Abramowitz,

Geoff NitschkeAuthors Info & Claims

GECCO '22: Proceedings of the Genetic and Evolutionary Computation Conference Companion

Pages 272 - 275

https://doi.org/10.1145/3520304.3528937

Published: 19 July 2022 Publication History

Abstract

This paper investigates a novel method combining Scalable Evolution Strategies (S-ES) and Hierarchical Reinforcement Learning (HRL). S-ES, named for its excellent scalability, was popularised with demonstrated performance comparable to state-of-the-art policy gradient methods. However, S-ES has not been tested in conjunction with HRL methods, which empower temporal abstraction thus allowing agents to tackle more challenging problems. We introduce a novel method merging S-ES and HRL, which creates a highly scalable and efficient (compute time) algorithm. We demonstrate that the proposed method benefits from S-ES's scalability and indifference to delayed rewards. This results in our main contribution: significantly higher learning speed and competitive performance compared to gradient-based HRL methods, across a range of tasks.

References

[1]

Hans-Georg Beyer and Hans-Paul Schwefel. 2002. Evolution strategies - A comprehensive introduction. Natural Computing 1 (2002), 3--52. Issue 1.

Digital Library

[2]

Dimo Brockhoff, Anne Auger, Nikolaus Hansen, Dirk V Arnold, and Tim Hohm. 2010. Mirrored sampling and sequential selection for evolution strategies. In International Conference on Parallel Problem Solving from Nature. Springer, 11--21.

[3]

Edoardo Conti, Vashisht Madhavan, Felipe Petroski Such, Joel Lehman, Kenneth O. Stanley, and Jeff Clune. 2017. Improving Exploration in Evolution Strategies for Deep Reinforcement Learning via a Population of Novelty-Seeking Agents. Advances in Neural Information Processing Systems 2018-December (12 2017), 5027--5038.

[4]

Erwin Coumans et al. 2013. Bullet physics library. Open source: bulletphysics. org 15, 49 (2013), 5.

[5]

Peter Dayan and Geoffrey E Hinton. 1993. Feudal Reinforcement Learning. In Advances in Neural Information Processing Systems, S. Hanson, J. Cowan, and C. Giles (Eds.), Vol. 5. Morgan-Kaufmann.

[6]

B. Flannery, S. Teukolsky, and W. Vetterling. 1986. Numerical Recipes. Cambridge University Press, Cambridge, UK.

[7]

Santiago Gonzalez and Risto Miikkulainen. 2020. Improved training speed, accuracy, and data utilization through loss function optimization. In 2020 IEEE Congress on Evolutionary Computation (CEC). IEEE, 1--8.

Digital Library

[8]

Tuomas Haarnoja, Aurick Zhou, Pieter Abbeel, and Sergey Levine. 2018. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. In International conference on machine learning. PMLR, 1861--1870.

[9]

Nikolaus Hansen and Andreas Ostermeier. 2001. Completely derandomized self-adaptation in evolution strategies. Evolutionary computation 9, 2 (2001), 159--195.

[10]

Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).

[11]

Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness, Marc G Bellemare, Alex Graves, Martin Riedmiller, Andreas K Fidjeland, Georg Ostrovski, et al. 2015. Human-level control through deep reinforcement learning. nature 518, 7540 (2015), 529--533.

[12]

Ofir Nachum, Shixiang Shane Gu, Honglak Lee, and Sergey Levine. 2018. Data-Efficient Hierarchical Reinforcement Learning. Advances in Neural Information Processing Systems 31 (2018), 3303--3313.

[13]

Tom Le Paine, Cosmin Paduraru, Andrea Michi, Caglar Gulcehre, Konrad Zolna, Alexander Novikov, Ziyu Wang, and Nando de Freitas. 2020. Hyperparameter selection for offline reinforcement learning. arXiv preprint arXiv:2007.09055 (2020).

[14]

Hongyu Ren, Shengjia Zhao, and Stefano Ermon. 2019. Adaptive Antithetic Sampling for Variance Reduction. In Proceedings of the 36th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 97), Kamalika Chaudhuri and Ruslan Salakhutdinov (Eds.). PMLR, 5420--5428. https://proceedings.mlr.press/v97/ren19b.html

[15]

Tim Salimans, Jonathan Ho, Xi Chen, and Ilya Sutskever. 2017. Evolution Strategies as a Scalable Alternative to Reinforcement Learning. CoRR abs/1703.03864 (2017). arXiv:1703.03864 http://arxiv.org/abs/1703.03864

[16]

Olivier Sigaud and Freek Stulp. 2019. Policy search in continuous action domains: an overview. Neural Networks 113 (2019), 28--40.

Digital Library

[17]

David Silver, Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou, Matthew Lai, Arthur Guez, Marc Lanctot, Laurent Sifre, Dharshan Kumaran, Thore Graepel, et al. 2018. A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play. Science 362, 6419 (2018), 1140--1144.

[18]

Richard S Sutton and Andrew G Barto. 2018. Reinforcement learning: An introduction. MIT press.

Digital Library

[19]

Richard S Sutton, Doina Precup, and Satinder Singh. 1999. Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial intelligence 112, 1--2 (1999), 181--211.

[20]

Alexander Sasha Vezhnevets, Simon Osindero, Tom Schaul, Nicolas Heess, Max Jaderberg, David Silver, and Koray Kavukcuoglu. 2017. Feudal networks for hierarchical reinforcement learning. In International Conference on Machine Learning. PMLR, 3540--3549.

Cited By

Gašperov BĐurasević MJakobovic D(2024)Leveraging More of Biology in Evolutionary Reinforcement LearningApplications of Evolutionary Computation10.1007/978-3-031-56855-8_6(91-114)Online publication date: 3-Mar-2024
https://dl.acm.org/doi/10.1007/978-3-031-56855-8_6
Kelly SSchossau J(2023)Evolutionary Computation and the Reinforcement Learning ProblemHandbook of Evolutionary Machine Learning10.1007/978-981-99-3814-8_4(79-118)Online publication date: 2-Nov-2023
https://doi.org/10.1007/978-981-99-3814-8_4

Index Terms

Scalable evolutionary hierarchical reinforcement learning
1. Computing methodologies
  1. Artificial intelligence
    1. Search methodologies
  2. Machine learning
2. Theory of computation
  1. Design and analysis of algorithms

Index terms have been assigned to the content through auto-classification.

Recommendations

Hierarchical Reinforcement Learning: A Comprehensive Survey

Hierarchical Reinforcement Learning (HRL) enables autonomous decomposition of challenging long-horizon decision-making tasks into simpler subtasks. During the past years, the landscape of HRL research has grown profoundly, resulting in copious ...
Concurrent Hierarchical Reinforcement Learning for RoboCup Keepaway
RoboCup 2017: Robot World Cup XXI
Abstract
RoboCup Keepaway, originated from the RoboCup soccer simulation 2D challenge, has been widely used as a machine learning benchmark. In this paper, we present a concurrent hierarchical reinforcement learning approach to RoboCup Keepaway. Following ...
Transfer in variable-reward hierarchical reinforcement learning

Transfer learning seeks to leverage previously learned tasks to achieve faster learning in a new task. In this paper, we consider transfer learning in the context of related but distinct Reinforcement Learning (RL) problems. In particular, our RL ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

GECCO '22: Proceedings of the Genetic and Evolutionary Computation Conference Companion

July 2022

2395 pages

ISBN:9781450392686

DOI:10.1145/3520304

Editor:
Jonathan E. Fieldsend
University of Exeter
,
General Chair:
Markus Wagner
The University of Adelaide

Copyright © 2022 Owner/Author.

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

SIGEVO: ACM Special Interest Group on Genetic and Evolutionary Computation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 July 2022

Check for updates

Author Tags

Qualifiers

Poster

Conference

GECCO '22

Sponsor:

SIGEVO

GECCO '22: Genetic and Evolutionary Computation Conference

July 9 - 13, 2022

Massachusetts, Boston

Acceptance Rates

Overall Acceptance Rate 1,669 of 4,410 submissions, 38%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
67
Total Downloads

Downloads (Last 12 months)21
Downloads (Last 6 weeks)2

Reflects downloads up to 16 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Gašperov BĐurasević MJakobovic D(2024)Leveraging More of Biology in Evolutionary Reinforcement LearningApplications of Evolutionary Computation10.1007/978-3-031-56855-8_6(91-114)Online publication date: 3-Mar-2024
https://dl.acm.org/doi/10.1007/978-3-031-56855-8_6
Kelly SSchossau J(2023)Evolutionary Computation and the Reinforcement Learning ProblemHandbook of Evolutionary Machine Learning10.1007/978-981-99-3814-8_4(79-118)Online publication date: 2-Nov-2023
https://doi.org/10.1007/978-981-99-3814-8_4

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten