Abstract
The evolutionary reinforcement learning (ERL) algorithm is a hybrid algorithm which combines evolutionary computation and reinforcement learning. By exchanging information between the population and the agent, the ERL algorithm can perfectly handle a range of challenging control tasks. However, for some complex reward structure problems, both deep reinforcement learning and ERL algorithms easily get stuck in local optima because of the deception of reward function. To address this problem, we integrate a novelty search in the framework of the ERL algorithm, and it guides the agent or population to visit state space where it has rarely or never visited. Five robot locomotion continuous optimization problems were employed as benchmarks. Simulation results show our proposed algorithm outperformed its competitors in most tested environments.
Similar content being viewed by others
References
Goodfellow I, Bengio Y, Courville A (2016) Deep learning. MIT press, Cambridge
Li J, Yao L, Xu X, Cheng B, Ren J (2020) Deep reinforcement learning for pedestrian collision avoidance and human-machine cooperative driving. Inf Sci 532:110–124
Pröllochs N, Feuerriegel S, Lutz B, Neumann D (2020) Negation scope detection for sentiment analysis: a reinforcement learning framework for replicating human interpretations. Inf Sci 536:205–221
Wang H, Hu X, Yu Q, Gu M, Zhao W, Yan J, Hong T (2020) Integrating reinforcement learning and skyline computing for adaptive service composition. Inf Sci 519:141–160
Conti E, Madhavan V, Such FP, Lehman J, Stanley KO, Clune J, Improving exploration in evolution strategies for deep reinforcement learning via a population of novelty-seeking agents. Artificial Intelligence. https://doi.org/10.48550/arXiv.1712.06560
Khadka S, Tumer K (2018) Evolution-guided policy gradient in reinforcement learning. In: advances in neural information processing systems, pp 1188–1200
Pourchot A, Sigaud O, Cem-rl: Combining evolutionary and gradient-based methods for policy search, Learning. https://doi.org/10.48550/arXiv.1810.01222
Lillicrap TP, Hunt JJ, Pritzel A, Heess N, Erez T, Tassa Y, Silver D, Wierstra D (2019) Continuous control with deep reinforcement learning. https://doi.org/10.48550/arXiv.1509.02971
Such FP, Madhavan V, Conti E, Lehman J, Stanley KO, Clune J, Deep neuroevolution: Genetic algorithms are a competitive alternative for training deep neural networks for reinforcement learning, preprint. https://doi.org/10.48550/arXiv.1712.06567
Lehman J, Stanley KO (2011) Novelty search and the problem with objectives. In: Genetic programming theory and practice IX, Springer, pp 37–56
Sutton RS, Barto AG, Reinforcement learning: an introduction
Wiering M, Van Otterlo M (2014) Reinforcement learning: state-of-the-art. Springer, Berlin
Jong KAD (2007) Evolutionary computation: a unified approach. Kluwer Academic Publishers, London
Risi S, Togelius J (2017) Neuroevolution in games: state of the art and open challenges. IEEE Trans Comput Intell AI Games 9(1):25–41
Koutnik J, Gomez F, Schmidhuber J (2010) Evolving neural networks in compressed weight space 619–626
Koutnik J, Cuccu G, Schmidhuber J, Gomez F (2013) Evolving large-scale neural networks for vision-based reinforcement learning. pp 1061–1068
Srivastava RK, Schmidhuber J, Gomez F (2012) Generalized compressed network search. pp 337–346
Salimans T, Ho J, Chen X, Sidor S, Sutskever I (2017) Evolution strategies as a scalable alternative to reinforcement learning, Learning. https://doi.org/10.48550/arXiv.1703.03864
Francon O, Gonzalez S, Hodjat B, Meyerson E, Miikkulainen R, Qiu X, Shahrzad H (2020) Effective reinforcement learning through evolutionary surrogate-assisted prescription, in: Proceedings of the 2020 Genetic and evolutionary computation conference, pp 814–822
Magyar G, Johnsson M (2000) An adaptive hybrid genetic algorithm for the three-matching problem. IEEE Trans Evol Comput 4(2):135-146
Zhang H, Lu J (2008) Adaptive evolutionary programming based on reinforcement learning. Elsevier, Amsterdam
Pettinger JE, Everson RM (2002) Controlling genetic algorithms with reinforcement learning,. In: Proceedings of the 4th Annual Conference on genetic and evolutionary computation, pp 692–692
Radaideh MI, Shirvan K (2021) Rule-based reinforcement learning methodology to inform evolutionary algorithms for constrained optimization of engineering applications. Knowl Based Syst 217:106836
Zhu S, Belardinelli F, León BG (2021) Evolutionary reinforcement learning for sparse rewards. In: Proceedings of the Genetic and Evolutionary Computation Conference Companion, pp 1508–1512
Khadka S, Majumdar S, Nassar T, Dwiel Z,y Tumer E, Miret S, Liu Y, Tumer K, Collaborative evolutionary reinforcement learning, Learning. https://doi.org/10.48550/arXiv.1905.00976
Yang P, Zhang H, Yu Y, Li M, Tang K (2021) Evolutionary reinforcement learning via cooperative coevolutionary negatively correlated search. Swarm Evol Comput 68:100974
Altin UC (2020) Evolutionary reinforcement learning for the coordination of swarm uavs. In: 2020 28th Signal processing and communications applications conference (SIU), IEEE, pp 1–4
Bodnar C, Day B, Lio P, Proximal distilled evolutionary reinforcement learning, Learning. https://doi.org/10.48550/arXiv.1906.09807
Lü S, Han S, Zhou W, Zhang J (2021) Recruitment-imitation mechanism for evolutionary reinforcement learning. Inf Sci 553:172–188
Uhlenbeck GE, Ornstein LS (1930) On the theory of the brownian motion. Phys Rev 36(5):823
Hansen N, The cma evolution strategy: a tutorial, Learning. https://doi.org/10.48550/arXiv.1604.00772
Todorov E, Erez T, Tassa Y, Mujoco (2012) A physics engine for model-based control pp 5026–5033
Brockman G, Cheung V, Pettersson L, Schneider J, Schulman J, Tang J, Zaremba W, Openai gym, preprint
Kingma DP, Ba J Adam (2014) A method for stochastic optimization, Learning
Henderson PA, Islam R, Bachman P, Pineau J, Precup D, Meger D, Deep reinforcement learning that matters, Learning. https://doi.org/10.48550/arXiv.1709.06560
Acknowledgements
This research was supported in part by the NSF of China (Grant No.62073300, U1911205, 62076225). This paper has been subjected to Hubei Key Laboratory of Intelligent Geo-Information Processing, China University of Geosciences, Wuhan 430074, China.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Hu, C., Qiao, R., Gong, W. et al. A novelty-search-based evolutionary reinforcement learning algorithm for continuous optimization problems. Memetic Comp. 14, 451–460 (2022). https://doi.org/10.1007/s12293-022-00375-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12293-022-00375-8