Two-Stage Evolutionary Reinforcement Learning for Enhancing Exploration and Exploitation

Qingling Zhu; Xiaoqiang Wu; Qiuzhen Lin; Wei-Neng Chen

doi:10.1609/aaai.v38i18.30079

Authors

Qingling Zhu Shenzhen University
Xiaoqiang Wu Shenzhen University
Qiuzhen Lin Shenzhen University
Wei-Neng Chen South China University of Technology

DOI:

https://doi.org/10.1609/aaai.v38i18.30079

Keywords:

SO: Evolutionary Computation, ML: Evolutionary Learning, ML: Reinforcement Learning

Abstract

The integration of Evolutionary Algorithm (EA) and Reinforcement Learning (RL) has emerged as a promising approach for tackling some challenges in RL, such as sparse rewards, lack of exploration, and brittle convergence properties. However, existing methods often employ actor networks as individuals of EA, which may constrain their exploratory capabilities, as the entire actor population will stop evolution when the critic network in RL falls into local optimal. To alleviate this issue, this paper introduces a Two-stage Evolutionary Reinforcement Learning (TERL) framework that maintains a population containing both actor and critic networks. TERL divides the learning process into two stages. In the initial stage, individuals independently learn actor-critic networks, which are optimized alternatively by RL and Particle Swarm Optimization (PSO). This dual optimization fosters greater exploration, curbing susceptibility to local optima. Shared information from a common replay buffer and PSO algorithm substantially mitigates the computational load of training multiple agents. In the subsequent stage, TERL shifts to a refined exploitation phase. Here, only the best individual undergoes further refinement, while the rest individuals continue PSO-based optimization. This allocates more computational resources to the best individual for yielding superior performance. Empirical assessments, conducted across a range of continuous control problems, validate the efficacy of the proposed TERL paradigm.

Two-Stage Evolutionary Reinforcement Learning for Enhancing Exploration and Exploitation

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Developed By

Subscription