A novel DDPG method with prioritized experience replay | IEEE Conference Publication | IEEE Xplore

A novel DDPG method with prioritized experience replay


Abstract:

Recently, a state-of-the-art algorithm, called deep deterministic policy gradient (DDPG), has achieved good performance in many continuous control tasks in the MuJoCo sim...Show More

Abstract:

Recently, a state-of-the-art algorithm, called deep deterministic policy gradient (DDPG), has achieved good performance in many continuous control tasks in the MuJoCo simulator. To further improve the efficiency of the experience replay mechanism in DDPG and thus speeding up the training process, in this paper, a prioritized experience replay method is proposed for the DDPG algorithm, where prioritized sampling is adopted instead of uniform sampling. The proposed DDPG with prioritized experience replay is tested with an inverted pendulum task via OpenAI Gym. The experimental results show that DDPG with prioritized experience replay can reduce the training time and improve the stability of the training process, and is less sensitive to the changes of some hyperparameters such as the size of replay buffer, minibatch and the updating rate of the target network.
Date of Conference: 05-08 October 2017
Date Added to IEEE Xplore: 30 November 2017
ISBN Information:
Conference Location: Banff, AB, Canada

References

References is not available for this document.