Improving Reinforcement Learning Pre-Training with Variational Dropout | IEEE Conference Publication | IEEE Xplore

Improving Reinforcement Learning Pre-Training with Variational Dropout

Publisher: IEEE

Abstract:

Reinforcement learning has been very successful at learning control policies for robotic agents in order to perform various tasks, such as driving around a track, navigat...View more

Abstract:

Reinforcement learning has been very successful at learning control policies for robotic agents in order to perform various tasks, such as driving around a track, navigating a maze, and bipedal locomotion. One significant drawback of reinforcement learning methods is that they require a large number of data points in order to learn good policies, a trait known as poor data efficiency or poor sample efficiency. One approach for improving sample efficiency is supervised pre-training of policies to directly clone the behavior of an expert, but this suffers from poor generalization far from the training data. We propose to improve this by using Gaussian dropout networks with a regularization term based on variational inference in the pre-training step. We show that this initializes policy parameters to significantly better values than standard supervised learning or random initialization, thus greatly reducing sample complexity compared with state-of-the-art methods, and enabling an RL algorithm to learn optimal policies for high-dimensional continuous control problems in a practical time frame.
Date of Conference: 01-05 October 2018
Date Added to IEEE Xplore: 06 January 2019
ISBN Information:

ISSN Information:

Publisher: IEEE
Conference Location: Madrid, Spain

References

References is not available for this document.