Abstract:
In deep reinforcement learning, it is difficult to converge when the exploration is insufficient or a reward is sparse. Besides, on specific tasks, the amount of explorat...Show MoreMetadata
Abstract:
In deep reinforcement learning, it is difficult to converge when the exploration is insufficient or a reward is sparse. Besides, on specific tasks, the amount of exploration may be limited. Therefore, it is considered effective to learn on source tasks that were previously for promoting learning on the target tasks. Existing researches have proposed pretraining methods for learning parameters that enable fast learning on multiple tasks. However, these methods are still limited by several problems, such as sparse reward, deviation of samples, dependence on initial parameters. In this research, we propose a pretraining method to train a model that can work well on variety of target tasks and solve the above problems with an evolutionary algorithm and policy gradients method. In this method, agents explore multiple environments with a diverse set of neural networks to train a general model with evolutionary algorithm and policy gradients method. In the experiments, we assume multiple 3D control source tasks. After the model training with our method on the source tasks, we show how effective the model is for the 3D control tasks of the target tasks.
Published in: 2019 IEEE International Conference on Big Data, Cloud Computing, Data Science & Engineering (BCD)
Date of Conference: 29-31 May 2019
Date Added to IEEE Xplore: 31 October 2019
ISBN Information: