Abstract:
While searching for solutions for Reinforcement Learning problems Policy Search algorithms crawl through a Search Space to efficiently find a successful policy. In contra...Show MoreMetadata
Abstract:
While searching for solutions for Reinforcement Learning problems Policy Search algorithms crawl through a Search Space to efficiently find a successful policy. In contrast to Deep learning methods, which rely on hundreds of trials to tune weights, such efficient methods rely on very few trials to find a solution. In this work we introduce a variant of the method Black-box Data Efficient Policy Search for Robotics, which originally relied on a Gaussian Process method to learn a dynamics model. Our algorithm is called Black-DROPS-SGP and it adopts a more efficient Gaussian Process called Sparse Gaussian Processes, and we prove the superiority of our method by performing a comparison with Deep PILCO and Black-DROPS, two highly efficient Policy Search algorithms. We use cart-pole systems with one, two and three rotating joints to assess how our method fares when dealing with different complexities, and our results show that our method can find the optimal solution with virtually the same number of trials as the current state-of-the-art, but requires much less computational time between iterations to find this solution. In fields where trials are expensive (e.g. drug discovery, robotics) a higher number of trials can lead to broken robots before new behaviors are learnt, or too many patients injected with an ineffective drug before a cure is found.
Date of Conference: 06-10 December 2021
Date Added to IEEE Xplore: 05 January 2022
ISBN Information: