Abstract:
Reinforcement learning allows agents to use trial and error method to learn intelligent behaviors which like human beings. However, when the learning tasks become difficu...Show MoreMetadata
Abstract:
Reinforcement learning allows agents to use trial and error method to learn intelligent behaviors which like human beings. However, when the learning tasks become difficult, how to define the reward function is an imperative issue. So, inverse reinforcement learning is proposed to form the reward function that imitates the process of interaction between the expert and the environment. In this paper, an Adaboost-like inverse reinforcement learning methods is proposed. This method uses Adaboost classifier and upper confidence bounds to generate the reward function for a complex task. In the imitating process, the agent continuously compares the difference between itself and the expert, and then the difference decides a specific weight for each state through Adaboost classifier. The weight combines with state confidence by upper confidence bounds to form an approximate reward function. Finally, a simulation, maze environment is used to demonstrate that the proposed method can decrease the computation time.
Date of Conference: 24-29 July 2016
Date Added to IEEE Xplore: 10 November 2016
ISBN Information: