Abstract:
Mobile robots are commonly used for missions like target searching and security surveillance in unknown environments, where an exact mathematical model may not be availab...Show MoreMetadata
Abstract:
Mobile robots are commonly used for missions like target searching and security surveillance in unknown environments, where an exact mathematical model may not be available. In this paper, we formulate the problem of mobile robot path planning in unknown environments as a nondeterministic Markov Decision Process (MDP), and provide a model-free reinforcement learning solution in which the modified Q-learning utilizes a combined ε-greedy and Boltzmann exploration. We simulate the validity of the proposed algorithm, and compare the learning process with that of the original Q-learning algorithm. We also analyze the effects of the discounted factor on learning results. Simulations show that the proposed algorithm can generate the shortest path that obtains the maximized accumulated reward in environments having nondeterministic Markovian property given appropriate values of the discounted factor.
Published in: 2018 5th IEEE International Conference on Cloud Computing and Intelligence Systems (CCIS)
Date of Conference: 23-25 November 2018
Date Added to IEEE Xplore: 14 April 2019
ISBN Information: