Abstract
Reinforcement Learning has achieved an exceptional performance in the last decade, yet its application to robotics and control remains a field for deeper investigation due to potential challenges. These include high-dimensional continuous state and action spaces, as well as complicated system dynamics and constraints in robotic settings. In this paper, we demonstrate a pioneering experiment in applying an existing model-based RL framework, PILCO, to the problem of time-optimal control. At first, the algorithm models the system dynamics with Gaussian Processes, successfully reducing the effect of model biases. Then, policy evaluation is done through iterated prediction with Gaussian posteriors and deterministic approximate inference. Finally, analytic gradients are used for policy improvement. A simulation and an experiment of an autonomous car completing a rest-to-rest linear locomotion is documented. Time-optimality and data efficiency of the task are shown in the simulation results, and learning under real-world circumstances is proved possible with our methodology.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Deisenroth, M., Rasmussen, C.: PILCO: a model-based and data-efficient approach to policy search. In: Proceedings of the International Conference on Machine Learning (2011).
Shin, K., McKay, N.: A dynamic programming approach to trajectory planning of robotic manipulators. IEEE Trans. Autom. Control 31(6), 491–500 (1986)
Verscheure, D., Demeulenaere, B., Swevers, J., DeSchutter, J., Diehl, M.: Time-optimal path tracking for robots: a convex optimization approach. IEEE Trans. Autom. Control 54(10), 2318–2327 (2009)
Shin, K., McKay, N.: Minimum-time control of robotic manipulators with geometric path constraints. IEEE Trans. Autom. Control 30(6), 531–541 (1985)
Lamiraux, F., Laumond, J.: From paths to trajectories for multibody mobile robots. In: Proceedings of the 5th International Symposium on Experimental Robotics, pp. 301–309 (1998).
Polydoros, A.S., Nalpantidis, L.: Survey of model-based reinforcement learning: applications on robotics. J. Intell. Rob. Syst. 86(2), 153–173 (2017)
Rasmussen, C., Kuss, M.: Gaussian processes in reinforcement learning. NIPS 7, 51–759 (2004)
Mayne, D.Q., Rawlings, J.B., Rao, C.V., Scokaert, P.O.: Constrained model predictive control: stability and optimality. Automatica 36, 789–814 (2000)
Acknowledgement
We acknowledge Dr. Marc Peter Deisenroth for his kind help when implementing the PILCO algorithm in our project. His advice on system constraints handling was very useful to us.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Liao, HC., Liu, JS. (2019). A Model-Based Reinforcement Learning Approach to Time-Optimal Control Problems. In: Wotawa, F., Friedrich, G., Pill, I., Koitz-Hristov, R., Ali, M. (eds) Advances and Trends in Artificial Intelligence. From Theory to Practice. IEA/AIE 2019. Lecture Notes in Computer Science(), vol 11606. Springer, Cham. https://doi.org/10.1007/978-3-030-22999-3_56
Download citation
DOI: https://doi.org/10.1007/978-3-030-22999-3_56
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-22998-6
Online ISBN: 978-3-030-22999-3
eBook Packages: Computer ScienceComputer Science (R0)