Abstract
Recent advances in high performance computing have allowed sampling-based motion planning methods to be successfully applied to practical robot control problems. In such methods, a graph representing the local connectivity among states is constructed using a mathematical model of the controlled target. The motion is planned using this graph. However, it is difficult to obtain an appropriate mathematical model in advance when the behavior of the robot is affected by unanticipated factors. Therefore, it is crucial to be able to build a mathematical model from the motion data gathered by monitoring the robot in operation. However, when these data are sparse, uncertainty may be introduced into the model. To deal with this uncertainty, we propose a motion planning method using Gaussian process regression as a mathematical model. Experimental results show that satisfactory robot motion can be achieved using limited data.
Similar content being viewed by others
Notes
The accumulated cost becomes -1 for any control policy if \(\gamma =1\) under our assumption since the system reaches the goal with probability 1 after an infinite number of time steps. When the purpose is to obtain the optimal control policy that leads the system to reach the goal in the fewest time steps, the discount factor \(\gamma \) is usually set to less than 1.
This RMSE is calculated as \({\frac{1}{|\mathcal M^{p}|} \sum _{\tau =1}^{|\mathcal M^{p}|} \root \of {({\varvec{s}}^{p}(t)-{\varvec{s}}(t))^{\top }({\varvec{s}}^{p}(t)-{\varvec{s}}(t))}}\).
The control uncertainty is mentioned in (Berg et al. 2011), and it corresponds the uncertainty in the state transition model or inverse dynamics model.
References
Asmuth, J., & Littman, M. L. (2011). Learning is planning: Near Bayes-optimal reinforcement learning via Monte-Carlo tree search. In Uncertainty in Artificial Intelligence.
Berg, J. V. D., et al. (2011). LQG-MP: Optimized path planning for robots with motion uncertainty and imperfect state information. The International Journal of Robotics Research, 30(7), 895–913.
Bishop, C. M. (2006). Pattern recognition and machine learning (1st ed.). Springer. corr. 2nd printing edn.
Bry, A., & Roy, N. (2011). Rapidly-exploring Random Belief Trees for motion planning under uncertainty. In 2011 IEEE international conference on robotics and automation (ICRA) (pp. 723–730).
Choset, H . M. (2005). Principles of robot motion: Theory, algorithms, and implementations. Cambridge: MIT Press.
Chrisman, L. (1992). Reinforcement learning with perceptual aliasing: The perceptual distinctions approach. In AAAI (pp. 183–188).
Dalibard, S., et al. (2013). Dynamic walking and whole-body motion planning for humanoid robots: An integrated approach. The International Journal of Robotics Research, 32(9–10), 1089–1103.
Dechter, R., & Pearl, J. (1985). Generalized best-first search strategies and the optimality of A*. Journal of the ACM (JACM), 32(3), 505–536.
Deisenroth, M. P., & Rasmussen, C. E. (2011). PILCO: A model-based and data-efficient approach to policy search. In In Proceedings of the International Conference on Machine Learning.
Dijkstra, E. W. (1959). A note on two problems in connexion with graphs. Numerische mathematik, 1(1), 269–271.
Doya, K. (2000). Reinforcement learning in continuous time and space. Neural Computation, 12(1), 219–245.
Foster, L., et al. (2009). Stable and efficient Gaussian Process calculations. Journal of Machine Learning Research, 10, 857–882.
Grondman, I., et al. (2012). Efficient model learning methods for actor critic control. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 42(3), 591–602.
Hinton, G. E. (2002). Training products of experts by minimizing contrastive divergence. Neural computation, 14, 1771–1800.
Ijspeert, A. J. (2008). Central pattern generators for locomotion control in animals and robots: A review. Neural Networks 21(4), 642–653. Robotics and Neuroscience.
Indyk, P., & Motwani, R. (1998). Approximate nearest neighbors: towards removing the curse of dimensionality. In Proceedings of the thirtieth annual ACM symposium on theory of computing, STOC ’98 (pp. 604–613). New York, ACM.
Kavraki, L. E., et al. (1996). Probabilistic roadmaps for path planning in high-dimensional configuration spaces. IEEE Transactions on Robotics and Automation, 12(4), 566–580.
Ko, J., & Fox, D. (2009). GP-BayesFilters: Bayesian filtering using Gaussian process prediction and observation models. Autonomous Robots, 27, 75–90.
Kuwata, Y., et al. (2009). Real-time motion planning with applications to autonomous urban driving. IEEE Transactions on Control Systems Technology, 17(5), 1105–1118.
LaValle, S . M. (2006). Planning algorithms. Cambridge: Cambridge University Press.
LaValle, S. M., & Kuffner, J. J. (2001). Randomized kinodynamic planning. The International Journal of Robotics Research, 20(5), 378–400.
Lawrence, N. D. (2004). Gaussian process latent variable models for visualisation of high dimensional data. Advances in neural information systems, 16, 329–336.
Marco, A., et al. (2016). Automatic LQR tuning based on Gaussian process global optimization. In 2016 IEEE international conference on robotics and automation (ICRA), (pp. 270–277).
Mukadam, M., et al. (2016). Gaussian process motion planning. In 2016 IEEE international conference on robotics and automation (ICRA) (pp. 9–15).
Okadome, Y., et al. (2013). Fast approximation method for Gaussian process regression using hash function for non-uniformly distributed data. In Artificial neural networks and machine learning (pp. 17–25).
Okadome, Y., et al. (2014). Confidence-based roadmap using Gaussian process regression for a robot control. In 2014 IEEE/RSJ international conference on intelligent robots and systems (IROS 2014) (pp. 661–666).
Peters, J., & Schaal, S. (2008). Reinforcement learning of motor skills with policy gradients. Neural Networks, 21(4), 682–697.
Prentice, S., & Roy, N. (2009). The belief roadmap: Efficient planning in belief space by factoring the covariance. The International Journal of Robotics Research, 28(11–12), 1448–1465.
Rasmussen, C . E., & Williams, C . K . I. (2005). Gaussian processes for machine learning. Cambridge: the mit press.
Ross, S., et al. (2011). A Bayesian approach for learning and planning in partially observable Markov decision processes. The Journal of Machine Learning Research, 12, 1729–1770.
Spaan, M. T. J., & Vlassis, N. (2004). ‘A point-based POMDP algorithm for robot planning’. In IEEE international conference on robotics and automation (Vol. 3, pp. 2399–2404). IEEE.
Sutton, R. S., et al. (2000). Policy gradient method for reinforcement learning with function approximation. In Advances in neural information processing systems (Vol. 12, pp. 1057–1063).
Theodorou, E., et al. (2010). ‘Reinforcement learning of motor skills in high dimensions: A path integral approach’. In 2010 IEEE international conference on robotics and automation (ICRA), (pp. 2397 –2403).
Thrun, S. B. (1992). Efficient exploration in reinforcement learning. Tech. rep., Pittsburgh.
Todorov, E., & Li, W. (2005). A generalized iterative LQG method for locally-optimal feedback control of constrained nonlinear stochastic In Proceedings of the 2005, American control conference (Vol. 1, pp. 300–306).
Wang, J. M., et al. (2008). Gaussian process dynamical models for human motion. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(2), 283–298.
Acknowledgments
This work was partly supported by Grant-in-Aid for Young Scientists 14444719 and for JSPS Fellows A15J01499.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Okadome, Y., Nakamura, Y. & Ishiguro, H. A confidence-based roadmap using Gaussian process regression. Auton Robot 41, 1013–1026 (2017). https://doi.org/10.1007/s10514-016-9604-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10514-016-9604-y