Skip to main content
Log in

A confidence-based roadmap using Gaussian process regression

  • Published:
Autonomous Robots Aims and scope Submit manuscript

Abstract

Recent advances in high performance computing have allowed sampling-based motion planning methods to be successfully applied to practical robot control problems. In such methods, a graph representing the local connectivity among states is constructed using a mathematical model of the controlled target. The motion is planned using this graph. However, it is difficult to obtain an appropriate mathematical model in advance when the behavior of the robot is affected by unanticipated factors. Therefore, it is crucial to be able to build a mathematical model from the motion data gathered by monitoring the robot in operation. However, when these data are sparse, uncertainty may be introduced into the model. To deal with this uncertainty, we propose a motion planning method using Gaussian process regression as a mathematical model. Experimental results show that satisfactory robot motion can be achieved using limited data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

Notes

  1. The accumulated cost becomes -1 for any control policy if \(\gamma =1\) under our assumption since the system reaches the goal with probability 1 after an infinite number of time steps. When the purpose is to obtain the optimal control policy that leads the system to reach the goal in the fewest time steps, the discount factor \(\gamma \) is usually set to less than 1.

  2. This RMSE is calculated as \({\frac{1}{|\mathcal M^{p}|} \sum _{\tau =1}^{|\mathcal M^{p}|} \root \of {({\varvec{s}}^{p}(t)-{\varvec{s}}(t))^{\top }({\varvec{s}}^{p}(t)-{\varvec{s}}(t))}}\).

  3. The control uncertainty is mentioned in (Berg et al. 2011), and it corresponds the uncertainty in the state transition model or inverse dynamics model.

References

  • Asmuth, J., & Littman, M. L. (2011). Learning is planning: Near Bayes-optimal reinforcement learning via Monte-Carlo tree search. In Uncertainty in Artificial Intelligence.

  • Berg, J. V. D., et al. (2011). LQG-MP: Optimized path planning for robots with motion uncertainty and imperfect state information. The International Journal of Robotics Research, 30(7), 895–913.

    Article  Google Scholar 

  • Bishop, C. M. (2006). Pattern recognition and machine learning (1st ed.). Springer. corr. 2nd printing edn.

  • Bry, A., & Roy, N. (2011). Rapidly-exploring Random Belief Trees for motion planning under uncertainty. In 2011 IEEE international conference on robotics and automation (ICRA) (pp. 723–730).

  • Choset, H . M. (2005). Principles of robot motion: Theory, algorithms, and implementations. Cambridge: MIT Press.

    MATH  Google Scholar 

  • Chrisman, L. (1992). Reinforcement learning with perceptual aliasing: The perceptual distinctions approach. In AAAI (pp. 183–188).

  • Dalibard, S., et al. (2013). Dynamic walking and whole-body motion planning for humanoid robots: An integrated approach. The International Journal of Robotics Research, 32(9–10), 1089–1103.

    Article  Google Scholar 

  • Dechter, R., & Pearl, J. (1985). Generalized best-first search strategies and the optimality of A*. Journal of the ACM (JACM), 32(3), 505–536.

    Article  MathSciNet  MATH  Google Scholar 

  • Deisenroth, M. P., & Rasmussen, C. E. (2011). PILCO: A model-based and data-efficient approach to policy search. In In Proceedings of the International Conference on Machine Learning.

  • Dijkstra, E. W. (1959). A note on two problems in connexion with graphs. Numerische mathematik, 1(1), 269–271.

    Article  MathSciNet  MATH  Google Scholar 

  • Doya, K. (2000). Reinforcement learning in continuous time and space. Neural Computation, 12(1), 219–245.

    Article  Google Scholar 

  • Foster, L., et al. (2009). Stable and efficient Gaussian Process calculations. Journal of Machine Learning Research, 10, 857–882.

    MathSciNet  MATH  Google Scholar 

  • Grondman, I., et al. (2012). Efficient model learning methods for actor critic control. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 42(3), 591–602.

    Article  Google Scholar 

  • Hinton, G. E. (2002). Training products of experts by minimizing contrastive divergence. Neural computation, 14, 1771–1800.

  • Ijspeert, A. J. (2008). Central pattern generators for locomotion control in animals and robots: A review. Neural Networks 21(4), 642–653. Robotics and Neuroscience.

  • Indyk, P., & Motwani, R. (1998). Approximate nearest neighbors: towards removing the curse of dimensionality. In Proceedings of the thirtieth annual ACM symposium on theory of computing, STOC ’98 (pp. 604–613). New York, ACM.

  • Kavraki, L. E., et al. (1996). Probabilistic roadmaps for path planning in high-dimensional configuration spaces. IEEE Transactions on Robotics and Automation, 12(4), 566–580.

    Article  Google Scholar 

  • Ko, J., & Fox, D. (2009). GP-BayesFilters: Bayesian filtering using Gaussian process prediction and observation models. Autonomous Robots, 27, 75–90.

    Article  Google Scholar 

  • Kuwata, Y., et al. (2009). Real-time motion planning with applications to autonomous urban driving. IEEE Transactions on Control Systems Technology, 17(5), 1105–1118.

    Article  Google Scholar 

  • LaValle, S . M. (2006). Planning algorithms. Cambridge: Cambridge University Press.

    Book  MATH  Google Scholar 

  • LaValle, S. M., & Kuffner, J. J. (2001). Randomized kinodynamic planning. The International Journal of Robotics Research, 20(5), 378–400.

    Article  Google Scholar 

  • Lawrence, N. D. (2004). Gaussian process latent variable models for visualisation of high dimensional data. Advances in neural information systems, 16, 329–336.

  • Marco, A., et al. (2016). Automatic LQR tuning based on Gaussian process global optimization. In 2016 IEEE international conference on robotics and automation (ICRA), (pp. 270–277).

  • Mukadam, M., et al. (2016). Gaussian process motion planning. In 2016 IEEE international conference on robotics and automation (ICRA) (pp. 9–15).

  • Okadome, Y., et al. (2013). Fast approximation method for Gaussian process regression using hash function for non-uniformly distributed data. In Artificial neural networks and machine learning (pp. 17–25).

  • Okadome, Y., et al. (2014). Confidence-based roadmap using Gaussian process regression for a robot control. In 2014 IEEE/RSJ international conference on intelligent robots and systems (IROS 2014) (pp. 661–666).

  • Peters, J., & Schaal, S. (2008). Reinforcement learning of motor skills with policy gradients. Neural Networks, 21(4), 682–697.

    Article  Google Scholar 

  • Prentice, S., & Roy, N. (2009). The belief roadmap: Efficient planning in belief space by factoring the covariance. The International Journal of Robotics Research, 28(11–12), 1448–1465.

    Article  Google Scholar 

  • Rasmussen, C . E., & Williams, C . K . I. (2005). Gaussian processes for machine learning. Cambridge: the mit press.

    MATH  Google Scholar 

  • Ross, S., et al. (2011). A Bayesian approach for learning and planning in partially observable Markov decision processes. The Journal of Machine Learning Research, 12, 1729–1770.

    MathSciNet  MATH  Google Scholar 

  • Spaan, M. T. J., & Vlassis, N. (2004). ‘A point-based POMDP algorithm for robot planning’. In IEEE international conference on robotics and automation (Vol. 3, pp. 2399–2404). IEEE.

  • Sutton, R. S., et al. (2000). Policy gradient method for reinforcement learning with function approximation. In Advances in neural information processing systems (Vol. 12, pp. 1057–1063).

  • Theodorou, E., et al. (2010). ‘Reinforcement learning of motor skills in high dimensions: A path integral approach’. In 2010 IEEE international conference on robotics and automation (ICRA), (pp. 2397 –2403).

  • Thrun, S. B. (1992). Efficient exploration in reinforcement learning. Tech. rep., Pittsburgh.

  • Todorov, E., & Li, W. (2005). A generalized iterative LQG method for locally-optimal feedback control of constrained nonlinear stochastic In Proceedings of the 2005, American control conference (Vol. 1, pp. 300–306).

  • Wang, J. M., et al. (2008). Gaussian process dynamical models for human motion. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(2), 283–298.

    Article  Google Scholar 

Download references

Acknowledgments

This work was partly supported by Grant-in-Aid for Young Scientists 14444719 and for JSPS Fellows A15J01499.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yutaka Nakamura.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Okadome, Y., Nakamura, Y. & Ishiguro, H. A confidence-based roadmap using Gaussian process regression. Auton Robot 41, 1013–1026 (2017). https://doi.org/10.1007/s10514-016-9604-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10514-016-9604-y

Keywords

Navigation