Skip to main content

An Online POMDP Solver for Uncertainty Planning in Dynamic Environment

  • Chapter
  • First Online:
Robotics Research

Part of the book series: Springer Tracts in Advanced Robotics ((STAR,volume 114))

Abstract

Motion planning under uncertainty is important for reliable robot operations in uncertain and dynamic environments. Partially Observable Markov Decision Process (POMDP) is a general and systematic framework for motion planning under uncertainty. To cope with dynamic environment well, we often need to modify the POMDP model during runtime. However, despite recent tremendous advances in POMDP planning, most solvers are not fast enough to generate a good solution when the POMDP model changes during runtime. Recent progress in online POMDP solvers have shown promising results. However, most online solvers are based on replanning, which recompute a solution from scratch at each step, discarding any solution that has been computed so far, and hence wasting valuable computational resources. In this paper, we propose a new online POMDP solver, called Adaptive Belief Tree (ABT), that can reuse and improve existing solution, and update the solution as needed whenever the POMDP model changes. Given enough time, ABT converges to the optimal solution of the current POMDP model in probability. Preliminary results on three distinct robotics tasks in dynamic environments are promising. In all test scenarios, ABT generates similar or better solutions faster than the fastest online POMDP solver today; using an average of less than 50 ms of computation time per step.

Vinay Yadav—All work were done while the author was an internship student at University of Queensland.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Mach. Learn. 47(2–3), 235–256 (2002)

    Article  MATH  Google Scholar 

  2. Auer, P., Cesa-Bianchi, N., Freund, Y., Schapire, R.E.: The non-stochastic multi-armed bandit problem. SIAM J. Comput. 32(1), 48–77 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  3. Bai, H., Hsu, D., Lee, W.S., Ngo, A.V.: Monte Carlo value iteration for continuous-state POMDPs. In: Proceedings of the WAFR (2010)

    Google Scholar 

  4. de Berg, M., Cheong, O., Kreveld, M.V., Overmars, M.: Computational Geometry: Algorithms and Applications. Springer, Berlin (2000)

    Google Scholar 

  5. Hauser, K.: Randomized belief-space replanning in partially-observable continuous spaces. In: Proceedings of the WAFR (2010)

    Google Scholar 

  6. He, R., Brunskill, E., Roy, N.: PUMA: planning under uncertainty with macro-actions. In: Proceedings of the AAAI (2010)

    Google Scholar 

  7. Horowitz, M., Burdick, J.: Interactive non-prehensile manipulation for grasping via POMDPs. In: Proceedings of the ICRA (2013)

    Google Scholar 

  8. Hsiao, K., Kaelbling, L.P., Lozano-Perez, T.: Grasping POMDPs. In: Proceedings of the ICRA, pp. 4685–4692 (2007)

    Google Scholar 

  9. Kocsis, L., Szepesvri, C.: Bandit based monte-carlo planning. In: ECML-06. LNCS, vol. 4212, pp. 282–293. Springer, Berlin (2006)

    Google Scholar 

  10. Kurniawati, H., Patrikalakis, N.M.: Point-based policy transformation: adapting policy to changing POMDP models. In: Proceedings of the WAFR (2012)

    Google Scholar 

  11. Kurniawati, H., Hsu, D., Lee, W.S.: SARSOP: efficient point-based POMDP planning by approximating optimally reachable belief spaces. In: Proceedings of the RSS (2008)

    Google Scholar 

  12. Kurniawati, H., Du, Y., Hsu, D., Lee, W.S.: Motion planning under uncertainty for robotic tasks with long time horizons. IJRR 30(3), 308–323 (2011)

    MATH  Google Scholar 

  13. Ong, S.C.W., Png, S.W., Hsu, D., Lee, W.S.: Planning under uncertainty for robotic tasks with mixed observability. IJRR 29(8), 1053–1068 (2010)

    Google Scholar 

  14. Papadimitriou, C.H., Tsitsiklis, J.N.: The complexity of Markov decision processes. Math. Oper. Res. 12(3), 441–450 (1987)

    Article  MathSciNet  MATH  Google Scholar 

  15. Pineau, J., Gordon, G., Thrun, S.: Point-based value iteration: an anytime algorithm for POMDPs. In: Proceedings of the IJCAI, pp. 1025–1032 (2003)

    Google Scholar 

  16. Platt, R., Tedrake, R., Lozano-Perez, T., Kaelbling, L.P.: Belief space planning assuming maximum likelihood observations. In: Proceedings of the RSS (2010)

    Google Scholar 

  17. Prentice, S., Roy, N.: The belief roadmap: efficient planning in linear POMDPs by factoring the covariance. In: Proceedings of the ISRR (2007)

    Google Scholar 

  18. Ross, S., Pineau, J., Paquet, S., Chaib-draa, B.: Online planning algorithms for POMDPs. JAIR 32, 663–704 (2008)

    MathSciNet  MATH  Google Scholar 

  19. Silver, D., Veness, J.: Monte-Carlo planning in large POMDPs. In: Proceedings of the NIPS (2010)

    Google Scholar 

  20. Smith, T., Simmons, R.: Heuristic search value iteration for POMDPs. In: Proceedings of the UAI (2004)

    Google Scholar 

  21. Smith, T., Simmons, R.: Point-based POMDP algorithms: improved analysis and implementation. In: Proceedings of the UAI, July 2005

    Google Scholar 

  22. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (2012)

    Google Scholar 

  23. Thrun, S.: Monte carlo POMDPs. In: Proceedings of the NIPS, pp. 1064–1070 (2000)

    Google Scholar 

  24. van den Berg, J., Abbeel, P., Goldberg, K.: LQG-MP: optimized path planning for robots with motion uncertainty and imperfect state information. In: Proceedings of the RSS (2010)

    Google Scholar 

  25. van den Berg, J., Wilkie, D., Guy, S.J., Niethammer, M., Manocha, D.: LQG-Obstacles: feedback control with collision avoidance for mobile robots with motion and sensing uncertainty. In: Proceedings of the ICRA (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hanna Kurniawati .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Kurniawati, H., Yadav, V. (2016). An Online POMDP Solver for Uncertainty Planning in Dynamic Environment. In: Inaba, M., Corke, P. (eds) Robotics Research. Springer Tracts in Advanced Robotics, vol 114. Springer, Cham. https://doi.org/10.1007/978-3-319-28872-7_35

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-28872-7_35

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-28870-3

  • Online ISBN: 978-3-319-28872-7

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics