Skip to main content

Importance Sampling for Online Planning under Uncertainty

  • Chapter
  • First Online:
Algorithmic Foundations of Robotics XII

Part of the book series: Springer Proceedings in Advanced Robotics ((SPAR,volume 13))

  • 1638 Accesses

Abstract

The partially observable Markov decision process (POMDP) provides a principled general framework for robot planning under uncertainty. Leveraging the idea of Monte Carlo sampling, recent POMDP planning algorithms have scaled up to various challenging robotic tasks, including, e.g., real-time online planning for autonomous vehicles. To further improve online planning performance, this paper presents IS-DESPOT, which introduces importance sampling to DESPOT, a state-of-the-art sampling-based POMDP algorithm for planning under uncertainty. Importance sampling improves the planning performance when there are critical, but rare events, which are difficult to sample. We prove that IS-DESPOT retains the theoretical guarantee of DESPOT. We present a general method for learning the importance sampling distribution and demonstrate empirically that importance sampling significantly improves the performance of online POMDP planning for suitable tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Bai, H., Cai, S., Ye, N., Hsu, D., Lee, W.S.: Intention-aware online POMDP planning for autonomous driving in a crowd. In: Proc. IEEE Int. Conf. on Robotics & Automation (2015)

    Google Scholar 

  2. Bai, H., Hsu, D., Kochenderfer, M.J., Lee,W.S.: Unmanned aircraft collision avoidance using continuous-state POMDPs. Robotics: Science and Systems VII 1 (2012)

    Google Scholar 

  3. Bai, H., Hsu, D., Lee, W.S., Ngo, V.A.: Monte Carlo value iteration for continuous-state POMDPs. In: Algorithmic Foundations of Robotics IX (2010)

    Google Scholar 

  4. Folsom-Kovarik, J., Sukthankar, G., Schatz, S.: Tractable POMDP representations for intelligent tutoring systems. ACM Trans. on Intelligent Systems & Technology 4(2) (2013)

    Google Scholar 

  5. Gelly, S., Silver, D.: Combining online and offline knowledge in UCT. In: Proc. Int. Conf. on Machine Learning (2007)

    Google Scholar 

  6. Glasserman, P.: Monte Carlo methods in financial engineering, vol. 53. Springer Science & Business Media (2003)

    Google Scholar 

  7. He, R., Brunskill, E., Roy, N.: Efficient planning under uncertainty with macro-actions. J. Artificial Intelligence Research 40(1) (2011)

    Google Scholar 

  8. Hsiao, K., Kaelbling, L.P., Lozano-Perez, T.: Grasping POMDPs. In: Proc. IEEE Int. Conf. on Robotics & Automation (2007)

    Google Scholar 

  9. Kaelbling, L.P., Littman, M.L., Cassandra, A.R.: Planning and acting in partially observable stochastic domains. Artificial Intelligence 101(1) (1998)

    Google Scholar 

  10. Kalos, M., Whitlock, P.: Monte Carlo Methods, vol. 1. JohnWiley & Sons, New York (1986)

    Google Scholar 

  11. Kearns, M., Mansour, Y., Ng, A.Y.: A sparse sampling algorithm for near-optimal planning in large Markov decision processes. Machine Learning 49(2-3) (2002)

    Google Scholar 

  12. Koller, D., Friedman, N.: Probabilistic graphical models: principles and techniques. MIT press (2009)

    Google Scholar 

  13. Koval, M., Pollard, N., Srinivasa, S.: Pre- and post-contact policy decomposition for planar contact manipulation under uncertainty. Int. J. Robotics Research 35(1–3) (2016)

    Google Scholar 

  14. Koval, M., Hsu, D., Pollard, N., Srinivasa, S.: Configuration lattices for planar contact manipulation under uncertainty. In: Algorithmic Foundations of Robotics XII–Proc. Int.Workshop on the Algorithmic Foundations of Robotics (WAFR) (2016)

    Google Scholar 

  15. Kurniawati, H., Hsu, D., Lee, W.S.: SARSOP: efficient point-based POMDP planning by approximating optimally reachable belief spaces. In: Proc. Robotics: Science & Systems (2008)

    Google Scholar 

  16. Owen, A.B.: Monte Carlo theory, methods and examples (2013)

    Google Scholar 

  17. Pineau, J., Gordon, G., Thrun, S.: Point-based value iteration: An anytime algorithm for POMDPs. In: Proc. Int. Jnt. Conf. on Artificial Intelligence (2003)

    Google Scholar 

  18. Ross, S., Pineau, J., Paquet, S., Chaib-Draa, B.: Online planning algorithms for POMDPs. J. Artificial Intelligence Research 32 (2008)

    Google Scholar 

  19. Silver, D., Veness, J.: Monte-Carlo planning in large POMDPs. In: Advances in Neural Information Processing Systems (NIPS) (2010)

    Google Scholar 

  20. Smallwood, R., Sondik, E.: The optimal control of partially observable Markov processes over a finite horizon. Operations Research 21 (1973)

    Google Scholar 

  21. Smith, T., Simmons, R.: Heuristic search value iteration for POMDPs. In: Proc. Uncertainty in Artificial Intelligence (2004)

    Google Scholar 

  22. Somani, A., Ye, N., Hsu, D., Lee,W.S.: DESPOT: Online POMDP planning with regularization. In: Advances in Neural Information Processing Systems (NIPS) (2013)

    Google Scholar 

  23. Spaan, M., Vlassis, N.: Perseus: Randomized point-based value iteration for POMDPs. J. Artificial Intelligence Research 24 (2005)

    Google Scholar 

  24. Veach, E.: Robust Monte Carlo methods for light transport simulation. Ph.D. thesis, Stanford University (1997)

    Google Scholar 

  25. Wu, K., Lee, W.S., Hsu, D.: POMDP to the rescue: Boosting performance for RoboCup rescue. In: Proc. IEEE/RSJ Int. Conf. on Intelligent Robots & Systems (2015)

    Google Scholar 

  26. Ye, N., Somani, A., Hsu, D., Lee,W.S.: DESPOT: Online POMDP planning with regularization. J. Artificial Intelligence Research (to appear)

    Google Scholar 

  27. Humanitarian robotics and automation technology challenge (HRATC) 2015, http://www.isr.uc.pt/HRATC2015

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to David Hsu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Luo, Y., Bai, H., Hsu, D., Lee, W.S. (2020). Importance Sampling for Online Planning under Uncertainty. In: Goldberg, K., Abbeel, P., Bekris, K., Miller, L. (eds) Algorithmic Foundations of Robotics XII. Springer Proceedings in Advanced Robotics, vol 13. Springer, Cham. https://doi.org/10.1007/978-3-030-43089-4_16

Download citation

Publish with us

Policies and ethics