Importance Sampling for Online Planning under Uncertainty

Luo, Yuanfu; Bai, Haoyu; Hsu, David; Lee, Wee Sun

doi:10.1007/978-3-030-43089-4_16

Yuanfu Luo¹⁴,
Haoyu Bai¹⁴,
David Hsu¹⁴ &
…
Wee Sun Lee¹⁴

Part of the book series: Springer Proceedings in Advanced Robotics ((SPAR,volume 13))

1638 Accesses

Abstract

The partially observable Markov decision process (POMDP) provides a principled general framework for robot planning under uncertainty. Leveraging the idea of Monte Carlo sampling, recent POMDP planning algorithms have scaled up to various challenging robotic tasks, including, e.g., real-time online planning for autonomous vehicles. To further improve online planning performance, this paper presents IS-DESPOT, which introduces importance sampling to DESPOT, a state-of-the-art sampling-based POMDP algorithm for planning under uncertainty. Importance sampling improves the planning performance when there are critical, but rare events, which are difficult to sample. We prove that IS-DESPOT retains the theoretical guarantee of DESPOT. We present a general method for learning the importance sampling distribution and demonstrate empirically that importance sampling significantly improves the performance of online POMDP planning for suitable tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

An Online POMDP Solver for Uncertainty Planning in Dynamic Environment

Planning in Discrete and Continuous Markov Decision Processes by Probabilistic Programming

Posterior sampling for Monte Carlo planning under uncertainty

Article 15 August 2018

References

Bai, H., Cai, S., Ye, N., Hsu, D., Lee, W.S.: Intention-aware online POMDP planning for autonomous driving in a crowd. In: Proc. IEEE Int. Conf. on Robotics & Automation (2015)
Google Scholar
Bai, H., Hsu, D., Kochenderfer, M.J., Lee,W.S.: Unmanned aircraft collision avoidance using continuous-state POMDPs. Robotics: Science and Systems VII 1 (2012)
Google Scholar
Bai, H., Hsu, D., Lee, W.S., Ngo, V.A.: Monte Carlo value iteration for continuous-state POMDPs. In: Algorithmic Foundations of Robotics IX (2010)
Google Scholar
Folsom-Kovarik, J., Sukthankar, G., Schatz, S.: Tractable POMDP representations for intelligent tutoring systems. ACM Trans. on Intelligent Systems & Technology 4(2) (2013)
Google Scholar
Gelly, S., Silver, D.: Combining online and offline knowledge in UCT. In: Proc. Int. Conf. on Machine Learning (2007)
Google Scholar
Glasserman, P.: Monte Carlo methods in financial engineering, vol. 53. Springer Science & Business Media (2003)
Google Scholar
He, R., Brunskill, E., Roy, N.: Efficient planning under uncertainty with macro-actions. J. Artificial Intelligence Research 40(1) (2011)
Google Scholar
Hsiao, K., Kaelbling, L.P., Lozano-Perez, T.: Grasping POMDPs. In: Proc. IEEE Int. Conf. on Robotics & Automation (2007)
Google Scholar
Kaelbling, L.P., Littman, M.L., Cassandra, A.R.: Planning and acting in partially observable stochastic domains. Artificial Intelligence 101(1) (1998)
Google Scholar
Kalos, M., Whitlock, P.: Monte Carlo Methods, vol. 1. JohnWiley & Sons, New York (1986)
Google Scholar
Kearns, M., Mansour, Y., Ng, A.Y.: A sparse sampling algorithm for near-optimal planning in large Markov decision processes. Machine Learning 49(2-3) (2002)
Google Scholar
Koller, D., Friedman, N.: Probabilistic graphical models: principles and techniques. MIT press (2009)
Google Scholar
Koval, M., Pollard, N., Srinivasa, S.: Pre- and post-contact policy decomposition for planar contact manipulation under uncertainty. Int. J. Robotics Research 35(1–3) (2016)
Google Scholar
Koval, M., Hsu, D., Pollard, N., Srinivasa, S.: Configuration lattices for planar contact manipulation under uncertainty. In: Algorithmic Foundations of Robotics XII–Proc. Int.Workshop on the Algorithmic Foundations of Robotics (WAFR) (2016)
Google Scholar
Kurniawati, H., Hsu, D., Lee, W.S.: SARSOP: efficient point-based POMDP planning by approximating optimally reachable belief spaces. In: Proc. Robotics: Science & Systems (2008)
Google Scholar
Owen, A.B.: Monte Carlo theory, methods and examples (2013)
Google Scholar
Pineau, J., Gordon, G., Thrun, S.: Point-based value iteration: An anytime algorithm for POMDPs. In: Proc. Int. Jnt. Conf. on Artificial Intelligence (2003)
Google Scholar
Ross, S., Pineau, J., Paquet, S., Chaib-Draa, B.: Online planning algorithms for POMDPs. J. Artificial Intelligence Research 32 (2008)
Google Scholar
Silver, D., Veness, J.: Monte-Carlo planning in large POMDPs. In: Advances in Neural Information Processing Systems (NIPS) (2010)
Google Scholar
Smallwood, R., Sondik, E.: The optimal control of partially observable Markov processes over a finite horizon. Operations Research 21 (1973)
Google Scholar
Smith, T., Simmons, R.: Heuristic search value iteration for POMDPs. In: Proc. Uncertainty in Artificial Intelligence (2004)
Google Scholar
Somani, A., Ye, N., Hsu, D., Lee,W.S.: DESPOT: Online POMDP planning with regularization. In: Advances in Neural Information Processing Systems (NIPS) (2013)
Google Scholar
Spaan, M., Vlassis, N.: Perseus: Randomized point-based value iteration for POMDPs. J. Artificial Intelligence Research 24 (2005)
Google Scholar
Veach, E.: Robust Monte Carlo methods for light transport simulation. Ph.D. thesis, Stanford University (1997)
Google Scholar
Wu, K., Lee, W.S., Hsu, D.: POMDP to the rescue: Boosting performance for RoboCup rescue. In: Proc. IEEE/RSJ Int. Conf. on Intelligent Robots & Systems (2015)
Google Scholar
Ye, N., Somani, A., Hsu, D., Lee,W.S.: DESPOT: Online POMDP planning with regularization. J. Artificial Intelligence Research (to appear)
Google Scholar
Humanitarian robotics and automation technology challenge (HRATC) 2015, http://www.isr.uc.pt/HRATC2015

Download references

Author information

Authors and Affiliations

National University of Singapore, 117417, Singapore, Singapore
Yuanfu Luo, Haoyu Bai, David Hsu & Wee Sun Lee

Authors

Yuanfu Luo
View author publications
You can also search for this author in PubMed Google Scholar
Haoyu Bai
View author publications
You can also search for this author in PubMed Google Scholar
David Hsu
View author publications
You can also search for this author in PubMed Google Scholar
Wee Sun Lee
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to David Hsu .

Editor information

Editors and Affiliations

Industrial Engineering and Operations Research, University of California, Berkeley, CA, USA
Ken Goldberg
Electrical Engineering and Computer Sciences, University of California, Berkeley, CA, USA
Pieter Abbeel
Rutgers University, Piscataway, NJ, USA
Kostas Bekris
University of California, Berkeley, CA, USA
Lauren Miller

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Luo, Y., Bai, H., Hsu, D., Lee, W.S. (2020). Importance Sampling for Online Planning under Uncertainty. In: Goldberg, K., Abbeel, P., Bekris, K., Miller, L. (eds) Algorithmic Foundations of Robotics XII. Springer Proceedings in Advanced Robotics, vol 13. Springer, Cham. https://doi.org/10.1007/978-3-030-43089-4_16

Download citation

DOI: https://doi.org/10.1007/978-3-030-43089-4_16
Published: 07 May 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-43088-7
Online ISBN: 978-3-030-43089-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Importance Sampling for Online Planning under Uncertainty

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

An Online POMDP Solver for Uncertainty Planning in Dynamic Environment

Planning in Discrete and Continuous Markov Decision Processes by Probabilistic Programming

Posterior sampling for Monte Carlo planning under uncertainty

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Importance Sampling for Online Planning under Uncertainty

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

An Online POMDP Solver for Uncertainty Planning in Dynamic Environment

Planning in Discrete and Continuous Markov Decision Processes by Probabilistic Programming

Posterior sampling for Monte Carlo planning under uncertainty

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation