Robust probabilistic planning with ilao

Moreira, Daniel A. M.; Delgado, Karina Valdivia; Nunes de Barros, Leliane

doi:10.1007/s10489-016-0780-4

Robust probabilistic planning with ilao

Published: 15 April 2016

Volume 45, pages 662–672, (2016)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Daniel A. M. Moreira¹,
Karina Valdivia Delgado ORCID: orcid.org/0000-0002-9120-8987² &
Leliane Nunes de Barros¹

323 Accesses
Explore all metrics

Abstract

In probabilistic planning problems which are usually modeled as Markov Decision Processes (MDPs), it is often difficult, or impossible, to obtain an accurate estimate of the state transition probabilities. This limitation can be overcome by modeling these problems as Markov Decision Processes with imprecise probabilities (MDP-IPs). Robust LAO* and Robust LRTDP are efficient algorithms for solving a special class of MDP-IPs where the probabilities lie in a given interval, known as Bounded-Parameter Stochastic-Shortest Path MDP (BSSP-MDP). However, they do not make clear what assumptions must be made to find a robust solution (the best policy under the worst model). In this paper, we propose a new efficient algorithm for BSSP-MDPs, called Robust ILAO* which has a better performance than Robust LAO* and Robust LRTDP, considered the-state-of-the art of robust probabilistic planning. We also define the assumptions required to ensure a robust solution and prove that Robust ILAO* algorithm converges to optimal values if the initial value of all states is admissible.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Artificial Intelligence

References

Barto A, Bradtke S, Singh S (1995) Learning to act using Real-Time dynamic programming. Artif Intell 72:81–138
Article Google Scholar
Datta A, Choudhary A, Bittner ML, Dougherty ER (2003) External control in Markovian genetic regulatory networks. Mach Learn 52(1-2):169191
MATH Google Scholar
Tewari A, Barlett PL (2007) Bounded parameter Markov decision processes with average reward criterion. Springer, Berlin Heidelberg, pp 263–277. proceedings of Learning Theory: 20th Annual Conference on Learning Theory
Bonet B, Geffner H, Labeled RTDP (2003) Improving the convergence of real-time dynamic programming in Proc. AAAI Press:12–21. 13th International Conf. on Automated Planning and Sheduling Trento: Italy:
White III CC, El-Deib HK (1994) Markov decision processes with imprecise transition probabilities. Oper Res 42(4):739–749
Article MathSciNet MATH Google Scholar
Bertsekas D (1995) Programming, Dynamic Control, Optimal Athena scientific, belmont MA
Bertsekas DP, Tsitsiklis JN (1991) An analysis of stochastic shortest path problems. Math Oper Res 16(3):580595
Article MathSciNet MATH Google Scholar
Bryce D, Verdicchio M, Kim S (2010) Planning interventions in biological networks. ACM Trans Intell Syst Technol 1(2):111–140
Article Google Scholar
Wu D, Koutsoukos X (2008) Reachability analysis of uncertain systems using bounded-parameter Markov decision processes. Artif Intell 172(8):945–954
Article MathSciNet MATH Google Scholar
Hansen E, Zilberstein S (2001) LAO*: A heuristic search algorithm that finds solutions with loops. Artif Intell 129:35– 62
Article MathSciNet MATH Google Scholar
Hansen E, Zilberstein S (1999) Solving Markov decision problems using heuristic search. AAAI Technical Report:42–47
Trevizan F W, Cozman F G, de Barros L N (2007) Planning under risk and knightian uncertainty. Hyderabad, India, pp 2023–2028. Inproceedings of International Joint Conferences on Artificial Intelligence
Google Scholar
McMahan HB, Likhachev M, Gordon GJ (2005) Bounded realtime dynamic programming: RTDP with monotone upper bounds and performance guarantees:569–576. Inproceedings of the 22nd international conference on Machine Learning (ICML ’05) New York NY
Satia JK, Lave Jr RE (1970) Markovian decision processes with uncertain transition probabilities. Oper Res 21:728–740
Article MathSciNet MATH Google Scholar
Delgado KV, Sanner S, de Barros LN (2011) Efficient solutions to factored MDPs with imprecise transition probabilities. Artif Intell 175(9-10):1498–1527
Article MathSciNet MATH Google Scholar
Delgado KV, de Barros LN, Cozman FG, Sanner S (2011) Using mathematical programming to solve factored Markov decision processes with imprecise probabilities. Int J Approx Reason (IJAR) 52(7):1000–1017
Article MathSciNet MATH Google Scholar
Delgado KV, de Barros LN, Dias DB, Sanner S (2015) Real-time dynamic programming for Markov Decision Processes with imprecise probabilities. Artif Intell 230:192–223
Article MathSciNet MATH Google Scholar
Hauskrecht M (1997) Dynamic decision making in stochastic partially observable medical domains. Lecture Notes in Artificial Intelligence 1211:296–299. Ischemic heart disease example, 6th Conference on Artificial Intelligence in Medicine, Springer
Google Scholar
Puterman M L, Processes Markov Decision (1994) Discrete Stochastic Dynamic Programming, 1st ed, New York, NY, USA: John Wiley & Sons Inc
Buffet O, Aberdeen D (2005) Robust planning with (l)RTDP, in Proc. of the 19th. Int Joint Conf:1214–1219. on Artificial Intelligence (IJCAI05)
Givan R, Leach S, Dean T (2000) Bounded-parameter Markov Decision Processes. Artif Intell 122:71–109
Article MathSciNet MATH Google Scholar
Pal R, Datta A, Dougherty ER (2008) Robust intervention in probabilistic boolean networks. IEEE Trans Signal Process 56(3):1280–1294
Article MathSciNet Google Scholar
Sanner S, Goetschalckx R, Driessens K, Shani G (2009) Bayesian realtime dynamic programming, in 21st International Joint Conference on Artifical Intelligence (IJCAI-09). Kaufmann Publishers Inc., San Francisco, CA, pp 1784–1789
Google Scholar
Cui S, Sun J, Yin M, Lu S (2006) Solving uncertain Markov decision problems: an Interval-Based method, second international conference. ICNC:948–957
Patek SD, Bertsekas DP (1999) Stochastic shortest path games. SIAM J Control Optim 37:804–824
Article MathSciNet MATH Google Scholar
Witwicki SJ, Melo FS, Capitan J, Spaan MTJ (2013) A flexible approach to modeling unpredictable events in MDPs. ICAPS:260–268. proceedings of the Twenty-Third International Conference on Automated Planning and Scheduling

Download references

Acknowledgments

We thank the São Paulo Research Foundation for the financial support (FAPESP grant #2015/01587-0).

Author information

Authors and Affiliations

Institute of Mathematics and Statistics, University of São Paulo, R. do Matão 1010, Butantã, São Paulo, Brazil
Daniel A. M. Moreira & Leliane Nunes de Barros
School of Arts, Sciences and Humanities, University of São Paulo, Av. Arlindo Béttio 1000, Ermelino Matarazzo, São Paulo, Brazil
Karina Valdivia Delgado

Authors

Daniel A. M. Moreira
View author publications
You can also search for this author inPubMed Google Scholar
Karina Valdivia Delgado
View author publications
You can also search for this author inPubMed Google Scholar
Leliane Nunes de Barros
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Karina Valdivia Delgado.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Moreira, D.A.M., Delgado, K.V. & Nunes de Barros, L. Robust probabilistic planning with ilao. Appl Intell 45, 662–672 (2016). https://doi.org/10.1007/s10489-016-0780-4

Download citation

Published: 15 April 2016
Issue Date: October 2016
DOI: https://doi.org/10.1007/s10489-016-0780-4

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Robust probabilistic planning with ilao

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

On Solving a Stochastic Shortest-Path Markov Decision Process as Probabilistic Inference

Expectation-maximization for Bayes-adaptive POMDPs

Multi-objective Robust Strategy Synthesis for Interval Markov Decision Processes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Robust probabilistic planning with ilao

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

On Solving a Stochastic Shortest-Path Markov Decision Process as Probabilistic Inference

Expectation-maximization for Bayes-adaptive POMDPs

Multi-objective Robust Strategy Synthesis for Interval Markov Decision Processes

Explore related subjects

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now