Skip to main content
Log in

Robust probabilistic planning with ilao

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

In probabilistic planning problems which are usually modeled as Markov Decision Processes (MDPs), it is often difficult, or impossible, to obtain an accurate estimate of the state transition probabilities. This limitation can be overcome by modeling these problems as Markov Decision Processes with imprecise probabilities (MDP-IPs). Robust LAO* and Robust LRTDP are efficient algorithms for solving a special class of MDP-IPs where the probabilities lie in a given interval, known as Bounded-Parameter Stochastic-Shortest Path MDP (BSSP-MDP). However, they do not make clear what assumptions must be made to find a robust solution (the best policy under the worst model). In this paper, we propose a new efficient algorithm for BSSP-MDPs, called Robust ILAO* which has a better performance than Robust LAO* and Robust LRTDP, considered the-state-of-the art of robust probabilistic planning. We also define the assumptions required to ensure a robust solution and prove that Robust ILAO* algorithm converges to optimal values if the initial value of all states is admissible.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Barto A, Bradtke S, Singh S (1995) Learning to act using Real-Time dynamic programming. Artif Intell 72:81–138

    Article  Google Scholar 

  2. Datta A, Choudhary A, Bittner ML, Dougherty ER (2003) External control in Markovian genetic regulatory networks. Mach Learn 52(1-2):169191

    MATH  Google Scholar 

  3. Tewari A, Barlett PL (2007) Bounded parameter Markov decision processes with average reward criterion. Springer, Berlin Heidelberg, pp 263–277. proceedings of Learning Theory: 20th Annual Conference on Learning Theory

  4. Bonet B, Geffner H, Labeled RTDP (2003) Improving the convergence of real-time dynamic programming in Proc. AAAI Press:12–21. 13th International Conf. on Automated Planning and Sheduling Trento: Italy:

  5. White III CC, El-Deib HK (1994) Markov decision processes with imprecise transition probabilities. Oper Res 42(4):739–749

    Article  MathSciNet  MATH  Google Scholar 

  6. Bertsekas D (1995) Programming, Dynamic Control, Optimal Athena scientific, belmont MA

  7. Bertsekas DP, Tsitsiklis JN (1991) An analysis of stochastic shortest path problems. Math Oper Res 16(3):580595

    Article  MathSciNet  MATH  Google Scholar 

  8. Bryce D, Verdicchio M, Kim S (2010) Planning interventions in biological networks. ACM Trans Intell Syst Technol 1(2):111–140

    Article  Google Scholar 

  9. Wu D, Koutsoukos X (2008) Reachability analysis of uncertain systems using bounded-parameter Markov decision processes. Artif Intell 172(8):945–954

    Article  MathSciNet  MATH  Google Scholar 

  10. Hansen E, Zilberstein S (2001) LAO*: A heuristic search algorithm that finds solutions with loops. Artif Intell 129:35– 62

    Article  MathSciNet  MATH  Google Scholar 

  11. Hansen E, Zilberstein S (1999) Solving Markov decision problems using heuristic search. AAAI Technical Report:42–47

  12. Trevizan F W, Cozman F G, de Barros L N (2007) Planning under risk and knightian uncertainty. Hyderabad, India, pp 2023–2028. Inproceedings of International Joint Conferences on Artificial Intelligence

    Google Scholar 

  13. McMahan HB, Likhachev M, Gordon GJ (2005) Bounded realtime dynamic programming: RTDP with monotone upper bounds and performance guarantees:569–576. Inproceedings of the 22nd international conference on Machine Learning (ICML ’05) New York NY

  14. Satia JK, Lave Jr RE (1970) Markovian decision processes with uncertain transition probabilities. Oper Res 21:728–740

    Article  MathSciNet  MATH  Google Scholar 

  15. Delgado KV, Sanner S, de Barros LN (2011) Efficient solutions to factored MDPs with imprecise transition probabilities. Artif Intell 175(9-10):1498–1527

    Article  MathSciNet  MATH  Google Scholar 

  16. Delgado KV, de Barros LN, Cozman FG, Sanner S (2011) Using mathematical programming to solve factored Markov decision processes with imprecise probabilities. Int J Approx Reason (IJAR) 52(7):1000–1017

    Article  MathSciNet  MATH  Google Scholar 

  17. Delgado KV, de Barros LN, Dias DB, Sanner S (2015) Real-time dynamic programming for Markov Decision Processes with imprecise probabilities. Artif Intell 230:192–223

    Article  MathSciNet  MATH  Google Scholar 

  18. Hauskrecht M (1997) Dynamic decision making in stochastic partially observable medical domains. Lecture Notes in Artificial Intelligence 1211:296–299. Ischemic heart disease example, 6th Conference on Artificial Intelligence in Medicine, Springer

    Google Scholar 

  19. Puterman M L, Processes Markov Decision (1994) Discrete Stochastic Dynamic Programming, 1st ed, New York, NY, USA: John Wiley & Sons Inc

  20. Buffet O, Aberdeen D (2005) Robust planning with (l)RTDP, in Proc. of the 19th. Int Joint Conf:1214–1219. on Artificial Intelligence (IJCAI05)

  21. Givan R, Leach S, Dean T (2000) Bounded-parameter Markov Decision Processes. Artif Intell 122:71–109

    Article  MathSciNet  MATH  Google Scholar 

  22. Pal R, Datta A, Dougherty ER (2008) Robust intervention in probabilistic boolean networks. IEEE Trans Signal Process 56(3):1280–1294

    Article  MathSciNet  Google Scholar 

  23. Sanner S, Goetschalckx R, Driessens K, Shani G (2009) Bayesian realtime dynamic programming, in 21st International Joint Conference on Artifical Intelligence (IJCAI-09). Kaufmann Publishers Inc., San Francisco, CA, pp 1784–1789

    Google Scholar 

  24. Cui S, Sun J, Yin M, Lu S (2006) Solving uncertain Markov decision problems: an Interval-Based method, second international conference. ICNC:948–957

  25. Patek SD, Bertsekas DP (1999) Stochastic shortest path games. SIAM J Control Optim 37:804–824

    Article  MathSciNet  MATH  Google Scholar 

  26. Witwicki SJ, Melo FS, Capitan J, Spaan MTJ (2013) A flexible approach to modeling unpredictable events in MDPs. ICAPS:260–268. proceedings of the Twenty-Third International Conference on Automated Planning and Scheduling

Download references

Acknowledgments

We thank the São Paulo Research Foundation for the financial support (FAPESP grant #2015/01587-0).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Karina Valdivia Delgado.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Moreira, D.A.M., Delgado, K.V. & Nunes de Barros, L. Robust probabilistic planning with ilao. Appl Intell 45, 662–672 (2016). https://doi.org/10.1007/s10489-016-0780-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-016-0780-4

Keywords

Navigation