Approximate planning and verification for large Markov decision processes

Lassaigne, Richard; Peyronnet, Sylvain

doi:10.1007/s10009-014-0344-z

Approximate planning and verification for large Markov decision processes

SMC
Published: 29 August 2014

Volume 17, pages 457–467, (2015)
Cite this article

International Journal on Software Tools for Technology Transfer Aims and scope Submit manuscript

Richard Lassaigne^1,2 &
Sylvain Peyronnet^1,2,3

307 Accesses
9 Citations
Explore all metrics

Abstract

We focus on the planning and verification problems for very large probabilistic systems, such as Markov decision processes (MDPs), from a complexity point of view. More precisely, we deal with the problem of designing an efficient approximation method to compute a near-optimal policy for the planning problem in discounted MDPs and the satisfaction probabilities of interesting properties, like reachability or safety, over the Markov chain obtained by restricting the MDP to the near-optimal policy. In this paper, we present two different approaches. The first one is based on sparse sampling while the second uses a variant of the multiplicative weights update algorithm. The complexity of the first approximation method is independent of the size of the state space and uses only a probabilistic generator of the MDP. We give a complete analysis of this approach, for which the control parameter is mainly the targeted quality of the approximation. The second approach is more prospective and is different in the sense that the method can be controlled dynamically by observing its speed of convergence. Parts of this paper have already been presented in Lassaigne and Peyronnet (in Proceedings of the ACM Symposium on applied computing, SAC 2012, pp 1314–1319, ACM 2012), by the same authors.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Verification of Markov Decision Processes Using Learning Algorithms

Reachability in MDPs: Refining Convergence of Value Iteration

Scenario-Based Verification of Uncertain MDPs

References

Alur, R., Henzinger, T.A.: Reactive modules. Formal Methods System Design 15(1), 7–48 (1999)
Article MathSciNet Google Scholar
Andrea, B., Luca De A.: Model checking of probabalistic and nondeterministic systems. In: Proc. 15th conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS), volume 1026 of Lecture Notes in Computer Science, pp. 499–513. Springer, Berlin (1995)
Arora, S., Hazan, E., Kale, S.: The multiplicative weights update method: a meta-algorithm and applications. Theory Comput. 8(1), 121–164 (2012)
Article MathSciNet Google Scholar
Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Mach. Learn. 47(2–3), 235–256 (2002)
Article MATH Google Scholar
Bertsekas, D.P., Castanon, D.A.: Rollout algorithms for stochastic scheduling problems. J. Heuristics 5(1), 89–108 (1999)
Article MATH Google Scholar
Chang, H.S., Fu, M.C., Hu, J., Marcus, S.I.: A survey of some simulation-based algorithms for markov decision processes. Commun. Inf. Systems 7(1), 59–92 (2007)
Courcoubetis, C., Yannakakis, M.: The complexity of probabilistic verification. J. ACM (JACM) 42(4), 857–907 (1995)
Article MathSciNet MATH Google Scholar
Fearnley, J.: Exponential lower bounds for policy iteration. In: Automata, Languages and Programming, 37th International Colloquium, ICALP 2010, Bordeaux, France, July 6–10, 2010, Proceedings, Part II, volume 6199 of Lecture Notes in Computer Science, pp. 551–562. Springer, Berlin (2010)
Friedmann, O.: An exponential lower bound for the parity game strategy improvement algorithm as we know it. In: Proceedings of the 24th Annual IEEE Symposium on Logic in Computer Science, LICS 2009, 11–14 August 2009, Los Angeles, CA, USA, pp. 145–156. IEEE Computer Society (2009)
Guirado, G., Hérault, T., Lassaigne, R., Peyronnet, S.: Distribution, approximation and probabilistic model checking. Electr. Notes Theor. Comput. Sci. - Proc. of Parallel and Distributed Model Checking (PDMC) 135(2), 19–30 (2006)
Article Google Scholar
Hamidouche, K., Borghi, A., Esterie, P., Falcou, J., Peyronnet, S.: Three high performance architectures in the parallel approximate probabilistic model checking boat. In: Proc. of Parallel and Distributed Model Checking (PDMC) (2010)
Hansen, T.D., Miltersen, P.B., Zwick, U.: Strategy iteration is strongly polynomial for 2-player turn-based stochastic games with a constant discount factor. In: Innovations in Computer Science - ICS 2010, Tsinghua University, Beijing, China, January 7–9, 2011. Proceedings, pp. 253–263. Tsinghua University Press (2011)
Henriques, D., Martins, J.G., Zuliani, P., Platzer, A., Clarke, E.M.: Statistical model checking for markov decision processes. In: Quantitative Evaluation of Systems (QEST), 2012 Ninth International Conference on, pp. 84–93 (2012)
Hérault, T., Lassaigne, R., Magniette, F., Peyronnet, S.: Approximate probabilistic model checking. In: Proceedings of the 5th verification, model checking, and abstract interpretation (VMCAI), volume 2937 of Lecture Notes in Computer Science, pp. 73–84. Springer, Berlin (2004)
Herault, T., Lassaigne, R., Peyronnet, S.: APMC 3.0: Approximate verification of discrete and continuous time markov chains. In: Third International Conference on the Quantitative evaluation of systems (QEST), pp. 129–130. IEEE Computer Society (2006)
Hinton, A., Kwiatkowska, M.Z., Norman, G., Parker, D.: PRISM: A tool for automatic verification of probabilistic systems. In: Tools and algorithms for construction and analysis of systems (TACAS), volume 3920 of Lecture Notes in Computer Science, pp. 441–444. Springer, Berlin (2006)
Hoeffding, W.: Probability inequalities for sums of bounded random variables. J. Am. Stat. Assoc. 58(301), 13–30 (1963)
Article MathSciNet MATH Google Scholar
Howard, R.A.: Dynamic Programming and Markov Process. MIT Press, Cambridge (1960)
Google Scholar
Karp, R.M., Luby, M.: Monte-carlo algorithms for enumeration and reliability problems. In: Proc. of the 24th Annual symposium on foundations of computer science (FOCS), pp. 56–64. IEEE (1983)
Kearns, M., Mansour, Y., Ng, A.Y.: A sparse sampling algorithm for near-optimal planning in large Markov decision processes. Mach. Learn. 49(2–3), 193–208 (2002)
Article MATH Google Scholar
Kwiatkowska, M., Norman, G., Parker, D.: PRISM 4.0: Verification of probabilistic real-time systems. In: Gopalakrishnan, G., Qadeer, S. (eds.), Proc. 23rd International Conference on Computer Aided Verification (CAV’11), volume 6806 of LNCS, pp. 585–591. Springer, Berlin (2011)
Lassaigne, R., Peyronnet, S.: Approximate planning and verification for large markov decision processes. In: Proceedings of the ACM Symposium on applied computing, SAC 2012, pp. 1314–1319. ACM (2012)
Lassaigne, R., Peyronnet, S.: Probabilistic verification and approximation. Ann. Pure Appl. Logic 152(1–3), 122–131 (2008)
Article MathSciNet MATH Google Scholar
Legay, A., Sedwards, S.: Lightweight monte carlo algorithm for markov decision processes. arXiv preprint, arXiv:1310.3609 (2013)
Putterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley and Sons, New York (1994)
Book Google Scholar
Segala, R., Lynch, N.: Probabilistic simulations for probabilistic processes. Nordic J. Comput. 2(2), 250–273 (1995)
MathSciNet MATH Google Scholar
Tesauro, G., Galperin, G.R.: On-line policy improvement using monte-carlo search. Adv. Neural Inf. Process. Systems, 1068–1074 (1997)
Vardi, M.Y.: Automatic verification of probabilistic concurrent finite state programs. In: Proc. of the 26th Foundations of Computer Science (FOCS), pp. 327–338 (1984)
Ye, Y.: A new complexity result on solving the markov decision problem. Math. Oper. Res. 30(3), 733–749 (2005)
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Univ Paris VII, Logique Mathématique (IMJ), 75251, Paris, France
Richard Lassaigne & Sylvain Peyronnet
CNRS, Paris-Centre, 75000, Paris, France
Richard Lassaigne & Sylvain Peyronnet
ix-labs, Rouen, France
Sylvain Peyronnet

Authors

Richard Lassaigne
View author publications
You can also search for this author in PubMed Google Scholar
Sylvain Peyronnet
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sylvain Peyronnet.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lassaigne, R., Peyronnet, S. Approximate planning and verification for large Markov decision processes. Int J Softw Tools Technol Transfer 17, 457–467 (2015). https://doi.org/10.1007/s10009-014-0344-z

Download citation

Published: 29 August 2014
Issue Date: August 2015
DOI: https://doi.org/10.1007/s10009-014-0344-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Approximate planning and verification for large Markov decision processes

Abstract

Access this article

Similar content being viewed by others

Verification of Markov Decision Processes Using Learning Algorithms

Reachability in MDPs: Refining Convergence of Value Iteration

Scenario-Based Verification of Uncertain MDPs

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Approximate planning and verification for large Markov decision processes

Abstract

Access this article

Similar content being viewed by others

Verification of Markov Decision Processes Using Learning Algorithms

Reachability in MDPs: Refining Convergence of Value Iteration

Scenario-Based Verification of Uncertain MDPs

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation