Skip to main content
Log in

Finite approximation of the first passage models for discrete-time Markov decision processes with varying discount factors

  • Published:
Discrete Event Dynamic Systems Aims and scope Submit manuscript

Abstract

This paper deals with the finite approximation of the first passage models for discrete-time Markov decision processes with varying discount factors. For a given control model \(\mathcal {M}\) with denumerable states and compact Borel action sets, and possibly unbounded reward functions, under reasonable conditions we prove that there exists a sequence of control models \(\mathcal {M}_{n}\) such that the first passage optimal rewards and policies of \(\mathcal {M}_{n}\) converge to those of \(\mathcal {M}\), respectively. Based on the convergence theorems, we propose a finite-state and finite-action truncation method for the given control model \(\mathcal {M}\), and show that the first passage optimal reward and policies of \(\mathcal {M}\) can be approximated by those of the solvable truncated finite control models. Finally, we give the corresponding value and policy iteration algorithms to solve the finite approximation models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Altman E (1994) Denumerable constrainedMarkov decision processes and finite approximations.MathMeth Operat Res 19:169–191

    Article  MathSciNet  MATH  Google Scholar 

  • Alvarez-Mena J, Hernández-Lerma O (2002) Convergence of the optimal values of constrained Markov control processes. Math Meth Oper Res 55:461–484

    Article  MathSciNet  MATH  Google Scholar 

  • Bäuerle N, Rieder U (2011) Markov decision processes with applications in finance. Springer, Heidelberg

    Book  MATH  Google Scholar 

  • Derman C (1970) Finite State Markovian Decision Processes, vol. 67 of Mathematics in Science and Engineering. Academic Press, New York

    Google Scholar 

  • González-Hernández J, López-Martínez RR, Minjárez-Sosa JA(2009) Approximation,estimation, and control of stochastic systems under a randomized discounted cost criterion. Kybernetika 45:737–754

    MathSciNet  MATH  Google Scholar 

  • Guo XP, Hernández-del-Valle A, Hernández-Lerma O (2012) First passage problems for nonstationary discrete-time stochastic control systems. Eur J Control 18:528–538

    Article  MathSciNet  MATH  Google Scholar 

  • Guo XP, Zhang WZ (2014) Convergence of controlled models and finite-state approximation for discounted continuous-timeMarkov decision processes with constraints. Eur J Oper Res 238:486–496

    Article  MATH  Google Scholar 

  • Guo XY, Song XP, Zhang Y (2014) First passage criteria for continuous-time Markov decision processes with varying discount factors and history-dependent policies. IEEE Trans Automat Control 59:163–174

    Article  MathSciNet  Google Scholar 

  • Hernández-Lerma O, Lasserre JB (1996) Discrete-Time Markov Control Processes. Springer-Verlag, New York

    Book  MATH  Google Scholar 

  • Hernández-Lerma O, Lasserre JB (1999) Further topics on discrete-timeMarkov control processes. Springer-Verlag, New York

    Book  MATH  Google Scholar 

  • Hinderer K, Waldmann K-H (2005) Algorithms for countable state Markov decision models with an absorbing set. SIAM J Control Optim 43:2109–2131

    Article  MathSciNet  MATH  Google Scholar 

  • Huang YH, Guo XP (2011) First passage models for denumerable semi-Markov decision processes with nonnegative discounted costs. Acta Math Appl Sinica 27:177–190

    Article  MathSciNet  MATH  Google Scholar 

  • Huang YH, Wei QD, Guo XP (2013) Constrained Markov decision processes with first passage criteria. Ann Oper Res 206:197–C219

    Article  MathSciNet  MATH  Google Scholar 

  • Kitaev MY, Rykov VV (1995) Controlled Queueing Systems. CRC Press

  • Liu JY, Huang SM (2001) Markov decision processes with distribution function criterion of first-passage time. Appl Math Optim 43:187–201

    Article  MathSciNet  MATH  Google Scholar 

  • Liu JY, Liu K (1992) Markov decision programming-the first passage model with denumerable state space. Systems Sci Math Sci 5:340–351

    MathSciNet  MATH  Google Scholar 

  • Meyn SP, Tweedie RL (2009) Markov Chains and Stochastic Stability. Cambridge University Press, New York

    Book  MATH  Google Scholar 

  • Prieto-Rumeau T, Hernández-Lerma O (2012) Discounted continuous-time controlled Markov chains: convergence of control models. J Appl Prob 49:1072–1090

    MathSciNet  MATH  Google Scholar 

  • Puterman ML (1994) Markov Decision Processes. Wiley, New York

    Book  MATH  Google Scholar 

  • Schäl M (2005) Control of ruin probabilities by discrete-time investments. Math Methods Oper Res 62:141–158

    Article  MathSciNet  MATH  Google Scholar 

  • Schmidli H (2008) Stochastic Contol in Insurance. Springer-Verlag, London

    MATH  Google Scholar 

  • Wei QD, Guo XP (2011) Markov decision processes with state-dependent discount factors and unbounded rewards/costs. Oper Res Lett 39:369–374

    MathSciNet  MATH  Google Scholar 

  • Wu X, Guo XP (2015) First passage optimality and variance minimization of Markov decision processes with varying discount factors. J Appl Prob 52, no.2 (to be publishes)

  • Yu SX, Lin YL, Yan PF (1998) Optimization models for the first arrival target distribution function in discrete time. J Math Anal Appl 225:193–223

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgments

Research of the first coauthor was partially supported by National Natural Science Foundation of China (NSFC no. 41271076). Research of the second coauthor was partially supported by NSFC (no. 61004037), Research Fund for the Doctoral Program of Higher Education of China, the Fundamental Research Funds for the Central Universities, and by Guangdong Province Key Laboratory of Computational Science.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Junyu Zhang.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wu, X., Zhang, J. Finite approximation of the first passage models for discrete-time Markov decision processes with varying discount factors. Discrete Event Dyn Syst 26, 669–683 (2016). https://doi.org/10.1007/s10626-014-0209-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10626-014-0209-3

Keywords

Navigation