Skip to main content

Advertisement

Log in

New sufficient conditions for average optimality in continuous-time Markov decision processes

  • Original Article
  • Published:
Mathematical Methods of Operations Research Aims and scope Submit manuscript

Abstract

This paper is devoted to studying continuous-time Markov decision processes with general state and action spaces, under the long-run expected average reward criterion. The transition rates of the underlying continuous-time Markov processes are allowed to be unbounded, and the reward rates may have neither upper nor lower bounds. We provide new sufficient conditions for the existence of average optimal policies. Moreover, such sufficient conditions are imposed on the controlled process’ primitive data and thus they are directly verifiable. Finally, we apply our results to two new examples.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Dong ZQ (1979) Continuous-time Markov decision programming with average reward criterion countable state and action space. Sci Sinica SP ISS(II): 141–148

    Google Scholar 

  • Doshi BT (1976) Continuous-time control of Markov processes on an arbitrary state space: discounted rewards. Ann Stat 4: 1219–1235

    Article  MATH  MathSciNet  Google Scholar 

  • Feller W (1940) On the integro-differential equations of purely discontinuous Markoff processes. Trans Am Math Soc 48: 488–515

    MATH  MathSciNet  Google Scholar 

  • Guo XP (2007a) Continuous-time Markov decision processes with discounted rewards: the case of Polish spaces. Math Oper Res 32(1): 73–87

    Article  MATH  MathSciNet  Google Scholar 

  • Guo XP (2007b) Constrained optimization for average cost continuous-time Markov decision processes. IEEE Trans Automat Control 52(6): 1139–1143

    Article  MathSciNet  Google Scholar 

  • Guo XP, Hernández-Lerma O (2003) Drift and monotonicity conditions for continuous—time controlled Markov chains an average criterion. IEEE Trans Automat Control 48: 236–245

    Article  MathSciNet  Google Scholar 

  • Guo XP, Hernández-Lerma (2009) Continuous-time Markov decision processes: theory and applications. Springer, New York

    Book  MATH  Google Scholar 

  • Guo XP, Rieder U (2006) Average optimality for continuous-time Markov decision processes in Polish spaces. Ann Appl Probab 16(2): 730–756

    Article  MATH  MathSciNet  Google Scholar 

  • Guo XP, Ye LE (2008) New discount and average optimality conditions for continuous-time Markov decision processes (submitted)

  • Hernández-Lerma O (1994) Lectures on continuous-time Markov control processes. Sociedad Matemática Mexicana, México City

    MATH  Google Scholar 

  • Hernández-Lerma O, Lasserre JB (1996) Discrete-time Markov control processes. Springer, New York

    Google Scholar 

  • Hernández-Lerma O, Lasserre JB (1999) Further topics on discrete-time Markov control processes. Springer, New York

    MATH  Google Scholar 

  • Huang YH, Guo XP First passage models for denumerable semi-Markov decision processes with nonnegative discounted costs. Acta Math Appl Sinica (to appear)

  • Kakumanu P (1972) Nondiscounted continuous-time Markov decision processes with contable state space. SIAM J Control 10: 210–220

    Article  MATH  MathSciNet  Google Scholar 

  • Lewis ME, Puterman ML (2000) A note on bias optimality in controlled queueing systems. J Appl Probab 37: 300–305

    Article  MATH  MathSciNet  Google Scholar 

  • Puterman ML (1994) Markov decision processes. Wiley, New York

    Book  MATH  Google Scholar 

  • Sennott LI (1999) Stochastic dynamic programming and the control of queueing system. Wiley, New York

    Google Scholar 

  • Song JS (1987) Continuous-time Markov decision programming with non-uniformlu bounded transition rates. Sci Sinica 12: 1258–1267

    Google Scholar 

  • Ye LE, Guo XP (2008) Construction and regularity of transition functions on Polish spaces under measurablity conditions (submitted)

  • Ye LE, Guo XP, Hernández-Lerma O (2008) Existence and regularity of a nonhomogeneous transition matrix under measurability conditions. J Theor Probab 21: 604–627

    Article  MATH  Google Scholar 

  • Zhu QX (2007) Average optimality inequality for continuous-time Markov decision processes in Polish spaces. Math Meth Oper Res 66: 299–313

    Article  MATH  Google Scholar 

  • Zhu QX (2008) Average optimality for continuous-time Markov decision processes with a policy iteration approach. J Math Anal Appl 339: 691–704

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xianping Guo.

Additional information

Research supported by NSFC.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ye, L., Guo, X. New sufficient conditions for average optimality in continuous-time Markov decision processes. Math Meth Oper Res 72, 75–94 (2010). https://doi.org/10.1007/s00186-010-0307-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00186-010-0307-4

Keywords

Mathematics Subject Classification (2000)

Navigation