Performance Optimization of Semi-Markov Decision Processes with Discounted-cost Criteria

https://doi.org/10.3166/ejc.14.213-222Get rights and content

We discuss the problems of discounted-cost performance optimization for a class of semi-Markov decision processes (SMDPs). We define a matrix which can be used as the infinitesimal generator of a Markov process. The discounted Poisson equation is proposed for an SMDP by using this matrix, from which the α-potential is defined. The optimality equation satisfied by the optimal stationary policy is given and the relation between discounted model and average model is discussed. Two iteration algorithms to find ε-optimal policies are proposed and the proofs of convergence of these two algorithms are given. A numerical example is provided to illustrate the application of the algorithms.

References (10)

  • X.R. Cao

    A unified approach to Markov decision problems and performance sensitivity analysis

    Automatica

    (2000)
  • P.W. Glynn et al.

    A Lyapunov bound for solutions of Poisson's equation

    Ann Probab

    (1996)
  • X.R. Cao et al.

    Perturbation realization, potentials, and sensitivity analysis of Markov processes

    IEEE Trans Automat Contr

    (1997)
  • J.S. Song

    Continuous time Markov decision programming with nonuniformly bounded transition rate

    Sci Sin

    (1987)
  • X.P. Guo et al.

    Denumerable-state continuous time Markov decision processes with unbounded transition and reward rates under the discounted criterion

    J Appl Probab

    (2002)
There are more references available in the full text version of this article.

Cited by (0)

View full text