On strong average optimality of markov decision processes with unbounded costs

https://doi.org/10.1016/0167-6377(92)90040-AGet rights and content

Abstract

We consider average Markov decision processes on countable state space, compact action space and with unbounded costs. Under a certain penalizing condition on the cost for unstable behavior, we establish the existence of a stable stationary strategy which is strong average optimal.

References (24)

  • J Flynn

    On optimality criteria for dynamic programs with long finite horizon

    J. Math. Anal. Appl.

    (1980)
  • D.P Bertsekas et al.

    Stochastic Optimal Control: The Discrete Time Case

    (1978)
  • V.S Borkar

    On minimum cost per unit time control of Markov chains

    SIAM J. Control Opt.

    (1984)
  • V.S Barkar

    Control of Markov chains with long-run average cost criterion

  • V.S Borkar

    Control of Markov chains with long-run average cost criterion: the dynamic programming equations

    SIAM J. Control Opt.

    (1989)
  • V.S Borkar

    Topics in Controlled Markov Chains

    (1991)
  • V.S Borkar et al.

    Ergodic and adaptive control of nearest neighbour motions

    Math. Control, Signals and Systems

    (1991)
  • R Cavazos-Cadena

    Necessary conditions for the optimality equation in average reward Markov decision processes

    Appl. Math. Opt.

    (1989)
  • R Cavazos-Cadena

    Weak conditions for the existence of optimal stationary policies in average cost Markov decision chains with unbounded costs

    Kybernetica

    (1989)
  • R. Cavazos-Cadena and L.I. Sennott, “Comparing recent assumptions for the existence of average optimal stationary...
  • C Derman

    Denumerable state Markov decision processes - average cost criterion

    Ann. Math. Stat.

    (1966)
  • C Derman et al.

    A solution to a countable system of equation arising in Markovian decision processes

    Ann. Math. Stat.

    (1967)
  • Cited by (0)

    This research was supported in part by the Texas Advanced Research Program (Advanced Technology Program) under Grant No. 003658-093, in part by the Air Force Office of Scientific Research under Grants AFOSR-86-0029 and AFOSR-91-0033, in part by the National Science Foundation under Grant ECS-8617860, and in part by the Air Force Office of Scientific Research (AFSC) under Contract F49620-89-C-0044.

    ∗∗

    Systems Research Center, University of Maryland, College Park, MD 20742, USA.

    View full text