Skip to main content
Log in

Average cost Markov decision processes under the hypothesis of Doeblin

  • Borel State Space
  • Published:
Annals of Operations Research Aims and scope Submit manuscript

Abstract

Average cost Markov decision processes (MDPs) with compact state and action spaces and bounded lower semicontinuous cost functions are considered. Kurano [7] has treated the general case in which several ergodic classes and a transient set are permitted for the Markov process induced by any randomized stationary policy under the hypothesis of Doeblin and showed the existence of a minimum pair of state and policy. This paper considers the same case as that discussed in Kurano [7] and proves some new results which give the existence theorem of an optimal stationary policy under some reasonable conditions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. D.P. Bertsekas and S.D. Shreve,Stochastic Optimal Control — The Discrete Time Case (Academic Press, 1978).

  2. V.S. Borkar, Controlled Markov chains and stochastic networks, SIAM J. Control Optim. 21 (1983) 652–666.

    Google Scholar 

  3. V.S. Borkar, On minimum cost per unit time control of Markov chains, SIAM J. Control Optim. 22 (1984) 965–978.

    Google Scholar 

  4. J.L. Doob,Stochastic Processes (Wiley, New York, 1953).

    Google Scholar 

  5. S.N. Ethier and T.G. Kurtz,Markov Processes, Characterization and Convergence (Wiley, New York, 1986).

    Google Scholar 

  6. M. Kurano, Markov decision processes with a Borel measurable cost function — the average case, Math. Oper. Res. 11 (1986) 309–320.

    Google Scholar 

  7. M. Kurano, The existence of a minimum pair of state and policy for Markov decision processes under the hypothesis of Doeblin, SIAM J. Control Optim. 27 (1989) 296–307.

    Google Scholar 

  8. M. Loève,Probability Theory, 2nd ed. (Van Nostrand, Princeton, NJ, 1960).

    Google Scholar 

  9. S.M. Ross, Arbitrary state Markovian decision processes, Ann. Math. Statist. 39 (1968) 2118–2122.

    Google Scholar 

  10. R.E. Strauch, Negative dynamic programming, Ann. Math. Statist. 37 (1966) 871–890.

    Google Scholar 

  11. H.C. Tijms, On dynamic programming with arbitrary state space, compact action space and the average return as criterion, Report BW 55/75, Math. Centrum, Amsterdam (1975).

    Google Scholar 

  12. J. Wijngaard, Stationary Markovian decision problems and perturbation theory of quasi-compact linear operators, Math. Oper. Res. 2 (1977) 91–102.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kurano, M. Average cost Markov decision processes under the hypothesis of Doeblin. Ann Oper Res 29, 375–385 (1991). https://doi.org/10.1007/BF02283606

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02283606

Keywords

Navigation