Skip to main content
Log in

Value iteration and approximately optimal stationary policies in finite-state average Markov decision chains

  • Published:
Mathematical Methods of Operations Research Aims and scope Submit manuscript

Abstract.

This work concerns finte-state Markov decision chains endowed with the long-run average reward criterion. Assuming that the optimality equation has a solution, it is shown that a nearly optimal stationary policy, as well as an approximation to the optimal average reward within a specified error, can be obtained in a finite number of steps of the value iteration method. These results extend others already available in the literature, which were established under more stringent restrictions on the ergodic structure of the decision process.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

Author information

Authors and Affiliations

Authors

Additional information

Manuscript received: October 2001/Final version received: February 2002

RID="*"

ID="*"  The support of the PSF Organization under Grant No. 010/300/01-1 is deeply acknowledged.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cavazos-Cadena, R., Cavazos-Cadena, R. Value iteration and approximately optimal stationary policies in finite-state average Markov decision chains. Mathematical Methods of OR 56, 181–196 (2002). https://doi.org/10.1007/s001860200205

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/s001860200205

Navigation