Skip to main content

Advertisement

Log in

Algorithms for aggregated limiting average Markov decision problems

  • Published:
Mathematical Methods of Operations Research Aims and scope Submit manuscript

Abstract.

We consider discrete time Markov Decision Process (MDP) with finite state and action spaces under average reward optimality criterion. The decomposition theory, in Ross and Varadarajan [11], leads to a natural partition of the state space into strongly communicating classes and a set of states that are transient under all stationary strategies. Then, an optimal pure strategy can be obtained from an optimal strategy for some smaller aggregated MDP. This decomposition gives an efficient method for solving large-scale MDPs. In this paper, we consider deterministic MDPs and we construct a simple algorithm, based on graph theory, to determine an aggregated optimal policy. In the case of MDPs without cycles, we propose an algorithm for computing aggregated optimal strategies. In the general case, we propose some new improving algorithms for computing aggregated optimal strategies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Author information

Authors and Affiliations

Authors

Additional information

Manuscript received: September 2000/Final version received: December 2000

Rights and permissions

Reprints and permissions

About this article

Cite this article

Abbad, M., Daoui, C. Algorithms for aggregated limiting average Markov decision problems. Mathematical Methods of OR 53, 451–463 (2001). https://doi.org/10.1007/s001860100117

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/s001860100117