A structured pattern matrix algorithm for multichain Markov decision processes

Iki, Tetsuichiro; Horiguchi, Masayuki; Kurano, Masami

doi:10.1007/s00186-006-0138-5

A structured pattern matrix algorithm for multichain Markov decision processes

Original Article
Published: 06 February 2007

Volume 66, pages 545–555, (2007)
Cite this article

Mathematical Methods of Operations Research Aims and scope Submit manuscript

Tetsuichiro Iki¹,
Masayuki Horiguchi² &
Masami Kurano³

53 Accesses
Explore all metrics

Abstract

In this paper, we are concerned with a new algorithm for multichain finite state Markov decision processes which finds an average optimal policy through the decomposition of the state space into some communicating classes and a transient class. For each communicating class, a relatively optimal policy is found, which is used to find an optimal policy by applying the value iteration algorithm. Using a pattern matrix determining the behaviour pattern of the decision process, the decomposition of the state space is effectively done, so that the proposed algorithm simplifies the structured one given by the excellent Leizarowitz’s paper (Math Oper Res 28:553–586, 2003). Also, a numerical example is given to comprehend the algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Computing semi-stationary optimal policies for multichain semi-Markov decision processes

Article 14 October 2017

Prasenjit Mondal

Robust Synchronization in Markov Decision Processes

Limit Synchronization in Markov Decision Processes

References

Bather J (1973) Optimal decision procedures for finite Markov chains. II. Communicating systems. Adv Appl Probab 5:521–540
Article MATH MathSciNet Google Scholar
Bellman R (1957) Dynamic programming. Princeton University Press, Princeton, NJ
Google Scholar
Denardo EV (1967) Contraction mappings in the theory underlying dynamic programming. SIAM Rev 9:165–177
Article MATH MathSciNet Google Scholar
Denardo EV (1982) Dynamic programming: models and applications. Prentice-Hall Inc., Englewood Cliffs, NJ
Google Scholar
Federgruen A, Schweitzer PJ (1978) Discounted and undiscounted value-iteration in Markov decision problems: a survey. In: Dynamic programming and its applications. Proceedings of the conference, University of British Columbia, Vancouver, BC, 1977. Academic, New York, pp 23–52
Hordijk A, Kallenberg LCM (1979) Linear programming and Markov decision chains. Manage Sci 25(4):352–362
MATH MathSciNet Google Scholar
Hordijk A, Puterman ML (1987) On the convergence of policy iteration in finite state undiscounted Markov decision processes: the unichain case. Math Oper Res 12(1):163–176
Article MATH MathSciNet Google Scholar
Howard RA (1960) Dynamic programming and Markov processes. The Technology Press of MIT, Cambridge
MATH Google Scholar
Kemeny JG, Snell JL (1960) Finite Markov chains. In: The University series in undergraduate mathematics. D. Van Nostrand Co. Inc., Princeton-Toronto-London-New York
Leizarowitz A (2003) An algorithm to identify and compute average optimal policies in multichain Markov decision processes. Math Oper Res 28(3):553–586
Article MATH MathSciNet Google Scholar
Puterman ML (1994) Markov decision processes: discrete stochastic dynamic programming. Wiley, New York (A Wiley-Interscience Publication)
Schweitzer PJ (1971) Iterative solution of the functional equations of undiscounted Markov renewal programming. J Math Anal Appl 34:495–501
Article MATH MathSciNet Google Scholar
White DJ (1963) Dynamic programming, Markov chains, and the method of successive approximations. J Math Anal Appl 6:373–376
Article MATH MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Education and Culture, Miyazaki University, Miyazaki, 889-2192, Japan
Tetsuichiro Iki
General Education, Yuge National College of Maritime Technology, Ehime, 794-2593, Japan
Masayuki Horiguchi
Faculty of Education, Chiba University, Chiba, 263-8522, Japan
Masami Kurano

Authors

Tetsuichiro Iki
View author publications
You can also search for this author in PubMed Google Scholar
Masayuki Horiguchi
View author publications
You can also search for this author in PubMed Google Scholar
Masami Kurano
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Masayuki Horiguchi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Iki, T., Horiguchi, M. & Kurano, M. A structured pattern matrix algorithm for multichain Markov decision processes. Math Meth Oper Res 66, 545–555 (2007). https://doi.org/10.1007/s00186-006-0138-5

Download citation

Received: 22 September 2005
Accepted: 10 October 2006
Published: 06 February 2007
Issue Date: December 2007
DOI: https://doi.org/10.1007/s00186-006-0138-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

A structured pattern matrix algorithm for multichain Markov decision processes

Abstract

Access this article

Similar content being viewed by others

Computing semi-stationary optimal policies for multichain semi-Markov decision processes

Robust Synchronization in Markov Decision Processes

Limit Synchronization in Markov Decision Processes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A structured pattern matrix algorithm for multichain Markov decision processes

Abstract

Access this article

Similar content being viewed by others

Computing semi-stationary optimal policies for multichain semi-Markov decision processes

Robust Synchronization in Markov Decision Processes

Limit Synchronization in Markov Decision Processes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation