Conditions for the uniqueness of optimal policies of discounted Markov decision processes

Cruz-Suárez, Daniel; Montes-de-Oca, Raúl; Salem-Silva, Francisco

doi:10.1007/s001860400372

Conditions for the uniqueness of optimal policies of discounted Markov decision processes

Published: December 2004

Volume 60, pages 415–436, (2004)
Cite this article

Mathematical Methods of Operations Research Aims and scope Submit manuscript

Daniel Cruz-Suárez¹,
Raúl Montes-de-Oca² &
Francisco Salem-Silva³

233 Accesses
Explore all metrics

Abstract.

This paper presents three conditions. Each of them guarantees the uniqueness of optimal policies of discounted Markov decision processes. The conditions presented here impose hypotheses specifically on the state space X, the action space A, the admissible action sets A(x),x∈X, the transition probability Q, and on the cost function c. Two of these conditions require mainly convexity assumptions, but the third one does not need this kind of assumptions. However, it needs certain stochastic order relations in Q, and the cost function c to reach its minimum with respect to the actions, just in one action. We illustrate the conditions with several examples including, in particular, discrete models, the linear regulator problem, and also a model of an inventory control system.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Constrained Markov decision processes in Borel spaces: from discounted to average optimality

Article 20 June 2016

On the optimality equation for average cost Markov decision processes and its validity for inventory control

Article 22 June 2017

On the Expected Total Reward with Unbounded Returns for Markov Decision Processes

Article 23 October 2018

Author information

Authors and Affiliations

División Académica de Ciencias Básicas, Universidad Juárez Autónoma de Tabasco, Apdo. Postal 5, Cunduacán, Tab. 86690, México
Daniel Cruz-Suárez
Departamento de Matemáticas, Universidad Autónoma Metropolitana-Iztapalapa, Av. San Rafael Atlixco 186, Col. Vicentina, México D.F, 09340, México
Raúl Montes-de-Oca
Facultad de Ciencias Físico Matemáticas, Benemérita Universidad Autónoma de Puebla, Av. San Claudio y Río Verde, Col. San Manuel, Ciudad Universitaria, Puebla, Pue. 09340, México
Francisco Salem-Silva

Authors

Daniel Cruz-Suárez
View author publications
You can also search for this author inPubMed Google Scholar
Raúl Montes-de-Oca
View author publications
You can also search for this author inPubMed Google Scholar
Francisco Salem-Silva
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Raúl Montes-de-Oca.

Additional information

Manuscript received: May 2003 / Final version received: January 2004

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cruz-Suárez, D., Montes-de-Oca, R. & Salem-Silva, F. Conditions for the uniqueness of optimal policies of discounted Markov decision processes. Math Meth Oper Res 60, 415–436 (2004). https://doi.org/10.1007/s001860400372

Download citation

Issue Date: December 2004
DOI: https://doi.org/10.1007/s001860400372

Keywords

Mathematics Subject Classification 2000:

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Conditions for the uniqueness of optimal policies of discounted Markov decision processes

Abstract.

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Constrained Markov decision processes in Borel spaces: from discounted to average optimality

On the optimality equation for average cost Markov decision processes and its validity for inventory control

On the Expected Total Reward with Unbounded Returns for Markov Decision Processes

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification 2000:

Subscribe and save

Buy Now