New Average Optimality Conditions for Semi-Markov Decision Processes in Borel Spaces

Wei, Qingda; Guo, Xianping

doi:10.1007/s10957-012-9986-8

New Average Optimality Conditions for Semi-Markov Decision Processes in Borel Spaces

Published: 26 January 2012

Volume 153, pages 709–732, (2012)
Cite this article

Journal of Optimization Theory and Applications Aims and scope Submit manuscript

Qingda Wei¹ &
Xianping Guo¹

356 Accesses
7 Citations
Explore all metrics

Abstract

This paper deals with semi-Markov decision processes under the average expected criterion. The state and action spaces are Borel spaces, and the cost/reward function is allowed to be unbounded from above and from below. We give another set of conditions, under which the existence of an optimal (deterministic) stationary policy is proven by a new technique of two average optimality inequalities. Our conditions are slightly weaker than those in the existing literature, and some new sufficient conditions for the verifications of our assumptions are imposed on the primitive data of the model. Finally, we illustrate our results with three examples.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

The Bottom of the Spectrum of Time-Changed Processes and the Maximum Principle of Schrödinger Operators

Article 12 January 2017

Masayoshi Takeda

Conservative and Semiconservative Random Walks: Recurrence and Transience

Article 27 February 2017

Vyacheslav M. Abramov

Existence and Uniqueness of Quasi-stationary Distributions for Symmetric Markov Processes with Tightness Property

Article 17 January 2019

Masayoshi Takeda

References

Hernández-Lerma, O., Luque-Vásquez, F.: Semi-Markov control models with average costs. Appl. Math. 26, 315–331 (1999)
MathSciNet MATH Google Scholar
Kurano, M.: Semi-Markov decision processes and their applications in replacement models. J. Oper. Res. Soc. Jpn. 28, 18–30 (1985)
MathSciNet MATH Google Scholar
Limnios, N., Oprisan, J.: Semi-Markov Processes and Reliability. Birkhäuser, Boston (2001)
Book MATH Google Scholar
Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley, New York (1994)
MATH Google Scholar
Sennott, L.I.: Stochastic Dynamic Programming and the Control of Queueing Systems. Wiley, New York (1999)
MATH Google Scholar
Beutler, F.J., Ross, K.W.: Uniformization for semi-Markov decision processes under stationary policies. J. Appl. Probab. 24, 644–656 (1987)
Article MathSciNet MATH Google Scholar
Federgruen, A., Tijms, H.C.: The optimality equation in average cost denumerable state semi-Markov decision problems: recurrency conditions and algorithms. J. Appl. Probab. 15, 356–373 (1978)
Article MathSciNet MATH Google Scholar
Federgruen, A., Hordijk, A., Tijms, H.C.: Denumerable state semi-Markov decision processes with unbounded costs, average cost criterion. Stoch. Process. Appl. 9, 223–235 (1979)
Article MathSciNet MATH Google Scholar
Federgruen, A., Schweitzer, P.J., Tijms, H.C.: Denumerable undiscounted semi-Markov decision processes with unbounded rewards. Math. Oper. Res. 8, 298–313 (1983)
Article MathSciNet MATH Google Scholar
Feinberge, E.A.: Constrained semi-Markov decision processes with average rewards. Z. Oper. Res. 39, 257–288 (1994)
MathSciNet Google Scholar
Heyman, D., Sobel, M.: Stochastic Models in Operations Research, Volume II: Stochastic Optimization. McGraw-Hill, New York (1984)
Google Scholar
Mine, H., Osaki, S.: Markovian Decision Processes. Elsevier, New York (1970)
MATH Google Scholar
Ross, S.M.: Applied Probability Models with Optimization Applications. Holden-Day, San Francisco (1970)
MATH Google Scholar
Schweitzer, P.J.: Iterative solution of the functional equations of undiscounted Markov renewal programming. J. Math. Anal. Appl. 34, 495–501 (1971)
Article MathSciNet MATH Google Scholar
Sennott, L.I.: Average cost semi-Markov decision processes and the control of queueing systems. Probab. Eng. Inf. Sci. 3, 247–272 (1989)
Article MATH Google Scholar
Jaśkiewicz, A.: An approximation approach to ergodic semi-Markov control processes. Math. Methods Oper. Res. 54, 1–19 (2001)
Article MathSciNet MATH Google Scholar
Jaśkiewicz, A.: On the equivalence of two expected average cost criteria for semi-Markov control processes. Math. Oper. Res. 29, 326–338 (2004)
Article MathSciNet MATH Google Scholar
Jaśkiewicz, A., Nowak, A.S.: Optimality in Feller semi-Markov control processes. Oper. Res. Lett. 34, 713–718 (2006)
Article MathSciNet MATH Google Scholar
Jaśkiewicz, A.: A fixed point approach to solve the average cost optimality equation for semi-Markov decision processes with Feller transition probabilities. Commun. Stat., Theory Methods 36, 2559–2575 (2007)
Article MATH Google Scholar
Guo, X.P., Rieder, U.: Average optimality for continuous-time Markov decision processes in Polish spaces. Ann. Appl. Probab. 16, 730–756 (2006)
Article MathSciNet MATH Google Scholar
Guo, X.P., Zhu, Q.X.: Average optimality for Markov decision processes in Borel spaces: a new condition and approach. J. Appl. Probab. 43, 318–334 (2006)
Article MathSciNet MATH Google Scholar
Hernández-Lerma, O., Lasserre, J.B.: Discrete-Time Markov Control Processes: Basic Optimality Criteria. Springer, New York (1996)
Google Scholar
Hernández-Lerma, O., Lasserre, J.B.: Further Topics on Discrete-Time Markov Control Processes. Springer, New York (1999)
MATH Google Scholar
Guo, X.P., Hernández-Lerma, O.: Continuous-Time Markov Decision Processes: Theory and Applications. Springer, Berlin (2009)
Book MATH Google Scholar
Ye, L.E., Guo, X.P.: New sufficient conditions for average optimality in continuous-time Markov decision processes. Math. Methods Oper. Res. 72, 75–94 (2010)
Article MathSciNet MATH Google Scholar
Huang, Y.H., Guo, X.P.: First passage models for denumerable semi-Markov decision processes with nonnegative discounted costs. Acta Math. Appl. Sin. 27, 263–276 (2011)
Article Google Scholar
Meyn, S.P., Tweedie, R.L.: Computable bounds for geometric convergence rates of Markov chains. Ann. Appl. Probab. 4, 981–1011 (1994)
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

The research is supported by NSFC, GDUPS, and GPK-LCS. The authors are greatly indebted to the anonymous referees for many valuable comments and suggestions that have improved the presentation.

Author information

Authors and Affiliations

School of Mathematics and Computational Science, Sun Yat-Sen University, Guangzhou, 510275, P.R. China
Qingda Wei & Xianping Guo

Authors

Qingda Wei
View author publications
You can also search for this author in PubMed Google Scholar
Xianping Guo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xianping Guo.

Additional information

Communicated by Weibo Gong.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wei, Q., Guo, X. New Average Optimality Conditions for Semi-Markov Decision Processes in Borel Spaces. J Optim Theory Appl 153, 709–732 (2012). https://doi.org/10.1007/s10957-012-9986-8

Download citation

Received: 19 August 2011
Accepted: 04 January 2012
Published: 26 January 2012
Issue Date: June 2012
DOI: https://doi.org/10.1007/s10957-012-9986-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

New Average Optimality Conditions for Semi-Markov Decision Processes in Borel Spaces

Abstract

Access this article

Similar content being viewed by others

The Bottom of the Spectrum of Time-Changed Processes and the Maximum Principle of Schrödinger Operators

Conservative and Semiconservative Random Walks: Recurrence and Transience

Existence and Uniqueness of Quasi-stationary Distributions for Symmetric Markov Processes with Tightness Property

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

New Average Optimality Conditions for Semi-Markov Decision Processes in Borel Spaces

Abstract

Access this article

Similar content being viewed by others

The Bottom of the Spectrum of Time-Changed Processes and the Maximum Principle of Schrödinger Operators

Conservative and Semiconservative Random Walks: Recurrence and Transience

Existence and Uniqueness of Quasi-stationary Distributions for Symmetric Markov Processes with Tightness Property

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation