Abstract
This paper deals with semi-Markov decision processes under the average expected criterion. The state and action spaces are Borel spaces, and the cost/reward function is allowed to be unbounded from above and from below. We give another set of conditions, under which the existence of an optimal (deterministic) stationary policy is proven by a new technique of two average optimality inequalities. Our conditions are slightly weaker than those in the existing literature, and some new sufficient conditions for the verifications of our assumptions are imposed on the primitive data of the model. Finally, we illustrate our results with three examples.
Similar content being viewed by others
References
Hernández-Lerma, O., Luque-Vásquez, F.: Semi-Markov control models with average costs. Appl. Math. 26, 315–331 (1999)
Kurano, M.: Semi-Markov decision processes and their applications in replacement models. J. Oper. Res. Soc. Jpn. 28, 18–30 (1985)
Limnios, N., Oprisan, J.: Semi-Markov Processes and Reliability. Birkhäuser, Boston (2001)
Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley, New York (1994)
Sennott, L.I.: Stochastic Dynamic Programming and the Control of Queueing Systems. Wiley, New York (1999)
Beutler, F.J., Ross, K.W.: Uniformization for semi-Markov decision processes under stationary policies. J. Appl. Probab. 24, 644–656 (1987)
Federgruen, A., Tijms, H.C.: The optimality equation in average cost denumerable state semi-Markov decision problems: recurrency conditions and algorithms. J. Appl. Probab. 15, 356–373 (1978)
Federgruen, A., Hordijk, A., Tijms, H.C.: Denumerable state semi-Markov decision processes with unbounded costs, average cost criterion. Stoch. Process. Appl. 9, 223–235 (1979)
Federgruen, A., Schweitzer, P.J., Tijms, H.C.: Denumerable undiscounted semi-Markov decision processes with unbounded rewards. Math. Oper. Res. 8, 298–313 (1983)
Feinberge, E.A.: Constrained semi-Markov decision processes with average rewards. Z. Oper. Res. 39, 257–288 (1994)
Heyman, D., Sobel, M.: Stochastic Models in Operations Research, Volume II: Stochastic Optimization. McGraw-Hill, New York (1984)
Mine, H., Osaki, S.: Markovian Decision Processes. Elsevier, New York (1970)
Ross, S.M.: Applied Probability Models with Optimization Applications. Holden-Day, San Francisco (1970)
Schweitzer, P.J.: Iterative solution of the functional equations of undiscounted Markov renewal programming. J. Math. Anal. Appl. 34, 495–501 (1971)
Sennott, L.I.: Average cost semi-Markov decision processes and the control of queueing systems. Probab. Eng. Inf. Sci. 3, 247–272 (1989)
Jaśkiewicz, A.: An approximation approach to ergodic semi-Markov control processes. Math. Methods Oper. Res. 54, 1–19 (2001)
Jaśkiewicz, A.: On the equivalence of two expected average cost criteria for semi-Markov control processes. Math. Oper. Res. 29, 326–338 (2004)
Jaśkiewicz, A., Nowak, A.S.: Optimality in Feller semi-Markov control processes. Oper. Res. Lett. 34, 713–718 (2006)
Jaśkiewicz, A.: A fixed point approach to solve the average cost optimality equation for semi-Markov decision processes with Feller transition probabilities. Commun. Stat., Theory Methods 36, 2559–2575 (2007)
Guo, X.P., Rieder, U.: Average optimality for continuous-time Markov decision processes in Polish spaces. Ann. Appl. Probab. 16, 730–756 (2006)
Guo, X.P., Zhu, Q.X.: Average optimality for Markov decision processes in Borel spaces: a new condition and approach. J. Appl. Probab. 43, 318–334 (2006)
Hernández-Lerma, O., Lasserre, J.B.: Discrete-Time Markov Control Processes: Basic Optimality Criteria. Springer, New York (1996)
Hernández-Lerma, O., Lasserre, J.B.: Further Topics on Discrete-Time Markov Control Processes. Springer, New York (1999)
Guo, X.P., Hernández-Lerma, O.: Continuous-Time Markov Decision Processes: Theory and Applications. Springer, Berlin (2009)
Ye, L.E., Guo, X.P.: New sufficient conditions for average optimality in continuous-time Markov decision processes. Math. Methods Oper. Res. 72, 75–94 (2010)
Huang, Y.H., Guo, X.P.: First passage models for denumerable semi-Markov decision processes with nonnegative discounted costs. Acta Math. Appl. Sin. 27, 263–276 (2011)
Meyn, S.P., Tweedie, R.L.: Computable bounds for geometric convergence rates of Markov chains. Ann. Appl. Probab. 4, 981–1011 (1994)
Acknowledgements
The research is supported by NSFC, GDUPS, and GPK-LCS. The authors are greatly indebted to the anonymous referees for many valuable comments and suggestions that have improved the presentation.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Weibo Gong.
Rights and permissions
About this article
Cite this article
Wei, Q., Guo, X. New Average Optimality Conditions for Semi-Markov Decision Processes in Borel Spaces. J Optim Theory Appl 153, 709–732 (2012). https://doi.org/10.1007/s10957-012-9986-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10957-012-9986-8