Skip to main content

Advertisement

Log in

Sensitivity of constrained Markov decision processes

  • Published:
Annals of Operations Research Aims and scope Submit manuscript

Abstract

We consider the optimization of finite-state, finite-action Markov decision processes under constraints. Costs and constraints are of the discounted or average type, and possibly finite-horizon. We investigate the sensitivity of the optimal cost and optimal policy to changes in various parameters. We relate several optimization problems to a generic linear program, through which we investigate sensitivity issues. We establish conditions for the continuity of the optimal value in the discount factor. In particular, the optimal value and optimal policy for the expected average cost are obtained as limits of the dicounted case, as the discount factor goes to one. This generalizes a well-known result for the unconstrained case. We also establish the continuity in the discount factor for certain non-stationary policies. We then discuss the sensitivity of optimal policies and optimal values to small changes in the transition matrix and in the instantaneous cost functions. The importance of the last two results is related to the performance of adaptive policies for constrained MDP under various cost criteria [3,5]. Finally, we establish the convergence of the optimal value for the discounted constrained finite horizon problem to the optimal value of the corresponding infinite horizon problem.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. E. Altman and A. Shwartz, Non-stationary policies for controlled Markov chains, EE Pub. 633, Technion, June (1987).

  2. E. Altman and A. Shwartz, Markov decision problems and state-action frequencies, EE Pub. 692, Technion, November, 1988, SIAM J. Control Optim. 29, No. 4 (1991).

  3. E. Altman and A. Shwartz, Adaptive control of constrained Markov chains, EE. Pub. 717, Technion, March 1989, IEEE Trans. Autom. Control AC-36(1991)454–462.

    Google Scholar 

  4. E. Altman and A. Shwartz, Adaptive control of constrained Markov chains,Trans. 14th Symp. on Operations Research, Ulm, Germany (1989).

  5. E. Altman and A. Shwartz, Adaptive control of constrained Markov chains: Criteria and policies, Ann. Oper. Res. 28(1991)101–134.

    Google Scholar 

  6. V.S. Borkar, A convex analytic approach to Markov decision processes, Prob. Theor. Rel. Fields 78(1988)583–602.

    Google Scholar 

  7. V.S. Borkar, Controlled Markov chains with constraints, Preprint (revised) (1989).

  8. G.B. Dantzig, J. Folkman and N. Shapiro, On the continuity of the minimum set of a continuous function, J. Math. Anal. Appl. 17(1967)519–548.

    Google Scholar 

  9. R. Dekker, Denumerable Markov decision chains: Optimal policies for small interest rates, Thesis, Institute for Applied Mathematics and Computer Science, University of Leiden (1984).

  10. C. Derman,Finite State Markovian Decision Processes (Academic Press, 1970).

  11. C. Derman and M. Klein, Some remarks in finite horizon Markovian decision models, Oper Res. 13(1965)272–278.

    Google Scholar 

  12. W.-R. Heilmann, Solving stochastic dynamic programming problems by linear programming — an annotated bibliography, Zeit. Oper. Res. 22(1978)43–53.

    Google Scholar 

  13. O. Hernández-Lerma,Adaptive Control of Markov Processes (Springer, 1989).

  14. A. Hordijk and L.C.M. Kallenberg, Constrained undiscounted stochastic dynamic programming, Math. Oper. Res. 9, No. 2 (1984)276–289.

    Google Scholar 

  15. L.C.M. Kallenberg,Linear Programming and Finite Markovian Control Problems, Mathematical Centre Tracts 148, Amsterdam (1983).

  16. A.S. Manne, Linear programming and sequential decisions, Manag. Sci. 6(1960)259–267.

    Google Scholar 

  17. P. Nain and K.W. Ross, Optimal priority assignment with hard constraint, IEEE Trans. Autom. Control AC-31, 10(1986)883–888.

    Google Scholar 

  18. K.W. Ross, Randomized and past-dependent policies for Markov decision processes with multiple constraints, Oper. Res. 37, No. 3 (May 1989).

  19. K.W. Ross and B. Chen, Optimal scheduling of interactive and non-interactive traffic in telecommunication systems, IEEE Trans. Autom. Control AC-33, 3(1988)261–267.

    Google Scholar 

  20. M. Schäl, Estimation and control in discounted dynamic programming, Stochastics 20(1987)51–71.

    Google Scholar 

  21. A. Shwartz and A.M. Makowski, An optimal adaptive scheme for two competing queues with constraints, in:Analysis and Optimization of Systems, ed. A. Bensoussan and J.L. Lions, Lecture Notes in Control and Information Sciences (Springer, 1986), pp. 515–532.

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Altman, E., Shwartz, A. Sensitivity of constrained Markov decision processes. Ann Oper Res 32, 1–22 (1991). https://doi.org/10.1007/BF02204825

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02204825

Keywords