Skip to main content
Log in

Optimality of monotonic policies for two-action Markovian decision processes, with applications to control of queues with delayed information

  • Published:
Queueing Systems Aims and scope Submit manuscript

Abstract

We consider a discrete-time Markov decision process with a partially ordered state space and two feasible control actions in each state. Our goal is to find general conditions, which are satisfied in a broad class of applications to control of queues, under which an optimal control policy is monotonic. An advantage of our approach is that it easily extends to problems with both information and action delays, which are common in applications to high-speed communication networks, among others. The transition probabilities are stochastically monotone and the one-stage reward submodular. We further assume that transitions from different states are coupled, in the sense that the state after a transition is distributed as a deterministic function of the current state and two random variables, one of which is controllable and the other uncontrollable. Finally, we make a monotonicity assumption about the sample-path effect of a pairwise switch of the actions in consecutive stages. Using induction on the horizon length, we demonstrate that optimal policies for the finite- and infinite-horizon discounted problems are monotonic. We apply these results to a single queueing facility with control of arrivals and/or services, under very general conditions. In this case, our results imply that an optimal control policy has threshold form. Finally, we show how monotonicity of an optimal policy extends in a natural way to problems with information and/or action delay, including delays of more than one time unit. Specifically, we show that, if a problem without delay satisfies our sufficient conditions for monotonicity of an optimal policy, then the same problem with information and/or action delay also has monotonic (e.g., threshold) optimal policies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. E. Altman and P. Nain, Closed-loop control with delayed information, INRIA Report No. 1638,Performance '92, Newport, USA (1992).

  2. E. Altman and S. Stidham, Optimality of monotonic policies for two-action Markovian decision processes, including information and action delays, Technical Report No. UNC/OR/TR-94-2, Department of Operations Research, University of North Carolina, Chapel Hill (1994).

    Google Scholar 

  3. D. Artiges, Routing to parallel servers with delay, Research Report, INRIA, Sophia Antipolis, France (1993).

    Google Scholar 

  4. D. Bertsekas,Dynamic Programming: Deterministic and Stochastic Models (Prentice-Hall, Englewood Cliffs, NJ, 1987).

    Google Scholar 

  5. T. Crabill, D. Gross, and M. Magazine, A classified bibliography of research on optimal design and control of queues, Oper. Res. 25 (1977) 219–232.

    Google Scholar 

  6. P. Glasserman and D. Yao, Monotone optimal control of permutable GSMPs, Math. Oper. Res. 19 (1994) 449–476.

    Google Scholar 

  7. P. Glasserman and D. Yao,Monotone Structure in Discrete-Event Systems (Wiley, New York, 1994).

    Google Scholar 

  8. K.F. Hinderer, On the structure of solutions of stochastic dynamic programs,Proc. 7th Conf. on Probability Theory, ed. M. Iosifescu (Editura Academiei Republicii Socialiste România, Bucharest, 1984) pp. 173–182.

    Google Scholar 

  9. S.G. Johansen and S. Stidham, Control of arrivals to a stochastic input-output system, Adv. Appl. Prob. 12 (1980) 972–999.

    Google Scholar 

  10. S.M. Ross,Stochastic Processes (Wiley, New York, 1983).

    Google Scholar 

  11. M. Schäl, Conditions for optimality in dynamic programming and for the limit of n-stage optimal policies to be optimal, Z. Wahrscheinlichkeitstheorie verw. Gerb. 32 (1975) 179–196.

    Google Scholar 

  12. R. Serfozo, Monotone optimal policies for Markov decision processes,Stochastic Systems, II: Optimization, Vol. 6, ed. R. Wets, Mathematical Programming Studies (North-Holland, Amsterdam, New York, 1976) pp. 202–215.

    Google Scholar 

  13. R. Serfozo, Optimal control of random walks, birth and death processes, and queues, Adv. Appl. Prob. 13 (1981) 61–83.

    Google Scholar 

  14. M.J. Sobel, Optimal operation of queues, in:Mathematical Methods in Queueing Theory, Vol. 98, ed. A.B. Clarke, Lecture Notes in Economics and Mathematical Systems (Springer, Berlin, 1974) pp. 145–162.

    Google Scholar 

  15. S. Stidham, Optimal control of admission to a queueing system, IEEE Trans. Autom. Contr. 30 (1985) 705–713.

    Google Scholar 

  16. S. Stidham and N.U. Prabhu, Optimal control of queueing systems, in:Mathematical Methods in Queueing Theory, Vol. 98, ed. A.B. Clarke, Lecture Notes in Economics and Mathematical Systems (Springer, Berlin, 1974) pp. 263–294.

    Google Scholar 

  17. S. Stidham and R. Weber, A survey of Markov decision models for control of networks of queues, Queueing Systems 13 (1993) 291–314.

    Google Scholar 

  18. D. Topkis, Minimizing a submodular function on a lattice, Oper. Res. 26 (1978) 305–321.

    Google Scholar 

  19. M. Veatch and L. Wein, Monotone control of queueing and production/inventory systems, Queueing Systems 12 (1992) 391–408.

    Google Scholar 

  20. R. Weber and S. Stidham, Control of service rates in networks of queues, Adv. Appl. Prob.24 (1987) 202–218.

    Google Scholar 

  21. C.C. White, Monotone control laws for noisy, countable-state Markov chains, Eur. J. Oper. Res. 5 (1980) 124–132.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Altman, E., Stidham, S. Optimality of monotonic policies for two-action Markovian decision processes, with applications to control of queues with delayed information. Queueing Syst 21, 267–291 (1995). https://doi.org/10.1007/BF01149165

Download citation

  • Received:

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF01149165

Keywords

Navigation