Abstract
Howard’s algorithm is a fifty-year old generally applicable algorithm for sequential decision making in face of uncertainty. It is routinely used in practice in numerous application areas that are so important that they usually go by their acronyms, e.g., OR, AI, and CAV. While Howard’s algorithm is generally recognized as fast in practice, until recently, its worst case time complexity was poorly understood. However, a surge of results since 2009 has led us to a much more satisfactory understanding of the worst case time complexity of the algorithm in the various settings in which it applies. In this talk, we shall survey these recent results and the open problems that remains.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Blackwell, D.: Discrete dynamic programming. Ann. Math. Stat. 33, 719–726 (1962)
Chatterjee, K., de Alfaro, L., Henzinger, T.A.: Strategy improvement for concurrent reachability games. In: Third International Conference on the Quantitative Evaluation of Systems, QEST 2006, pp. 291–300. IEEE Computer Society (2006)
Condon, A.: The complexity of stochastic games. Information and Computation 96, 203–224 (1992)
Etessami, K., Yannakakis, M.: Recursive Concurrent Stochastic Games. In: Bugliesi, M., Preneel, B., Sassone, V., Wegener, I. (eds.) ICALP 2006, Part II. LNCS, vol. 4052, pp. 324–335. Springer, Heidelberg (2006)
Fearnley, J.: Exponential Lower Bounds for Policy Iteration. In: Abramsky, S., Gavoille, C., Kirchner, C., Meyer auf der Heide, F., Spirakis, P.G. (eds.) ICALP 2010, Part II. LNCS, vol. 6199, pp. 551–562. Springer, Heidelberg (2010)
Friedmann, O.: An exponential lower bound for the parity game strategy improvement algorithm as we know it. In: Proceedings of the 24th Annual IEEE Symposium on Logic in Computer Science, LICS 2009, Los Angeles, CA, USA, August 11-14, pp. 145–156 (2009)
Hansen, K.A., Ibsen-Jensen, R., Miltersen, P.B.: The Complexity of Solving Reachability Games Using Value and Strategy Iteration. In: Kulikov, A., Vereshchagin, N. (eds.) CSR 2011. LNCS, vol. 6651, pp. 77–90. Springer, Heidelberg (2011)
Hansen, K.A., Koucký, M., Lauritzen, N., Miltersen, P.B., Tsigaridas, E.P.: Exact algorithms for solving stochastic games: extended abstract. In: Proceedings of the 43rd ACM Symposium on Theory of Computing, STOC 2011, San Jose, CA, USA, June 6-8, pp. 205–214. ACM (2011)
Hansen, K.A., Koucky, M., Miltersen, P.B.: Winning concurrent reachability games requires doubly exponential patience. In: 24th Annual IEEE Symposium on Logic in Computer Science (LICS 2009), pp. 332–341. IEEE (2009)
Hansen, T.D., Miltersen, P.B., Zwick, U.: Strategy iteration is strongly polynomial for 2-player turn-based stochastic games with a constant discount factor. In: Innovations in Computer Science - ICS 2010, January 7-9, pp. 253–263. Tsinghua University Press, Beijing (2011)
Hansen, T.D., Zwick, U.: Lower Bounds for Howard’s Algorithm for Finding Minimum Mean-Cost Cycles. In: Cheong, O., Chwa, K.-Y., Park, K. (eds.) ISAAC 2010, Part I. LNCS, vol. 6506, pp. 415–426. Springer, Heidelberg (2010)
Hoffman, A.J., Karp, R.M.: On nonterminating stochastic games. Management Science, 359–370 (1966)
Howard, R.A.: Dynamic Programming and Markov Processes. MIT Press, Cambridge (1960)
Ibsen-Jensen, R., Miltersen, P.B.: Solving Simple Stochastic Games with Few Coin Toss Positions. In: Epstein, L., Ferragina, P. (eds.) ESA 2012. LNCS, vol. 7501, pp. 636–647. Springer, Heidelberg (2012)
Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley & Sons, Inc., New York (1994)
Rao, S.S., Chandrasekaran, R., Nair, K.P.K.: Algorithms for discounted games. Journal of Optimization Theory and Applications, 627–637 (1973)
Vöge, J., Jurdziński, M.: A Discrete Strategy Improvement Algorithm for Solving Parity Games. In: Emerson, E.A., Sistla, A.P. (eds.) CAV 2000. LNCS, vol. 1855, pp. 202–215. Springer, Heidelberg (2000)
Ye,Y.: The simplex and policy-iteration methods are strongly polynomial for the markov decision problem with a fixed discount rate (2010), www.stanford.edu/~yyye/SimplexMDP4.pdf
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Miltersen, P.B. (2013). Recent Results on Howard’s Algorithm. In: Kučera, A., Henzinger, T.A., Nešetřil, J., Vojnar, T., Antoš, D. (eds) Mathematical and Engineering Methods in Computer Science. MEMICS 2012. Lecture Notes in Computer Science, vol 7721. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-36046-6_6
Download citation
DOI: https://doi.org/10.1007/978-3-642-36046-6_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-36044-2
Online ISBN: 978-3-642-36046-6
eBook Packages: Computer ScienceComputer Science (R0)