Skip to main content

Lower Bounds for Howard’s Algorithm for Finding Minimum Mean-Cost Cycles

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6506))

Abstract

Howard’s policy iteration algorithm is one of the most widely used algorithms for finding optimal policies for controlling Markov Decision Processes (MDPs). When applied to weighted directed graphs, which may be viewed as Deterministic MDPs (DMDPs), Howard’s algorithm can be used to find Minimum Mean-Cost cycles (MMCC). Experimental studies suggest that Howard’s algorithm works extremely well in this context. The theoretical complexity of Howard’s algorithm for finding MMCCs is a mystery. No polynomial time bound is known on its running time. Prior to this work, there were only linear lower bounds on the number of iterations performed by Howard’s algorithm. We provide the first weighted graphs on which Howard’s algorithm performs Ω(n 2) iterations, where n is the number of vertices in the graph.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bellman, R.E.: Dynamic programming. Princeton University Press, Princeton (1957)

    MATH  Google Scholar 

  2. Bellman, R.E.: On a routing problem. Quarterly of Applied Mathematics 16, 87–90 (1958)

    Article  MathSciNet  MATH  Google Scholar 

  3. Dasdan, A.: Experimental analysis of the fastest optimum cycle ratio and mean algorithms. ACM Trans. Des. Autom. Electron. Syst. 9(4), 385–418 (2004)

    Article  Google Scholar 

  4. Derman, C.: Finite state Markov decision processes. Academic Press, London (1972)

    MATH  Google Scholar 

  5. Fearnley, J.: Exponential lower bounds for policy iteration. In: Proc. of 37th ICALP (2010), Preliminaey version available at http://arxiv.org/abs/1003.3418v1

  6. Ford Jr., L.R., Fulkerson, D.R.: Maximal flow through a network. Canadian Journal of Mathematics 8, 399–404 (1956)

    Article  MathSciNet  MATH  Google Scholar 

  7. Friedmann, O.: An exponential lower bound for the parity game strategy improvement algorithm as we know it. In: Proc. of 24th LICS, pp. 145–156 (2009)

    Google Scholar 

  8. Georgiadis, L., Goldberg, A.V., Tarjan, R.E., Werneck, R.F.F.: An experimental study of minimum mean cycle algorithms. In: Proc. of 11th ALENEX, pp. 1–13 (2009)

    Google Scholar 

  9. Goldberg, A.V., Tarjan, R.E.: Finding minimum-cost circulations by canceling negative cycles. Journal of the ACM 36(4), 873–886 (1989)

    Article  MathSciNet  MATH  Google Scholar 

  10. Hansen, T.D., Miltersen, P.B., Zwick, U.: Strategy iteration is strongly polynomial for 2-player turn-based stochastic games with a constant discount factor. CoRR, abs/1008.0530 (2010)

    Google Scholar 

  11. Howard, R.A.: Dynamic programming and Markov processes. MIT Press, Cambridge (1960)

    MATH  Google Scholar 

  12. Karp, R.M.: A characterization of the minimum cycle mean in a digraph. Discrete Mathematics 23(3), 309–311 (1978)

    Article  MathSciNet  MATH  Google Scholar 

  13. Madani, O.: Personal communication (2008)

    Google Scholar 

  14. Megiddo, N.: Combinatorial optimization with rational objective functions. Mathematics of Operations Research 4(4), 414–424 (1979)

    Article  MathSciNet  MATH  Google Scholar 

  15. Megiddo, N.: Applying parallel computation algorithms in the design of serial algorithms. Journal of the ACM 30(4), 852–865 (1983)

    Article  MathSciNet  MATH  Google Scholar 

  16. Puterman, M.L.: Markov decision processes. Wiley, Chichester (1994)

    Book  MATH  Google Scholar 

  17. Ye, Y.: The simplex method is strongly polynomial for the Markov decision problem with a fixed discount rate (2010), http://www.stanford.edu/~yyye/simplexmdp1.pdf

  18. Young, N.E., Tarjan, R.E., Orlin, J.B.: Faster parametric shortest path and minimum-balance algorithms. Networks 21, 205–221 (1991)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Hansen, T.D., Zwick, U. (2010). Lower Bounds for Howard’s Algorithm for Finding Minimum Mean-Cost Cycles. In: Cheong, O., Chwa, KY., Park, K. (eds) Algorithms and Computation. ISAAC 2010. Lecture Notes in Computer Science, vol 6506. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-17517-6_37

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-17517-6_37

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-17516-9

  • Online ISBN: 978-3-642-17517-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics