Skip to main content
Log in

Parameter-Free Sampled Fictitious Play for Solving Deterministic Dynamic Programming Problems

  • Published:
Journal of Optimization Theory and Applications Aims and scope Submit manuscript

Abstract

In this paper, we present a parameter-free variation of the Sampled Fictitious Play algorithm that facilitates fast solution of deterministic dynamic programming problems. Its random tie-breaking procedure imparts a natural randomness to the algorithm which prevents it from “getting stuck” at a local optimal solution and allows the discovery of an optimal path in a finite number of iterations. Furthermore, we illustrate through an application to maritime navigation that, in practice, a parameter-free Sampled Fictitious Play algorithm finds a high-quality solution after only a few iterations, in contrast with traditional methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Denardo, E.V.: Dynamic Programming. Dover Publications Inc, Mineola, NY (2003)

    MATH  Google Scholar 

  2. Bertsekas, D.P.: Dynamic Programming and Optimal Control, 3rd edn. Athena Scientific, Belmont (2007)

    MATH  Google Scholar 

  3. Androulakis, I.P.: Dynamic programming: inventory control dynamic programming: Inventory control. In: Floudas, C.A., Pardalos, P.M. (eds.) Encyclopedia of Optimization, pp. 853–856. Springer, US (2009). doi:10.1007/978-0-387-74759-0_149

  4. Khaledi, H., Reisi-Nafchi, M.: Dynamic production planning model: a dynamic programming approach. Int J Adv Manuf Technol 67(5–8), 1675–1681 (2013). doi:10.1007/s00170-012-4600-7

    Article  Google Scholar 

  5. Sancho, N.: A dynamic programming solution of a shortest path problem with time constraints on movement and parking. J. Math. Anal. Appl. 166(1), 192–198 (1992). doi:10.1016/0022-247X(92)90335-B. http://www.sciencedirect.com/science/article/pii/0022247X9290335B

  6. Righini, G., Salani, M.: New dynamic programming algorithms for the resource constrained elementary shortest path problem. Networks 51(3), 155–170 (2008). doi:10.1002/net.v51:3

    Article  MathSciNet  MATH  Google Scholar 

  7. Plant, W.J., Keller, W.C., Hayes, K.: Simultaneous measurement of ocean winds and waves with an airborne coherent real aperture radar. J. Atmos. Oceanic Technol. 22, 832–846 (2005)

    Article  Google Scholar 

  8. Johnson, J.T., Burkholder, R.J., Toporkov, J.V., Lyzenga, D.R., Plant, W.J.: A numerical study of the retrieval of sea surface height profiles from low grazing angle radar data. IEEE Trans. Geosci. Remote Sens. 47(6), 1641–1650 (2009)

    Article  Google Scholar 

  9. Alford, L.K., Beck, R.F., Johnson, J.T., Lyzenga, D., Nwogu, O., Zundel, A.: Design, implementation, and evaluation of a system for environmental and ship motion forecasting. In: 30th Symposium on Naval Hydrodynamics. Hobart, Tasmania, Australia (2014)

  10. Nwogu, O.G.: Interaction of finite-amplitude waves with vertically-sheared current fields. J. Fluid Mech. 627, 179–213 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  11. Nwogu, O.G., Lyzenga, D.R.: Surface wavefield estimation from coherent marine radars. IEEE Geosci. Remote Sens. Lett. 7(4), 631–635 (2010)

    Article  Google Scholar 

  12. Zhang, X., Bandyk, P., Beck, R.F.: Seakeeping computations using double-body basis flows. Appl. Ocean Res. 32(4), 471–482 (2010)

    Article  Google Scholar 

  13. Dreyfus, S.E.: An appraisal of some shortest-path algorithms. Oper. Res. 17(3), 395–412 (1969)

    Article  MATH  Google Scholar 

  14. Ahuja, R.K., Mehlhorn, K., Orlin, J., Tarjan, R.E.: Faster algorithms for the shortest path problem. JACM 37(2), 213–223 (1990). doi:10.1145/77600.77615

    Article  MathSciNet  MATH  Google Scholar 

  15. Ahuja, R.K., Magnanti, T.L., Orlin, J.B.: Network Flows. Prentice Hall, Englewood Cliffs (1993)

    MATH  Google Scholar 

  16. Schrijver, A.: Combinatorial Optimization: Polyhedra and Efficiency, vol. 24. Springer Science & Business Media, Berlin (2003)

    MATH  Google Scholar 

  17. Pearl, J.: Heuristics: Intelligent Search Strategies for Computer Problem Solving. Addison-Wesley, Reading (1984)

  18. Gubichev, A., Bedathur, S., Seufert, S., Weikum, G.: Fast and accurate estimation of shortest paths in large graphs. In: Proceedings of the 19th ACM international conference on information and knowledge management, CIKM ’10, pp. 499–508. ACM, New York, NY (2010). doi:10.1145/1871437.1871503

  19. Brown, G.W.: Iterative solution of games by fictitious play. In: Koopmans, T.C. (ed.) Activity Analysis of Production and Allocation, chap. XXIV, pp. 374–376. Wiley, New York (1951)

  20. Robinson, J.: An iterative method of solving a game. Ann. Math. 54(2), 296–301 (1951)

    Article  MathSciNet  MATH  Google Scholar 

  21. Monderer, D., Shapley, L.S.: Fictitious play property for games with identical interests. J. Econ. Theory 68(14), 258–265 (1996)

    Article  MathSciNet  MATH  Google Scholar 

  22. Lambert, T.J.I., Epelman, M.A., Smith, R.L.: A fictitious play approach to large-scale optimization. Oper. Res. 53(3), 477–489 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  23. Cheng, S.F., Epelman, M.A., Smith, R.L.: CoSIGN: a parallel algorithm for coordinated traffic signal control. IEEE Trans. Intell. Trans. Syst. 7(4), 551–564 (2006)

    Article  Google Scholar 

  24. Garcia, A., Reaume, D., Smith, R.L.: Fictitious play for finding system optimal routing in dynamic traffic networks. Trans. Res. B 34(2), 147–156 (2000)

    Article  Google Scholar 

  25. Garcia, A., Patek, S.D., Sinha, K.: A decentralized approach to discrete optimization via simulation: application to network flow. Oper. Res. 55(4), 717–732 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  26. Ghate, A., Cheng, S.F., Baumert, S., Reaume, D., Sharma, D., Smith, R.L.: Sampled fictitious play for multi-action stochastic dynamic programs. IIE Trans. 46(7), 742–756 (2014)

    Article  Google Scholar 

  27. Sisikoglu, E.: Distributed algorithms based on fictitious play for near optimal sequential decision making. Ph.D. thesis, The University of Michigan, Ann Arbor, MI (2009)

  28. Epelman, M.A., Ghate, A., Smith, R.L.: Sampled fictitious play for approximate dynamic programming. Comput. Oper. Res. 36(12), 1705–1718 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  29. Sisikoglu, E., Epelman, M.A., Smith, R.L.: A sampled fictitious play based learning algorithm for infinite horizon markov decision processes. In: S. Jain, R.R. Creasey, J. Himmelspach, K.P. White, M. Fu (eds.) Proceedings of the 2011 winter simulation conference, pp. 4086–4097 (2011)

  30. Powell, W.B.: Approximate Dynamic Programming: Solving the Curses of Dimensionality, vol. 703. Wiley, Hoboken (2007)

    Book  MATH  Google Scholar 

  31. Si, J., Barto, A.G., Powell, W.B., Wunsch, D.: Handbook of Learning and Approximate Dynamic Programming (IEEE Press Series on Computational Intelligence). Wiley-IEEE Press, New York (2004)

    Book  Google Scholar 

  32. Marden, J.R., Young, H.P., Arslan, G., Shamma, J.S.: Payoff-based dynamics for multiplayer weakly acyclic games. SIAM J. Control Optim. 48(1), 373–396 (2009). doi:10.1137/070680199

    Article  MathSciNet  MATH  Google Scholar 

  33. Buşoniu, L., Babuška, R., De Schutter, B., Ernst, D.: Reinforcement Learning and Dynamic Programming Using Function Approximators. CRC Press, Boca Raton (2010) doi:10.1201/9781439821091

  34. Vrabie, D., Vamvoudakis, K.G., Lewis, F.L.: Optimal Adaptive Control and Differential Games by Reinforcement Learning Principles. The Institute of Engineering and Technology, London (2012)

  35. Zermelo, E.: Über das navigationsproblem bei ruhender oder veränderlicher windverteilung. Z. Angew. Math. Mech. 11(2), 114–124 (1931)

    Article  MATH  Google Scholar 

  36. Faulkner, F.D.: A general numerical method for determining optimum ship routes. Navigation 10(2), 143–148 (1963)

    Article  MathSciNet  Google Scholar 

  37. Faulkner, F.D.: Numerical methods for determining optimum ship routes. Navigation 10(4), 351–367 (1963)

    Article  MathSciNet  Google Scholar 

  38. Papadakis, N.A., Perakis, A.N.: Deterministic minimal time vessel routing. Oper. Res. 38(3), 426–438 (1990)

    Article  MathSciNet  MATH  Google Scholar 

  39. Perakis, A.N., Papadakis, N.A.: New models for minimal time ship weather routing. Soc. Naval Arch. Marine Eng. Trans. 96, 247–269 (1988)

    Google Scholar 

  40. Perakis, A.N., Papadakis, N.A.: Minimal time vessel routing in a time-dependent environment. Trans. Sci. 23(4), 266–276 (1989)

    Article  MathSciNet  MATH  Google Scholar 

  41. Kimball, J.C., Story, H.: Fermat’s principle, Huygens’ principle, Hamilton’s optics and sailing strategy. Eur. J. Phys. 19, 15–24 (1998)

    Article  MATH  Google Scholar 

  42. Philpott, A.B., Sullivan, R.M., Jackson, P.S.: Yacht velocity prediction using mathematical programming. Eur. J. Oper. Res. 67(1), 13–24 (1993)

    Article  MATH  Google Scholar 

  43. Allsopp, T., Mason, A., Philpott, A.B.: Optimal sailing routes with uncertain weather. In: Proceedings of the 35th annual conference of the operational research society of New Zealand, pp. 65–74 (2000)

  44. Philpott, A.B.: Stochastic optimization and yacht racing. In: Applications of stochastic programming, MPS/SIAM Ser. Optim., vol. 5, pp. 315–336. SIAM, Philadelphia, PA (2005)

  45. Philpott, A.B., Mason, A.: Optimising yacht routes under uncertainty. In: The 15th Cheasapeake Sailing Yacht Symposium (2001)

  46. Mitchell, J.S.B.: Geometric shortest paths and network optimization. In: Handbook of computational geometry, pp. 633–701. North-Holland, Amsterdam (2000)

  47. Lanthier, M., Maheshwari, A., Sack, J.R.: Shortest anisotropic paths on terrains. In: Automata, languages and programming (Prague, 1999), Lecture Notes in Comput. Sci., vol. 1644, pp. 524–533. Springer, Berlin (1999)

  48. Rowe, N.C.: Obtaining optimal mobile-robot paths with nonsmooth anisotropic cost functions using qualitative-state reasoning. Int. J. Rob. Res. 16(3), 375–399 (1997)

    Article  Google Scholar 

  49. Rowe, N.C., Ross, R.S.: Optimal grid-free path planning across arbitrarily contoured terrain with anisotropic friction and gravity effects. IEEE Trans. Rob. Autom. 6(5), 540–553 (1990)

    Article  Google Scholar 

  50. Sun, Z., Rief, J.H.: On finding energy-minimizing paths on terrains. IEEE Trans. Rob. 21(1), 102–114 (2005)

    Article  Google Scholar 

  51. Nilim, A., El Ghaoui, L., Hansen, M., Duong, V.: Trajectory-based air traffic management (TB-ATM) under weather uncertainty. In: Proceedings of the Fourth International Air Traffic Management R&D Seminar ATM. Santa Fe, New Mexico (2001)

  52. Nilim, A., El Ghaoui, L.: Algorithms for air traffic flow management under stochastic environments. Proceedings of American control conference 4, 3429–3434 (2004)

    Google Scholar 

  53. Fang, M.C., Luo, J.H.: On the track keeping and roll reduction of the ship in random waves using different sliding mode controllers. Ocean Eng. 34, 479–488 (2007)

    Article  Google Scholar 

  54. Treakle, T.W.I., Mook, D.T., Liapis, S.I., Nayfeh, A.H.: A time-domain method to evaluate the use of moving weights to reduce the roll motion of a ship. Ocean Eng. 27(12), 1321–1343 (2000)

    Article  Google Scholar 

  55. Smith, T.C., Thomas III, W.L.: A survey of ship motion reduction devices. Departmental Report SHD-1338-01, David Taylor Research Center, Bethesda, Maryland 20084-5000 (1990)

  56. Dolinskaya, I.S.: Optimal path finding in direction, location and time dependent environments. Nav. Res. Logist. Quart. 59(5), 325–339 (2012)

    Article  MathSciNet  Google Scholar 

  57. Dijkstra, E.W.: A note on two problems in connexion with graphs. Numer. Math. 1(1), 269–271 (1959)

    Article  MathSciNet  MATH  Google Scholar 

  58. Ross, S.M.: Stochastic Processes, 2nd edn. Wiley, New York (1995)

    Google Scholar 

  59. Zwillinger, D., Kokoska, S.: CRC Standard Probability and Statistics Tables and Formulae. CRC Press, Boca Raton (1999)

    Book  MATH  Google Scholar 

  60. Fossen, T.I.: Guidance and Control of Ocean Vehicles. Wiley, New York (1994)

    Google Scholar 

  61. Dubins, L.E.: On curves of minimal length with a constraint on average curvature, and with prescribed initial and terminal positions and tangents. Amer. J. Math. 79, 497–516 (1957)

    Article  MathSciNet  MATH  Google Scholar 

  62. Sussmann, H.J., Tang, G.: Shortest path for the Reeds-Shepp car: a worked out example of the use of geometric techniques in nonlinear optimal control. Tech. Rep. SYCON-91-10, Rutgers Center for Systems and Control (1991)

  63. Boissonnat, J.D., Cérézo, A., Leblond, J.: Shortest paths of bounded curvature in the plane. J. Intell. Rob. Syst. 11(1–2), 5–20 (1994)

    Article  MATH  Google Scholar 

  64. Alden, J.M., Smith, R.L.: Rolling horizon procedures in nonhomogeneous Markov decision processes. Oper. Res. 40(suppl. 2), S183–S194 (1992)

    Article  MathSciNet  MATH  Google Scholar 

  65. Lee, C.Y., Denardo, E.V.: Rolling planning horizons: error bounds for the dynamic lot size model. Math. Oper. Res. 11(3), 423–432 (1986)

    Article  MathSciNet  MATH  Google Scholar 

  66. Ovacikt, I.M., Uzsoy, R.: Rolling horizon algorithms for a single-machine dynamic scheduling problem with sequence-dependent setup times. Int. J. Prod. Res. 32(6), 1243–1263 (1994)

    Article  MATH  Google Scholar 

  67. Office of Naval Research: MURI-optimal vessel maneuvering in evolving nonlinear wave fields: Final meeting. Arlington, VA (2011)

    Google Scholar 

Download references

Acknowledgments

The authors would like to thank Okey Nwogu and Fernando Tavares for their assistance with implementation and numerical results. This work was supported in part by the Office of Naval Research through the Multidisciplinary University Research Initiative (MURI) Optimum Vessel Performance in Evolving Nonlinear Wave Fields Grant (N00014-05-1-0537).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Irina S. Dolinskaya.

Additional information

Communicated by Kyriakos G. Vamvoudakis.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dolinskaya, I.S., Epelman, M.A., Şişikoğlu Sir, E. et al. Parameter-Free Sampled Fictitious Play for Solving Deterministic Dynamic Programming Problems. J Optim Theory Appl 169, 631–655 (2016). https://doi.org/10.1007/s10957-015-0798-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10957-015-0798-5

Keywords

Mathematics Subject Classification

Navigation