Parameter-Free Sampled Fictitious Play for Solving Deterministic Dynamic Programming Problems

Dolinskaya, Irina S.; Epelman, Marina A.; Şişikoğlu Sir, Esra; Smith, Robert L.

doi:10.1007/s10957-015-0798-5

Parameter-Free Sampled Fictitious Play for Solving Deterministic Dynamic Programming Problems

Published: 25 August 2015

Volume 169, pages 631–655, (2016)
Cite this article

Journal of Optimization Theory and Applications Aims and scope Submit manuscript

Irina S. Dolinskaya¹,
Marina A. Epelman²,
Esra Şişikoğlu Sir³ &
…
Robert L. Smith²

274 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

In this paper, we present a parameter-free variation of the Sampled Fictitious Play algorithm that facilitates fast solution of deterministic dynamic programming problems. Its random tie-breaking procedure imparts a natural randomness to the algorithm which prevents it from “getting stuck” at a local optimal solution and allows the discovery of an optimal path in a finite number of iterations. Furthermore, we illustrate through an application to maritime navigation that, in practice, a parameter-free Sampled Fictitious Play algorithm finds a high-quality solution after only a few iterations, in contrast with traditional methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Monte Carlo Tree Search: a review of recent modifications and applications

Article Open access 19 July 2022

ARES: Adaptive Receding-Horizon Synthesis of Optimal Plans

Search Games: A Review

References

Denardo, E.V.: Dynamic Programming. Dover Publications Inc, Mineola, NY (2003)
MATH Google Scholar
Bertsekas, D.P.: Dynamic Programming and Optimal Control, 3rd edn. Athena Scientific, Belmont (2007)
MATH Google Scholar
Androulakis, I.P.: Dynamic programming: inventory control dynamic programming: Inventory control. In: Floudas, C.A., Pardalos, P.M. (eds.) Encyclopedia of Optimization, pp. 853–856. Springer, US (2009). doi:10.1007/978-0-387-74759-0_149
Khaledi, H., Reisi-Nafchi, M.: Dynamic production planning model: a dynamic programming approach. Int J Adv Manuf Technol 67(5–8), 1675–1681 (2013). doi:10.1007/s00170-012-4600-7
Article Google Scholar
Sancho, N.: A dynamic programming solution of a shortest path problem with time constraints on movement and parking. J. Math. Anal. Appl. 166(1), 192–198 (1992). doi:10.1016/0022-247X(92)90335-B. http://www.sciencedirect.com/science/article/pii/0022247X9290335B
Righini, G., Salani, M.: New dynamic programming algorithms for the resource constrained elementary shortest path problem. Networks 51(3), 155–170 (2008). doi:10.1002/net.v51:3
Article MathSciNet MATH Google Scholar
Plant, W.J., Keller, W.C., Hayes, K.: Simultaneous measurement of ocean winds and waves with an airborne coherent real aperture radar. J. Atmos. Oceanic Technol. 22, 832–846 (2005)
Article Google Scholar
Johnson, J.T., Burkholder, R.J., Toporkov, J.V., Lyzenga, D.R., Plant, W.J.: A numerical study of the retrieval of sea surface height profiles from low grazing angle radar data. IEEE Trans. Geosci. Remote Sens. 47(6), 1641–1650 (2009)
Article Google Scholar
Alford, L.K., Beck, R.F., Johnson, J.T., Lyzenga, D., Nwogu, O., Zundel, A.: Design, implementation, and evaluation of a system for environmental and ship motion forecasting. In: 30th Symposium on Naval Hydrodynamics. Hobart, Tasmania, Australia (2014)
Nwogu, O.G.: Interaction of finite-amplitude waves with vertically-sheared current fields. J. Fluid Mech. 627, 179–213 (2009)
Article MathSciNet MATH Google Scholar
Nwogu, O.G., Lyzenga, D.R.: Surface wavefield estimation from coherent marine radars. IEEE Geosci. Remote Sens. Lett. 7(4), 631–635 (2010)
Article Google Scholar
Zhang, X., Bandyk, P., Beck, R.F.: Seakeeping computations using double-body basis flows. Appl. Ocean Res. 32(4), 471–482 (2010)
Article Google Scholar
Dreyfus, S.E.: An appraisal of some shortest-path algorithms. Oper. Res. 17(3), 395–412 (1969)
Article MATH Google Scholar
Ahuja, R.K., Mehlhorn, K., Orlin, J., Tarjan, R.E.: Faster algorithms for the shortest path problem. JACM 37(2), 213–223 (1990). doi:10.1145/77600.77615
Article MathSciNet MATH Google Scholar
Ahuja, R.K., Magnanti, T.L., Orlin, J.B.: Network Flows. Prentice Hall, Englewood Cliffs (1993)
MATH Google Scholar
Schrijver, A.: Combinatorial Optimization: Polyhedra and Efficiency, vol. 24. Springer Science & Business Media, Berlin (2003)
MATH Google Scholar
Pearl, J.: Heuristics: Intelligent Search Strategies for Computer Problem Solving. Addison-Wesley, Reading (1984)
Gubichev, A., Bedathur, S., Seufert, S., Weikum, G.: Fast and accurate estimation of shortest paths in large graphs. In: Proceedings of the 19th ACM international conference on information and knowledge management, CIKM ’10, pp. 499–508. ACM, New York, NY (2010). doi:10.1145/1871437.1871503
Brown, G.W.: Iterative solution of games by fictitious play. In: Koopmans, T.C. (ed.) Activity Analysis of Production and Allocation, chap. XXIV, pp. 374–376. Wiley, New York (1951)
Robinson, J.: An iterative method of solving a game. Ann. Math. 54(2), 296–301 (1951)
Article MathSciNet MATH Google Scholar
Monderer, D., Shapley, L.S.: Fictitious play property for games with identical interests. J. Econ. Theory 68(14), 258–265 (1996)
Article MathSciNet MATH Google Scholar
Lambert, T.J.I., Epelman, M.A., Smith, R.L.: A fictitious play approach to large-scale optimization. Oper. Res. 53(3), 477–489 (2005)
Article MathSciNet MATH Google Scholar
Cheng, S.F., Epelman, M.A., Smith, R.L.: CoSIGN: a parallel algorithm for coordinated traffic signal control. IEEE Trans. Intell. Trans. Syst. 7(4), 551–564 (2006)
Article Google Scholar
Garcia, A., Reaume, D., Smith, R.L.: Fictitious play for finding system optimal routing in dynamic traffic networks. Trans. Res. B 34(2), 147–156 (2000)
Article Google Scholar
Garcia, A., Patek, S.D., Sinha, K.: A decentralized approach to discrete optimization via simulation: application to network flow. Oper. Res. 55(4), 717–732 (2007)
Article MathSciNet MATH Google Scholar
Ghate, A., Cheng, S.F., Baumert, S., Reaume, D., Sharma, D., Smith, R.L.: Sampled fictitious play for multi-action stochastic dynamic programs. IIE Trans. 46(7), 742–756 (2014)
Article Google Scholar
Sisikoglu, E.: Distributed algorithms based on fictitious play for near optimal sequential decision making. Ph.D. thesis, The University of Michigan, Ann Arbor, MI (2009)
Epelman, M.A., Ghate, A., Smith, R.L.: Sampled fictitious play for approximate dynamic programming. Comput. Oper. Res. 36(12), 1705–1718 (2011)
Article MathSciNet MATH Google Scholar
Sisikoglu, E., Epelman, M.A., Smith, R.L.: A sampled fictitious play based learning algorithm for infinite horizon markov decision processes. In: S. Jain, R.R. Creasey, J. Himmelspach, K.P. White, M. Fu (eds.) Proceedings of the 2011 winter simulation conference, pp. 4086–4097 (2011)
Powell, W.B.: Approximate Dynamic Programming: Solving the Curses of Dimensionality, vol. 703. Wiley, Hoboken (2007)
Book MATH Google Scholar
Si, J., Barto, A.G., Powell, W.B., Wunsch, D.: Handbook of Learning and Approximate Dynamic Programming (IEEE Press Series on Computational Intelligence). Wiley-IEEE Press, New York (2004)
Book Google Scholar
Marden, J.R., Young, H.P., Arslan, G., Shamma, J.S.: Payoff-based dynamics for multiplayer weakly acyclic games. SIAM J. Control Optim. 48(1), 373–396 (2009). doi:10.1137/070680199
Article MathSciNet MATH Google Scholar
Buşoniu, L., Babuška, R., De Schutter, B., Ernst, D.: Reinforcement Learning and Dynamic Programming Using Function Approximators. CRC Press, Boca Raton (2010) doi:10.1201/9781439821091
Vrabie, D., Vamvoudakis, K.G., Lewis, F.L.: Optimal Adaptive Control and Differential Games by Reinforcement Learning Principles. The Institute of Engineering and Technology, London (2012)
Zermelo, E.: Über das navigationsproblem bei ruhender oder veränderlicher windverteilung. Z. Angew. Math. Mech. 11(2), 114–124 (1931)
Article MATH Google Scholar
Faulkner, F.D.: A general numerical method for determining optimum ship routes. Navigation 10(2), 143–148 (1963)
Article MathSciNet Google Scholar
Faulkner, F.D.: Numerical methods for determining optimum ship routes. Navigation 10(4), 351–367 (1963)
Article MathSciNet Google Scholar
Papadakis, N.A., Perakis, A.N.: Deterministic minimal time vessel routing. Oper. Res. 38(3), 426–438 (1990)
Article MathSciNet MATH Google Scholar
Perakis, A.N., Papadakis, N.A.: New models for minimal time ship weather routing. Soc. Naval Arch. Marine Eng. Trans. 96, 247–269 (1988)
Google Scholar
Perakis, A.N., Papadakis, N.A.: Minimal time vessel routing in a time-dependent environment. Trans. Sci. 23(4), 266–276 (1989)
Article MathSciNet MATH Google Scholar
Kimball, J.C., Story, H.: Fermat’s principle, Huygens’ principle, Hamilton’s optics and sailing strategy. Eur. J. Phys. 19, 15–24 (1998)
Article MATH Google Scholar
Philpott, A.B., Sullivan, R.M., Jackson, P.S.: Yacht velocity prediction using mathematical programming. Eur. J. Oper. Res. 67(1), 13–24 (1993)
Article MATH Google Scholar
Allsopp, T., Mason, A., Philpott, A.B.: Optimal sailing routes with uncertain weather. In: Proceedings of the 35th annual conference of the operational research society of New Zealand, pp. 65–74 (2000)
Philpott, A.B.: Stochastic optimization and yacht racing. In: Applications of stochastic programming, MPS/SIAM Ser. Optim., vol. 5, pp. 315–336. SIAM, Philadelphia, PA (2005)
Philpott, A.B., Mason, A.: Optimising yacht routes under uncertainty. In: The 15th Cheasapeake Sailing Yacht Symposium (2001)
Mitchell, J.S.B.: Geometric shortest paths and network optimization. In: Handbook of computational geometry, pp. 633–701. North-Holland, Amsterdam (2000)
Lanthier, M., Maheshwari, A., Sack, J.R.: Shortest anisotropic paths on terrains. In: Automata, languages and programming (Prague, 1999), Lecture Notes in Comput. Sci., vol. 1644, pp. 524–533. Springer, Berlin (1999)
Rowe, N.C.: Obtaining optimal mobile-robot paths with nonsmooth anisotropic cost functions using qualitative-state reasoning. Int. J. Rob. Res. 16(3), 375–399 (1997)
Article Google Scholar
Rowe, N.C., Ross, R.S.: Optimal grid-free path planning across arbitrarily contoured terrain with anisotropic friction and gravity effects. IEEE Trans. Rob. Autom. 6(5), 540–553 (1990)
Article Google Scholar
Sun, Z., Rief, J.H.: On finding energy-minimizing paths on terrains. IEEE Trans. Rob. 21(1), 102–114 (2005)
Article Google Scholar
Nilim, A., El Ghaoui, L., Hansen, M., Duong, V.: Trajectory-based air traffic management (TB-ATM) under weather uncertainty. In: Proceedings of the Fourth International Air Traffic Management R&D Seminar ATM. Santa Fe, New Mexico (2001)
Nilim, A., El Ghaoui, L.: Algorithms for air traffic flow management under stochastic environments. Proceedings of American control conference 4, 3429–3434 (2004)
Google Scholar
Fang, M.C., Luo, J.H.: On the track keeping and roll reduction of the ship in random waves using different sliding mode controllers. Ocean Eng. 34, 479–488 (2007)
Article Google Scholar
Treakle, T.W.I., Mook, D.T., Liapis, S.I., Nayfeh, A.H.: A time-domain method to evaluate the use of moving weights to reduce the roll motion of a ship. Ocean Eng. 27(12), 1321–1343 (2000)
Article Google Scholar
Smith, T.C., Thomas III, W.L.: A survey of ship motion reduction devices. Departmental Report SHD-1338-01, David Taylor Research Center, Bethesda, Maryland 20084-5000 (1990)
Dolinskaya, I.S.: Optimal path finding in direction, location and time dependent environments. Nav. Res. Logist. Quart. 59(5), 325–339 (2012)
Article MathSciNet Google Scholar
Dijkstra, E.W.: A note on two problems in connexion with graphs. Numer. Math. 1(1), 269–271 (1959)
Article MathSciNet MATH Google Scholar
Ross, S.M.: Stochastic Processes, 2nd edn. Wiley, New York (1995)
Google Scholar
Zwillinger, D., Kokoska, S.: CRC Standard Probability and Statistics Tables and Formulae. CRC Press, Boca Raton (1999)
Book MATH Google Scholar
Fossen, T.I.: Guidance and Control of Ocean Vehicles. Wiley, New York (1994)
Google Scholar
Dubins, L.E.: On curves of minimal length with a constraint on average curvature, and with prescribed initial and terminal positions and tangents. Amer. J. Math. 79, 497–516 (1957)
Article MathSciNet MATH Google Scholar
Sussmann, H.J., Tang, G.: Shortest path for the Reeds-Shepp car: a worked out example of the use of geometric techniques in nonlinear optimal control. Tech. Rep. SYCON-91-10, Rutgers Center for Systems and Control (1991)
Boissonnat, J.D., Cérézo, A., Leblond, J.: Shortest paths of bounded curvature in the plane. J. Intell. Rob. Syst. 11(1–2), 5–20 (1994)
Article MATH Google Scholar
Alden, J.M., Smith, R.L.: Rolling horizon procedures in nonhomogeneous Markov decision processes. Oper. Res. 40(suppl. 2), S183–S194 (1992)
Article MathSciNet MATH Google Scholar
Lee, C.Y., Denardo, E.V.: Rolling planning horizons: error bounds for the dynamic lot size model. Math. Oper. Res. 11(3), 423–432 (1986)
Article MathSciNet MATH Google Scholar
Ovacikt, I.M., Uzsoy, R.: Rolling horizon algorithms for a single-machine dynamic scheduling problem with sequence-dependent setup times. Int. J. Prod. Res. 32(6), 1243–1263 (1994)
Article MATH Google Scholar
Office of Naval Research: MURI-optimal vessel maneuvering in evolving nonlinear wave fields: Final meeting. Arlington, VA (2011)
Google Scholar

Download references

Acknowledgments

The authors would like to thank Okey Nwogu and Fernando Tavares for their assistance with implementation and numerical results. This work was supported in part by the Office of Naval Research through the Multidisciplinary University Research Initiative (MURI) Optimum Vessel Performance in Evolving Nonlinear Wave Fields Grant (N00014-05-1-0537).

Author information

Authors and Affiliations

Department of Industrial Engineering and Management Sciences, Northwestern University, Evanston, IL, 60208, USA
Irina S. Dolinskaya
Department of Industrial and Operations Engineering, University of Michigan, Ann Arbor, MI, 48109, USA
Marina A. Epelman & Robert L. Smith
Office of Access Management, Mayo Clinic, Rochester, MN, 55905, USA
Esra Şişikoğlu Sir

Authors

Irina S. Dolinskaya
View author publications
You can also search for this author in PubMed Google Scholar
Marina A. Epelman
View author publications
You can also search for this author in PubMed Google Scholar
Esra Şişikoğlu Sir
View author publications
You can also search for this author in PubMed Google Scholar
Robert L. Smith
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Irina S. Dolinskaya.

Additional information

Communicated by Kyriakos G. Vamvoudakis.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dolinskaya, I.S., Epelman, M.A., Şişikoğlu Sir, E. et al. Parameter-Free Sampled Fictitious Play for Solving Deterministic Dynamic Programming Problems. J Optim Theory Appl 169, 631–655 (2016). https://doi.org/10.1007/s10957-015-0798-5

Download citation

Received: 02 February 2015
Accepted: 14 August 2015
Published: 25 August 2015
Issue Date: May 2016
DOI: https://doi.org/10.1007/s10957-015-0798-5

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Parameter-Free Sampled Fictitious Play for Solving Deterministic Dynamic Programming Problems

Abstract

Access this article

Similar content being viewed by others

Monte Carlo Tree Search: a review of recent modifications and applications

ARES: Adaptive Receding-Horizon Synthesis of Optimal Plans

Search Games: A Review

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation