Skip to main content

Optimal Control of Point-to-Point Navigation in Turbulent Time Dependent Flows Using Reinforcement Learning

  • Conference paper
  • First Online:
AIxIA 2020 – Advances in Artificial Intelligence (AIxIA 2020)

Abstract

We present theoretical and numerical results concerning the problem to find the path that minimizes the time to navigate between two given points in a complex fluid under realistic navigation constraints. We contrast deterministic Optimal Navigation (ON) control with stochastic policies obtained by Reinforcement Learning (RL) algorithms. We show that Actor-Critic RL algorithms are able to find quasi-optimal solutions in the presence of either time-independent or chaotically evolving flow configurations. For our application, ON solutions develop unstable behavior within the typical duration of the navigation process, and are therefore not useful in practice. We first explore navigation of turbulent flow using a constant propulsion speed. Based on a discretized phase-space, the propulsion direction is adjusted with the aim to minimize the time spent to reach the target. Further, we explore a case where additional control is obtained by allowing the engine to power off. Exploiting advection of the underlying flow, allows the target to be reached with less energy consumption. In this case, we optimize a linear combination between the total navigation time and the total time the engine is switched off. Our approach can be generalized to other setups, for example, navigation under imperfect environmental forecast or with different models for the moving vessel.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Alexakis, A., Biferale, L.: Cascades and transitions in turbulent flows. Phys. Rep. 767–769, 1–101 (2018)

    Google Scholar 

  2. Andrew, Y.N., Harada, D., Russelt, S.: Policy invariance under reward transformations: theory and application to reward shaping. ICML 99, 278 (1999)

    Google Scholar 

  3. Bechinger, C., Di Leonardo, R., Löwen, H., Reichhardt, C., Volpe, G., Volpe, G.: Active particles in complex and crowded environments. Rev. Mod. Phys. 88(4), 045006 (2016)

    Article  MathSciNet  Google Scholar 

  4. Biferale, L., Bonaccorso, F., Buzzicotti, M., Clark Di Leoni, P., Gustavsson, K.: Zermelo’s problem: optimal point-to-point navigation in 2D turbulent flows using reinforcement learning. Chaos: Interdisc. J. Nonlinear Sci. 29(10), 103138 (2019)

    Google Scholar 

  5. Bryson, A.E., Ho, Y.: Applied Optimal Control: Optimization, Estimation and Control. Routledge, New York (1975)

    Google Scholar 

  6. Centurioni, L.R.: Drifter technology and impacts for sea surface temperature, sea-level pressure, and ocean circulation studies. In: Venkatesan, R., Tandon, A., D’Asaro, E., Atmanand, M.A. (eds.) Observing the Oceans in Real Time. SO, pp. 37–57. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-66493-4_3

    Chapter  Google Scholar 

  7. Colabrese, S., Gustavsson, K., Celani, A., Biferale, L.: Flow navigation by smart microswimmers via reinforcement learning. Phys. Rev. Lett. 118(15), 158004 (2017)

    Article  Google Scholar 

  8. Colabrese, S., Gustavsson, K., Celani, A., Biferale, L.: Smart inertial particles. Phys. Rev. Fluids 3(8), 084301 (2018)

    Article  Google Scholar 

  9. Gustavsson, K., Biferale, L., Celani, A., Colabrese, S.: Finding efficient swimming strategies in a three-dimensional chaotic flow by reinforcement learning. Eur. Phys. J. E 40(12), 1–6 (2017). https://doi.org/10.1140/epje/i2017-11602-9

    Article  Google Scholar 

  10. Kraus, N.D.: Wave glider dynamic modeling, parameter identification and simulation. Ph.D. thesis, University of Hawaii at Manoa, Honolulu, May 2012 (2012)

    Google Scholar 

  11. Lermusiaux, P.F., et al.: A future for intelligent autonomous ocean observing systems. J. Mar. Res. 75(6), 765–813 (2017)

    Article  Google Scholar 

  12. Lerner, J., Wagner, D., Zweig, K.: Algorithmics of Large and Complex Networks: Design, Analysis, and Simulation, vol. 5515. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-02094-0

    Book  MATH  Google Scholar 

  13. Lumpkin, R., Pazos, M.: Measuring surface currents with surface velocity program drifters: the instrument, its data, and some recent results. In: Lagrangian Analysis and Prediction of Coastal and Ocean Dynamics, pp. 39–67 (2007)

    Google Scholar 

  14. Mannarini, G., Pinardi, N., Coppini, G., Oddo, P., Iafrati, A.: VISIR-I: small vessels-least-time nautical routes using wave forecasts. Geosci. Model Dev. 9(4), 1597–1625 (2016)

    Article  Google Scholar 

  15. Okubo, A.: Horizontal dispersion of floatable particles in the vicinity of velocity singularities such as convergences. In: Deep Sea Research and Oceanographic Abstracts, vol. 17, pp. 445–454. Elsevier (1970)

    Google Scholar 

  16. Petres, C., Pailhas, Y., Patron, P., Petillot, Y., Evans, J., Lane, D.: Path planning for autonomous underwater vehicles. IEEE Trans. Robot. 23(2), 331–341 (2007)

    Article  Google Scholar 

  17. Pontryagin, L.S.: Mathematical Theory of Optimal Processes. Routledge, London (2018)

    Book  Google Scholar 

  18. Roemmich, D., et al.: The Argo program: observing the global ocean with profiling floats. Oceanography 22(2), 34–43 (2009)

    Article  Google Scholar 

  19. Russell, S., Norvig, P.: Artificial intelligence: a modern approach (2002)

    Google Scholar 

  20. Schneider, E., Stark, H.: Optimal steering of a smart active particle. arXiv preprint arXiv:1909.03243 (2019)

  21. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (2018)

    MATH  Google Scholar 

  22. Techy, L.: Optimal navigation in planar time-varying flow: Zermelo’s problem revisited. Intell. Serv. Robot. 4(4), 271–283 (2011). https://doi.org/10.1007/s11370-011-0092-9

    Article  Google Scholar 

  23. Weiss, J.: The dynamics of enstrophy transfer in two-dimensional hydrodynamics. Phys. D: Nonlinear Phenomena 48(2–3), 273–294 (1991)

    Article  MathSciNet  Google Scholar 

  24. Zermelo, E.: Über das navigationsproblem bei ruhender oder veränderlicher windverteilung. ZAMM-J. Appl. Math. Mech./Zeitschrift für Angewandte Mathematik und Mechanik 11(2), 114–124 (1931)

    Article  Google Scholar 

Download references

Acknowledgments

This project has received partial funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No 882340). K.G. acknowledges funding from the Knut and Alice Wallenberg Foundation, Grant No. KAW 2014.0048, and Vetenskapsrådet, Grant No. 2018-03974. F.B acknowledges funding from the European Research Council under the European Union’s Horizon 2020 Framework Programme (No. FP/2014–2020) ERC Grant Agreement No. 739964 (COPMAT).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Luca Biferale .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Buzzicotti, M., Biferale, L., Bonaccorso, F., Clark di Leoni, P., Gustavsson, K. (2021). Optimal Control of Point-to-Point Navigation in Turbulent Time Dependent Flows Using Reinforcement Learning. In: Baldoni, M., Bandini, S. (eds) AIxIA 2020 – Advances in Artificial Intelligence. AIxIA 2020. Lecture Notes in Computer Science(), vol 12414. Springer, Cham. https://doi.org/10.1007/978-3-030-77091-4_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-77091-4_14

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-77090-7

  • Online ISBN: 978-3-030-77091-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics