Optimal Control of Point-to-Point Navigation in Turbulent Time Dependent Flows Using Reinforcement Learning

Buzzicotti, Michele; Biferale, Luca; Bonaccorso, Fabio; Clark di Leoni, Patricio; Gustavsson, Kristian

doi:10.1007/978-3-030-77091-4_14

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12414))

Included in the following conference series:

International Conference of the Italian Association for Artificial Intelligence

1054 Accesses

Abstract

We present theoretical and numerical results concerning the problem to find the path that minimizes the time to navigate between two given points in a complex fluid under realistic navigation constraints. We contrast deterministic Optimal Navigation (ON) control with stochastic policies obtained by Reinforcement Learning (RL) algorithms. We show that Actor-Critic RL algorithms are able to find quasi-optimal solutions in the presence of either time-independent or chaotically evolving flow configurations. For our application, ON solutions develop unstable behavior within the typical duration of the navigation process, and are therefore not useful in practice. We first explore navigation of turbulent flow using a constant propulsion speed. Based on a discretized phase-space, the propulsion direction is adjusted with the aim to minimize the time spent to reach the target. Further, we explore a case where additional control is obtained by allowing the engine to power off. Exploiting advection of the underlying flow, allows the target to be reached with less energy consumption. In this case, we optimize a linear combination between the total navigation time and the total time the engine is switched off. Our approach can be generalized to other setups, for example, navigation under imperfect environmental forecast or with different models for the moving vessel.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Nonlinear Optimal Control Using Deep Reinforcement Learning

Learning efficient navigation in vortical flow fields

Article Open access 08 December 2021

Steering undulatory micro-swimmers in a fluid flow through reinforcement learning

Article 12 June 2023

References

Alexakis, A., Biferale, L.: Cascades and transitions in turbulent flows. Phys. Rep. 767–769, 1–101 (2018)
Google Scholar
Andrew, Y.N., Harada, D., Russelt, S.: Policy invariance under reward transformations: theory and application to reward shaping. ICML 99, 278 (1999)
Google Scholar
Bechinger, C., Di Leonardo, R., Löwen, H., Reichhardt, C., Volpe, G., Volpe, G.: Active particles in complex and crowded environments. Rev. Mod. Phys. 88(4), 045006 (2016)
Article MathSciNet Google Scholar
Biferale, L., Bonaccorso, F., Buzzicotti, M., Clark Di Leoni, P., Gustavsson, K.: Zermelo’s problem: optimal point-to-point navigation in 2D turbulent flows using reinforcement learning. Chaos: Interdisc. J. Nonlinear Sci. 29(10), 103138 (2019)
Google Scholar
Bryson, A.E., Ho, Y.: Applied Optimal Control: Optimization, Estimation and Control. Routledge, New York (1975)
Google Scholar
Centurioni, L.R.: Drifter technology and impacts for sea surface temperature, sea-level pressure, and ocean circulation studies. In: Venkatesan, R., Tandon, A., D’Asaro, E., Atmanand, M.A. (eds.) Observing the Oceans in Real Time. SO, pp. 37–57. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-66493-4_3
Chapter Google Scholar
Colabrese, S., Gustavsson, K., Celani, A., Biferale, L.: Flow navigation by smart microswimmers via reinforcement learning. Phys. Rev. Lett. 118(15), 158004 (2017)
Article Google Scholar
Colabrese, S., Gustavsson, K., Celani, A., Biferale, L.: Smart inertial particles. Phys. Rev. Fluids 3(8), 084301 (2018)
Article Google Scholar
Gustavsson, K., Biferale, L., Celani, A., Colabrese, S.: Finding efficient swimming strategies in a three-dimensional chaotic flow by reinforcement learning. Eur. Phys. J. E 40(12), 1–6 (2017). https://doi.org/10.1140/epje/i2017-11602-9
Article Google Scholar
Kraus, N.D.: Wave glider dynamic modeling, parameter identification and simulation. Ph.D. thesis, University of Hawaii at Manoa, Honolulu, May 2012 (2012)
Google Scholar
Lermusiaux, P.F., et al.: A future for intelligent autonomous ocean observing systems. J. Mar. Res. 75(6), 765–813 (2017)
Article Google Scholar
Lerner, J., Wagner, D., Zweig, K.: Algorithmics of Large and Complex Networks: Design, Analysis, and Simulation, vol. 5515. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-02094-0
Book MATH Google Scholar
Lumpkin, R., Pazos, M.: Measuring surface currents with surface velocity program drifters: the instrument, its data, and some recent results. In: Lagrangian Analysis and Prediction of Coastal and Ocean Dynamics, pp. 39–67 (2007)
Google Scholar
Mannarini, G., Pinardi, N., Coppini, G., Oddo, P., Iafrati, A.: VISIR-I: small vessels-least-time nautical routes using wave forecasts. Geosci. Model Dev. 9(4), 1597–1625 (2016)
Article Google Scholar
Okubo, A.: Horizontal dispersion of floatable particles in the vicinity of velocity singularities such as convergences. In: Deep Sea Research and Oceanographic Abstracts, vol. 17, pp. 445–454. Elsevier (1970)
Google Scholar
Petres, C., Pailhas, Y., Patron, P., Petillot, Y., Evans, J., Lane, D.: Path planning for autonomous underwater vehicles. IEEE Trans. Robot. 23(2), 331–341 (2007)
Article Google Scholar
Pontryagin, L.S.: Mathematical Theory of Optimal Processes. Routledge, London (2018)
Book Google Scholar
Roemmich, D., et al.: The Argo program: observing the global ocean with profiling floats. Oceanography 22(2), 34–43 (2009)
Article Google Scholar
Russell, S., Norvig, P.: Artificial intelligence: a modern approach (2002)
Google Scholar
Schneider, E., Stark, H.: Optimal steering of a smart active particle. arXiv preprint arXiv:1909.03243 (2019)
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (2018)
MATH Google Scholar
Techy, L.: Optimal navigation in planar time-varying flow: Zermelo’s problem revisited. Intell. Serv. Robot. 4(4), 271–283 (2011). https://doi.org/10.1007/s11370-011-0092-9
Article Google Scholar
Weiss, J.: The dynamics of enstrophy transfer in two-dimensional hydrodynamics. Phys. D: Nonlinear Phenomena 48(2–3), 273–294 (1991)
Article MathSciNet Google Scholar
Zermelo, E.: Über das navigationsproblem bei ruhender oder veränderlicher windverteilung. ZAMM-J. Appl. Math. Mech./Zeitschrift für Angewandte Mathematik und Mechanik 11(2), 114–124 (1931)
Article Google Scholar

Download references

Acknowledgments

This project has received partial funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No 882340). K.G. acknowledges funding from the Knut and Alice Wallenberg Foundation, Grant No. KAW 2014.0048, and Vetenskapsrådet, Grant No. 2018-03974. F.B acknowledges funding from the European Research Council under the European Union’s Horizon 2020 Framework Programme (No. FP/2014–2020) ERC Grant Agreement No. 739964 (COPMAT).

Author information

Authors and Affiliations

Department of Physics and INFN, University of Rome Tor Vergata, Via della Ricerca Scientifica 1, 00133, Rome, Italy
Michele Buzzicotti, Luca Biferale & Fabio Bonaccorso
Center for Life Nano Science@La Sapienza, Istituto Italiano di Tecnologia, 00161, Rome, Italy
Fabio Bonaccorso
Department of Mechanical Engineering, Johns Hopkins University, Baltimore, MD, 21218, USA
Patricio Clark di Leoni
Department of Physics, University of Gothenburg, 41296, Gothenburg, Sweden
Kristian Gustavsson

Authors

Michele Buzzicotti
View author publications
You can also search for this author in PubMed Google Scholar
Luca Biferale
View author publications
You can also search for this author in PubMed Google Scholar
Fabio Bonaccorso
View author publications
You can also search for this author in PubMed Google Scholar
Patricio Clark di Leoni
View author publications
You can also search for this author in PubMed Google Scholar
Kristian Gustavsson
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Luca Biferale .

Editor information

Editors and Affiliations

Università degli Studi di Torino, Turin, Italy
Matteo Baldoni
Department of Informatics, Systems and C, University of Milano-Bicocca, Milan, Italy
Stefania Bandini

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Buzzicotti, M., Biferale, L., Bonaccorso, F., Clark di Leoni, P., Gustavsson, K. (2021). Optimal Control of Point-to-Point Navigation in Turbulent Time Dependent Flows Using Reinforcement Learning. In: Baldoni, M., Bandini, S. (eds) AIxIA 2020 – Advances in Artificial Intelligence. AIxIA 2020. Lecture Notes in Computer Science(), vol 12414. Springer, Cham. https://doi.org/10.1007/978-3-030-77091-4_14

Download citation

DOI: https://doi.org/10.1007/978-3-030-77091-4_14
Published: 22 May 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-77090-7
Online ISBN: 978-3-030-77091-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics