Abstract
Deep Reinforcement Learning algorithms have gained attention lately due to their ability to solve complex decision problems with a model-free and zero-derivative approach. In the case of multi-agent problems, these algorithms can help to easily find efficient cooperative policies in a feasible amount of time. In this chapter, we present the Informative Patrolling Problem, a commonplace task in the conservation of water resources. The approach is presented here as a convenient methodology for the synthesis of cooperative policies than can solve simultaneous objectives present in the unmanned monitoring of lakes and rivers: maximizing the collected information of water parameters and the collision-free routing with multiple surface vehicles. For this mixed objective, it is proposed a Deep Q-Learning scheme with a convolutional network as a shared fleet policy. In order to solve the credit assignment problem, it is proposed an effective multiagent decomposition of the informative reward with a discussion of other several state-of-the-art topics of Reinforcement Learning: noisy networks for enhanced exploration of the state-action domain, the use of a visual states, and the shaping of the reward function. This methodology, as it is quantitative demonstrated, allows a significant improvement in water resource monitoring compared to other heuristics.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
As it is demonstrated in [21] and because \(\varSigma \) is a positive semi-definite matrix, \(|\varSigma | = \prod _{i=0}^{dim(X)} \lambda _i\).
- 3.
In a total deterministic environment, this probability is assumed to be 1.
- 4.
\(\alpha = 0\) means full uniform sampling and vice versa.
- 5.
From this point, the decoupled reward is selected for better performance.
- 6.
See https://deap.readthedocs.io/en/master/api/benchmarks.html for the complete definition.
References
Arzamendia M, Gregor D, Gutierrez-Reina D, Toral S (2019) An evolutionary approach to constrained path planning of an autonomous surface vehicle for maximizing the covered area of ypacarai lake. Soft Comput 23(5):1723–1734
Arzamendia M, Gutierrez D, Toral S, Gregor D, Asimakopoulou E, Bessis N (2019) Intelligent online learning strategy for an autonomous surface vehicle in lake environments using evolutionary computation. IEEE Intell Transp Syst Mag 11(4):110–125
Bellman RE (2003) Dynamic Programming. Dover Publications Inc, USA
Coley K (2015) Unmanned surface vehicles: the future of data-collection. Ocean Chall 21:14–15
Cover TM, Thomas JA (2006) Elements of information theory. Wiley Series in telecommunications and signal processing. Wiley-Interscience, USA
Ferreira H, Almeida C, Martins A, Almeida J, Dias N, Dias A, Silva E (2009) Autonomous bathymetry for risk assessment with ROAZ robotic surface vehicle. In: OCEANS 2009-EUROPE, pp 1–6. https://doi.org/10.1109/OCEANSE.2009.5278235
Fortunato M, Azar MG, Piot B, Menick J, Osband I, Graves A, Mnih V, Munos R, Hassabis D, Pietquin O, Blundell C, Legg S (2017) Noisy networks for exploration. CoRR arXiv:1706.10295
van Hasselt H, Guez A, Silver D (2015) Deep reinforcement learning with double Q-learning. CoRR arXiv:1509.06461
Hoen PJ, Tuyls K, Panait L, Luke S, La Poutré JA (2006) An overview of cooperative and competitive multiagent learning. In: Tuyls K, Hoen PJ, Verbeeck K, Sen S (eds) Learning and adaption in multi-agent systems. Springer, Berlin, Heidelberg, pp 1–46
Julian KD, Kochenderfer MJ (2018) Distributed wildfire surveillance with autonomous aircraft using deep reinforcement learning. CoRR arXiv:1810.04244
Kathen MJT, Flores IJ, Reina DG (2021) An informative path planner for a swarm of ASVs based on an enhanced PSO with gaussian surrogate model components intended for water monitoring applications. Electronics 10(13):1605
Krishna Lakshmanan A, Elara Mohan R, Ramalingam B, Vu Le A, Veerajagadeshwar P, Tiwari K, Ilyas M (2020) Complete coverage path planning using reinforcement learning for tetromino based cleaning and maintenance robot. Autom Constr 112(May 2019):103078. https://doi.org/10.1016/j.autcon.2020.103078
Lillicrap TP, Hunt JJ, Pritzel A, Heess N, Erez T, Tassa Y, Silver D, Wierstra D (2016) Continuous control with deep reinforcement learning. In: Bengio Y, LeCun Y (eds) ICLR, http://dblp.uni-trier.de/db/conf/iclr/iclr2016.html#LillicrapHPHETS15
Lowe R, Wu Y, Tamar A, Harb J, Abbeel P, Mordatch I (2017) Multi-agent actor-critic for mixed cooperative-competitive environments. NIPS’17, Curran Associates Inc., Red Hook, NY, USA
Mnih V, Kavukcuoglu K, Silver D et al (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529–533. https://doi.org/10.1038/nature14236
Murphy RR, Steimle E, Griffin C, Cullins C, Hall M, Pratt K (2008) Cooperative use of unmanned sea surface and micro aerial vehicles at hurricane Wilma. J Field Robot 25(3):164–180. https://doi.org/10.1002/rob.20235
Peralta F, Reina DG, Toral S, Arzamendia M, Gregor D (2021) A Bayesian optimization approach for multi-function estimation for environmental monitoring using an autonomous surface vehicle: Ypacarai lake case study. Electronics 10(8):963
Peralta Samaniego F, Reina DG, Toral Marín SL, Gregor DO, Arzamendia M (2021) A Bayesian optimization approach for water resources monitoring through an autonomous surface vehicle: the ypacarai lake case study. IEEE Access 9(1):9163–9179. https://doi.org/10.1109/ACCESS.2021.3050934
Piciarelli C, Foresti GL (2019) Drone patrolling with reinforcement learning. ACM Int Conf Proc Ser 1:1–6. https://doi.org/10.1145/3349801.3349805
Popović M, Vidal-Calleja T, Hitz G (2020) An informative path planning framework for UAV-based terrain monitoring. Auton Robot 44:889–911. https://doi.org/10.1007/s10514-020-09903-2
Rasmussen C, Williams C (2006) Gaussian processes for machine learning. Adaptive computation and machine learning. MIT Press, Cambridge, MA, USA. https://doi.org/10.7551/mitpress/3206.003.0001
Sánchez-García J, García-Campos J, Arzamendia M, Reina D, Toral S, Gregor D (2018) A survey on unmanned aerial and aquatic vehicle multi-hop networks: Wireless communications, evaluation tools and applications. Comput Commun 119:43–65. https://doi.org/10.1016/j.comcom.2018.02.002
Schaul T, Quan J, Antonoglou I, Silver D (2015) Prioritized experience replay. arXiv:1511.05952
Sim R, Roy N (2005) Global a-optimal robot exploration in slam. pp 661–666. https://doi.org/10.1109/ROBOT.2005.1570193
Sutton RS, Barto AG (2018) Reinforcement learning: an introduction. A Bradford Book, Cambridge, MA, USA
Ten Kathen MJ, Flores IJ, Reina DG (2021) A comparison of PSO-based informative path planners for autonomous surface vehicles for water resource monitoring. In: 7th international conference on machine learning technologies (ICMLT 2022). ACM
Ten Kathen MJ, Reina DG, Flores IJ (2021) A comparison of PSO-based informative path planners for detecting pollution peaks of the Ypacarai lake with autonomous surface vehicles. In: International conference on optimization and learning (OLA’2022)
Theile M, Bayerlein H, Nai R, Gesbert D, Caccamo M (2020) UAV coverage path planning under varying power constraints using deep reinforcement learning. In: 2020 IEEE/RSJ international conference on intelligent robots and systems (IROS). IEEE, pp 1444–1449
Viseras A, Garcia R (2019) Deepig: multi-robot information gathering with deep reinforcement learning. IEEE Robot Autom Lett 4(3):3059–3066. https://doi.org/10.1109/LRA.2019.2924839
Viseras A, Meißner M, Marchal J (2021) Wildfire front monitoring with multiple UAVs using deep Q-learning. IEEE Access 1–1. https://doi.org/10.1109/ACCESS.2021.3055651
Wang Z, de Freitas N, Lanctot M (2015) Dueling network architectures for deep reinforcement learning. CoRR arXiv:1511.06581
Woo J, Kim N (2020) Collision avoidance for an unmanned surface vehicle using deep reinforcement learning. Ocean Eng 199(107):001. https://doi.org/10.1016/j.oceaneng.2020.107001. www.sciencedirect.com/science/article/pii/S0029801820300792
Yanes Luis S, Reina DG, Toral Marín SL (2020) A deep reinforcement learning approach for the patrolling problem of water resources through autonomous surface vehicles: the Ypacarai lake case. IEEE Access 6(1):1–1. https://doi.org/10.1109/ACCESS.2020.3036938
Yanes Luis S, Reina DG, Marín SLT (2021) A multiagent deep reinforcement learning approach for path planning in autonomous surface vehicles: the Ypacaraí lake patrolling case. IEEE Access 9:17,084–17,099
Yanes Luis S, Gutiérrez-Reina D, Toral Marin S (2021) A dimensional comparison between evolutionary algorithm and deep reinforcement learning methodologies for autonomous surface vehicles with water quality sensors. Sensors 21(8). https://doi.org/10.3390/s21082862. https://www.mdpi.com/1424-8220/21/8/2862
Yanes Luis S, Peralta F, Tapia Córdoba A, Rodríguez Álvaro, del Nozal Toral, Marín S, Gutiérrez Reina D (2022) An evolutionary multi-objective path planning of a fleet of ASVs for patrolling water resources. Eng Appl Artif Intell 112(104):852www.sciencedirect.com/science/article/pii/S0952197622001051
Zhang Q, Lin J, Sha Q, He B, Li G (2020) Deep interactive reinforcement learning for path following of autonomous underwater vehicle. CoRR arXiv:2001.03359
Acknowledgements
This work has been funded by the Spanish “Ministerio de Ciencia, Innovación y Universidades” under the PhD grant FPU-2020 (Formación del Profesorado Universitario) of Samuel Yanes Luis.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Yanes Luis, S., Perales Esteve, M., Gutiérrez Reina, D., Toral Marín, S. (2023). Deep Reinforcement Learning Applied to Multi-agent Informative Path Planning in Environmental Missions. In: Azar, A.T., Kasim Ibraheem, I., Jaleel Humaidi, A. (eds) Mobile Robot: Motion Control and Path Planning. Studies in Computational Intelligence, vol 1090. Springer, Cham. https://doi.org/10.1007/978-3-031-26564-8_2
Download citation
DOI: https://doi.org/10.1007/978-3-031-26564-8_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-26563-1
Online ISBN: 978-3-031-26564-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)