Reinforcement Learning for Multiple HAPS/UAV Coordination: Impact of Exploration–Exploitation Dilemma on Convergence

Anicho, Ogbonnaya; Charlesworth, Philip B.; Baicher, Gurvinder S.; Nagar, Atulya K.

doi:10.1007/978-981-15-3290-0_12

Ogbonnaya Anicho¹⁸,
Philip B. Charlesworth¹⁸,
Gurvinder S. Baicher¹⁸ &
…
Atulya K. Nagar¹⁸

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1138))

287 Accesses
2 Citations

Abstract

This work analyses the application of reinforcement learning (RL) for coordinating multiple unmanned high-altitude platform stations (HAPS) or unmanned aerial vehicles (UAVs) for providing communications area coverage to a community of fixed and mobile users. Multiple agent coordination techniques are essential for developing autonomous capabilities for multi-UAV/HAPS control and management. This paper examines the impact of exploration–exploitation dilemma on the application of RL for coordinating multiple UAVs/HAPS. In the work, it is observed that RL convergence is a challenge, as the RL algorithm struggles to find optimal positioning for maximum user coverage. This paper attempts to establish the source of the convergence issue with the RL technique for this specific application scenario. The work goes on to suggest methods to minimise this impact, and some insights for applying RL techniques for multi-agent coordination for communications area coverage.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

I.T.U. (ITU), Terms and definitions. Radio Regulations Articles (2016)
Google Scholar
F.A. d’Oliveira, F.C.L. de Melo, T.C. Devezas, High-altitude platforms—present situation and technology trends. J. Aerosp. Technol. Manag. 8, 249–262 (2016)
Article Google Scholar
D. Grace, M. Mohorcic, Broadband Communications via High Altitude Platforms (Wiley, New York, 2011)
Google Scholar
ITU, Identifying the potential of new communications technologies for sustainable development, in Broadband Commission For Sustainable Development: Working Group on Technologies in Space and the Upper-Atmosphere, Technical Report (2017)
Google Scholar
O. Anicho, P.B. Charlesworth, G.S. Baicher, A. Nagar, Integrating routing schemes and platform autonomy algorithms for UAV ad-hoc & infrastructure based networks, in 28th International Telecommunication Networks and Applications Conference (ITNAC), 28th International Telecommunication Networks and Applications Conference (ITNAC). IEEE, Nov 2018
Google Scholar
R. Stengel, Flight Dynamics (Princeton University Press, Princeton, 2004)
Google Scholar
V. Hehtke, J. Kiam, A. Schulte, An autonomous mission management system to assist decision making for a HALE operator, in Deutscher Luft-und RaumfahrtKongress (2017)
Google Scholar
T.B. Chen, Management of multiple heterogenous unmanned aerial vehicles through capacity transparency, Ph.D. dissertation. Queensland University of Technology (2016)
Google Scholar
D.W. Gage, Command and Control of Many-Robot Systems, Unmanned Systems (1992)
Google Scholar
A. Howard, M. Mataric, G. Sukhatme, Mobile sensor network deployment using potential fields: a distributed, scalable solution to the area coverage problem, in International Symposium on Distributed Autonomous Robotic Systems, June 2002
Google Scholar
A. Gosavi, A Tutorial for Reinforcement Learning (Springer, Berlin, 2017)
MATH Google Scholar
H. Xuan Pham, H. La, D. Feil-Seifer, L. Nguyen, Cooperative and Distributed Reinforcement Learning of Drones for Field Coverage, Mar 2018
Google Scholar
L. Busoniu, B. Schtter, R. Babuska, Multiagent reinforcement learning with adaptive state focus, in Proceedings of the 17th Belgium-Netherlands Conference on Artificial Intelligence, BNAIC 2005, ed. by K. Verbeeck, K. Tuyls, A. Nowe, B. Manderick, B. Kuijpers, Oct 2005, pp. 35–42
Google Scholar
A. Adepegba, S. Miah, D. Spinello, Multi-agent area coverage control using reinforcement learning, in Proceedings of the Twenty-Ninth International Florida Artificial Intelligence Society Conference (2016)
Google Scholar
S.-M. Hung, S. Givigi, A Q-learning approach to flocking with UAVs in a stochastic environment. IEEE Trans. Cybern. 47(1), 186–197 (2017)
Article Google Scholar
H. Nguyen, L. Bui, M. Garratt, H. Abbass, Apprenticeship bootstrapping: inverse reinforcement learning in a multi-skill UAV-UGV coordination task, in Proceedings of the 17th International Conference on Autonomous and Multiagent Systems, ed. by M. Dastani, G. Sukthankar, E. Andre, S. Koenig, July 2018, pp. 2204–2206
Google Scholar
R.S. Sutton, G. Andrew, Reinforcement Learning: An Introduction (MIT Press, Barto, 2017)
MATH Google Scholar

Download references

Author information

Authors and Affiliations

School of Mathematics, Computer Science and Engineering, Liverpool Hope University, Liverpool, L16 9JD, UK
Ogbonnaya Anicho, Philip B. Charlesworth, Gurvinder S. Baicher & Atulya K. Nagar

Authors

Ogbonnaya Anicho
View author publications
You can also search for this author in PubMed Google Scholar
Philip B. Charlesworth
View author publications
You can also search for this author in PubMed Google Scholar
Gurvinder S. Baicher
View author publications
You can also search for this author in PubMed Google Scholar
Atulya K. Nagar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ogbonnaya Anicho .

Editor information

Editors and Affiliations

School of Mathematics, Computer Science and Engineering, Liverpool Hope University, Liverpool, UK
Atulya K. Nagar
Department of Mathematics, Indian Institute of Technology Roorkee, Roorkee, India
Kusum Deep
Department of Mathematics, South Asian University, New Delhi, Delhi, India
Jagdish Chand Bansal
Department of Mathematics, National Institute of Technology Silchar, Silchar, India
Kedar Nath Das

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Anicho, O., Charlesworth, P.B., Baicher, G.S., Nagar, A.K. (2020). Reinforcement Learning for Multiple HAPS/UAV Coordination: Impact of Exploration–Exploitation Dilemma on Convergence. In: Nagar, A., Deep, K., Bansal, J., Das, K. (eds) Soft Computing for Problem Solving 2019 . Advances in Intelligent Systems and Computing, vol 1138. Springer, Singapore. https://doi.org/10.1007/978-981-15-3290-0_12

Download citation

DOI: https://doi.org/10.1007/978-981-15-3290-0_12
Published: 30 April 2020
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-3289-4
Online ISBN: 978-981-15-3290-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics