Towards better generalization in quadrotor landing using deep reinforcement learning

Wang, Jiawei; Wang, Teng; He, Zichen; Cai, Wenzhe; Sun, Changyin

doi:10.1007/s10489-022-03503-6

Towards better generalization in quadrotor landing using deep reinforcement learning

Published: 07 July 2022

Volume 53, pages 6195–6213, (2023)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Jiawei Wang^1,2,
Teng Wang³,
Zichen He¹,
Wenzhe Cai³ &
…
Changyin Sun ORCID: orcid.org/0000-0001-9269-334X^2,3

580 Accesses
3 Citations
Explore all metrics

Abstract

In recent years, the autonomous landing of unmanned aerial vehicles (UAVs) has attracted extensive attention due to the widespread applications of UAVs. With the rapid improvements in machine learning and artificial intelligence, recent research has begun to explore deep reinforcement learning (DRL) to learn the landing policy directly from raw observation data. However, current DRL-based solutions tend to suffer poor generalization to unseen environments. To deal with this issue, we formulate the landing problem as a two-stage DRL problem and bootstrap the DRL procedures by augmenting regular DRL loss with an auxiliary localization task. The auxiliary localization task provides dense supervision signals that aid in landing-relevant representation learning. In particular, two marker localization approaches are delicately designed based on deep classification and regression models, and differences between the two configurations are explored, aiming to answer the fundamental question of how to exploit localization better for representation learning. Furthermore, we propose a novel and flexible sampling strategy called Dynamic Partitioned Experience Replay to stabilize and accelerate the training procedure. Experimental results show that the auxiliary localization tasks combined with the improved sampling strategy aid the trained model to generalize in unseen environments. In addition, the trained model can be seamlessly transferred to the real-world quadrotors and has achieved outstanding landing performances.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Deep Reinforcement Learning Strategy for UAV Autonomous Landing on a Moving Platform

Article 03 July 2018

UAV First View Landmark Localization via Deep Reinforcement Learning

Re-planning of Quadrotors Under Disturbance Based on Meta Reinforcement Learning

Article 17 January 2023

References

Silvagni M, Tonoli A, Zenerino E, Chiaberge M (2017) Multipurpose uav for search and rescue operations in mountain avalanche events. Geomatics, Natural Hazards and Risk 8(1):18–33
Article Google Scholar
Whitehead K, Hugenholtz CH (2014) Remote sensing of the environment with small unmanned aircraft systems (uass), part 1: A review of progress and challenges. Journal of Unmanned Vehicle Systems 2 (3):69–85
Article Google Scholar
Yang S, Yang X, Mo J (2018) The application of unmanned aircraft systems to plant protection in china. Precision agriculture 19(2):278–292
Article Google Scholar
Yang T, Li Z, Zhang F, Xie B, Li J, Liu L (2019) Panoramic uav surveillance and recycling system based on structure-free camera array. IEEE Access 7:25763–25778
Article Google Scholar
Tanaka S, Senoo T, Ishikawa M (2019) High-speed uav delivery system with non-stop parcel handover using high-speed visual control. In: 2019 IEEE Intelligent Transportation Systems Conference (ITSC), IEEE, pp 4449–4455
Dai Z, Yi J, Zhang Y, Zhou B, He L (2020) Fast and accurate cable detection using cnn. Appl Intell 50(12):4688–4707
Article Google Scholar
Tian G, Liu J, Zhao H, Yang W (2021) Small object detection via dual inspection mechanism for uav visual images. Appl Intell, pp 1–14
Lee S, Shim T, Kim S, Park J, Hong K, Bang H (2018) Vision-based autonomous landing of a multi-copter unmanned aerial vehicle using reinforcement learning. In: 2018 International Conference on Unmanned Aircraft Systems (ICUAS), IEEE, pp 108–114
Al-Sharman MK, Emran BJ, Jaradat MA, Najjaran H, Al-Husari R, Zweiri Y (2018) Precision landing using an adaptive fuzzy multi-sensor data fusion architecture. Applied soft computing 69:149–164
Article Google Scholar
Talha M, Asghar F, Rohan A, Rabah M, Kim SH (2019) Fuzzy logic-based robust and autonomous safe landing for uav quadcopter. Arab J Sci Eng 44(3):2627–2639
Article Google Scholar
Gui Y, Guo P, Zhang H, Lei Z, Zhou X, Du J, Yu Q (2013) Airborne vision-based navigation method for uav accuracy landing using infrared lamps. J Intelligent & Robotic Systems 72(2):197–218
Article Google Scholar
Tang D, Hu T, Shen L, Zhang D, Kong W, Low KH (2016) Ground stereo vision-based navigation for autonomous take-off and landing of uavs: a chan-vese model approach. Int J Adv Robot Syst 13 (2):67
Article Google Scholar
Kalinov I, Petrovsky A, Agishev R, Karpyshev P, Tsetserukou D (2021) Impedance-based control for soft uav landing on a ground robot in heterogeneous robotic system. In: 2021 International Conference on Unmanned Aircraft Systems (ICUAS), IEEE, pp 1653–1658
Almeshal AM, Alenezi MR (2018) A vision-based neural network controller for the autonomous landing of a quadrotor on moving targets. Robotics 7(4):71
Article Google Scholar
Khazetdinov A, Zakiev A, Tsoy T, Svinin M, Magid E (2021) Embedded aruco: a novel approach for high precision uav landing. In: 2021 International Siberian Conference on Control and Communications (SIBCON), IEEE, pp 1–6
Mnih V, Kavukcuoglu K, Silver D, Graves A, Antonoglou I, Wierstra D, Riedmiller M (2013) Playing atari with deep reinforcement learning. arXiv:1312.5602
Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, Graves A, Riedmiller M, Fidjeland AK, Ostrovski G et al (2015) Human-level control through deep reinforcement learning. nature 518(7540):529–533
Article Google Scholar
Zhang F, Leitner J, Milford M, Upcroft B, Corke P (2015) Towards vision-based deep reinforcement learning for robotic motion control. arXiv:1511.03791
Tai L, Paolo G, Liu M (2017) Virtual-to-real deep reinforcement learning: Continuous control of mobile robots for mapless navigation. In: 2017 IEEE/RSJ International conference on intelligent robots and systems (IROS), IEEE, pp 31–36
Polvara R, Patacchiola M, Hanheide M, Neumann G (2020) Sim-to-real quadrotor landing via sequential deep q-networks and domain randomization. Robotics 9(1):8
Article Google Scholar
Xu Y, Liu Z, Wang X (2018) Monocular vision based autonomous landing of quadrotor through deep reinforcement learning. In: 2018 37th Chinese control conference (CCC), IEEE, pp 10014–10019
Le L, Patterson A, White M (2018) Supervised autoencoders: Improving generalization performance with unsupervised regularizers. Adv Neural Info Process Systems 31:107–117
Google Scholar
Sun Y, Wang X, Liu Z, Miller J, Efros A, Hardt M (2020) Test-time training with self-supervision for generalization under distribution shifts. In: International conference on machine learning, PMLR, pp 9229–9248
Schaul T, Quan J, Antonoglou I, Silver D (2015) Prioritized experience replay. arXiv:1511.05952
Sutton RS, Barto AG (2018) Reinforcement learning: An introduction. MIT press
Kavuk EM, Tosun A, Cevik M, Bozanta A, Sonuç SB, Tutuncu M, Kosucu B, Basar A (2021) Order dispatching for an ultra-fast delivery service via deep reinforcement learning. Appl Intell, pp 1–26
Hui TS, Ishak MK, Mohamed MFP, Fadzil LM, Ahmarofi AA (2021) Balancing excitation and inhibition of spike neuron using deep q network (dqn). In: Journal of physics: Conference series, vol 1755, IOP Publishing, p 012004
Al-Gablawy M (2021) Optimal peak shifting of a domestic load connected to utility grid using storage battery based on deep q-learning network. Int J Energy Res 45(2):3269–3287
Article Google Scholar
Van Hasselt H, Guez A, Silver D (2016) Deep reinforcement learning with double q-learning. In: Proceedings of the AAAI conference on artificial intelligence, vol 30
Haarnoja T, Zhou A, Abbeel P, Levine S (2018) Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. In: International conference on machine learning, PMLR, pp 1861–1870
Olivares-Méndez MA, Mondragón IF, Campoy P, Martínez C (2010) Fuzzy controller for uav-landing task using 3d-position visual estimation. In: International conference on fuzzy systems, Ieee, pp 1–8
Keipour A, Pereira GAS, Bonatti R, Garg R, Rastogi P, Dubey G, Scherer S (2021) Visual servoing approach for autonomous uav landing on a moving vehicle. arXiv:2104.01272
Saavedra-Ruiz M, Pinto-Vargas AM, Romero-Cano V (2021) Monocular visual autonomous landing system for quadcopter drones using software in the loop. IEEE Aerosp Electron Syst Mag
Lange S, Sunderhauf N, Protzel P (2009) A vision based onboard approach for landing and position control of an autonomous multirotor uav in gps-denied environments. In: 2009 International conference on advanced robotics, IEEE, pp 1–6
Huang X, Xu Q, Wang J (2019) Vision-based autonomous landing of uav on moving platform using a new marker. In: IOP Conference series: Materials science and engineering, vol 646, IOP Publishing, p 012062
Lebedev I, Erashov A, Shabanova A (2020) Accurate autonomous uav landing using vision-based detection of aruco-marker. In: International conference on interactive collaborative robotics, Springer, pp 179–188
Niu G, Yang Q, Gao Y, Pun M-O (2021) Vision-based autonomous landing for unmanned aerial and mobile ground vehicles cooperative systems. IEEE robotics and automation letters
Vankadari MB, Das K, Shinde C, Kumar S (2018) A reinforcement learning approach for autonomous control and landing of a quadrotor. In: 2018 International conference on unmanned aircraft systems (ICUAS), IEEE, pp 676–683
Shim T, Bang H (2018) Autonomous landing of uav using vision based approach and pid controller based outer loop. In: 2018 18th International conference on control, automation and systems (ICCAS), IEEE, pp 876–879
Kim C, Lee EM, Choi J, Jeon J, Kim S, Myung H (2021) Roland: Robust landing of uav on moving platform using object detection and uwb based extended kalman filter. In: 2021 21st International conference on control, automation and systems (ICCAS), IEEE, pp 249–254
Zhang H-T, Hu B-B, Xu Z, Cai Z, Liu B, Wang X, Geng T, Zhong S, Zhao J (2021) Visual navigation and landing control of an unmanned aerial vehicle on a moving autonomous surface vehicle via adaptive learning. IEEE Trans Neural Networks and Learning Systems
Guo K, Tang P, Wang H, Lin D, Cui X (2022) Autonomous landing of a quadrotor on a moving platform via model predictive control, vol 9
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Adv Neural Information Processing Systems 25:1097–1105
Google Scholar
Narasimhan K, Kulkarni T, Barzilay R (2015) Language understanding for text-based games using deep reinforcement learning. arXiv:1506.08941
Christodoulou P (2019) Soft actor-critic for discrete action settings. arXiv:1910.07207
Wang Z, Schaul T, Hessel M, Hasselt H, Lanctot M, Freitas N (2016) Dueling network architectures for deep reinforcement learning. In: International conference on machine learning, PMLR, pp 1995–2003

Download references

Author information

Authors and Affiliations

College of Electronics and Information Engineering, Tongji University, Shanghai, 201804, China
Jiawei Wang & Zichen He
Peng Cheng Laboratory, Shenzhen, 518055, China
Jiawei Wang & Changyin Sun
Department of Automation, Southeast University, Nanjing, 210018, China
Teng Wang, Wenzhe Cai & Changyin Sun

Authors

Jiawei Wang
View author publications
You can also search for this author in PubMed Google Scholar
Teng Wang
View author publications
You can also search for this author in PubMed Google Scholar
Zichen He
View author publications
You can also search for this author in PubMed Google Scholar
Wenzhe Cai
View author publications
You can also search for this author in PubMed Google Scholar
Changyin Sun
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Changyin Sun.

Ethics declarations

Conflict of Interests

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

To test the trained model’s generalization capability, the ground background texture is replaced with textures not involved in training for testing. As shown in Fig. 16, there are 6 types of textures, each of which contains 13 different instances.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, J., Wang, T., He, Z. et al. Towards better generalization in quadrotor landing using deep reinforcement learning. Appl Intell 53, 6195–6213 (2023). https://doi.org/10.1007/s10489-022-03503-6

Download citation

Accepted: 13 March 2022
Published: 07 July 2022
Issue Date: March 2023
DOI: https://doi.org/10.1007/s10489-022-03503-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Towards better generalization in quadrotor landing using deep reinforcement learning

Abstract

Access this article

Similar content being viewed by others

A Deep Reinforcement Learning Strategy for UAV Autonomous Landing on a Moving Platform

UAV First View Landmark Localization via Deep Reinforcement Learning

Re-planning of Quadrotors Under Disturbance Based on Meta Reinforcement Learning

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Publisher’s note

Appendix

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Towards better generalization in quadrotor landing using deep reinforcement learning

Abstract

Access this article

Similar content being viewed by others

A Deep Reinforcement Learning Strategy for UAV Autonomous Landing on a Moving Platform

UAV First View Landmark Localization via Deep Reinforcement Learning

Re-planning of Quadrotors Under Disturbance Based on Meta Reinforcement Learning

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Publisher’s note

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation