Abstract
In recent years, the autonomous landing of unmanned aerial vehicles (UAVs) has attracted extensive attention due to the widespread applications of UAVs. With the rapid improvements in machine learning and artificial intelligence, recent research has begun to explore deep reinforcement learning (DRL) to learn the landing policy directly from raw observation data. However, current DRL-based solutions tend to suffer poor generalization to unseen environments. To deal with this issue, we formulate the landing problem as a two-stage DRL problem and bootstrap the DRL procedures by augmenting regular DRL loss with an auxiliary localization task. The auxiliary localization task provides dense supervision signals that aid in landing-relevant representation learning. In particular, two marker localization approaches are delicately designed based on deep classification and regression models, and differences between the two configurations are explored, aiming to answer the fundamental question of how to exploit localization better for representation learning. Furthermore, we propose a novel and flexible sampling strategy called Dynamic Partitioned Experience Replay to stabilize and accelerate the training procedure. Experimental results show that the auxiliary localization tasks combined with the improved sampling strategy aid the trained model to generalize in unseen environments. In addition, the trained model can be seamlessly transferred to the real-world quadrotors and has achieved outstanding landing performances.
Similar content being viewed by others
References
Silvagni M, Tonoli A, Zenerino E, Chiaberge M (2017) Multipurpose uav for search and rescue operations in mountain avalanche events. Geomatics, Natural Hazards and Risk 8(1):18–33
Whitehead K, Hugenholtz CH (2014) Remote sensing of the environment with small unmanned aircraft systems (uass), part 1: A review of progress and challenges. Journal of Unmanned Vehicle Systems 2 (3):69–85
Yang S, Yang X, Mo J (2018) The application of unmanned aircraft systems to plant protection in china. Precision agriculture 19(2):278–292
Yang T, Li Z, Zhang F, Xie B, Li J, Liu L (2019) Panoramic uav surveillance and recycling system based on structure-free camera array. IEEE Access 7:25763–25778
Tanaka S, Senoo T, Ishikawa M (2019) High-speed uav delivery system with non-stop parcel handover using high-speed visual control. In: 2019 IEEE Intelligent Transportation Systems Conference (ITSC), IEEE, pp 4449–4455
Dai Z, Yi J, Zhang Y, Zhou B, He L (2020) Fast and accurate cable detection using cnn. Appl Intell 50(12):4688–4707
Tian G, Liu J, Zhao H, Yang W (2021) Small object detection via dual inspection mechanism for uav visual images. Appl Intell, pp 1–14
Lee S, Shim T, Kim S, Park J, Hong K, Bang H (2018) Vision-based autonomous landing of a multi-copter unmanned aerial vehicle using reinforcement learning. In: 2018 International Conference on Unmanned Aircraft Systems (ICUAS), IEEE, pp 108–114
Al-Sharman MK, Emran BJ, Jaradat MA, Najjaran H, Al-Husari R, Zweiri Y (2018) Precision landing using an adaptive fuzzy multi-sensor data fusion architecture. Applied soft computing 69:149–164
Talha M, Asghar F, Rohan A, Rabah M, Kim SH (2019) Fuzzy logic-based robust and autonomous safe landing for uav quadcopter. Arab J Sci Eng 44(3):2627–2639
Gui Y, Guo P, Zhang H, Lei Z, Zhou X, Du J, Yu Q (2013) Airborne vision-based navigation method for uav accuracy landing using infrared lamps. J Intelligent & Robotic Systems 72(2):197–218
Tang D, Hu T, Shen L, Zhang D, Kong W, Low KH (2016) Ground stereo vision-based navigation for autonomous take-off and landing of uavs: a chan-vese model approach. Int J Adv Robot Syst 13 (2):67
Kalinov I, Petrovsky A, Agishev R, Karpyshev P, Tsetserukou D (2021) Impedance-based control for soft uav landing on a ground robot in heterogeneous robotic system. In: 2021 International Conference on Unmanned Aircraft Systems (ICUAS), IEEE, pp 1653–1658
Almeshal AM, Alenezi MR (2018) A vision-based neural network controller for the autonomous landing of a quadrotor on moving targets. Robotics 7(4):71
Khazetdinov A, Zakiev A, Tsoy T, Svinin M, Magid E (2021) Embedded aruco: a novel approach for high precision uav landing. In: 2021 International Siberian Conference on Control and Communications (SIBCON), IEEE, pp 1–6
Mnih V, Kavukcuoglu K, Silver D, Graves A, Antonoglou I, Wierstra D, Riedmiller M (2013) Playing atari with deep reinforcement learning. arXiv:1312.5602
Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, Graves A, Riedmiller M, Fidjeland AK, Ostrovski G et al (2015) Human-level control through deep reinforcement learning. nature 518(7540):529–533
Zhang F, Leitner J, Milford M, Upcroft B, Corke P (2015) Towards vision-based deep reinforcement learning for robotic motion control. arXiv:1511.03791
Tai L, Paolo G, Liu M (2017) Virtual-to-real deep reinforcement learning: Continuous control of mobile robots for mapless navigation. In: 2017 IEEE/RSJ International conference on intelligent robots and systems (IROS), IEEE, pp 31–36
Polvara R, Patacchiola M, Hanheide M, Neumann G (2020) Sim-to-real quadrotor landing via sequential deep q-networks and domain randomization. Robotics 9(1):8
Xu Y, Liu Z, Wang X (2018) Monocular vision based autonomous landing of quadrotor through deep reinforcement learning. In: 2018 37th Chinese control conference (CCC), IEEE, pp 10014–10019
Le L, Patterson A, White M (2018) Supervised autoencoders: Improving generalization performance with unsupervised regularizers. Adv Neural Info Process Systems 31:107–117
Sun Y, Wang X, Liu Z, Miller J, Efros A, Hardt M (2020) Test-time training with self-supervision for generalization under distribution shifts. In: International conference on machine learning, PMLR, pp 9229–9248
Schaul T, Quan J, Antonoglou I, Silver D (2015) Prioritized experience replay. arXiv:1511.05952
Sutton RS, Barto AG (2018) Reinforcement learning: An introduction. MIT press
Kavuk EM, Tosun A, Cevik M, Bozanta A, Sonuç SB, Tutuncu M, Kosucu B, Basar A (2021) Order dispatching for an ultra-fast delivery service via deep reinforcement learning. Appl Intell, pp 1–26
Hui TS, Ishak MK, Mohamed MFP, Fadzil LM, Ahmarofi AA (2021) Balancing excitation and inhibition of spike neuron using deep q network (dqn). In: Journal of physics: Conference series, vol 1755, IOP Publishing, p 012004
Al-Gablawy M (2021) Optimal peak shifting of a domestic load connected to utility grid using storage battery based on deep q-learning network. Int J Energy Res 45(2):3269–3287
Van Hasselt H, Guez A, Silver D (2016) Deep reinforcement learning with double q-learning. In: Proceedings of the AAAI conference on artificial intelligence, vol 30
Haarnoja T, Zhou A, Abbeel P, Levine S (2018) Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. In: International conference on machine learning, PMLR, pp 1861–1870
Olivares-Méndez MA, Mondragón IF, Campoy P, Martínez C (2010) Fuzzy controller for uav-landing task using 3d-position visual estimation. In: International conference on fuzzy systems, Ieee, pp 1–8
Keipour A, Pereira GAS, Bonatti R, Garg R, Rastogi P, Dubey G, Scherer S (2021) Visual servoing approach for autonomous uav landing on a moving vehicle. arXiv:2104.01272
Saavedra-Ruiz M, Pinto-Vargas AM, Romero-Cano V (2021) Monocular visual autonomous landing system for quadcopter drones using software in the loop. IEEE Aerosp Electron Syst Mag
Lange S, Sunderhauf N, Protzel P (2009) A vision based onboard approach for landing and position control of an autonomous multirotor uav in gps-denied environments. In: 2009 International conference on advanced robotics, IEEE, pp 1–6
Huang X, Xu Q, Wang J (2019) Vision-based autonomous landing of uav on moving platform using a new marker. In: IOP Conference series: Materials science and engineering, vol 646, IOP Publishing, p 012062
Lebedev I, Erashov A, Shabanova A (2020) Accurate autonomous uav landing using vision-based detection of aruco-marker. In: International conference on interactive collaborative robotics, Springer, pp 179–188
Niu G, Yang Q, Gao Y, Pun M-O (2021) Vision-based autonomous landing for unmanned aerial and mobile ground vehicles cooperative systems. IEEE robotics and automation letters
Vankadari MB, Das K, Shinde C, Kumar S (2018) A reinforcement learning approach for autonomous control and landing of a quadrotor. In: 2018 International conference on unmanned aircraft systems (ICUAS), IEEE, pp 676–683
Shim T, Bang H (2018) Autonomous landing of uav using vision based approach and pid controller based outer loop. In: 2018 18th International conference on control, automation and systems (ICCAS), IEEE, pp 876–879
Kim C, Lee EM, Choi J, Jeon J, Kim S, Myung H (2021) Roland: Robust landing of uav on moving platform using object detection and uwb based extended kalman filter. In: 2021 21st International conference on control, automation and systems (ICCAS), IEEE, pp 249–254
Zhang H-T, Hu B-B, Xu Z, Cai Z, Liu B, Wang X, Geng T, Zhong S, Zhao J (2021) Visual navigation and landing control of an unmanned aerial vehicle on a moving autonomous surface vehicle via adaptive learning. IEEE Trans Neural Networks and Learning Systems
Guo K, Tang P, Wang H, Lin D, Cui X (2022) Autonomous landing of a quadrotor on a moving platform via model predictive control, vol 9
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Adv Neural Information Processing Systems 25:1097–1105
Narasimhan K, Kulkarni T, Barzilay R (2015) Language understanding for text-based games using deep reinforcement learning. arXiv:1506.08941
Christodoulou P (2019) Soft actor-critic for discrete action settings. arXiv:1910.07207
Wang Z, Schaul T, Hessel M, Hasselt H, Lanctot M, Freitas N (2016) Dueling network architectures for deep reinforcement learning. In: International conference on machine learning, PMLR, pp 1995–2003
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interests
The authors declare that they have no conflict of interest.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
To test the trained model’s generalization capability, the ground background texture is replaced with textures not involved in training for testing. As shown in Fig. 16, there are 6 types of textures, each of which contains 13 different instances.
Rights and permissions
About this article
Cite this article
Wang, J., Wang, T., He, Z. et al. Towards better generalization in quadrotor landing using deep reinforcement learning. Appl Intell 53, 6195–6213 (2023). https://doi.org/10.1007/s10489-022-03503-6
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-022-03503-6