Abstract
It is a long-term challenging task to develop an intelligent agent that is able to navigate in 3D environment using only visual input in an end-to-end manner. In this paper, we introduce a goal-conditioned reinforcement learning framework for vision-based UAV navigation, and then develop a Memory Enhanced DRL agent with dynamic relative goal, extra action penalty and non-sparse reward to tackle the UAV navigation problem. This enables the agent to escape from the objective-obstacle dilemma. By performing experimental evaluations in high-fidelity visual environments simulated by Airsim, we show that our proposed memory-enhanced model can achieve higher success rate with less training steps compared to the DRL agents without memories.
Similar content being viewed by others
References
Walvekar A, Goel Y, Jain A, Chakrabarty S, Kumar A (2019) Vision based autonomous navigation of quadcopter using reinforcement learning. In: Proceedings of the 2019 IEEE 2nd international conference on automation, electronics and electrical engineering (AUTEEE), pp 160–165 . IEEE
Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, Graves A, Riedmiller M, Fidjeland AK, Ostrovski G et al (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529–533
Zhu Y, Mottaghi R, Kolve E, Lim J.J, Gupta A, Fei-Fei L, Farhadi A (2017) Target-driven visual navigation in indoor scenes using deep reinforcement learning. In: Proceedings of the 2017 IEEE international conference on robotics and automation (ICRA), pp 3357–3364 . IEEE
Nasiriany S, Pong VH, Lin S, Levine S (2019) Planning with goal-conditioned policies. arXiv preprint arXiv:1911.08453
Shah S, Dey D, Lovett C, Kapoor A (2018) Airsim: High-fidelity visual and physical simulation for autonomous vehicles. In: Field and service robotics, pp 621–635 . Springer
Kulkarni TD, Saeedi A, Gautam S, Gershman SJ (2016) Deep successor reinforcement learning. arXiv preprint arXiv:1606.02396
Oh J, Chockalingam V, Lee H, et al (2016) Control of memory, active perception, and action in minecraft. In: International conference on machine learning, pp 2790–2799 . PMLR
Tessler C, Givony S, Zahavy T, Mankowitz D, Mannor S (2017) A deep hierarchical approach to lifelong learning in minecraft. In: Proceedings of the AAAI conference on artificial intelligence, vol. 31
Frazier S, Riedl M (2019) Improving deep reinforcement learning in minecraft with action advice. In: Proceedings of the AAAI conference on artificial intelligence and interactive digital entertainment, vol. 15, pp 146–152
Eski İ, Kuş ZA (2019) Control of unmanned agricultural vehicles using neural network-based control system. Neural Comput Appl 31(1):583–595
Ulus Ş, Eski İ (2021) Neural network and fuzzy logic-based hybrid attitude controller designs of a fixed-wing uav. Neural Comput Appl 33(14):8821–8843
Abedin SF, Munir MS, Tran NH, Han Z, Hong CS (2021) Data freshness and energy-efficient uav navigation optimization: a deep reinforcement learning approach. IEEE Trans Intell Transp Syst 22:5994–6006
MahmoudZadeh S, Yazdani A, Elmi A, Abbasi A, Ghanooni P (2021) Exploiting a fleet of uavs for monitoring and data acquisition of a distributed sensor network. Neural Comput Appl 1–14
Fu C, Carrio A, Olivares-Méndez MA, Suarez-Fernandez R, Cervera PC (2014) Robust real-time vision-based aircraft tracking from unmanned aerial vehicles. In: Proceedings of the 2014 IEEE international conference on robotics and automation (ICRA), 5441–5446
Li T, Ding F, Yang W (2021) Uav object tracking by background cues and aberrances response suppression mechanism. Neural Comput Appl 33(8):3347–3361
Polvara R, Patacchiola M, Sharma S, Wan J, Manning A, Sutton R, Cangelosi A (2018) Toward end-to-end control for uav autonomous landing via deep reinforcement learning. In: Proceedings of the 2018 international conference on unmanned aircraft systems (ICUAS), pp 115–123 . IEEE
Ross S, Melik-Barkhudarov N, Shankar K.S, Wendel A, Dey D, Bagnell J.A, Hebert M (2013) Learning monocular reactive uav control in cluttered natural environments. In: Proceedings of the 2013 IEEE international conference on robotics and automation, pp 1765–1772 . IEEE
Tai L, Liu M (2016) A robot exploration strategy based on q-learning network. In: Proceedings of the 2016 IEEE international conference on real-time computing and robotics (RCAR), pp 57–62 . IEEE
Loquercio A, Maqueda AI, Del-Blanco CR, Scaramuzza D (2018) Dronet: learning to fly by driving. IEEE Robot Autom Lett 3(2):1088–1095. https://doi.org/10.1109/LRA.2018.2795643
Alpdemir MN (2022) Tactical uav path optimization under radar threat using deep reinforcement learning. Neural Comput Appl 1–16
Kersandt K, Muñoz G, Barrado C (2018) Self-training by reinforcement learning for full-autonomous drones of the future. In: Proceedings of the 2018 IEEE/AIAA 37th digital avionics systems conference (DASC), pp. 1–10 . IEEE
Acknowledgements
The work is supported in part by the national key research and development program of China under grant No. 2019YFB2102200, the Natural Science Foundation of China under Grant No. 61902062, 61972086, 62102082, the Jiangsu Provincial Natural Science Foundation of China under Grant No. BK20190332, BK20210203, and Zhejiang Lab (No. 2019NB0AB05).
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Fu, C., Xu, X., Zhang, Y. et al. Memory-enhanced deep reinforcement learning for UAV navigation in 3D environment. Neural Comput & Applic 34, 14599–14607 (2022). https://doi.org/10.1007/s00521-022-07244-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-022-07244-y