Skip to main content
Log in

Memory-enhanced deep reinforcement learning for UAV navigation in 3D environment

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

It is a long-term challenging task to develop an intelligent agent that is able to navigate in 3D environment using only visual input in an end-to-end manner. In this paper, we introduce a goal-conditioned reinforcement learning framework for vision-based UAV navigation, and then develop a Memory Enhanced DRL agent with dynamic relative goal, extra action penalty and non-sparse reward to tackle the UAV navigation problem. This enables the agent to escape from the objective-obstacle dilemma. By performing experimental evaluations in high-fidelity visual environments simulated by Airsim, we show that our proposed memory-enhanced model can achieve higher success rate with less training steps compared to the DRL agents without memories.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Walvekar A, Goel Y, Jain A, Chakrabarty S, Kumar A (2019) Vision based autonomous navigation of quadcopter using reinforcement learning. In: Proceedings of the 2019 IEEE 2nd international conference on automation, electronics and electrical engineering (AUTEEE), pp 160–165 . IEEE

  2. Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, Graves A, Riedmiller M, Fidjeland AK, Ostrovski G et al (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529–533

    Article  Google Scholar 

  3. Zhu Y, Mottaghi R, Kolve E, Lim J.J, Gupta A, Fei-Fei L, Farhadi A (2017) Target-driven visual navigation in indoor scenes using deep reinforcement learning. In: Proceedings of the 2017 IEEE international conference on robotics and automation (ICRA), pp 3357–3364 . IEEE

  4. Nasiriany S, Pong VH, Lin S, Levine S (2019) Planning with goal-conditioned policies. arXiv preprint arXiv:1911.08453

  5. Shah S, Dey D, Lovett C, Kapoor A (2018) Airsim: High-fidelity visual and physical simulation for autonomous vehicles. In: Field and service robotics, pp 621–635 . Springer

  6. Kulkarni TD, Saeedi A, Gautam S, Gershman SJ (2016) Deep successor reinforcement learning. arXiv preprint arXiv:1606.02396

  7. Oh J, Chockalingam V, Lee H, et al (2016) Control of memory, active perception, and action in minecraft. In: International conference on machine learning, pp 2790–2799 . PMLR

  8. Tessler C, Givony S, Zahavy T, Mankowitz D, Mannor S (2017) A deep hierarchical approach to lifelong learning in minecraft. In: Proceedings of the AAAI conference on artificial intelligence, vol. 31

  9. Frazier S, Riedl M (2019) Improving deep reinforcement learning in minecraft with action advice. In: Proceedings of the AAAI conference on artificial intelligence and interactive digital entertainment, vol. 15, pp 146–152

  10. Eski İ, Kuş ZA (2019) Control of unmanned agricultural vehicles using neural network-based control system. Neural Comput Appl 31(1):583–595

    Article  Google Scholar 

  11. Ulus Ş, Eski İ (2021) Neural network and fuzzy logic-based hybrid attitude controller designs of a fixed-wing uav. Neural Comput Appl 33(14):8821–8843

    Article  Google Scholar 

  12. Abedin SF, Munir MS, Tran NH, Han Z, Hong CS (2021) Data freshness and energy-efficient uav navigation optimization: a deep reinforcement learning approach. IEEE Trans Intell Transp Syst 22:5994–6006

    Article  Google Scholar 

  13. MahmoudZadeh S, Yazdani A, Elmi A, Abbasi A, Ghanooni P (2021) Exploiting a fleet of uavs for monitoring and data acquisition of a distributed sensor network. Neural Comput Appl 1–14

  14. Fu C, Carrio A, Olivares-Méndez MA, Suarez-Fernandez R, Cervera PC (2014) Robust real-time vision-based aircraft tracking from unmanned aerial vehicles. In: Proceedings of the 2014 IEEE international conference on robotics and automation (ICRA), 5441–5446

  15. Li T, Ding F, Yang W (2021) Uav object tracking by background cues and aberrances response suppression mechanism. Neural Comput Appl 33(8):3347–3361

    Article  Google Scholar 

  16. Polvara R, Patacchiola M, Sharma S, Wan J, Manning A, Sutton R, Cangelosi A (2018) Toward end-to-end control for uav autonomous landing via deep reinforcement learning. In: Proceedings of the 2018 international conference on unmanned aircraft systems (ICUAS), pp 115–123 . IEEE

  17. Ross S, Melik-Barkhudarov N, Shankar K.S, Wendel A, Dey D, Bagnell J.A, Hebert M (2013) Learning monocular reactive uav control in cluttered natural environments. In: Proceedings of the 2013 IEEE international conference on robotics and automation, pp 1765–1772 . IEEE

  18. Tai L, Liu M (2016) A robot exploration strategy based on q-learning network. In: Proceedings of the 2016 IEEE international conference on real-time computing and robotics (RCAR), pp 57–62 . IEEE

  19. Loquercio A, Maqueda AI, Del-Blanco CR, Scaramuzza D (2018) Dronet: learning to fly by driving. IEEE Robot Autom Lett 3(2):1088–1095. https://doi.org/10.1109/LRA.2018.2795643

    Article  Google Scholar 

  20. Alpdemir MN (2022) Tactical uav path optimization under radar threat using deep reinforcement learning. Neural Comput Appl 1–16

  21. Kersandt K, Muñoz G, Barrado C (2018) Self-training by reinforcement learning for full-autonomous drones of the future. In: Proceedings of the 2018 IEEE/AIAA 37th digital avionics systems conference (DASC), pp. 1–10 . IEEE

Download references

Acknowledgements

The work is supported in part by the national key research and development program of China under grant No. 2019YFB2102200, the Natural Science Foundation of China under Grant No. 61902062, 61972086, 62102082, the Jiangsu Provincial Natural Science Foundation of China under Grant No. BK20190332, BK20210203, and Zhejiang Lab (No. 2019NB0AB05).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Yan Lyu or Weiwei Wu.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fu, C., Xu, X., Zhang, Y. et al. Memory-enhanced deep reinforcement learning for UAV navigation in 3D environment. Neural Comput & Applic 34, 14599–14607 (2022). https://doi.org/10.1007/s00521-022-07244-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-022-07244-y

Keywords

Navigation