Skip to main content
Log in

A reinforcement learning approach for UAV target searching and tracking

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Owing to the advantages of Unmanned Aerial Vehicle (UAV), such as the extendibility, maneuverability and stability, multiple UAVs are having more and more applications in security surveillance. The object searching and trajectory planning become the important issues of uninterrupted patrol. We propose an online distributed algorithm for tracking and searching, while considering the energy refueling at the same time. The quantum probability model which describes the partially observable target positions is proposed. Moreover, the upper confidence tree algorithm is derived to resolve the best route, with the assistance of teammate learning model which handles the nonstationary problems in distributed reinforcement learning. Experiments and the analysis of the different situations show that the proposed scheme performs favorably.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Alexis K, Darivianakis G, Burri M, Siegwart R (2016) Aerial robotic contact-based inspection: planning and control. Auton Robot 40(4):631–655

    Article  Google Scholar 

  2. Artieda J, Sebastian JM, Campoy P, Correa JF, Mondragón IF, Martínez C, Olivares M (2009) Visual 3-d slam from uavs. J Intell Robot Syst 55(4):299–321

    Article  Google Scholar 

  3. Chen H, Chang K, Agate CS (2013) Uav path planning with tangent-plus-lyapunov vector field guidance and obstacle avoidance. IEEE Trans Aerosp Electron Syst 49(2):840–856

    Article  Google Scholar 

  4. Chen C, Liu X, Qiu T, Liu L, Sangaiah AK (2017) Latency estimation based on traffic density for video streaming in the internet of vehicles. Comput Commun 111:176–186

    Article  Google Scholar 

  5. Chen C, Liu X, Qiu T, Sangaiah AK (2017) A short-term traffic prediction model in the vehicular cyber–physical systems. Future Generation Computer Systems

  6. Choi C, Choi J, Lee E, You I, Kim P (2013) Probabilistic spatio-temporal inference for motion event understanding. Neurocomputing 122:24–32

    Article  Google Scholar 

  7. Claus C, Boutilier C (1970) The dynamics of reinforcement learning in cooperative multiagent systems. In: Fifteenth National/tenth Conference on Artificial Intelligence/innovative Applications of Artificial Intelligence, p 746–752

  8. Gosavi A (2009) Reinforcement learning: A tutorial survey and recent advances. Informs J Comput 21(2):178–192

    Article  MathSciNet  Google Scholar 

  9. Greenstein G, Zajonc AG, Peres A (1998) The quantum challenge: Modern research on the foundations of quantum mechanics. Amer J Phys 66(5):455–456

    Article  Google Scholar 

  10. Hausamann D, Zirnig W, Schreier G, Strobl P (2005) Monitoring of gas pipelines–a civil uav application. Aircr Eng Aerosp Technol 77(5):352–360

    Article  Google Scholar 

  11. Huang H, Zhuo T (2017) Multi-model cooperative task assignment and path planning of multiple ucav formation. Multimed Tools Appl 7:1–22

  12. Iwashita Y, Ryoo MS, Fuchs TJ, Padgett C (2013) Recognizing humans in motion: Trajectory-based aerial video analysis. In: Proceedings of British Machine Vision Conference (BMVC), vol 1, p 6

  13. Lin S, Garratt MA, Lambert AJ (2017) Monocular vision-based real-time target recognition and tracking for autonomously landing an uav in a cluttered shipboard environment. Auton Robot 41(4):881–901

    Article  Google Scholar 

  14. Liu B, Singh S, Lewis RL, Qin S (2014) Optimal rewards for cooperative agents. IEEE Trans Auton Mental Dev 6(4):286–297

    Article  Google Scholar 

  15. Ma L, Hovakimyan N (2013) Cooperative target tracking in balanced circular formation: Multiple uavs tracking a ground vehicle. In: American Control Conference, pp 5386–5391

  16. Medhane DV, Sangaiah AK (2017) Escape: Effective scalable clustering approach for parallel execution of continuous position-based queries in position monitoring applications. IEEE Transactions on Sustainable Computing

  17. Meng BB, Gao X (2010) Uav path planning based on bidirectional sparse a* search algorithm. In: 2010 International Conference on Intelligent Computation Technology and Automation, pp 1106–1109

  18. Murphy RR, Steimle E, Hall M, Lindemuth M, Trejo D, Hurlebaus S, Medina-Cetina Z, Slocum D (2011) Robot-assisted bridge inspection. J Intell Robot Syst 64(1):77–95

    Article  Google Scholar 

  19. Oreifej O, Mehran R, Shah M (2010) Human identity recognition in aerial images. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 709–716

  20. Prokaj J, Medioni G (2014) Persistent tracking for wide area aerial surveillance. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1186–1193

  21. Qiu T, Zhang Y, Qiao D, Zhang X, Wymore ML, Sangaiah AK (2017) A robust time synchronization scheme for industrial internet of things. IEEE Transactions on Industrial Informatics

  22. Quintero SAP, Copp DA, Hespanha JP (2015) Robust uav coordination for target tracking using output-feedback model predictive control with moving horizon estimation. In: American Control Conference

  23. Ragi S, Chong EKP (2013) Uav path planning in a dynamic environment via partially observable markov decision process. IEEE Trans Aerosp Electron Syst 49 (4):2397–2412

    Article  Google Scholar 

  24. Robin C, Lacroix S (2016) Multi-robot target detection and tracking: taxonomy and survey. Auton Robot 40(4):729–760

    Article  Google Scholar 

  25. Sang Y, Cai Z, Lin Q, Wang Y (2015) Planning algorithm based on airborne sensor for uav to track and intercept moving target in dynamic environment. In: Guidance, Navigation and Control Conference

  26. Shu T, Xie D, Rothrock B, Todorovic S, Zhu S (2015) Joint inference of groups, events and human roles in aerial videos. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4576–4584

  27. Silver D, Veness J (2010) Monte-carlo planning in large pomdps. In: Advances in Neural Information Processing Systems 23: Conference on Neural Information Processing Systems 2010. In: Proceedings of A Meeting Held 6-9 December 2010, Vancouver, British Columbia, pp 2164–2172

  28. Su X, Liang C, Choi D, Choi C (2016) Power allocation scheme for femto-to-macro downlink interference reduction for smart devices in ambient intelligence. Mobile Information Systems:2016, pp 1–10

  29. Su X, Yu H, Kim W, Choi C, Choi D (2016) Interference cancellation for non-orthogonal multiple access used in future wireless mobile networks. EURASIP J Wirel Commun Netw 2016(1):231

    Article  Google Scholar 

  30. Walha A, Wali A, Alimi AM (2015) Video stabilization with moving object detecting and tracking for aerial video surveillance. Multimed Tools Appl 74 (17):6745–6767

    Article  Google Scholar 

  31. Yang Y, Chen N, Jiang S (2017) Collaborative strategy for visual object tracking. Multimedia Tools and Applications 4:1–21

  32. Yeh MC, Chiu HK, Wang JS (2016) Fast medium-scale multiperson identification in aerial videos. Multimed Tools Appl 75(23):16,117–16,133

    Article  Google Scholar 

  33. Yu H, Meier K, Argyle M, Beard RW (2015) Cooperative path planning for target tracking in urban environments using unmanned air and ground vehicles. IEEE/ASME Trans Mechatronics 20:541–552

    Article  Google Scholar 

  34. Zhu S, Wang D, Chang BL (2013) Ground target tracking using uav with input constraints. J Intell Robot Syst 69(1-4):417–429

    Article  Google Scholar 

Download references

Acknowledgments

This work is partially supported by the National Natural Science Foundation of China (61503017, U1435220), the Aeronautical Science Foundation of China (2016ZC51022), the SURECAP CPER project and the Platform CAPSEC funded by Région Champagne-Ardenne and FEDER, the Fundamental Research Funds for the Central Universities (YWF-14-RSC-102). Also, this research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT and Future Planning (2015R1C1A1A02037515).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chang Choi.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, T., Qin, R., Chen, Y. et al. A reinforcement learning approach for UAV target searching and tracking. Multimed Tools Appl 78, 4347–4364 (2019). https://doi.org/10.1007/s11042-018-5739-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-018-5739-5

Keywords

Navigation