Scene Adaptive Persistent Target Tracking and Attack Method Based on Deep Reinforcement Learning

Hao, Zhaotie; Guo, Bin; Li, Mengyuan; Wu, Lie; Yu, Zhiwen

doi:10.1007/978-981-99-2385-4_10

Zhaotie Hao¹³,
Bin Guo¹³,
Mengyuan Li¹³,
Lie Wu¹³ &
…
Zhiwen Yu¹³

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1682))

Included in the following conference series:

CCF Conference on Computer Supported Cooperative Work and Social Computing

342 Accesses

Abstract

As an intelligent device integrating a series of advanced technologies, mobile robots have been widely used in the field of defense and military affairs because of their high degree of autonomy and flexibility. They can independently track and attack dynamic targets. However, traditional tracking attack algorithms are sensitive to the changes of the external environment, and does not have mobility and expansibility, while deep reinforcement learning can adapt to different environments because of its good learning and exploration ability. In order to pursuit target accurately and robust, this paper proposes a solution based on deep reinforcement learning algorithm. In view of the low accuracy and low robustness of traditional dynamic target pursuit, this paper models the dynamic target tracking and attack problem of mobile robots as a Partially Observable Markov Decision Process (POMDP), and proposes a general-purpose end-to-end deep reinforcement learning framework based on dual agents to track and attack targets accurately in different scenarios. Aiming at the problem that it is difficult for mobile robots to accurately track targets and evade obstacles, this paper uses partial zero-sum game to improve the reward function to provide implicit guidance for attackers to pursue targets, and uses asynchronous advantage actor critic (A3C) algorithm to train models in parallel. Experiments in this paper show that the model can be transferred to different scenarios and has good generalization performance. Compared with the baseline method, the attacker’s time to successfully destroy the target is reduced by 44.7% at most in the maze scene and 40.5% at most in the block scene, which verifies the effectiveness of the proposed method. In addition, this paper analyzes the effectiveness of each structure of the model through ablation experiments, which illustrates the effectiveness and necessity of each module and provides a theoretical basis for subsequent research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Alexopoulos, C., Griffin, P.M.: Path planning for a mobile robot. IEEE Trans. Syst. Man Cybern. 22(2), 318–322 (1992)
Article Google Scholar
Li, Y.: Deep reinforcement learning: an overview. arXiv preprint arXiv:1701.07274 (2017)
Arulkumaran, K., Deisenroth, M.P., Brundage, M., et al.: Deep reinforcement learning: a brief survey. IEEE Signal Process. Mag. 34(6), 26–38 (2017)
Article Google Scholar
Littman, M.L.: Markov games as a framework for multi-agent reinforcement learning. In: Machine Learning Proceedings 1994, pp. 157–163. Morgan Kaufmann (1994)
Google Scholar
Li, X., Zha, Y.F., et al.: A survey of target tracking based on deep learning. J. Image Graph. 24(12), 2057–2080 (2019)
Google Scholar
Innmann, M., Zollhöfer, M., Nießner, M., Theobalt, C., Stamminger, M.: VolumeDeform: real-time volumetric non-rigid reconstruction. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 362–379. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_22
Chapter Google Scholar
Pérez, P., Hue, C., Vermaak, J., Gangnet, M.: Color-based probabilistic tracking. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2350, pp. 661–675. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-47969-4_44
Chapter Google Scholar
Danelljan, M., Häger, G., Khan, F., et al.: Accurate scale estimation for robust visual tracking. In: British Machine Vision Conference, Nottingham, September 1–5, 2014. Bmva Press (2014)
Google Scholar
Chen, B., Wang, D., Li, P., Wang, S., Lu, H.: Real-time ‘actor-critic’ tracking. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 328–345. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_20
Chapter Google Scholar
Huang, C., Lucey, S., Ramanan, D.: Learning policies for adaptive tracking with deep feature cascades. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 105–114 (2017)
Google Scholar
Yilmaz, A., Javed, O., Shah, M.: Object tracking: a survey. ACM Comput. Surv. (CSUR) 38(4), 13-es (2006)
Google Scholar
Sutton, R.S., Barto, A.G.: Introduction to reinforcement learning (1998)
Google Scholar
Miki, T., Lee, J., Hwangbo, J., et al.: Learning robust perceptive locomotion for quadrupedal robots in the wild. Sci. Robot. 7(62), eabk2822 (2022)
Google Scholar
Huang, S., Papernot, N., Goodfellow, I., et al.: Adversarial attacks on neural network policies. arXiv preprint arXiv:1702.02284 (2017)
Sukhbaatar, S., Lin, Z., Kostrikov, I., et al.: Intrinsic motivation and automatic curricula via asymmetric self-play. arXiv preprint arXiv:1703.05407 (2017)
Zhong, F., Sun, P., Luo, W., et al.: AD-VAT: An asymmetric dueling mechanism for learning visual active tracking. In: International Conference on Learning Representations (2018)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Zhong, F., Qiu, W., Yan, T., Yuille, A., Wang, Y.: Gym-unrealcv: realistic virtual worlds for visual reinforcement learning. Web Page (2017). https://github.com/unrealcv/gym-unrealcv
Griffis, D.: A3c lstm atari with pytorch plus a3g design. Web Page. https://github.com/dgriff777/rl_a3c_pytorch
Cho, K., Van Merriënboer, B., Bahdanau, D., et al.: On the properties of neural machine translation: encoder-decoder approaches. arXiv preprint arXiv:1409.1259 (2014)

Download references

Acknowledgements

This project was supported by the National Outstanding Young Scientists Foundation of China (62025205), the National Key Research and Development Program of China (2019QY0600), and the National Natural Science Foundation of China (61960206008, 61725205).

Author information

Authors and Affiliations

School of Computer Science, Northwestern Polytechnical University, Xi’an, 710072, China
Zhaotie Hao, Bin Guo, Mengyuan Li, Lie Wu & Zhiwen Yu

Authors

Zhaotie Hao
View author publications
You can also search for this author in PubMed Google Scholar
Bin Guo
View author publications
You can also search for this author in PubMed Google Scholar
Mengyuan Li
View author publications
You can also search for this author in PubMed Google Scholar
Lie Wu
View author publications
You can also search for this author in PubMed Google Scholar
Zhiwen Yu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bin Guo .

Editor information

Editors and Affiliations

Shandong University, Jinan, China
Yuqing Sun
Fudan University, Shanghai, China
Tun Lu
Taiyuan University of Science and Technology, Taiyuan, China
Yinzhang Guo
Shanxi Datong University, Datong, China
Xiaoxia Song
Tongji University, Shanghai, China
Hongfei Fan
Guangdong University of Technology, Guangzhou, China
Dongning Liu
University of Shanghai for Science and Technology, Shanghai, China
Liping Gao
Tongji University, Shanghai, China
Bowen Du

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hao, Z., Guo, B., Li, M., Wu, L., Yu, Z. (2023). Scene Adaptive Persistent Target Tracking and Attack Method Based on Deep Reinforcement Learning. In: Sun, Y., et al. Computer Supported Cooperative Work and Social Computing. ChineseCSCW 2022. Communications in Computer and Information Science, vol 1682. Springer, Singapore. https://doi.org/10.1007/978-981-99-2385-4_10

Download citation

DOI: https://doi.org/10.1007/978-981-99-2385-4_10
Published: 13 May 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-2384-7
Online ISBN: 978-981-99-2385-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the China Computer Federation (CCF) (opens in a new tab)