skip to main content
10.1145/3640824.3640843acmotherconferencesArticle/Chapter ViewAbstractPublication PagescceaiConference Proceedingsconference-collections
research-article

Multi-Robot Cooperative Pursuit-Evasion Control: A DeepReinforcement Learning Approach based on Prioritized Experience Replay

Published:08 March 2024Publication History

ABSTRACT

Cooperative pursuit systems based on traditional model control rules are less adaptable and less robust to complex dynamic environments. In this paper, we study the cooperative pursuit-evasion with collision avoidance in multi-robot systems. We first adopt the Multi-Agent Twin Delayed Deep Deterministic policy gradient (MATD3) algorithm, and design a cooperative pursuit framework that uses the information of multiple robots in the learning process to more accurately predict the actions that robots will take. Then, we propose a Prioritized Experience Replay based MATD3 (PER-MATD3) algorithm which solves the problem of sparse reward in multi-robot cooperative pursuit algorithm by adopting the higher prioritized experience data update network when sampling. Simulation results show that the proposed PER-MATD3 algorithm reduces the collisions among robots, the collisions between robots and obstacles, and the capture time by 60.97%, 68.42%, and 30.37% respectively compared with the baseline algorithms. Moreover, the PER-MATD3 algorithm improves the capture success rate by 25.71% and achieves a faster convergence speed in continuous decision-making than the baseline algorithms.

References

  1. Johann J.H. Ackermann, Volker Gabler, Takayuki Osa, and Masashi Sugiyama. 2019. Reducing Overestimation Bias in Multi-Agent Domains Using Double Centralized Critics. ArXiv abs/1910.01465 (2019). https://api.semanticscholar.org/CorpusID:203642167Google ScholarGoogle Scholar
  2. Cristino de Souza, Rhys Newbury, Akansel Cosgun, Pedro Castillo, Boris Vidolov, and Dana Kulić. 2021. Decentralized Multi-Agent Pursuit Using Deep Reinforcement Learning. IEEE Robotics and Automation Letters 6, 3 (2021), 4552–4559. https://doi.org/10.1109/LRA.2021.3068952Google ScholarGoogle ScholarCross RefCross Ref
  3. Dongyu Fan, Haikuo Shen, and Lijing Dong. 2021. Multi-Agent Distributed Deep Deterministic Policy Gradient for Partially Observable Tracking. Actuators (2021). https://api.semanticscholar.org/CorpusID:244581930Google ScholarGoogle Scholar
  4. Xu Fang, Chen Wang, Lihua Xie, and Jie Chen. 2022. Cooperative Pursuit With Multi-Pursuer and One Faster Free-Moving Evader. IEEE Transactions on Cybernetics 52, 3 (2022), 1405–1414. https://doi.org/10.1109/TCYB.2019.2958548Google ScholarGoogle ScholarCross RefCross Ref
  5. Massimiliano Ferrara, Gafurjan I. Ibragimov, Idham Arif Alias, and Mehdi Salimi. 2020. Pursuit Differential Game of Many Pursuers with Integral Constraints on Compact Convex Set. Bulletin of the Malaysian Mathematical Sciences Society 43 (2020), 2929–2950. https://api.semanticscholar.org/CorpusID:209980697Google ScholarGoogle ScholarCross RefCross Ref
  6. Scott Fujimoto, Herke van Hoof, and David Meger. 2018. Addressing Function Approximation Error in Actor-Critic Methods. In International Conference on Machine Learning. https://api.semanticscholar.org/CorpusID:3544558Google ScholarGoogle Scholar
  7. Tuomas Haarnoja, Aurick Zhou, P. Abbeel, and Sergey Levine. 2018. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor. ArXiv abs/1801.01290 (2018). https://api.semanticscholar.org/CorpusID:28202810Google ScholarGoogle Scholar
  8. Dan Horgan, John Quan, David Budden, Gabriel Barth-Maron, Matteo Hessel, H. V. Hasselt, and David Silver. 2018. Distributed Prioritized Experience Replay. ArXiv abs/1803.00933 (2018). https://api.semanticscholar.org/CorpusID:3463260Google ScholarGoogle Scholar
  9. Liwei Huang, Mingsheng Fu, Hong Qu, Siying Wang, and Shangqian Hu. 2021. A Deep Reinforcement Learning-Based Method Applied for Solving multi-agent defense and attack problems. Expert Syst. Appl. 176 (2021), 114896. https://api.semanticscholar.org/CorpusID:233699889Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Chia-Feng Juang, Chia-Hao Lu, and Chen-An Huang. 2022. Navigation of Three Cooperative Object-Transportation Robots Using a Multistage Evolutionary Fuzzy Control Approach. IEEE Transactions on Cybernetics 52, 5 (2022), 3606–3619. https://doi.org/10.1109/TCYB.2020.3015960Google ScholarGoogle ScholarCross RefCross Ref
  11. Bingyan Liu, Xiongbing Ye, Xianzhou Dong, and Lei Ni. 2022. Branching Improved Deep Q Networks for Solving Pursuit-Evasion Strategy Solution of Spacecraft.Journal of Industrial & Management Optimization 18, 2 (2022).Google ScholarGoogle Scholar
  12. Hang Liu, Akihiko Hyodo, and Shintaro Suzuki. 2022. Reinforcement Learning Based Indoor, Collaborative Autonomous Mobility . Proceedings of the 6th International Conference on Control Engineering and Artificial Intelligence (2022). https://api.semanticscholar.org/CorpusID:248151206Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Victor G. Lopez, Frank L. Lewis, Yan Wan, Edgar N. Sánchez, and Lingling Fan. 2020. Solutions for Multiagent Pursuit-Evasion Games on Communication Graphs: Finite-Time Capture and Asymptotic Behaviors. IEEE Trans. Automat. Control 65 (2020), 1911–1923. https://api.semanticscholar.org/CorpusID:198458026Google ScholarGoogle ScholarCross RefCross Ref
  14. Thien Hoang Nguyen, Thien-Minh Nguyen, and Lihua Xie. 2022. Flexible and Resource-Efficient Multi-Robot Collaborative Visual-Inertial-Range Localization. IEEE Robotics and Automation Letters 7 (2022), 928–935. https://api.semanticscholar.org/CorpusID:245314530Google ScholarGoogle ScholarCross RefCross Ref
  15. Yinjie Ni, Shuhua Gao, Sunan Huang, Cheng Xiang, Qinyuan Ren, and Tong heng Lee. 2021. Multi-Agent Cooperative Pursuit-Evasion Control Using Gene Expression Programming. IECON 2021 – 47th Annual Conference of the IEEE Industrial Electronics Society (2021), 1–6. https://api.semanticscholar.org/CorpusID:243946532Google ScholarGoogle Scholar
  16. Jeongho Park, Juwon Lee, Taehwan Kim, Inkyung Ahn, and Jooyoung Park. 2021. Co-Evolution of Predator-Prey Ecosystems by Reinforcement Learning Agents. Entropy 23 (2021). https://api.semanticscholar.org/CorpusID:233396873Google ScholarGoogle Scholar
  17. Muhammad Zuhair Qadir, Songhao Piao, Haiyang Jiang, and Mohammed El Habib Souidi. 2020. A Novel Approach for Multi-Agent Cooperative Pursuit to Capture Grouped Evaders. The Journal of Supercomputing 76 (2020), 3416–3426.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Samaneh Hosseini Semnani, Hugh Hong-Tao Liu, Michael Everett, Anton de Ruiter, and Jonathan P. How. 2020. Multi-Agent Motion Planning for Dense and Dynamic Environments via Deep Reinforcement Learning. IEEE Robotics and Automation Letters 5 (2020), 3221–3226. https://api.semanticscholar.org/CorpusID:210839349Google ScholarGoogle ScholarCross RefCross Ref
  19. Wenda Sheng, Hongliang Guo, Wei-Yun Yau, and Yingjie Zhou. 2022. PD-FAC: Probability Density Factorized Multi-Agent Distributional Reinforcement Learning for Multi-Robot Reliable Search. IEEE Robotics and Automation Letters 7 (2022), 8869–8876. https://api.semanticscholar.org/CorpusID:250654506Google ScholarGoogle ScholarCross RefCross Ref
  20. Ajay Kumar Shrestha and Ausif Mahmood. 2019. Review of Deep Learning Algorithms and Architectures. IEEE Access 7 (2019), 53040–53065. https://api.semanticscholar.org/CorpusID:139164978Google ScholarGoogle ScholarCross RefCross Ref
  21. Ali Sohail, Naeem A. Nawaz, Asghar Ali Shah, Saim Rasheed, Sheeba Ilyas, and Muhammad Khurram Ehsan. 2022. A Systematic Literature Review on Machine Learning and Deep Learning Methods for Semantic Segmentation. IEEE Access 10 (2022), 134557–134570. https://api.semanticscholar.org/CorpusID:254956617Google ScholarGoogle ScholarCross RefCross Ref
  22. Xinyu Song. 2022. MADDPG: An Efficient Multi-Agent Reinforcement Learning Algorithm. In Other Conferences. https://api.semanticscholar.org/CorpusID:249720545Google ScholarGoogle Scholar
  23. Kaifang Wan, Dingwei Wu, Yiwei Zhai, Bo Li, Xiao guang Gao, and Zijian Hu. 2021. An Improved Approach towards Multi-Agent Pursuit–Evasion Game Decision-Making Using Deep Reinforcement Learning. Entropy 23 (2021). https://api.semanticscholar.org/CorpusID:242056317Google ScholarGoogle Scholar
  24. Dingwei Wu, Kaifang Wan, Jianqiang Tang, Xiao guang Gao, Yiwei Zhai, and Zhaohui Qi. 2022. An Improved Method towards Multi-UAV Autonomous Navigation Using Deep Reinforcement Learning. 2022 7th International Conference on Control and Robotics Engineering (ICCRE) (2022), 96–101. https://api.semanticscholar.org/CorpusID:248681339Google ScholarGoogle ScholarCross RefCross Ref
  25. Xingwei Zhao, Bo Tao, and Han Ding. 2021. Multimobile Robot Cluster System for Robot Machining of Large-Scale Workpieces. IEEE/ASME Transactions on Mechatronics 27 (2021), 561–571. https://api.semanticscholar.org/CorpusID:234266561Google ScholarGoogle ScholarCross RefCross Ref
  26. Conghang Zhou, Jianxing Li, Yujing Shi, and Zhirui Lin. 2023. Research on Multi-Robot Formation Control Based on MATD3 Algorithm. Applied Sciences (2023). https://api.semanticscholar.org/CorpusID:256530787Google ScholarGoogle Scholar
  27. Xiaofeng Zhou, Song Zhou, Xingang Mou, and Yi He. 2022. Multirobot Collaborative Pursuit Target Robot by Improved MADDPG. Computational Intelligence and Neuroscience 2022 (2022). https://api.semanticscholar.org/CorpusID:247139444Google ScholarGoogle Scholar
  28. Qiang Zhu, Kexin Wang, Zhijiang Shao, and Lorenz T. Biegler. 2020. Receding Horizon Optimization Method for Solving the Cops and Robbers Problems in a Complex Environment with Obstacles. Journal of Intelligent & Robotic Systems 100 (2020), 83 – 112. https://api.semanticscholar.org/CorpusID:221521554Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Multi-Robot Cooperative Pursuit-Evasion Control: A DeepReinforcement Learning Approach based on Prioritized Experience Replay

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Other conferences
          CCEAI '24: Proceedings of the 2024 8th International Conference on Control Engineering and Artificial Intelligence
          January 2024
          297 pages
          ISBN:9798400707971
          DOI:10.1145/3640824

          Copyright © 2024 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 8 March 2024

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed limited
        • Article Metrics

          • Downloads (Last 12 months)24
          • Downloads (Last 6 weeks)19

          Other Metrics

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format .

        View HTML Format