research-article

Multi-Robot Cooperative Pursuit-Evasion Control: A DeepReinforcement Learning Approach based on Prioritized Experience Replay

Authors:
Wei Li

School of Artificial Intelligence, Henan University, China

School of Artificial Intelligence, Henan University, China

0000-0002-1074-3241
View Profile

,
Wenhao Yan

School of Artificial Intelligence, Henan University, China

School of Artificial Intelligence, Henan University, China

0009-0005-4407-7856
View Profile

,
Huaguang Shi

School of Artificial Intelligence,, Henan University, China

School of Artificial Intelligence,, Henan University, China

0000-0002-5984-4588
View Profile

,
Si Li

School of Artificial Intelligence, Henan University, China

School of Artificial Intelligence, Henan University, China

0009-0004-5350-7283
View Profile

,
Yi Zhou

School of Artificial Intelligence, Henan University, China

School of Artificial Intelligence, Henan University, China

0000-0001-7657-6100
View Profile

CCEAI '24: Proceedings of the 2024 8th International Conference on Control Engineering and Artificial IntelligenceJanuary 2024Pages 120–127https://doi.org/10.1145/3640824.3640843

Published:08 March 2024Publication History

CCEAI '24: Proceedings of the 2024 8th International Conference on Control Engineering and Artificial Intelligence

Pages 120–127

ABSTRACT

Cooperative pursuit systems based on traditional model control rules are less adaptable and less robust to complex dynamic environments. In this paper, we study the cooperative pursuit-evasion with collision avoidance in multi-robot systems. We first adopt the Multi-Agent Twin Delayed Deep Deterministic policy gradient (MATD3) algorithm, and design a cooperative pursuit framework that uses the information of multiple robots in the learning process to more accurately predict the actions that robots will take. Then, we propose a Prioritized Experience Replay based MATD3 (PER-MATD3) algorithm which solves the problem of sparse reward in multi-robot cooperative pursuit algorithm by adopting the higher prioritized experience data update network when sampling. Simulation results show that the proposed PER-MATD3 algorithm reduces the collisions among robots, the collisions between robots and obstacles, and the capture time by 60.97%, 68.42%, and 30.37% respectively compared with the baseline algorithms. Moreover, the PER-MATD3 algorithm improves the capture success rate by 25.71% and achieves a faster convergence speed in continuous decision-making than the baseline algorithms.

References

Johann J.H. Ackermann, Volker Gabler, Takayuki Osa, and Masashi Sugiyama. 2019. Reducing Overestimation Bias in Multi-Agent Domains Using Double Centralized Critics. ArXiv abs/1910.01465 (2019). https://api.semanticscholar.org/CorpusID:203642167Google Scholar
Cristino de Souza, Rhys Newbury, Akansel Cosgun, Pedro Castillo, Boris Vidolov, and Dana Kulić. 2021. Decentralized Multi-Agent Pursuit Using Deep Reinforcement Learning. IEEE Robotics and Automation Letters 6, 3 (2021), 4552–4559. https://doi.org/10.1109/LRA.2021.3068952Google ScholarCross Ref
Dongyu Fan, Haikuo Shen, and Lijing Dong. 2021. Multi-Agent Distributed Deep Deterministic Policy Gradient for Partially Observable Tracking. Actuators (2021). https://api.semanticscholar.org/CorpusID:244581930Google Scholar
Xu Fang, Chen Wang, Lihua Xie, and Jie Chen. 2022. Cooperative Pursuit With Multi-Pursuer and One Faster Free-Moving Evader. IEEE Transactions on Cybernetics 52, 3 (2022), 1405–1414. https://doi.org/10.1109/TCYB.2019.2958548Google ScholarCross Ref
Massimiliano Ferrara, Gafurjan I. Ibragimov, Idham Arif Alias, and Mehdi Salimi. 2020. Pursuit Differential Game of Many Pursuers with Integral Constraints on Compact Convex Set. Bulletin of the Malaysian Mathematical Sciences Society 43 (2020), 2929–2950. https://api.semanticscholar.org/CorpusID:209980697Google ScholarCross Ref
Scott Fujimoto, Herke van Hoof, and David Meger. 2018. Addressing Function Approximation Error in Actor-Critic Methods. In International Conference on Machine Learning. https://api.semanticscholar.org/CorpusID:3544558Google Scholar
Tuomas Haarnoja, Aurick Zhou, P. Abbeel, and Sergey Levine. 2018. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor. ArXiv abs/1801.01290 (2018). https://api.semanticscholar.org/CorpusID:28202810Google Scholar
Dan Horgan, John Quan, David Budden, Gabriel Barth-Maron, Matteo Hessel, H. V. Hasselt, and David Silver. 2018. Distributed Prioritized Experience Replay. ArXiv abs/1803.00933 (2018). https://api.semanticscholar.org/CorpusID:3463260Google Scholar
Liwei Huang, Mingsheng Fu, Hong Qu, Siying Wang, and Shangqian Hu. 2021. A Deep Reinforcement Learning-Based Method Applied for Solving multi-agent defense and attack problems. Expert Syst. Appl. 176 (2021), 114896. https://api.semanticscholar.org/CorpusID:233699889Google ScholarDigital Library
Chia-Feng Juang, Chia-Hao Lu, and Chen-An Huang. 2022. Navigation of Three Cooperative Object-Transportation Robots Using a Multistage Evolutionary Fuzzy Control Approach. IEEE Transactions on Cybernetics 52, 5 (2022), 3606–3619. https://doi.org/10.1109/TCYB.2020.3015960Google ScholarCross Ref
Bingyan Liu, Xiongbing Ye, Xianzhou Dong, and Lei Ni. 2022. Branching Improved Deep Q Networks for Solving Pursuit-Evasion Strategy Solution of Spacecraft.Journal of Industrial & Management Optimization 18, 2 (2022).Google Scholar
Hang Liu, Akihiko Hyodo, and Shintaro Suzuki. 2022. Reinforcement Learning Based Indoor, Collaborative Autonomous Mobility　. Proceedings of the 6th International Conference on Control Engineering and Artificial Intelligence (2022). https://api.semanticscholar.org/CorpusID:248151206Google ScholarDigital Library
Victor G. Lopez, Frank L. Lewis, Yan Wan, Edgar N. Sánchez, and Lingling Fan. 2020. Solutions for Multiagent Pursuit-Evasion Games on Communication Graphs: Finite-Time Capture and Asymptotic Behaviors. IEEE Trans. Automat. Control 65 (2020), 1911–1923. https://api.semanticscholar.org/CorpusID:198458026Google ScholarCross Ref
Thien Hoang Nguyen, Thien-Minh Nguyen, and Lihua Xie. 2022. Flexible and Resource-Efficient Multi-Robot Collaborative Visual-Inertial-Range Localization. IEEE Robotics and Automation Letters 7 (2022), 928–935. https://api.semanticscholar.org/CorpusID:245314530Google ScholarCross Ref
Yinjie Ni, Shuhua Gao, Sunan Huang, Cheng Xiang, Qinyuan Ren, and Tong heng Lee. 2021. Multi-Agent Cooperative Pursuit-Evasion Control Using Gene Expression Programming. IECON 2021 – 47th Annual Conference of the IEEE Industrial Electronics Society (2021), 1–6. https://api.semanticscholar.org/CorpusID:243946532Google Scholar
Jeongho Park, Juwon Lee, Taehwan Kim, Inkyung Ahn, and Jooyoung Park. 2021. Co-Evolution of Predator-Prey Ecosystems by Reinforcement Learning Agents. Entropy 23 (2021). https://api.semanticscholar.org/CorpusID:233396873Google Scholar
Muhammad Zuhair Qadir, Songhao Piao, Haiyang Jiang, and Mohammed El Habib Souidi. 2020. A Novel Approach for Multi-Agent Cooperative Pursuit to Capture Grouped Evaders. The Journal of Supercomputing 76 (2020), 3416–3426.Google ScholarDigital Library
Samaneh Hosseini Semnani, Hugh Hong-Tao Liu, Michael Everett, Anton de Ruiter, and Jonathan P. How. 2020. Multi-Agent Motion Planning for Dense and Dynamic Environments via Deep Reinforcement Learning. IEEE Robotics and Automation Letters 5 (2020), 3221–3226. https://api.semanticscholar.org/CorpusID:210839349Google ScholarCross Ref
Wenda Sheng, Hongliang Guo, Wei-Yun Yau, and Yingjie Zhou. 2022. PD-FAC: Probability Density Factorized Multi-Agent Distributional Reinforcement Learning for Multi-Robot Reliable Search. IEEE Robotics and Automation Letters 7 (2022), 8869–8876. https://api.semanticscholar.org/CorpusID:250654506Google ScholarCross Ref
Ajay Kumar Shrestha and Ausif Mahmood. 2019. Review of Deep Learning Algorithms and Architectures. IEEE Access 7 (2019), 53040–53065. https://api.semanticscholar.org/CorpusID:139164978Google ScholarCross Ref
Ali Sohail, Naeem A. Nawaz, Asghar Ali Shah, Saim Rasheed, Sheeba Ilyas, and Muhammad Khurram Ehsan. 2022. A Systematic Literature Review on Machine Learning and Deep Learning Methods for Semantic Segmentation. IEEE Access 10 (2022), 134557–134570. https://api.semanticscholar.org/CorpusID:254956617Google ScholarCross Ref
Xinyu Song. 2022. MADDPG: An Efficient Multi-Agent Reinforcement Learning Algorithm. In Other Conferences. https://api.semanticscholar.org/CorpusID:249720545Google Scholar
Kaifang Wan, Dingwei Wu, Yiwei Zhai, Bo Li, Xiao guang Gao, and Zijian Hu. 2021. An Improved Approach towards Multi-Agent Pursuit–Evasion Game Decision-Making Using Deep Reinforcement Learning. Entropy 23 (2021). https://api.semanticscholar.org/CorpusID:242056317Google Scholar
Dingwei Wu, Kaifang Wan, Jianqiang Tang, Xiao guang Gao, Yiwei Zhai, and Zhaohui Qi. 2022. An Improved Method towards Multi-UAV Autonomous Navigation Using Deep Reinforcement Learning. 2022 7th International Conference on Control and Robotics Engineering (ICCRE) (2022), 96–101. https://api.semanticscholar.org/CorpusID:248681339Google ScholarCross Ref
Xingwei Zhao, Bo Tao, and Han Ding. 2021. Multimobile Robot Cluster System for Robot Machining of Large-Scale Workpieces. IEEE/ASME Transactions on Mechatronics 27 (2021), 561–571. https://api.semanticscholar.org/CorpusID:234266561Google ScholarCross Ref
Conghang Zhou, Jianxing Li, Yujing Shi, and Zhirui Lin. 2023. Research on Multi-Robot Formation Control Based on MATD3 Algorithm. Applied Sciences (2023). https://api.semanticscholar.org/CorpusID:256530787Google Scholar
Xiaofeng Zhou, Song Zhou, Xingang Mou, and Yi He. 2022. Multirobot Collaborative Pursuit Target Robot by Improved MADDPG. Computational Intelligence and Neuroscience 2022 (2022). https://api.semanticscholar.org/CorpusID:247139444Google Scholar
Qiang Zhu, Kexin Wang, Zhijiang Shao, and Lorenz T. Biegler. 2020. Receding Horizon Optimization Method for Solving the Cops and Robbers Problems in a Complex Environment with Obstacles. Journal of Intelligent & Robotic Systems 100 (2020), 83 – 112. https://api.semanticscholar.org/CorpusID:221521554Google ScholarDigital Library

Index Terms

Multi-Robot Cooperative Pursuit-Evasion Control: A DeepReinforcement Learning Approach based on Prioritized Experience Replay
1. Computing methodologies
  1. Artificial intelligence
    1. Control methods
      1. Robotic planning
  2. Machine learning
    1. Machine learning approaches
      1. Partially-observable Markov decision processes
2. Theory of computation
  1. Theory and algorithms for application domains
    1. Machine learning theory
      1. Reinforcement learning
        Multi-agent reinforcement learning

Recommendations

Deep Reinforcement Learning of Map-Based Obstacle Avoidance for Mobile Robot Navigation
Abstract
Autonomous and safe navigation in complex environments without collisions is particularly important for mobile robots. In this paper, we propose an end-to-end deep reinforcement learning method for mobile robot navigation with map-based obstacle ...
Read More
Multi-objective crowd-aware robot navigation system using deep reinforcement learning
Abstract
Navigating efficiently and safely through human crowds is essential for mobile robots in diverse applications such as delivery services, home assistance, healthcare, and manufacturing. However, traditional navigation methods are adversely ...
Highlights
- Deep reinforcement learning helps achieving robot navigation in a crowd environment.
- Crowd-aware robots can learn to reach the goal while balancing multiple objectives.
- Dual-selection attention module allows robots to reduce ...
Read More
Multi-robot Target Encirclement Control with Collision Avoidance via Deep Reinforcement Learning
Abstract
The target encirclement control of multi-robot systems via deep reinforcement learning has been investigated in this paper. Inspired by the encirclement behavior of dolphins to entrap the fishes, the encirclement control is mainly to enforce the ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CCEAI '24: Proceedings of the 2024 8th International Conference on Control Engineering and Artificial Intelligence
January 2024
297 pages
ISBN:9798400707971
DOI:10.1145/3640824
Editors:
Wenqiang Zhang,
Yong Yue,
Marek Ogiela
Copyright © 2024 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 8 March 2024
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Multi-robot cooperative pursuit
deep reinforcement learning
prioritized experience replay.
Qualifiers
- research-article
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 24
  Total Downloads
- Downloads (Last 12 months)24
- Downloads (Last 6 weeks)19
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Multi-Robot Cooperative Pursuit-Evasion Control: A DeepReinforcement Learning Approach based on Prioritized Experience Replay

CCEAI '24: Proceedings of the 2024 8th International Conference on Control Engineering and Artificial Intelligence

ABSTRACT

References

Cited By

Index Terms

Recommendations

Deep Reinforcement Learning of Map-Based Obstacle Avoidance for Mobile Robot Navigation

Multi-objective crowd-aware robot navigation system using deep reinforcement learning

Multi-robot Target Encirclement Control with Collision Avoidance via Deep Reinforcement Learning

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Multi-Robot Cooperative Pursuit-Evasion Control: A DeepReinforcement Learning Approach based on Prioritized Experience Replay

CCEAI '24: Proceedings of the 2024 8th International Conference on Control Engineering and Artificial Intelligence

ABSTRACT

References

Cited By

Index Terms

Recommendations

Deep Reinforcement Learning of Map-Based Obstacle Avoidance for Mobile Robot Navigation

Multi-objective crowd-aware robot navigation system using deep reinforcement learning

Multi-robot Target Encirclement Control with Collision Avoidance via Deep Reinforcement Learning

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media