ABSTRACT
Cooperative pursuit systems based on traditional model control rules are less adaptable and less robust to complex dynamic environments. In this paper, we study the cooperative pursuit-evasion with collision avoidance in multi-robot systems. We first adopt the Multi-Agent Twin Delayed Deep Deterministic policy gradient (MATD3) algorithm, and design a cooperative pursuit framework that uses the information of multiple robots in the learning process to more accurately predict the actions that robots will take. Then, we propose a Prioritized Experience Replay based MATD3 (PER-MATD3) algorithm which solves the problem of sparse reward in multi-robot cooperative pursuit algorithm by adopting the higher prioritized experience data update network when sampling. Simulation results show that the proposed PER-MATD3 algorithm reduces the collisions among robots, the collisions between robots and obstacles, and the capture time by 60.97%, 68.42%, and 30.37% respectively compared with the baseline algorithms. Moreover, the PER-MATD3 algorithm improves the capture success rate by 25.71% and achieves a faster convergence speed in continuous decision-making than the baseline algorithms.
- Johann J.H. Ackermann, Volker Gabler, Takayuki Osa, and Masashi Sugiyama. 2019. Reducing Overestimation Bias in Multi-Agent Domains Using Double Centralized Critics. ArXiv abs/1910.01465 (2019). https://api.semanticscholar.org/CorpusID:203642167Google Scholar
- Cristino de Souza, Rhys Newbury, Akansel Cosgun, Pedro Castillo, Boris Vidolov, and Dana Kulić. 2021. Decentralized Multi-Agent Pursuit Using Deep Reinforcement Learning. IEEE Robotics and Automation Letters 6, 3 (2021), 4552–4559. https://doi.org/10.1109/LRA.2021.3068952Google ScholarCross Ref
- Dongyu Fan, Haikuo Shen, and Lijing Dong. 2021. Multi-Agent Distributed Deep Deterministic Policy Gradient for Partially Observable Tracking. Actuators (2021). https://api.semanticscholar.org/CorpusID:244581930Google Scholar
- Xu Fang, Chen Wang, Lihua Xie, and Jie Chen. 2022. Cooperative Pursuit With Multi-Pursuer and One Faster Free-Moving Evader. IEEE Transactions on Cybernetics 52, 3 (2022), 1405–1414. https://doi.org/10.1109/TCYB.2019.2958548Google ScholarCross Ref
- Massimiliano Ferrara, Gafurjan I. Ibragimov, Idham Arif Alias, and Mehdi Salimi. 2020. Pursuit Differential Game of Many Pursuers with Integral Constraints on Compact Convex Set. Bulletin of the Malaysian Mathematical Sciences Society 43 (2020), 2929–2950. https://api.semanticscholar.org/CorpusID:209980697Google ScholarCross Ref
- Scott Fujimoto, Herke van Hoof, and David Meger. 2018. Addressing Function Approximation Error in Actor-Critic Methods. In International Conference on Machine Learning. https://api.semanticscholar.org/CorpusID:3544558Google Scholar
- Tuomas Haarnoja, Aurick Zhou, P. Abbeel, and Sergey Levine. 2018. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor. ArXiv abs/1801.01290 (2018). https://api.semanticscholar.org/CorpusID:28202810Google Scholar
- Dan Horgan, John Quan, David Budden, Gabriel Barth-Maron, Matteo Hessel, H. V. Hasselt, and David Silver. 2018. Distributed Prioritized Experience Replay. ArXiv abs/1803.00933 (2018). https://api.semanticscholar.org/CorpusID:3463260Google Scholar
- Liwei Huang, Mingsheng Fu, Hong Qu, Siying Wang, and Shangqian Hu. 2021. A Deep Reinforcement Learning-Based Method Applied for Solving multi-agent defense and attack problems. Expert Syst. Appl. 176 (2021), 114896. https://api.semanticscholar.org/CorpusID:233699889Google ScholarDigital Library
- Chia-Feng Juang, Chia-Hao Lu, and Chen-An Huang. 2022. Navigation of Three Cooperative Object-Transportation Robots Using a Multistage Evolutionary Fuzzy Control Approach. IEEE Transactions on Cybernetics 52, 5 (2022), 3606–3619. https://doi.org/10.1109/TCYB.2020.3015960Google ScholarCross Ref
- Bingyan Liu, Xiongbing Ye, Xianzhou Dong, and Lei Ni. 2022. Branching Improved Deep Q Networks for Solving Pursuit-Evasion Strategy Solution of Spacecraft.Journal of Industrial & Management Optimization 18, 2 (2022).Google Scholar
- Hang Liu, Akihiko Hyodo, and Shintaro Suzuki. 2022. Reinforcement Learning Based Indoor, Collaborative Autonomous Mobility . Proceedings of the 6th International Conference on Control Engineering and Artificial Intelligence (2022). https://api.semanticscholar.org/CorpusID:248151206Google ScholarDigital Library
- Victor G. Lopez, Frank L. Lewis, Yan Wan, Edgar N. Sánchez, and Lingling Fan. 2020. Solutions for Multiagent Pursuit-Evasion Games on Communication Graphs: Finite-Time Capture and Asymptotic Behaviors. IEEE Trans. Automat. Control 65 (2020), 1911–1923. https://api.semanticscholar.org/CorpusID:198458026Google ScholarCross Ref
- Thien Hoang Nguyen, Thien-Minh Nguyen, and Lihua Xie. 2022. Flexible and Resource-Efficient Multi-Robot Collaborative Visual-Inertial-Range Localization. IEEE Robotics and Automation Letters 7 (2022), 928–935. https://api.semanticscholar.org/CorpusID:245314530Google ScholarCross Ref
- Yinjie Ni, Shuhua Gao, Sunan Huang, Cheng Xiang, Qinyuan Ren, and Tong heng Lee. 2021. Multi-Agent Cooperative Pursuit-Evasion Control Using Gene Expression Programming. IECON 2021 – 47th Annual Conference of the IEEE Industrial Electronics Society (2021), 1–6. https://api.semanticscholar.org/CorpusID:243946532Google Scholar
- Jeongho Park, Juwon Lee, Taehwan Kim, Inkyung Ahn, and Jooyoung Park. 2021. Co-Evolution of Predator-Prey Ecosystems by Reinforcement Learning Agents. Entropy 23 (2021). https://api.semanticscholar.org/CorpusID:233396873Google Scholar
- Muhammad Zuhair Qadir, Songhao Piao, Haiyang Jiang, and Mohammed El Habib Souidi. 2020. A Novel Approach for Multi-Agent Cooperative Pursuit to Capture Grouped Evaders. The Journal of Supercomputing 76 (2020), 3416–3426.Google ScholarDigital Library
- Samaneh Hosseini Semnani, Hugh Hong-Tao Liu, Michael Everett, Anton de Ruiter, and Jonathan P. How. 2020. Multi-Agent Motion Planning for Dense and Dynamic Environments via Deep Reinforcement Learning. IEEE Robotics and Automation Letters 5 (2020), 3221–3226. https://api.semanticscholar.org/CorpusID:210839349Google ScholarCross Ref
- Wenda Sheng, Hongliang Guo, Wei-Yun Yau, and Yingjie Zhou. 2022. PD-FAC: Probability Density Factorized Multi-Agent Distributional Reinforcement Learning for Multi-Robot Reliable Search. IEEE Robotics and Automation Letters 7 (2022), 8869–8876. https://api.semanticscholar.org/CorpusID:250654506Google ScholarCross Ref
- Ajay Kumar Shrestha and Ausif Mahmood. 2019. Review of Deep Learning Algorithms and Architectures. IEEE Access 7 (2019), 53040–53065. https://api.semanticscholar.org/CorpusID:139164978Google ScholarCross Ref
- Ali Sohail, Naeem A. Nawaz, Asghar Ali Shah, Saim Rasheed, Sheeba Ilyas, and Muhammad Khurram Ehsan. 2022. A Systematic Literature Review on Machine Learning and Deep Learning Methods for Semantic Segmentation. IEEE Access 10 (2022), 134557–134570. https://api.semanticscholar.org/CorpusID:254956617Google ScholarCross Ref
- Xinyu Song. 2022. MADDPG: An Efficient Multi-Agent Reinforcement Learning Algorithm. In Other Conferences. https://api.semanticscholar.org/CorpusID:249720545Google Scholar
- Kaifang Wan, Dingwei Wu, Yiwei Zhai, Bo Li, Xiao guang Gao, and Zijian Hu. 2021. An Improved Approach towards Multi-Agent Pursuit–Evasion Game Decision-Making Using Deep Reinforcement Learning. Entropy 23 (2021). https://api.semanticscholar.org/CorpusID:242056317Google Scholar
- Dingwei Wu, Kaifang Wan, Jianqiang Tang, Xiao guang Gao, Yiwei Zhai, and Zhaohui Qi. 2022. An Improved Method towards Multi-UAV Autonomous Navigation Using Deep Reinforcement Learning. 2022 7th International Conference on Control and Robotics Engineering (ICCRE) (2022), 96–101. https://api.semanticscholar.org/CorpusID:248681339Google ScholarCross Ref
- Xingwei Zhao, Bo Tao, and Han Ding. 2021. Multimobile Robot Cluster System for Robot Machining of Large-Scale Workpieces. IEEE/ASME Transactions on Mechatronics 27 (2021), 561–571. https://api.semanticscholar.org/CorpusID:234266561Google ScholarCross Ref
- Conghang Zhou, Jianxing Li, Yujing Shi, and Zhirui Lin. 2023. Research on Multi-Robot Formation Control Based on MATD3 Algorithm. Applied Sciences (2023). https://api.semanticscholar.org/CorpusID:256530787Google Scholar
- Xiaofeng Zhou, Song Zhou, Xingang Mou, and Yi He. 2022. Multirobot Collaborative Pursuit Target Robot by Improved MADDPG. Computational Intelligence and Neuroscience 2022 (2022). https://api.semanticscholar.org/CorpusID:247139444Google Scholar
- Qiang Zhu, Kexin Wang, Zhijiang Shao, and Lorenz T. Biegler. 2020. Receding Horizon Optimization Method for Solving the Cops and Robbers Problems in a Complex Environment with Obstacles. Journal of Intelligent & Robotic Systems 100 (2020), 83 – 112. https://api.semanticscholar.org/CorpusID:221521554Google ScholarDigital Library
Index Terms
- Multi-Robot Cooperative Pursuit-Evasion Control: A DeepReinforcement Learning Approach based on Prioritized Experience Replay
Recommendations
Deep Reinforcement Learning of Map-Based Obstacle Avoidance for Mobile Robot Navigation
AbstractAutonomous and safe navigation in complex environments without collisions is particularly important for mobile robots. In this paper, we propose an end-to-end deep reinforcement learning method for mobile robot navigation with map-based obstacle ...
Multi-objective crowd-aware robot navigation system using deep reinforcement learning
AbstractNavigating efficiently and safely through human crowds is essential for mobile robots in diverse applications such as delivery services, home assistance, healthcare, and manufacturing. However, traditional navigation methods are adversely ...
Highlights- Deep reinforcement learning helps achieving robot navigation in a crowd environment.
- Crowd-aware robots can learn to reach the goal while balancing multiple objectives.
- Dual-selection attention module allows robots to reduce ...
Multi-robot Target Encirclement Control with Collision Avoidance via Deep Reinforcement Learning
AbstractThe target encirclement control of multi-robot systems via deep reinforcement learning has been investigated in this paper. Inspired by the encirclement behavior of dolphins to entrap the fishes, the encirclement control is mainly to enforce the ...
Comments