Learning controlled and targeted communication with the centralized critic for the multi-agent system

Sun, Qingshuang; Yao, Yuan; Yi, Peng; Hu, YuJiao; Yang, Zhao; Yang, Gang; Zhou, Xingshe

doi:10.1007/s10489-022-04225-5

Learning controlled and targeted communication with the centralized critic for the multi-agent system

Published: 04 November 2022

Volume 53, pages 14819–14837, (2023)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Qingshuang Sun¹,
Yuan Yao ORCID: orcid.org/0000-0002-6509-9297¹,
Peng Yi¹,
YuJiao Hu²,
Zhao Yang¹,
Gang Yang¹ &
…
Xingshe Zhou¹

455 Accesses
1 Altmetric
Explore all metrics

Abstract

Multi-agent deep reinforcement learning (MDRL) has attracted attention for solving complex tasks. Two main challenges of MDRL are non-stationarity and partial observability from the perspective of agents, impacting the performance of agents’ learning cooperative policies. In this study, Controlled and Targeted Communication with the Centralized Critic (COTAC) is proposed, thereby constructing the paradigm of centralized learning and decentralized execution with partial communication. It is capable of decoupling how the MAS obtains environmental information during training and execution. Specifically, COTAC can make the environment faced by agents to be stationarity in the training phase and learn partial communication to overcome the limitation of partial observability in the execution phase. Based on this, decentralized actors learn controlled and targeted communication and policies optimized by centralized critics during training. As a result, agents comprehensively learn when to communicate during the sending and how to target information aggregation during the receiving. Apart from that, COTAC is evaluated on two multi-agent scenarios with continuous space. Experimental results demonstrated that partial agents with important information choose to send messages and targeted aggregate received information by identifying the relevant important information, which can still have better cooperation performance while reducing the communication traffic of the system.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A survey of multi-agent deep reinforcement learning with communication

Article Open access 06 January 2024

Learning When to Communicate Among Actors with the Centralized Critic for the Multi-agent System

Multi-agent Neural Reinforcement-Learning System with Communication

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data Availability

The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.

References

Chen F, Ren W et al (2019) On the control of multi-agent systems: a survey. Found Trends® Syst Control 6(4):339–499
Article Google Scholar
D’Souza F, Costa J, Pires JN (2020) Development of a solution for adding a collaborative robot to an industrial agv. Ind Rob:, Int J Rob Res Appl 47(5):723–735
Article Google Scholar
Mahdoui N, Frémont V, Natalizio E (2020) Communicating multi-uav system for cooperative slam-based exploration. J Intell Rob Syst 98(2):325–343
Article Google Scholar
Arulkumaran K, Deisenroth MP, Brundage M, Bharath AA (2017) Deep reinforcement learning: a brief survey. IEEE Signal Proc Mag 34(6):26–38
Article Google Scholar
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444
Article Google Scholar
Sutton RS, Barto AG (2018) Reinforcement learning: an introduction. MIT Press
Hernandez-Leal P, Kartal B, Taylor ME (2019) A survey and critique of multiagent deep reinforcement learning. Auton Agent Multi-Agent Syst 33(6):750–797
Article Google Scholar
Fan T, Long P, Liu W, Pan J (2020) Distributed multi-robot collision avoidance via deep reinforcement learning for navigation in complex scenarios. Int J Rob Res 39(7):856–892
Article Google Scholar
Xiao Y, Hoffman J, Xia T, Amato C (2020) Learning multi-robot decentralized macro-action-based policies via a centralized q-net. In: 2020 IEEE International conference on robotics and automation (ICRA), pp 10695–10701. IEEE
Kiran BR, Sobh I, Talpaert V, Mannion P, Perez P (2021) Deep reinforcement learning for autonomous driving: a survey. IEEE Trans Intell Transp Syst pp(99):1–18
Google Scholar
Vinyals O, Babuschkin I, Czarnecki WM, Mathieu M, Dudzik A, Chung J, Choi DH, Powell R, Ewalds T, Georgiev P et al (2019) Grandmaster level in starcraft ii using multi-agent reinforcement learning. Nature 575(7782):350–354
Article Google Scholar
Nguyen TT, Nguyen ND, Nahavandi S (2020) Deep reinforcement learning for multiagent systems: a review of challenges, solutions, and applications. IEEE Trans Cybern 50(9):3826–3839
Article Google Scholar
Xiao Y, Lyu X, Amato C (2021) Local advantage actor-critic for robust multi-agent deep reinforcement learning. In: 2021 International symposium on multi-robot and multi-agent systems (MRS), pp 155–163. IEEE
Foerster J, Farquhar G, Afouras T, Nardelli N, Whiteson S (2018) Counterfactual multi-agent policy gradients. In: Proceedings of the AAAI Conference on artificial intelligence, vol 32
Lowe R, WU Y, Tamar A, Harb J, Pieter Abbeel O, Mordatch I (2017) Multi-agent actor-critic for mixed cooperative-competitive environments. Adv Neural Inf Process Syst 30:6379–6390
Google Scholar
Du W, Ding S (2021) A survey on multi-agent deep reinforcement learning: from the perspective of challenges and applications. Artif Intell Rev 54(5):3215–3238
Article Google Scholar
Sukhbaatar S, Fergus R et al (2016) Learning multiagent communication with backpropagation. In: Advances in neural information processing systems, pp 2244–2252
Das A, Gervet T, Romoff J, Batra D, Parikh D, Rabbat M, Pineau J (2019) Tarmac: targeted multi-agent communication. In: International conference on machine learning, pp 1538–1546
Simões D, Lau N, Reis LP (2020) Multi-agent actor centralized-critic with communication. Neurocomputing 390:40–56
Article Google Scholar
Singh A, Jain T, Sukhbaatar S (2019) Individualized controlled continuous communication model for multiagent cooperative and competitive tasks. In: International conference on learning representations
Iqbal S, Sha F (2019) Actor-attention-critic for multi-agent reinforcement learning. In: International conference on machine learning, pp 2961–2970. PMLR
Liu W, Liu S, Cao J, Wang Q, Lang X, Liu Y (2021) Learning communication for cooperation in dynamic agent-number environment. IEEE/ASME Trans Mechatronics 26(4):1846– 1857
Article Google Scholar
Jiang J, Lu Z (2018) Learning attentional communication for multi-agent cooperation. In: Advances in neural information processing systems, pp 7254–7264
Mu C, Wang K, Ni Z (2021) Adaptive learning and sampled-control for nonlinear game systems using dynamic event-triggering strategy. IEEE Transactions on Neural Networks and Learning Systems
Mu C, Wang K, Sun C (2020) Learning control supported by dynamic event communication applying to industrial systems. IEEE Trans Ind Inf 17(4):2325–2335
Article Google Scholar
Xu K, Ba J, Kiros R, Cho K, Courville A, Salakhudinov R, Zemel R, Bengio Y (2015) Show, attend and tell: neural image caption generation with visual attention. In: International conference on machine learning, pp 2048–2057. PMLR
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, pp 5998–6008
Soydaner D (2022) Attention mechanism in neural networks: where it comes and where it goes. Neural Computing and Applications, pp 1–15
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Article Google Scholar
Williams RJ (1992) Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach Learn 8(3):229–256
Article MATH Google Scholar

Download references

Acknowledgments

This work was supported in part by the National Natural Science Foundation of China (No.61876151, 62032018) and the Fundamental Research Funds for the Central Universities (No.3102019DX1005).

Author information

Authors and Affiliations

School of Computer Science, Northwestern Polytechnical University, No.1 Dongxiang Road, Xi’an, 710129, Shaanxi, China
Qingshuang Sun, Yuan Yao, Peng Yi, Zhao Yang, Gang Yang & Xingshe Zhou
Future network center, Purple Mountain Laboratories, No.9, Zhizhou East Road, Nanjing, 211111, Jiangsu, China
YuJiao Hu

Authors

Qingshuang Sun
View author publications
You can also search for this author in PubMed Google Scholar
Yuan Yao
View author publications
You can also search for this author in PubMed Google Scholar
Peng Yi
View author publications
You can also search for this author in PubMed Google Scholar
YuJiao Hu
View author publications
You can also search for this author in PubMed Google Scholar
Zhao Yang
View author publications
You can also search for this author in PubMed Google Scholar
Gang Yang
View author publications
You can also search for this author in PubMed Google Scholar
Xingshe Zhou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yuan Yao.

Ethics declarations

Conflict of Interests

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Sun, Q., Yao, Y., Yi, P. et al. Learning controlled and targeted communication with the centralized critic for the multi-agent system. Appl Intell 53, 14819–14837 (2023). https://doi.org/10.1007/s10489-022-04225-5

Download citation

Accepted: 27 September 2022
Published: 04 November 2022
Issue Date: June 2023
DOI: https://doi.org/10.1007/s10489-022-04225-5

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learning controlled and targeted communication with the centralized critic for the multi-agent system

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A survey of multi-agent deep reinforcement learning with communication

Learning When to Communicate Among Actors with the Centralized Critic for the Multi-agent System

Multi-agent Neural Reinforcement-Learning System with Communication

Data Availability

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Learning controlled and targeted communication with the centralized critic for the multi-agent system

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A survey of multi-agent deep reinforcement learning with communication

Learning When to Communicate Among Actors with the Centralized Critic for the Multi-agent System

Multi-agent Neural Reinforcement-Learning System with Communication

Explore related subjects

Data Availability

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation