Enhancing cooperation by cognition differences and consistent representation in multi-agent reinforcement learning

Ge, Hongwei; Ge, Zhixin; Sun, Liang; Wang, Yuxin

doi:10.1007/s10489-021-02873-7

Enhancing cooperation by cognition differences and consistent representation in multi-agent reinforcement learning

Published: 08 January 2022

Volume 52, pages 9701–9716, (2022)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Hongwei Ge¹,
Zhixin Ge¹,
Liang Sun ORCID: orcid.org/0000-0003-2794-8654¹ &
…
Yuxin Wang¹

768 Accesses
8 Citations
1 Altmetric
Explore all metrics

Abstract

Multi-agent reinforcement learning is efficient to deal with tasks that require cooperation among different individuals. And communication plays an important role to enhance the cooperation of agents in scalable and unstable environments. However, there are still many challenges because some information of communication may fail to facilitate cooperation or even have a negative effect. Thus, how to explore efficient information for the cooperation of agents is a critical issue to be solved. In this paper, we propose a multi-agent reinforcement learning algorithm with cognition differences and consistent representation (CDCR). The criteria of cognition differences are formulated to explore information possessed by different agents, to help each agent have a better understanding of others. We further train a cognition encoding network to obtain the global cognition consistent representation for each agent, then the representation is used to realize the cognitive consistency of the agent for the environment. To validate the effectiveness of the CDCR, we carry out experiments in Predator-Prey and StarCraft II environments. The results in Predator-Prey demonstrate that the proposed cognition differences can achieve effective communication among agents; the results in StarCraft II demonstrate that considering both cognition differences and consistent representation can increase the test win rate of the baseline algorithm by 29% in the best case, and the ablation studies further demonstrate the positive roles played by the proposed strategies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Improved reinforcement learning in cooperative multi-agent environments using knowledge transfer

Article 24 January 2022

HiSA: Facilitating Efficient Multi-Agent Coordination and Cooperation by Hierarchical Policy with Shared Attention

Modeling Others as a Player in Non-cooperative Game for Multi-agent Coordination

References

Bernstein D S, Givan R, Immerman N, Zilberstein S (2002) The complexity of decentralized control of markov decision processes. Math Oper Res 27(4):819–840
Article MathSciNet Google Scholar
Cao Y, Yu W, Ren W, Chen G (2012) An overview of recent progress in the study of distributed multi-agent coordination. IEEE Trans Ind Inf 9(1):427–438
Article Google Scholar
Chen H, Liu Y, Zhou Z, Hu D, Zhang M (2020) Gama: Graph attention multi-agent reinforcement learning algorithm for cooperation. Appl Intell 50(12):4195–4205
Article Google Scholar
Das A, Gervet T, Romoff J, Batra D, Parikh D, Rabbat M, Pineau J (2019) Tarmac: Targeted multi-agent communication. In: International Conference on Machine Learning, pp 1538–1546
Foerster J, Nardelli N, Farquhar G, Afouras T, Torr PH, Kohli P, Whiteson S (2017) Stabilising experience replay for deep multi-agent reinforcement learning. In: International Conference on Machine Learning, pp 1146–1155
Foerster JN, Farquhar G, Afouras T, Nardelli N, Whiteson S (2018) Counterfactual multi-agent policy gradients. In: Association for the Advancement of Artificial Intelligence, pp 2974–2982
Ge H, Song Y, Wu C, Ren J, Tan G (2019) Cooperative deep q-learning with q-value transfer for multi-intersection signal control. IEEE Access 7:40797–40809
Article Google Scholar
Gu S, Holly E, Lillicrap T, Levine S (2017) Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates. In: 2017 IEEE International Conference on Robotics and Automation, pp 3389–3396
Iqbal S, Sha F (2019) Actor-attention-critic for multi-agent reinforcement learning. In: International Conference on Machine Learning, pp 2961–2970
Jiang H, Shi D, Xue C, Wang Y, Zhang Y (2021) Multi-agent deep reinforcement learning with type-based hierarchical group communication. Appl Intell 51(8):5793–5808
Article Google Scholar
Jiang J, Lu Z (2018) Learning attentional communication for multi-agent cooperation. In: Advances in Neural Information Processing Systems, pp 7254–7264
Kim D, Moon S, Hostallero D, Kang W J, Lee T, Son K, Yi Y (2018) Learning to schedule communication in multi-agent reinforcement learning. In: International Conference on Learning Representations
Kraemer L, Banerjee B (2016) Multi-agent reinforcement learning as a rehearsal for decentralized planning. Neurocomputing 190:82–94
Article Google Scholar
Lakkaraju K, Speed A (2019) A cognitive-consistency based model of population wide attitude change. In: Complex Adaptive Systems. Springer, pp 17–38
Li S, Gupta J K, Morales P, Allen R, Kochenderfer MJ (2021) Deep implicit coordination graphs for multi-agent reinforcement learning. International Conference on Autonomous Agents and Multiagent Systems
Liu Y, Wang W, Hu Y, Hao J, Chen X, Gao Y (2020) Multi-agent game abstraction via graph attention neural network. In: Association for the Advancement of Artificial Intelligence, pp 7211–7218
Lobov S A, Mikhaylov A N, Shamshin M, Makarov V A, Kazantsev V B (2020) Spatial properties of stdp in a self-learning spiking neural network enable controlling a mobile robot. Front Neurosci 14:88
Article Google Scholar
Lowe R, Wu YI, Tamar A, Harb J, Abbeel OP, Mordatch I (2017) Multi-agent actor-critic for mixed cooperative-competitive environments. In: Advances in Neural Information Processing Systems, pp 6379–6390
Mao H, Liu W, Hao J, Luo J, Li D, Zhang Z, Wang J, Xiao Z (2019) Neighborhood cognition consistent multi-agent reinforcement learning. arXiv:191201160
Mnih V, Kavukcuoglu K, Silver D, Rusu A A, Veness J, Bellemare M G, Graves A, Riedmiller M, Fidjeland A K, Ostrovski G et al (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529–533
Article Google Scholar
Oh J, Chockalingam V, Lee H et al (2016) Control of memory, active perception, and action in minecraft. In: International Conference on Machine Learning, pp 2790–2799
Oliehoek F A, Spaan M T, Vlassis N (2008) Optimal and approximate q-value functions for decentralized pomdps. J Artif Intell Res 32:289–353
Article MathSciNet Google Scholar
Padakandla S, Prabuchandran K, Bhatnagar S (2020) Reinforcement learning algorithm for non-stationary environments. Applied Intelligence (11):3590–3606
Palmer G, Tuyls K, Bloembergen D, Savani R (2018) Lenient multi-agent deep reinforcement learning. In: International Conference on Autonomous Agents and Multiagent Systems, pp 443–451
Peng P, Wen Y, Yang Y, Yuan Q, Tang Z, Long H, Wang J (2017) Multiagent bidirectionally-coordinated nets: Emergence of human-level coordination in learning to play starcraft combat games. arXiv:170310069
Prashanth L, Bhatnagar S (2011) Reinforcement learning with average cost for adaptive control of traffic lights at intersections. In: 2011 14th International IEEE Conference on Intelligent Transportation Systems, pp 1640–1645
Rashid T, Samvelyan M, Schroeder C, Farquhar G, Foerster J, Whiteson S (2018) Qmix: Monotonic value function factorisation for deep multi-agent reinforcement learning. In: International Conference on Machine Learning, pp 4295–4304
Russo J E, Carlson K A, Meloy M G, Yong K (2008) The goal of consistency as a cause of information distortion. J Exp Psychol Gen 137(3):456–470
Article Google Scholar
Samvelyan M, Rashid T, de Witt CS, Farquhar G, Nardelli N, Rudner TG, Hung CM, Torr PH, Foerster J, Whiteson S (2019) The starcraft multi-agent challenge. In: International Conference on Autonomous Agents and Multiagent Systems, pp 2186– 2188
Simon D, Snow C J, Read S J (2004) The redux of cognitive consistency theories: evidence judgments by constraint satisfaction. J Personal Social Psychol 86(6):814–837
Article Google Scholar
Singh A, Jain T, Sukhbaatar S (2019) Learning when to communicate at scale in multiagent cooperative and competitive tasks. In: International Conference on Learning Representations
Son K, Kim D, Kang W J, Hostallero D, Yi Y (2019) Qtran: Learning to factorize with transformation for cooperative multi-agent reinforcement learning. In: International Conference on Machine Learning
Sukhbaatar S, Fergus R et al (2016) Learning multiagent communication with backpropagation. In: Advances in Neural Information Processing Systems, pp 2244–2252
Sunehag P, Lever G, Gruslys A, Czarnecki WM, Zambaldi VF, Jaderberg M, Lanctot M, Sonnerat N, Leibo JZ, Tuyls K et al (2018) Value-decomposition networks for cooperative multi-agent learning based on team reward. In: International Conference on Autonomous Agents and Multiagent Systems, pp 2085–2087
Tan M (1993) Multi-agent reinforcement learning: Independent vs. cooperative agents. In: Proceedings of the Tenth International Conference on Machine Learning, pp 330–337
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Advances in Neural Information Processing Systems, pp 5998–6008
Vinyals O, Ewalds T, Bartunov S, Georgiev P, Vezhnevets AS, Yeo M, Makhzani A, Küttler H, Agapiou J, Schrittwieser J et al (2017) Starcraft ii: A new challenge for reinforcement learning. arXiv:170804782
Wiering M (2000) Multi-agent reinforcement learning for traffic light control. In: Machine Learning: Proceedings of the Seventeenth International Conference, pp 1151–1158
Yang S, Wang J, Deng B, Liu C, Li H, Fietkiewicz C, Loparo K A (2018) Real-time neuromorphic system for large-scale conductance-based spiking neural networks. IEEE Trans Cybern 49(7):2490–2503
Article Google Scholar
Yang S, Deng B, Wang J, Li H, Lu M, Che Y, Wei X, Loparo K A (2019) Scalable digital neuromorphic architecture for large-scale biophysically meaningful neural network with multi-compartment neurons. IEEE Trans Neural Networks Learn Syst 31(1):148–162
Article Google Scholar
Yang S, Gao T, Wang J, Deng B, Lansdell B, Linares-Barranco B (2021a) Efficient spike-driven learning with dendritic event-based processing. Front Neurosci 15:601109
Article Google Scholar
Yang S, Wang J, Deng B, Azghadi MR, Linares-Barranco B (2021b) Neuromorphic context-dependent learning framework with fault-tolerant spike routing. IEEE Transactions on Neural Networks and Learning Systems. https://doi.org/10.1109/TNNLS.2021.3084250
Yang S, Wang J, Zhang N, Deng B, Pang Y, Azghadi MR (2021c) Cerebellumorphic: large-scale neuromorphic model and architecture for supervised motor learning. IEEE Transactions on Neural Networks and Learning Systems https://doi.org/10.1109/TNNLS.2021.3057070
Yang Y, Hao J, Liao B, Shao K, Chen G, Liu W, Tang H (2020) Qatten: A general framework for cooperative multiagent reinforcement learning. arXiv:200203939
Zhang SQ, Zhang Q, Lin J (2019) Efficient communication in multi-agent reinforcement learning via variance based control. In: Advances in Neural Information Processing Systems, pp 3235–3244
Zhang SQ, Lin J, Zhang Q (2020) Succinct and robust multi-agent communication with temporal message control. arXiv:201014391

Download references

Acknowledgements

This work was supported by the National Key Research and Development Program of China (2018YFB1600600), the National Natural Science Foundation of China under (61976034, 61572104, U1808206), and the Dalian Science and Technology Innovation Fund (2019J12GX035).

Author information

Authors and Affiliations

College of Computer Science and Technology, Dalian University of Technology, Dalian, China
Hongwei Ge, Zhixin Ge, Liang Sun & Yuxin Wang

Authors

Hongwei Ge
View author publications
You can also search for this author in PubMed Google Scholar
Zhixin Ge
View author publications
You can also search for this author in PubMed Google Scholar
Liang Sun
View author publications
You can also search for this author in PubMed Google Scholar
Yuxin Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Liang Sun.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ge, H., Ge, Z., Sun, L. et al. Enhancing cooperation by cognition differences and consistent representation in multi-agent reinforcement learning. Appl Intell 52, 9701–9716 (2022). https://doi.org/10.1007/s10489-021-02873-7

Download citation

Accepted: 24 September 2021
Published: 08 January 2022
Issue Date: July 2022
DOI: https://doi.org/10.1007/s10489-021-02873-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Enhancing cooperation by cognition differences and consistent representation in multi-agent reinforcement learning

Abstract

Access this article

Similar content being viewed by others

Improved reinforcement learning in cooperative multi-agent environments using knowledge transfer

HiSA: Facilitating Efficient Multi-Agent Coordination and Cooperation by Hierarchical Policy with Shared Attention

Modeling Others as a Player in Non-cooperative Game for Multi-agent Coordination

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Enhancing cooperation by cognition differences and consistent representation in multi-agent reinforcement learning

Abstract

Access this article

Similar content being viewed by others

Improved reinforcement learning in cooperative multi-agent environments using knowledge transfer

HiSA: Facilitating Efficient Multi-Agent Coordination and Cooperation by Hierarchical Policy with Shared Attention

Modeling Others as a Player in Non-cooperative Game for Multi-agent Coordination

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation