Skip to main content
Log in

GAMA: Graph Attention Multi-agent reinforcement learning algorithm for cooperation

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Multi-agent reinforcement learning (MARL) is an important way to realize multi-agent cooperation. But there are still many challenges, including the scalability and the uncertainty of the environment that limit its application. In this paper, we explored to solve those problems through the graph network and the attention mechanism. Finally we succeeded in extending the existing algorithm and obtaining a new algorithm called GAMA. Specifically through the graph network, we made the environment information shared among agents. Meanwhile, the unimportant information was filtered out with the help of the attention mechanism, which helped to improve the communication efficiency. As a result, GAMA obtained the highest mean episode rewards compared to the baselines as well as excellent scalability. The reason why we choose the graph network is that understanding the relationship among agents plays a key role in solving multi-agent problems. And the graph network is very suitable for relational induction bias. Through the integration with the attention mechanism, it was shown that agents could figure out their relationship and focus on the influential environment factors in our experiment.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Battaglia PW, Hamrick JB, Bapst V, Sanchez-Gonzalez A, Zambaldi V, Mali-nowski M, Tacchetti A, Raposo D, Santoro A, Faulkner R (2018) Relational inductive biases, deep learning and graph networks

  2. Brockman G, Cheung V, Pettersson L, Schneider J, Schulman J, Tang J, Zaremba W (2016) Openai gym

  3. Demis H, Dharshan K, Maguire EA (2007) Using imagination to understand the neural basis of episodic memory. Journal of Neuroscience the Official Journal of the Society for Neuroscience 27(52):1436574

    Google Scholar 

  4. Rubenstein M, Cornejo A, Nagpal R (2014) Programmable self-assembly in a thousand-robot swarm. Science 345(6198):795–799

    Article  Google Scholar 

  5. Werfel J, Petersen K, Nagpal R (2014) Designing collective behavior in a termite-inspired robot construction team. Science 343(6172):754–758

    Article  Google Scholar 

  6. Souidi MEH, et al. (2016) “Multi-agent cooperation pursuit based on an extension of AALAADIN organisational model”. Journal of Experimental & Theoretical Artificial Intelligence 28.6:1075–1088

    Article  Google Scholar 

  7. Kotb Y, Ridhawi I, Aloqaily M, Baker T, Jararweh Y, Tawfik H (2019) Cloud-Based Multi-Agent Cooperation for IoT Devices Using Workflow-Nets, vol 17

  8. Singh AJ, Kumar A (2019) Graph based optimization for multiagent cooperation. In: Proceedings of the 18th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), Montreal, Canada, May 13-17. 1497-1505. Research Collection School Of Information Systems

  9. Buccafurri F, Palopoli L, Rosaci D, Sarnè GML (2004) Modeling Cooperation in Multi-Agent Communities. Cognitive Systems Research Journal (CSRJ) (lavoro invitato) 5(3):171—190. Elsevier

    Google Scholar 

  10. Rosaci D (2007) CILIOS : Connectionist Inductive Learning And Inter-Ontology Similarities for Recommending Information Agents. Information Systems 32(6):793–825 . Elsevier

    Article  Google Scholar 

  11. Diestel R (2013) Graph Theory

  12. Guttenberg N, Yu Y, Kanai R (2017) Counterfactual control for free from generative models

  13. Henderson P, Islam R, Bachman P, Pineau J, Precup D, Meger D (2017) Deep reinforcement learning that matters. arXiv:1709.06560

  14. Iqbal S, Sha F (2018) Actor-attention-critic for multi-agent reinforcement learning. arXiv:1810.02912

  15. Kingma DP, Ba J (2014) Adam:, A method for stochastic optimization. arXiv:1412.6980

  16. Konda V, Tsitsiklis J (2001) Actor-critic algorithms, vol 42

  17. Li Y (2017) Deep reinforcement learning: An overview. arXiv:1701.07274

  18. Lowe R, Wu Y, Tamar A, Harb J, Abbeel P, Mordatch I (2017) Multi-agent actor-critic for mixed cooperative-competitive environments. In: Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 4-9 December 2017, Long Beach, CA, USA, pp. 63796390. http://papers.nips.cc/paper/7217-multi-agent-actor-critic-for-mixed-cooperative-competitive-environments

  19. Raileanu R, Denton E, Szlam A, Fergus R (2018) Modeling others using oneself in multi-agent reinforcement learning. arXiv:1802.09640

  20. Schacter DL, Addis DR, Hassabis D, Martin VC, Spreng RN, Szpunar KK (2012) The future of memory: Remembering, imagining, and the brain. Neuron 76(4):677694

    Article  Google Scholar 

  21. Shi W, Song S, Wu C (2019) Soft policy gradient method for maximum entropy deep reinforcement learning. pp 34253431

  22. Busoniu L, Babu Ska R, Schutter BD (2010) Multi-agent reinforcement learning: An overview

  23. Tan M (1998) Readings in agents. chap. Multi-agent Reinforcement learning: Independent vs.Cooperative Agents. Morgan Kaufmann Publishers Inc, San Francisco, CA, USA, p 487494. http://dl.acm.org/citation.cfm?id=284860.284934

    Google Scholar 

  24. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. arXiv:1706.03762

  25. Velikovi P, Cucurull G, Casanova A, Romero A, Li P, Bengio Y (2018) Graph attention networks. In: International Conference on Learning Representations. https://openreview.net/forum?id=rJXMpikCZ

  26. Williams RJ (1992) Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning 8(3-4):229256

    Article  Google Scholar 

  27. Zhu Y, Mottaghi R, Kolve E, Lim JJ, Farhadi A (2017) Target-driven visual navigation in indoor scenes using deep reinforcement learning. In: IEEE International conference on robotics & automation

  28. Ackermann J, Gabler V, Osa T, Sugiyama M (2019) Reducing overestimation bias in multi-agent domains using double centralized critics

  29. Sutton RS, Barto AG (1988) Reinforcement learning: an introduction. IEEE Transactions on Neural Networks 16:285286

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yadong Liu.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, H., Liu, Y., Zhou, Z. et al. GAMA: Graph Attention Multi-agent reinforcement learning algorithm for cooperation. Appl Intell 50, 4195–4205 (2020). https://doi.org/10.1007/s10489-020-01755-8

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-020-01755-8

Keywords

Navigation