skip to main content
10.1145/3205326.3205346acmotherconferencesArticle/Chapter ViewAbstractPublication PagescasaConference Proceedingsconference-collections
research-article

Learning to Communicate via Supervised Attentional Message Processing

Published:21 May 2018Publication History

ABSTRACT

Many tasks in AI require the collaboration of multiple agents. Generally, these agents cooperate with each other by message-passing communication. However, agents may suffer from being overwhelmed by massive received messages and have difficulties in obtaining useful information. To this end, we use an attention-based message processing (AMP) method to model agents' interactions by considering the relevance of each received message. To improve the efficiency of learning correct interactions, a supervised variant SAMP is then proposed to directly optimize the attentional weights in AMP with a target auxiliary interaction matrix from the environment. The empirical results demonstrate our proposal outperforms other competing multi-agent methods in "predator-prey-toxin" domain, and prove the superiority of SAMP in correctly guiding the optimization of attentional weights in AMP.

References

  1. Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014).Google ScholarGoogle Scholar
  2. Jakob Foerster, Ioannis Alexandros Assael, Nando de Freitas, and Shimon Whiteson. 2016. Learning to communicate with deep multi-agent reinforcement learning. In Advances in Neural Information Processing Systems. 2137--2145. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Jakob Foerster, Gregory Farquhar, Triantafyllos Afouras, Nantas Nardelli, and Shimon Whiteson. 2017. Counterfactual multi-agent policy gradients. arXiv preprint arXiv:1705.08926 (2017).Google ScholarGoogle Scholar
  4. Jakob Foerster, Nantas Nardelli, Gregory Farquhar, Philip Torr, Pushmeet Kohli, Shimon Whiteson, et al. 2017. Stabilising experience replay for deep multi-agent reinforcement learning. arXiv preprint arXiv:1702.08887 (2017).Google ScholarGoogle Scholar
  5. Jayesh K Gupta, Maxim Egorov, and Mykel Kochenderfer. 2017. Cooperative multi-agent control using deep reinforcement learning. In International Conference on Autonomous Agents and Multiagent Systems. Springer, 66--83.Google ScholarGoogle ScholarCross RefCross Ref
  6. Matthew Hausknecht and Peter Stone. 2016. Grounded Semantic Networks for Learning Shared Communication Protocols. In International Conference on Machine Learning (Workshop).Google ScholarGoogle Scholar
  7. Yedid Hoshen. 2017. Vain: Attentional multi-agent predictive modeling. In Advances in Neural Information Processing Systems. 2698--2708.Google ScholarGoogle Scholar
  8. Thomas N Kipf and Max Welling. 2016. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016).Google ScholarGoogle Scholar
  9. Tejas D Kulkarni, Karthik Narasimhan, Ardavan Saeedi, and Josh Tenenbaum. 2016. Hierarchical deep reinforcement learning: Integrating temporal abstraction and intrinsic motivation. In Advances in neural information processing systems. 3675--3683. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Joel Z Leibo, Vinicius Zambaldi, Marc Lanctot, Janusz Marecki, and Thore Graepel. 2017. Multi-agent reinforcement learning in sequential social dilemmas. In Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems. International Foundation for Autonomous Agents and Multiagent Systems, 464--473. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Hangyu Mao, Yan Ni, Zhibo Gong, Weichen Ke, Chao Ma, Yang Xiao, Yuan Wang, Jiakang Wang, Quanbin Wang, Xiangyu Liu, et al. 2017. ACCNet: Actor-Coordinator-Critic Net for" Learning-to-Communicate" with Deep Multi-agent Reinforcement Learning. arXiv preprint arXiv:1706.03235 (2017).Google ScholarGoogle Scholar
  12. Volodymyr Mnih, Adria Puigdomenech Badia, Mehdi Mirza, Alex Graves, Timothy Lillicrap, Tim Harley, David Silver, and Koray Kavukcuoglu. 2016. Asynchronous methods for deep reinforcement learning. In International Conference on Machine Learning. 1928--1937. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness, Marc G Bellemare, Alex Graves, Martin Riedmiller, Andreas K Fidjeland, Georg Ostrovski, et al. 2015. Human-level control through deep reinforcement learning. Nature 518, 7540 (2015), 529.Google ScholarGoogle Scholar
  14. Igor Mordatch and Pieter Abbeel. 2017. Emergence of grounded compositional language in multi-agent populations. arXiv preprint arXiv:1703.04908 (2017).Google ScholarGoogle Scholar
  15. Kozue Noro, Hiroshi Tenmoto, and Akimoto Kamiya. 2014. Signal learning with messages by reinforcement learning in multi-agent pursuit problem. Procedia Computer Science 35 (2014), 233--240.Google ScholarGoogle ScholarCross RefCross Ref
  16. Shayegan Omidshafiei, Jason Pazis, Christopher Amato, Jonathan P How, and John Vian. 2017. Deep Decentralized Multi-task Multi-Agent RL under Partial Observability, In International Conference on Machine Learning. arXiv preprint arXiv:1703.06182, 2681--2690.Google ScholarGoogle Scholar
  17. Liviu Panait and Sean Luke. 2005. Cooperative multi-agent learning: The state of the art. Autonomous agents and multi-agent systems 11, 3 (2005), 387--434. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Peng Peng, Quan Yuan, Ying Wen, Yaodong Yang, Zhenkun Tang, Haitao Long, and Jun Wang. 2017. Multiagent Bidirectionally-Coordinated nets for learning to play StarCraft combat games. arXiv preprint arXiv:1703.10069 (2017).Google ScholarGoogle Scholar
  19. Zhaoqing Peng, Takumi Kato, Hideyuki Takahashi, and Tetsuo Kinoshita. 2015. Intelligent home security system using agent-based IoT Devices. In Consumer Electronics (GCCE), 2015 IEEE 4th Global Conference on. IEEE, 313--314.Google ScholarGoogle ScholarCross RefCross Ref
  20. Sainbayar Sukhbaatar, Rob Fergus, et al. 2016. Learning multiagent communication with backpropagation. In Advances in Neural Information Processing Systems. 2244--2252. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Richard S Sutton, David A McAllester, Satinder P Singh, and Yishay Mansour. 2000. Policy gradient methods for reinforcement learning with function approximation. In Advances in neural information processing systems. 1057--1063. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Learning to Communicate via Supervised Attentional Message Processing

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Other conferences
          CASA 2018: Proceedings of the 31st International Conference on Computer Animation and Social Agents
          May 2018
          101 pages
          ISBN:9781450363761
          DOI:10.1145/3205326

          Copyright © 2018 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 21 May 2018

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed limited

          Acceptance Rates

          CASA 2018 Paper Acceptance Rate18of110submissions,16%Overall Acceptance Rate18of110submissions,16%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader