ABSTRACT
To cope with large scale, agents are usually organized in a network such that an agent interacts only with its immediate neighbors in the network. Reinforcement learning techniques have been commonly used to optimize agents local policies in such a network because they require little domain knowledge and can be fully distributed. However, all of the previous work assumed the underlying network was fixed throughout the learning process. This assumption was important because the underlying network defines the learning context of each agent. In particular, the set of actions and the state space for each agent is defined in terms of the agent's neighbors. If agents dynamically change the underlying network structure (also called self-organize) during learning, then one needs a mechanism for transferring what agents have learned so far before (in the old network structure) to their new learning context (in the new network structure).
In this work we develop a novel self-organization mechanism that not only allows agents to self-organize the underlying network during the learning process, but also uses information from learning to guide the self-organization process. Consequently, our work is the first to study this interaction between learning and self-organization. Our self-organization mechanism uses heuristics to transfer the learned knowledge across the different steps of self-organization. We also present a more restricted version of our mechanism that is computationally less expensive and still achieve good performance. We use a simplified version of the distributed task allocation domain as our case study. Experimental results verify the stability of our approach and show a monotonic improvement in the performance of the learning process due to self-organization.
- S. Abdallah and V. Lesser. Learning the task allocation game. In Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2006. Google ScholarDigital Library
- M. Bowling. Convergence and no-regret in multiagent learning. In Advances in Neural Information Processing Systems 17, pages 209--216. MIT Press, Cambridge, MA, 2005.Google Scholar
- M. Bowling and M. Veloso. Multiagent learning using a variable learning rate. Artificial Intelligence, 136(2):215--250, 2002. Google ScholarDigital Library
- J. A. Boyan and M. L. Littman. Packet routing in dynamically changing networks: A reinforcement learning approach. In Advances in Neural Information Processing Systems, volume 6, pages 671--678. Morgan Kaufmann Publishers, Inc., 1994.Google ScholarDigital Library
- M. E. Gaston and M. desJardins. Agent-organized networks for dynamic team formation. In AAMAS '05: Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems, pages 230--237, New York, NY, USA, 2005. ACM Press. Google ScholarDigital Library
- B. Horling. Quantitative Organizational Modeling and Design for Multi-Agent Systems. PhD thesis, University of Massachusetts at Amherst, February 2006. Google ScholarDigital Library
- L. Peshkin and V. Savova. Reinforcement learning for adaptive routing. In International Joint Conference on Neural Networks (IJCNN), pages 1825--1830, 2002.Google ScholarCross Ref
- M. Sims, C. Goldman, and V. Lesser. Self-organization through bottom-up coalition formation. In Proceedings of Second International Joint Conference on Autonomous Agents and MultiAgent Systems (AAMAS 2003), pages 867--874, Melbourne, AUS, July 2003. ACM Press. Google ScholarDigital Library
- M. E. Taylor and P. Stone. Behavior transfer for value-function-based reinforcement learning. In The Fourth International Joint Conference on Autonomous Agents and Multiagent Systems, pages 53--59, New York, NY, July 2005. Google ScholarDigital Library
Index Terms
- Multiagent reinforcement learning and self-organization in a network of agents
Recommendations
Opportunities for multiagent systems and multiagent reinforcement learning in traffic control
The increasing demand for mobility in our society poses various challenges to traffic engineering, computer science in general, and artificial intelligence and multiagent systems in particular. As it is often the case, it is not possible to provide ...
Autonomous Learning Agents: Layered Learning and Ad Hoc Teamwork
AAMAS '16: Proceedings of the 2016 International Conference on Autonomous Agents & Multiagent SystemsIn order to achieve long-term autonomy in the real world, fully autonomous agents need to be able to learn, both to improve their behaviors in a complex, dynamically changing world, and to enable interaction with previously unfamiliar agents. This talk ...
Training Cooperative Agents for Multi-Agent Reinforcement Learning
AAMAS '19: Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent SystemsDeep Learning and back-propagation has been successfully used to perform centralized training with communication protocols among multiple agents in a cooperative environment. In this paper we present techniques for centralized training of Multi-Agent (...
Comments