ABSTRACT
This work proposes a method for predicting the internal mechanisms of individual agents using observed collective behaviours by multi-agent reinforcement learning (MARL). Since the emergence of group behaviour among many agents can undergo phase transitions, and the action space will not in general be smooth, natural evolution strategies were adopted for updating a policy function. We tested the approach using a well-known flocking algorithm as a target model for our system to learn. With the data obtained from this rule-based model, the MARL model was trained, and its acquired behaviour was compared to the original. In the process, we discovered that agents trained by MARL can self-organize flow patterns using only local information. The expressed pattern is robust to changes in the initial positions of agents, whilst being sensitive to the training conditions used.
- Wilensky, U. and Rand, W., 2015. An introduction to agent-based modeling: modeling natural, social, and engineered complex systems with NetLogo. MIT Press. Google ScholarDigital Library
- Wierstra, D., Schaul, T., Glasmachers, T., Sun, Y., Peters, J. and Schmidhuber, J., 2014. Natural evolution strategies. Journal of Machine Learning Research, 15(1), pp.949--980. Google ScholarDigital Library
- Foerster, J., Assael, Y., de Freitas, N. and Whiteson, S., 2016. Learning to communicate with deep multi-agent reinforcement learning. In Advances in Neural Information Processing Systems (pp. 2137--2145). Google ScholarDigital Library
- Leibo, Joel Z., Vinicius Zambaldi, Marc Lanctot, Janusz Marecki, and Thore Graepel. "Multi-agent reinforcement learning in sequential social dilemmas." In Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems, pp. 464--473. International Foundation for Autonomous Agents and Multiagent Systems, 2017. Google ScholarDigital Library
- Lerer, A. and Peysakhovich, A., 2017. Maintaining cooperation in complex social dilemmas using deep reinforcement learning. arXiv preprint arXiv:1707.01068.Google Scholar
- Shalev-Shwartz, S., Shammah, S. and Shashua, A., 2016. Safe, multi-agent, reinforcement learning for autonomous driving. arXiv preprint arXiv:1610.03295.Google Scholar
- Reynolds, C.W., 1987, August. Flocks, herds and schools: A distributed behavioral model. In ACM SIGGRAPH computer graphics (Vol. 21, No. 4, pp. 25--34). ACM. Google ScholarDigital Library
- Salimans, T., Ho, J., Chen, X. and Sutskever, I., 2017. Evolution strategies as a scalable alternative to reinforcement learning. arXiv preprint arXiv:1703.03864.Google Scholar
Index Terms
- Learning how to flock: deriving individual behaviour from collective behaviour with multi-agent reinforcement learning and natural evolution strategies
Recommendations
Task Generalisation in Multi-Agent Reinforcement Learning
AAMAS '22: Proceedings of the 21st International Conference on Autonomous Agents and Multiagent SystemsMulti-agent reinforcement learning agents are typically trained in a single environment. As a consequence, they overfit to the training environment which results in sensitivity to perturbations and inability to generalise to similar environments. For ...
Learning intelligent behavior in a non-stationary and partially observable environment
Individual learning in an environment where more than one agent exist is a challenging task. In this paper, a single learning agent situated in an environment where multiple agents exist is modeled based on reinforcement learning. The environment is non-...
An Evolutionary Dynamical Analysis of Multi-Agent Learning in Iterated Games
In this paper, we investigate Reinforcement learning (RL) in multi-agent systems (MAS) from an evolutionary dynamical perspective. Typical for a MAS is that the environment is not stationary and the Markov property is not valid. This requires agents to ...
Comments