Abstract
In this work, we investigate the effects of centralized learning decentralized execution algorithms on agent coordination in a modified version of the Level Based Foraging environment that behaves as a sequential social dilemma. We show that with individual agent rewards, Level Based Foraging becomes a sequential social dilemma. When compared with previously reported results on the Level based Foraging environment using joint rewards [13], we observe significant convergence rate improvements for algorithms that perform state action value estimation: IQL, VDN and QMIX.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Beattie, C., Köppe, T., Duéñez-Guzmán, E.A., Leibo, J.Z.: DeepMind Lab2D. CoRR abs/2011.07027 (2020). https://arxiv.org/abs/2011.07027
Christianos, F., Schäfer, L., Albrecht, S.V.: Shared experience actor-critic for multi-agent reinforcement learning. In: Advances in Neural Information Processing Systems (NeurIPS) (2020)
Hughes, E., et al.: Inequity aversion resolves intertemporal social dilemmas. CoRR abs/1803.08884 (2018). http://arxiv.org/abs/1803.08884
Kollock, P.: Social dilemmas: the anatomy of cooperation. Ann. Rev. Sociol. 24(1), 183–214 (1998)
Köster, R., et al.: Model-free conventions in multi-agent reinforcement learning with heterogeneous preferences. arXiv preprint arXiv:2010.09054 (2020)
Leibo, J.Z., et al.: Scalable evaluation of multi-agent reinforcement learning with melting pot. In: International Conference on Machine Learning, pp. 6187–6199. PMLR (2021)
Leibo, J.Z., Zambaldi, V.F., Lanctot, M., Marecki, J., Graepel, T.: Multi-agent reinforcement learning in sequential social dilemmas. CoRR abs/1702.03037 (2017). http://arxiv.org/abs/1702.03037
Lowe, R., Wu, Y.I., Tamar, A., Harb, J., Pieter Abbeel, O., Mordatch, I.: Multi-agent actor-critic for mixed cooperative-competitive environments. In: Advances in Neural Information Processing Systems 30 (2017)
Mnih, V., et al.: Asynchronous methods for deep reinforcement learning. In: International Conference on Machine Learning, pp. 1928–1937. PMLR (2016)
Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)
Oliehoek, F.A., Amato, C.: The Decentralized POMDP Framework, pp. 11–32. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-28929-8_2
OpenAI: OpenAI five (2018). https://blog.openai.com/openai-five/
Papoudakis, G., Christianos, F., Schäfer, L., Albrecht, S.V.: Benchmarking multi-agent deep reinforcement learning algorithms in cooperative tasks. CoRR abs/2006.07869 (2020). https://arxiv.org/abs/2006.07869
Rashid, T., Samvelyan, M., de Witt, C.S., Farquhar, G., Foerster, J.N., Whiteson, S.: QMIX: monotonic value function factorisation for deep multi-agent reinforcement learning. CoRR abs/1803.11485 (2018). http://arxiv.org/abs/1803.11485
Samvelyan, M., et al.: The StarCraft Multi-Agent Challenge. CoRR abs/1902.04043 (2019)
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. CoRR abs/1707.06347 (2017). http://arxiv.org/abs/1707.06347
Son, K., Kim, D., Kang, W.J., Hostallero, D., Yi, Y.: QTRAN: learning to factorize with transformation for cooperative multi-agent reinforcement learning. CoRR abs/1905.05408 (2019). http://arxiv.org/abs/1905.05408
Sunehag, P., et al.: Value-decomposition networks for cooperative multi-agent learning. arXiv preprint arXiv:1706.05296 (2017)
Vinyals, O., et al.: Grandmaster level in StarCraft II using multi-agent reinforcement learning. Nature 575(7782), 350–354 (2019). https://doi.org/10.1038/s41586-019-1724-z
Yu, C., Velu, A., Vinitsky, E., Wang, Y., Bayen, A.M., Wu, Y.: The surprising effectiveness of MAPPO in cooperative, multi-agent games. CoRR abs/2103.01955 (2021). https://arxiv.org/abs/2103.01955
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Atrazhev, P., Musilek, P. (2022). Investigating Effects of Centralized Learning Decentralized Execution on Team Coordination in the Level Based Foraging Environment as a Sequential Social Dilemma. In: Dignum, F., Mathieu, P., Corchado, J.M., De La Prieta, F. (eds) Advances in Practical Applications of Agents, Multi-Agent Systems, and Complex Systems Simulation. The PAAMS Collection. PAAMS 2022. Lecture Notes in Computer Science(), vol 13616. Springer, Cham. https://doi.org/10.1007/978-3-031-18192-4_2
Download citation
DOI: https://doi.org/10.1007/978-3-031-18192-4_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-18191-7
Online ISBN: 978-3-031-18192-4
eBook Packages: Computer ScienceComputer Science (R0)