Abstract
Multi-agent reinforcement learning (MARL) has very high sample complexity leading to slow learning. For repeated social dilemma games e.g. Public Goods Game(PGG), Fruit Gathering Game(FGG), MARL exhibits low sustainability of cooperation due to non-stationarity of the agents and the environment, and the large sample complexity. Motivated by the fact that humans learn not only through their own actions (organic learning) but also by following the actions of other humans (social learning) who also continuously learn about the environment, we address this challenge by augmenting RL based models with a notion of collaboration among agents. In particular, we propose Collaborative-Reinforcement-Learning (CRL), where agents collaborate by observing and following other agent’s actions/decisions. The CRL model significantly influences the speed of individual learning, which effects the collective behavior as compared to RL only models and thereby effectively explaining the sustainability of cooperation in repeated PGG settings. We also extend the CRL model for PGGs over different generations where agents die, and new agents are born following a birth-death process. Also, extending the proposed CRL model, we propose Collaborative Deep RL Network(CDQN) for a team based game (FGG) and the experimental results confirm that agents following CDQN learns faster and collects more fruits.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Abbeel, P., Ng, A.Y.: Apprenticeship learning via inverse reinforcement learning. In: Proceedings of the Twenty-First International Conference on Machine Learning, ICML 2004, p. 1. ACM, New York (2004)
Andreoni, J., Harbaugh, W., Vesterlund, L.: The carrot or the stick: rewards, punishments, and cooperation. Am. Econ. Rev. 93(3), 893–902 (2003)
Atkeson, C.G., Schaal, S.: Robot learning from demonstration. In: Proceedings of the Fourteenth International Conference on Machine Learning, ICML 1997, pp. 12–20. Morgan Kaufmann Publishers Inc., San Francisco (1997)
Axelrod, R., Hamilton, W.: The evolution of cooperation. Biosystems 211(1–2), 1390–1396 (1996)
Bandura, A., Walters, R.H.: Social Learning and Personality Development. Holt Rinehart and Winston, New York (1963). https://psycnet.apa.org/record/1963-35030-000
Bandura, A., Walters, R.H.: Social Learning Theory. Prentice-Hall, Englewood Cliffs (1977)
Bereby-Meyer, Y., Roth, A.E.: The speed of learning in noisy games: partial reinforcement and the sustainability of cooperation. Am. Econ. Rev. 96(4), 1029–1042 (2006)
Claus, C., Boutilier, C.: The dynamics of reinforcement learning in cooperative multiagent systems. In: Proceedings of the Fifteenth National/Tenth Conference on Artificial Intelligence/Innovative Applications of Artificial Intelligence, pp. 746–752 (1998)
Engelmore, R.: Prisoner’s dilemma-recollections and observations. In: Rapoport, A. (ed.) Game Theory as a Theory of a Conflict Resolution, pp. 17–34. Springer, Dordrecht (1978). https://doi.org/10.1007/978-94-010-2161-6_2
Fehr, E., Gachter, S.: Cooperation and punishment in public goods experiments. Am. Econ. Rev. 90(4), 980–994 (2000)
Fu, F., Hauert, C., Nowa, M.A., Wang, L.: Reputation-based partner choice promotes cooperation in social networks. Phys. Rev. E 78, 026117 (2008)
Gunnthorsdottir, A., Rapoport, A.: Embedding social dilemmas in intergroup competition reduces free-riding. Organ. Beha. Hum. Decis. Processes 101(2), 184–199 (2006)
Hu, J., Wellman, M.P.: Multiagent reinforcement learning: theoretical framework and an algorithm. In: Proceedings of the Fifteenth International Conference on Machine Learning, ICML 1998, pp. 242–250. Morgan Kaufmann Publishers Inc., San Francisco (1998)
Lange, P.A.V., Joireman, J., Parks, C.D., Dijk, E.V.: The psychology of social dilemmas: a review. Organ. Behav. Hum. Decis. Processes 120(2), 125–141 (2013)
Ledyard, J.: A survey of experimental research. In: Kagel, J.H., Roth, A.E. (eds.) The Handbook of Experimental Economics. Princeton University Press, Princeton (1995)
Leibo, J.Z., Zambaldi, V., Lanctot, M., Marecki, J., Graepel, T.: Multi-agent reinforcement learning in sequential social dilemmas. In: Proceedings of the 16th Conference on Autonomous Agents and Multiagent Systems, pp. 464–473 (2017)
Mnih, V., et al.: Playing Atari with deep reinforcement learning. In: NIPS Deep Learning Workshop 2013 (2013)
Nowak, M.A., Signmund, K.: Evolution of indirect reciprocity. In: Proceedings of the National Academy of Sciences, pp. 1291–1298 (2005)
Rand, D.G., Arbesman, S., Christakis, N.A.: Dynamic social networks promote cooperation in experiments with humans. In: Proceedings of the National Academy of Sciences, pp. 19193–19198 (2011)
Sandholm, T.W., Crites, R.H.: Multiagent reinforcement learning in the iterated prisoner’s dilemma. Biosystems 37(1–2), 147–166 (1996)
van Veelen, M., Garcia, J., Rand, D.G., Nowak, M.A.: Direct reciprocity in structured populations. Proc. Natl. Acad. Sci. 109, 9929–9934 (2012)
Wunder, M., Littman, M., Babes, M.: Classes of multiagent q-learning dynamics with greedy exploration. In: Proceedings of the 27th International Conference on Machine Learning, ICML 2010 (2010)
Yang, Y., Luo, R., Li, M., Zhou, M., Zhang, W., Wang, J.: Mean field multi-agent reinforcement learning. In: Proceedings of the 35th International Conference on Machine Learning, pp. 5571–5580 (2018)
Zhou, L., Yang, P., Chen, C., Gao, Y.: Multiagent reinforcement learning with sparse interactions by negotiation and knowledge transfer. IEEE Trans. Cybern. 47(5), 1238–1250 (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Chaudhuri, R., Mukherjee, K., Narayanam, R., Vallam, R.D. (2021). Collaborative Reinforcement Learning Framework to Model Evolution of Cooperation in Sequential Social Dilemmas. In: Karlapalem, K., et al. Advances in Knowledge Discovery and Data Mining. PAKDD 2021. Lecture Notes in Computer Science(), vol 12712. Springer, Cham. https://doi.org/10.1007/978-3-030-75762-5_2
Download citation
DOI: https://doi.org/10.1007/978-3-030-75762-5_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-75761-8
Online ISBN: 978-3-030-75762-5
eBook Packages: Computer ScienceComputer Science (R0)