Collaborative Reinforcement Learning Framework to Model Evolution of Cooperation in Sequential Social Dilemmas

Chaudhuri, Ritwik; Mukherjee, Kushal; Narayanam, Ramasuri; Vallam, Rohith D.

doi:10.1007/978-3-030-75762-5_2

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12712))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

3765 Accesses
1 Citations

Abstract

Multi-agent reinforcement learning (MARL) has very high sample complexity leading to slow learning. For repeated social dilemma games e.g. Public Goods Game(PGG), Fruit Gathering Game(FGG), MARL exhibits low sustainability of cooperation due to non-stationarity of the agents and the environment, and the large sample complexity. Motivated by the fact that humans learn not only through their own actions (organic learning) but also by following the actions of other humans (social learning) who also continuously learn about the environment, we address this challenge by augmenting RL based models with a notion of collaboration among agents. In particular, we propose Collaborative-Reinforcement-Learning (CRL), where agents collaborate by observing and following other agent’s actions/decisions. The CRL model significantly influences the speed of individual learning, which effects the collective behavior as compared to RL only models and thereby effectively explaining the sustainability of cooperation in repeated PGG settings. We also extend the CRL model for PGGs over different generations where agents die, and new agents are born following a birth-death process. Also, extending the proposed CRL model, we propose Collaborative Deep RL Network(CDQN) for a team based game (FGG) and the experimental results confirm that agents following CDQN learns faster and collects more fruits.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Abbeel, P., Ng, A.Y.: Apprenticeship learning via inverse reinforcement learning. In: Proceedings of the Twenty-First International Conference on Machine Learning, ICML 2004, p. 1. ACM, New York (2004)
Google Scholar
Andreoni, J., Harbaugh, W., Vesterlund, L.: The carrot or the stick: rewards, punishments, and cooperation. Am. Econ. Rev. 93(3), 893–902 (2003)
Article Google Scholar
Atkeson, C.G., Schaal, S.: Robot learning from demonstration. In: Proceedings of the Fourteenth International Conference on Machine Learning, ICML 1997, pp. 12–20. Morgan Kaufmann Publishers Inc., San Francisco (1997)
Google Scholar
Axelrod, R., Hamilton, W.: The evolution of cooperation. Biosystems 211(1–2), 1390–1396 (1996)
MathSciNet MATH Google Scholar
Bandura, A., Walters, R.H.: Social Learning and Personality Development. Holt Rinehart and Winston, New York (1963). https://psycnet.apa.org/record/1963-35030-000
Bandura, A., Walters, R.H.: Social Learning Theory. Prentice-Hall, Englewood Cliffs (1977)
Google Scholar
Bereby-Meyer, Y., Roth, A.E.: The speed of learning in noisy games: partial reinforcement and the sustainability of cooperation. Am. Econ. Rev. 96(4), 1029–1042 (2006)
Article Google Scholar
Claus, C., Boutilier, C.: The dynamics of reinforcement learning in cooperative multiagent systems. In: Proceedings of the Fifteenth National/Tenth Conference on Artificial Intelligence/Innovative Applications of Artificial Intelligence, pp. 746–752 (1998)
Google Scholar
Engelmore, R.: Prisoner’s dilemma-recollections and observations. In: Rapoport, A. (ed.) Game Theory as a Theory of a Conflict Resolution, pp. 17–34. Springer, Dordrecht (1978). https://doi.org/10.1007/978-94-010-2161-6_2
Chapter Google Scholar
Fehr, E., Gachter, S.: Cooperation and punishment in public goods experiments. Am. Econ. Rev. 90(4), 980–994 (2000)
Article Google Scholar
Fu, F., Hauert, C., Nowa, M.A., Wang, L.: Reputation-based partner choice promotes cooperation in social networks. Phys. Rev. E 78, 026117 (2008)
Article Google Scholar
Gunnthorsdottir, A., Rapoport, A.: Embedding social dilemmas in intergroup competition reduces free-riding. Organ. Beha. Hum. Decis. Processes 101(2), 184–199 (2006)
Article Google Scholar
Hu, J., Wellman, M.P.: Multiagent reinforcement learning: theoretical framework and an algorithm. In: Proceedings of the Fifteenth International Conference on Machine Learning, ICML 1998, pp. 242–250. Morgan Kaufmann Publishers Inc., San Francisco (1998)
Google Scholar
Lange, P.A.V., Joireman, J., Parks, C.D., Dijk, E.V.: The psychology of social dilemmas: a review. Organ. Behav. Hum. Decis. Processes 120(2), 125–141 (2013)
Article Google Scholar
Ledyard, J.: A survey of experimental research. In: Kagel, J.H., Roth, A.E. (eds.) The Handbook of Experimental Economics. Princeton University Press, Princeton (1995)
Google Scholar
Leibo, J.Z., Zambaldi, V., Lanctot, M., Marecki, J., Graepel, T.: Multi-agent reinforcement learning in sequential social dilemmas. In: Proceedings of the 16th Conference on Autonomous Agents and Multiagent Systems, pp. 464–473 (2017)
Google Scholar
Mnih, V., et al.: Playing Atari with deep reinforcement learning. In: NIPS Deep Learning Workshop 2013 (2013)
Google Scholar
Nowak, M.A., Signmund, K.: Evolution of indirect reciprocity. In: Proceedings of the National Academy of Sciences, pp. 1291–1298 (2005)
Google Scholar
Rand, D.G., Arbesman, S., Christakis, N.A.: Dynamic social networks promote cooperation in experiments with humans. In: Proceedings of the National Academy of Sciences, pp. 19193–19198 (2011)
Google Scholar
Sandholm, T.W., Crites, R.H.: Multiagent reinforcement learning in the iterated prisoner’s dilemma. Biosystems 37(1–2), 147–166 (1996)
Article Google Scholar
van Veelen, M., Garcia, J., Rand, D.G., Nowak, M.A.: Direct reciprocity in structured populations. Proc. Natl. Acad. Sci. 109, 9929–9934 (2012)
Article Google Scholar
Wunder, M., Littman, M., Babes, M.: Classes of multiagent q-learning dynamics with greedy exploration. In: Proceedings of the 27th International Conference on Machine Learning, ICML 2010 (2010)
Google Scholar
Yang, Y., Luo, R., Li, M., Zhou, M., Zhang, W., Wang, J.: Mean field multi-agent reinforcement learning. In: Proceedings of the 35th International Conference on Machine Learning, pp. 5571–5580 (2018)
Google Scholar
Zhou, L., Yang, P., Chen, C., Gao, Y.: Multiagent reinforcement learning with sparse interactions by negotiation and knowledge transfer. IEEE Trans. Cybern. 47(5), 1238–1250 (2017)
Article Google Scholar

Download references

Author information

Authors and Affiliations

IBM Research, Bangalore, India
Ritwik Chaudhuri, Ramasuri Narayanam & Rohith D. Vallam
IBM Research, New Delhi, India
Kushal Mukherjee

Authors

Ritwik Chaudhuri
View author publications
You can also search for this author in PubMed Google Scholar
Kushal Mukherjee
View author publications
You can also search for this author in PubMed Google Scholar
Ramasuri Narayanam
View author publications
You can also search for this author in PubMed Google Scholar
Rohith D. Vallam
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ritwik Chaudhuri .

Editor information

Editors and Affiliations

IIIT, Hyderabad, Hyderabad, India
Kamal Karlapalem
Chinese University of Hong Kong, Shatin, Hong Kong
Hong Cheng
Virginia Tech, Arlington, VA, USA
Naren Ramakrishnan
Jawaharlal Nehru University, New Delhi, India
R. K. Agrawal
IIIT Hyderabad, Hyderabad, India
P. Krishna Reddy
University of Minnesota, Minneapolis, MN, USA
Jaideep Srivastava
IIIT Delhi, New Delhi, India
Tanmoy Chakraborty

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chaudhuri, R., Mukherjee, K., Narayanam, R., Vallam, R.D. (2021). Collaborative Reinforcement Learning Framework to Model Evolution of Cooperation in Sequential Social Dilemmas. In: Karlapalem, K., et al. Advances in Knowledge Discovery and Data Mining. PAKDD 2021. Lecture Notes in Computer Science(), vol 12712. Springer, Cham. https://doi.org/10.1007/978-3-030-75762-5_2

Download citation

DOI: https://doi.org/10.1007/978-3-030-75762-5_2
Published: 09 May 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-75761-8
Online ISBN: 978-3-030-75762-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics