Abstract
There has been extensive research on social dilemmas. Many models and mechanisms have been proposed to promote cooperation. In this work, we propose a three-stage social dilemma game, the Flexi Partner Selection (FPS) mechanism that can promote cooperative behaviour among agents that are trained to maximize an absolutely selfish objective function. Compared with previous works, our settings are more general and flexible as the number of players in each game is not fixed. Specifically, agents can vote out players based on their past behaviours or stay out of the game if playing the game makes them worse off. Moreover, we consider social dilemmas with both linear and non-linear payoffs. Using reinforcement learning (RL), self-interested agents are able to learn to punish defectors by consistently excluding them and cooperate with others in a number of different settings.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Anastassacos, N., Hailes, S., Musolesi, M.: Partner selection for the emergence of cooperation in multi-agent systems using reinforcement learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 7047–7054 (2020)
Chen, X., Sasaki, T., Brännström, Å., Dieckmann, U.: First carrot, then stick: how the adaptive hybridization of incentives promotes cooperation. J. R. Soc. Interface 12(102), 20140935 (2015)
Fehr, E., Gächter, S.: Altruistic punishment in humans. Nature 415(6868), 137–140 (2002)
Gintis, H., Bowles, S., Boyd, R.T., Fehr, E., et al.: Moral Sentiments and Material Interests: The Foundations of Cooperation in Economic Life, vol. 6. MIT Press, Cambridge (2005)
Klambauer, G., Unterthiner, T., Mayr, A., Hochreiter, S.: Self-normalizing neural networks. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 972–981 (2017)
Leibo, J.Z., Zambaldi, V., Lanctot, M., Marecki, J., Graepel, T.: Multi-agent reinforcement learning in sequential social dilemmas. arXiv preprint. arXiv:1702.03037 (2017)
Li, K., Hao, D.: Cooperation enforcement and collusion resistance in repeated public goods games. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 2085–2092 (2019)
Milinski, M., Sommerfeld, R.D., Krambeck, H.J., Reed, F.A., Marotzke, J.: The collective-risk social dilemma and the prevention of simulated dangerous climate change. Proc. Natl. Acad. Sci. 105(7), 2291–2294 (2008)
Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)
Nowak, M.A.: Evolutionary Dynamics: Exploring the Equations of Life. Harvard University Press, Cambridge (2006)
Paiva, A., Santos, F., Santos, F.: Engineering pro-sociality with autonomous agents. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32 (2018)
Pereira, L.M., Lenaerts, T., Martinez-Vaquero, L.A., Han, T.A.: Social manifestation of guilt leads to stable cooperation in multi-agent systems. In: Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems, pp. 1422–1430 (2017)
Santos, F., Pacheco, J., Santos, F.: Social norms of cooperation with costly reputation building. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32 (2018)
Santos, F.P., Mascarenhas, S.F., Santos, F.C., Correia, F., Gomes, S., Paiva, A.: Outcome-based partner selection in collective risk dilemmas. In: Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems, pp. 1556–1564 (2019)
Santos, F.C., Pacheco, J.M.: Risk of collective failure provides an escape from the tragedy of the commons. Proc. Natl. Acad. Sci. 108(26), 10421–10425 (2011)
Szilagyi, M.N.: An investigation of n-person prisoners’ dilemmas. Complex Syst. 14(2), 155–174 (2003)
Tavoni, A., Dannenberg, A., Kallis, G., Löschel, A.: Inequality, communication, and the avoidance of disastrous climate change in a public goods game. Proc. Natl. Acad. Sci. 108(29), 11825–11829 (2011)
Trummer, I., Wang, J., Maram, D., Moseley, S., Jo, S., Antonakakis, J.: Skinnerdb: Regret-bounded query evaluation via reinforcement learning. In: Proceedings of the 2019 ACM SIGMOD International Conference on Management of Data, pp. 1153–1170 (2019)
Vasconcelos, V.V., Santos, F.C., Pacheco, J.M.: A bottom-up institutional approach to cooperative governance of risky commons. Nat. Clim. Chang. 3(9), 797–801 (2013)
Wang, W., Hao, J., Wang, Y., Taylor, M.: Towards cooperation in sequential prisoner’s dilemmas: a deep multiagent reinforcement learning approach. arXiv preprint. arXiv:1803.00162 (2018)
Watkins, C.J., Dayan, P.: Q-learning. Mach. Learn. 8(3–4), 279–292 (1992). https://doi.org/10.1007/BF00992698
Yang, Z., et al.: Qd-tree: learning data layouts for big data analytics. In: Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, pp. 193–208 (2020)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Gu, T., An, B. (2023). A Flexi Partner Selection Model for the Emergence of Cooperation in N-person Social Dilemmas. In: Yokoo, M., Qiao, H., Vorobeychik, Y., Hao, J. (eds) Distributed Artificial Intelligence. DAI 2022. Lecture Notes in Computer Science(), vol 13824. Springer, Cham. https://doi.org/10.1007/978-3-031-25549-6_2
Download citation
DOI: https://doi.org/10.1007/978-3-031-25549-6_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-25548-9
Online ISBN: 978-3-031-25549-6
eBook Packages: Computer ScienceComputer Science (R0)