Article

Learning against multiple opponents

Authors:

Yoav ShohamAuthors Info & Claims

AAMAS '06: Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems

Pages 752 - 759

https://doi.org/10.1145/1160633.1160766

Published: 08 May 2006 Publication History

Abstract

We address the problem of learning in repeated n-player (as opposed to 2-player) general-sum games, paying particular attention to the rarely addressed situation in which there are a mixture of agents of different types. We propose new criteria requiring that the agents employing a particular learning algorithm work together to achieve a joint best-response against a target class of opponents, while guaranteeing they each achieve at least their individual security-level payoff against any possible set of opponents. We then provide algorithms that provably meet these criteria for two target classes: stationary strategies and adaptive strategies with a bounded memory. We also demonstrate that the algorithm for stationary strategies outperforms existing algorithms in tests spanning a wide variety of repeated games with more than two players.

References

[1]

M. Bowling. Convergence and no-regret in multiagent learning. In Advances in Neural Information Processing Systems 17. MIT Press, 2005.

[2]

M. Bowling and M. Veloso. Multiagent learning using a variable learning rate. Artificial Intelligence, 136:215--250, 2002

Digital Library

[3]

R. Brafman and M. Tennenholtz. Efficient learning equilibrium. In Advances in Neural Information Processing Systems 15. MIT Press, 2002.

[4]

G. Chalkiadakis and C. Boutilier. Coordination in multiagent reinforcement learning: A bayesian approach. In The Second International Joint Conference on Autonomous Agents & Multiagent Systems, 2003.

Digital Library

[5]

C. Claus and C. Boutilier. The dynamics of reinforcement learning in cooperative multiagent systems. In Proceedings of the Fifteenth National Conference on Artificial Intelligence, pages 746--752, 1998.

Digital Library

[6]

V. Conitzer and T. Sandholm. Awesome: A general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents. In Proceedings of the 20th International Conference on Machine Learning, pages 83--90, 2003.

[7]

D. P. de Farias and N. Megiddo. How to combine expert (or novice) advice when actions impact the environment. In Advances in Neural Information Processing Systems 16, 2004.

[8]

D. Foster and R. Vohra. Regret in the on-line decision problem. Games and Economic Behavior, 29:7--36, 1999.

[9]

D. Fudenberg and D. Levine. Universal consistency and cautious fictitious play. Journal of Economic Dynamics and Control, 19:1065--1089, 1995.

[10]

J. Hannan. Approximation to bayes risk in repeated plays. Contributions to the Theory of Games, 3:97--139, 1957.

[11]

S. Hart and A. Mas-Colell. A simple adaptive procedure leading to correlated equilibrium. Econometrica, 68:1127--1150, 2000.

[12]

A. Jafari, A. Greenwald, D. Gondek, and G. Ercal. On no-regret learning, fictitious play, and nash equilibrium. In Proceedings of the Eighteenth International Conference on Machine Learning, pages 226--223, 2001.

Digital Library

[13]

E. Kalai and E. Lehrer. Rational learning leads to nash equilibrium. Econometrica, 61(5): 1019--1045, 1993.

[14]

S. Kapetanakis and D. Kudenko. Reinforcement learning of coordination in cooperative multi-agent systems. In Proceedings of the Eighteenth National Conference on Artificial Intelligence, pages 326--331, 2002.

Digital Library

[15]

M. Lauer and M. Riedmiller. An algorithm for distributed reinforcement learning in cooperative multi-agent systems. In Proceedings of the 17th International Conference on Machine Learning, pages 535--542. Morgan Kaufman, 2000.

Digital Library

[16]

M. Littman and P. Stone. Implicit negotiation in repeated games. In Proceedings of The Eighth International Workshop on Agent Theories, Architectures, and Languages, pages 393--404, 2001.

Digital Library

[17]

A. Neyman. Bounded complexity justifies cooperation in finitely repeated prisoner's dilemma. Economic Letters, pages 227--229, 1985.

[18]

E. Nudelman, J. Wortman, K. Leyton-Brown, and Y. Shoham. Run the gamut: A comprehensive approach to evaluating game-theorectic algorithms. The Third International Joint Conference on Autonomous Agents & Multiagent Systems, 2004.

Digital Library

[19]

C. H. Papadimitriou and M. Yannakakis. On complexity as bounded rationality. In STOC-94, pages 726--733, 1994.

Digital Library

[20]

R. Powers and Y. Shoham. Learning against opponents with bounded memory. In Proceedings of the Nineteenth International Joint Conference on Artificial Intelligence, 2005.

Digital Library

[21]

R. Powers and Y. Shoham. New criteria and a new algorithm for learning in multi-agent systems. In Advances in Neural Information Processing Systems 17, 2005.

[22]

S. Singh, M. Kearns, and Y. Mansour. Nash convergence of gradient dynamics in general-sum games. In Proceedings of the Seventeenth International Conference on Machine Learning, pages 41--48, Morgan Kaufman, 2000.

[23]

X. Wang and T. Sandholm. Reinforcement learning to play an optimal nash equilibrium in team markov games. In Advances in Neural Information Processing Systems 15,'02.

[24]

C. Watkins and P. Dayan. Technical note: Q-learning. Machine Learning, 8(3/4):279--292, May 1992.

[25]

M. Zinkevich. Online convex programming and generalized infinitesimal gradient ascent. In ICML, 2003.

Digital Library

Cited By

Bazzan A(2012)Coordinating many agents in stochastic gamesThe 2012 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN.2012.6252457(1-8)Online publication date: Jun-2012
https://doi.org/10.1109/IJCNN.2012.6252457
Hazard CSingh M(2011)Intertemporal Discount Factors as a Measure of Trustworthiness in Electronic CommerceIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2010.14123:5(699-712)Online publication date: 1-May-2011
https://dl.acm.org/doi/10.1109/TKDE.2010.141
Allen-Williams MJennings N(2009)Bayesian Learning for Cooperation in Multi-Agent SystemsComputational Intelligence10.1007/978-3-642-01799-5_10(321-360)Online publication date: 2009
https://doi.org/10.1007/978-3-642-01799-5_10
Show More Cited By

Recommendations

Prosocial Learning Agents Solve Generalized Stag Hunts Better than Selfish Ones
AAMAS '18: Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems

Real world interactions are full of coordination problems [2, 3, 8, 14, 15] and thus constructing agents that can solve them is an important problem for artificial intelligence research. One of the simplest, most heavily studied coordination problems is ...
The social Ultimatum Game and adaptive agents
AAMAS '11: The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 3

The Ultimatum Game is a key exemplar that shows how human play often deviates from "rational" strategies suggested by game-theoretic analysis. One explanation is that humans cannot put aside the assumption of being in a multi-player multi-round ...
Learning to trust in the competence and commitment of agents

For agents to collaborate in open multi-agent systems, each agent must trust in the other agents' ability to complete tasks and willingness to cooperate. Agents need to decide between cooperative and opportunistic behavior based on their assessment of ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

AAMAS '06: Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems

May 2006

1631 pages

ISBN:1595933034

DOI:10.1145/1160633

General Chairs:
Hideyuki Nakashima
Future University - Hakodate, Japan
,
Michael Wellman
University of Michigan
,
Program Chairs:
Gerhard Weiss
Technical University Munich, Germany
,
Peter Stone
The University of Texas at Austin

Copyright © 2006 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

IFMAS: The International Foundation for Multiagent Systems
SIGAI: ACM Special Interest Group on Artificial Intelligence
ATAL: The International Workshop on Agent Theories, Architectures, and Languages

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 May 2006

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Article

Conference

AAMAS06

Sponsor:

IFMAS
SIGAI
ATAL

AAMAS06: AAMAS '06 - 5th International Joint Conference on Autonomous Agents and Multi-agent Systems 2006

May 8 - 12, 2006

Japan, Hakodate

Acceptance Rates

Overall Acceptance Rate 1,155 of 5,036 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

11
Total Citations
View Citations
399
Total Downloads

Downloads (Last 12 months)4
Downloads (Last 6 weeks)0

Reflects downloads up to 16 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Bazzan A(2012)Coordinating many agents in stochastic gamesThe 2012 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN.2012.6252457(1-8)Online publication date: Jun-2012
https://doi.org/10.1109/IJCNN.2012.6252457
Hazard CSingh M(2011)Intertemporal Discount Factors as a Measure of Trustworthiness in Electronic CommerceIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2010.14123:5(699-712)Online publication date: 1-May-2011
https://dl.acm.org/doi/10.1109/TKDE.2010.141
Allen-Williams MJennings N(2009)Bayesian Learning for Cooperation in Multi-Agent SystemsComputational Intelligence10.1007/978-3-642-01799-5_10(321-360)Online publication date: 2009
https://doi.org/10.1007/978-3-642-01799-5_10
Chakraborty DStone P(2008)Online multiagent learning against memory bounded adversariesProceedings of the 2008th European Conference on Machine Learning and Knowledge Discovery in Databases - Volume Part I10.5555/3120828.3120864(211-226)Online publication date: 15-Sep-2008
https://dl.acm.org/doi/10.5555/3120828.3120864
Bazzan A(2008)Opportunities for multiagent systems and multiagent reinforcement learning in traffic controlAutonomous Agents and Multi-Agent Systems10.1007/s10458-008-9062-918:3(342-375)Online publication date: 7-Sep-2008
https://doi.org/10.1007/s10458-008-9062-9
Chakraborty DStone P(2008)Online Multiagent Learning against Memory Bounded AdversariesMachine Learning and Knowledge Discovery in Databases10.1007/978-3-540-87479-9_32(211-226)Online publication date: 2008
https://doi.org/10.1007/978-3-540-87479-9_32
Rafael PNeto J(2007)Multi-agent learningProceedings of the aritficial intelligence 13th Portuguese conference on Progress in artificial intelligence10.5555/1782254.1782311(568-579)Online publication date: 3-Dec-2007
https://dl.acm.org/doi/10.5555/1782254.1782311
Bazzan AKlügl FNagel K(2007)Adaptation in games with many co-evolving agentsProceedings of the aritficial intelligence 13th Portuguese conference on Progress in artificial intelligence10.5555/1782254.1782273(195-206)Online publication date: 3-Dec-2007
https://dl.acm.org/doi/10.5555/1782254.1782273
Shoham YPowers RGrenager T(2007)If multi-agent learning is the answer, what is the question?Artificial Intelligence10.1016/j.artint.2006.02.006171:7(365-377)Online publication date: 1-May-2007
https://dl.acm.org/doi/10.1016/j.artint.2006.02.006
Rafael PNeto J(2007)Multi-agent Learning: How to Interact to Improve Collective ResultsProgress in Artificial Intelligence10.1007/978-3-540-77002-2_48(568-579)Online publication date: 2007
https://doi.org/10.1007/978-3-540-77002-2_48
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten