skip to main content
10.1145/1160633.1160766acmconferencesArticle/Chapter ViewAbstractPublication PagesaamasConference Proceedingsconference-collections
Article

Learning against multiple opponents

Published: 08 May 2006 Publication History

Abstract

We address the problem of learning in repeated n-player (as opposed to 2-player) general-sum games, paying particular attention to the rarely addressed situation in which there are a mixture of agents of different types. We propose new criteria requiring that the agents employing a particular learning algorithm work together to achieve a joint best-response against a target class of opponents, while guaranteeing they each achieve at least their individual security-level payoff against any possible set of opponents. We then provide algorithms that provably meet these criteria for two target classes: stationary strategies and adaptive strategies with a bounded memory. We also demonstrate that the algorithm for stationary strategies outperforms existing algorithms in tests spanning a wide variety of repeated games with more than two players.

References

[1]
M. Bowling. Convergence and no-regret in multiagent learning. In Advances in Neural Information Processing Systems 17. MIT Press, 2005.
[2]
M. Bowling and M. Veloso. Multiagent learning using a variable learning rate. Artificial Intelligence, 136:215--250, 2002
[3]
R. Brafman and M. Tennenholtz. Efficient learning equilibrium. In Advances in Neural Information Processing Systems 15. MIT Press, 2002.
[4]
G. Chalkiadakis and C. Boutilier. Coordination in multiagent reinforcement learning: A bayesian approach. In The Second International Joint Conference on Autonomous Agents & Multiagent Systems, 2003.
[5]
C. Claus and C. Boutilier. The dynamics of reinforcement learning in cooperative multiagent systems. In Proceedings of the Fifteenth National Conference on Artificial Intelligence, pages 746--752, 1998.
[6]
V. Conitzer and T. Sandholm. Awesome: A general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents. In Proceedings of the 20th International Conference on Machine Learning, pages 83--90, 2003.
[7]
D. P. de Farias and N. Megiddo. How to combine expert (or novice) advice when actions impact the environment. In Advances in Neural Information Processing Systems 16, 2004.
[8]
D. Foster and R. Vohra. Regret in the on-line decision problem. Games and Economic Behavior, 29:7--36, 1999.
[9]
D. Fudenberg and D. Levine. Universal consistency and cautious fictitious play. Journal of Economic Dynamics and Control, 19:1065--1089, 1995.
[10]
J. Hannan. Approximation to bayes risk in repeated plays. Contributions to the Theory of Games, 3:97--139, 1957.
[11]
S. Hart and A. Mas-Colell. A simple adaptive procedure leading to correlated equilibrium. Econometrica, 68:1127--1150, 2000.
[12]
A. Jafari, A. Greenwald, D. Gondek, and G. Ercal. On no-regret learning, fictitious play, and nash equilibrium. In Proceedings of the Eighteenth International Conference on Machine Learning, pages 226--223, 2001.
[13]
E. Kalai and E. Lehrer. Rational learning leads to nash equilibrium. Econometrica, 61(5): 1019--1045, 1993.
[14]
S. Kapetanakis and D. Kudenko. Reinforcement learning of coordination in cooperative multi-agent systems. In Proceedings of the Eighteenth National Conference on Artificial Intelligence, pages 326--331, 2002.
[15]
M. Lauer and M. Riedmiller. An algorithm for distributed reinforcement learning in cooperative multi-agent systems. In Proceedings of the 17th International Conference on Machine Learning, pages 535--542. Morgan Kaufman, 2000.
[16]
M. Littman and P. Stone. Implicit negotiation in repeated games. In Proceedings of The Eighth International Workshop on Agent Theories, Architectures, and Languages, pages 393--404, 2001.
[17]
A. Neyman. Bounded complexity justifies cooperation in finitely repeated prisoner's dilemma. Economic Letters, pages 227--229, 1985.
[18]
E. Nudelman, J. Wortman, K. Leyton-Brown, and Y. Shoham. Run the gamut: A comprehensive approach to evaluating game-theorectic algorithms. The Third International Joint Conference on Autonomous Agents & Multiagent Systems, 2004.
[19]
C. H. Papadimitriou and M. Yannakakis. On complexity as bounded rationality. In STOC-94, pages 726--733, 1994.
[20]
R. Powers and Y. Shoham. Learning against opponents with bounded memory. In Proceedings of the Nineteenth International Joint Conference on Artificial Intelligence, 2005.
[21]
R. Powers and Y. Shoham. New criteria and a new algorithm for learning in multi-agent systems. In Advances in Neural Information Processing Systems 17, 2005.
[22]
S. Singh, M. Kearns, and Y. Mansour. Nash convergence of gradient dynamics in general-sum games. In Proceedings of the Seventeenth International Conference on Machine Learning, pages 41--48, Morgan Kaufman, 2000.
[23]
X. Wang and T. Sandholm. Reinforcement learning to play an optimal nash equilibrium in team markov games. In Advances in Neural Information Processing Systems 15,'02.
[24]
C. Watkins and P. Dayan. Technical note: Q-learning. Machine Learning, 8(3/4):279--292, May 1992.
[25]
M. Zinkevich. Online convex programming and generalized infinitesimal gradient ascent. In ICML, 2003.

Cited By

View all
  • (2012)Coordinating many agents in stochastic gamesThe 2012 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN.2012.6252457(1-8)Online publication date: Jun-2012
  • (2011)Intertemporal Discount Factors as a Measure of Trustworthiness in Electronic CommerceIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2010.14123:5(699-712)Online publication date: 1-May-2011
  • (2009)Bayesian Learning for Cooperation in Multi-Agent SystemsComputational Intelligence10.1007/978-3-642-01799-5_10(321-360)Online publication date: 2009
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
AAMAS '06: Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
May 2006
1631 pages
ISBN:1595933034
DOI:10.1145/1160633
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 May 2006

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. game theory
  2. learning
  3. multi-agent systems

Qualifiers

  • Article

Conference

AAMAS06
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,155 of 5,036 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)4
  • Downloads (Last 6 weeks)0
Reflects downloads up to 16 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2012)Coordinating many agents in stochastic gamesThe 2012 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN.2012.6252457(1-8)Online publication date: Jun-2012
  • (2011)Intertemporal Discount Factors as a Measure of Trustworthiness in Electronic CommerceIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2010.14123:5(699-712)Online publication date: 1-May-2011
  • (2009)Bayesian Learning for Cooperation in Multi-Agent SystemsComputational Intelligence10.1007/978-3-642-01799-5_10(321-360)Online publication date: 2009
  • (2008)Online multiagent learning against memory bounded adversariesProceedings of the 2008th European Conference on Machine Learning and Knowledge Discovery in Databases - Volume Part I10.5555/3120828.3120864(211-226)Online publication date: 15-Sep-2008
  • (2008)Opportunities for multiagent systems and multiagent reinforcement learning in traffic controlAutonomous Agents and Multi-Agent Systems10.1007/s10458-008-9062-918:3(342-375)Online publication date: 7-Sep-2008
  • (2008)Online Multiagent Learning against Memory Bounded AdversariesMachine Learning and Knowledge Discovery in Databases10.1007/978-3-540-87479-9_32(211-226)Online publication date: 2008
  • (2007)Multi-agent learningProceedings of the aritficial intelligence 13th Portuguese conference on Progress in artificial intelligence10.5555/1782254.1782311(568-579)Online publication date: 3-Dec-2007
  • (2007)Adaptation in games with many co-evolving agentsProceedings of the aritficial intelligence 13th Portuguese conference on Progress in artificial intelligence10.5555/1782254.1782273(195-206)Online publication date: 3-Dec-2007
  • (2007)If multi-agent learning is the answer, what is the question?Artificial Intelligence10.1016/j.artint.2006.02.006171:7(365-377)Online publication date: 1-May-2007
  • (2007)Multi-agent Learning: How to Interact to Improve Collective ResultsProgress in Artificial Intelligence10.1007/978-3-540-77002-2_48(568-579)Online publication date: 2007
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media