A unifying learning framework for building artificial game-playing agents

Chen, Wenlin; Chen, Yixin; Levine, David K.

doi:10.1007/s10472-015-9450-1

A unifying learning framework for building artificial game-playing agents

Published: 31 January 2015

Volume 73, pages 335–358, (2015)
Cite this article

Annals of Mathematics and Artificial Intelligence Aims and scope Submit manuscript

Wenlin Chen¹,
Yixin Chen¹ &
David K. Levine¹

338 Accesses
6 Citations
7 Altmetric
Explore all metrics

Abstract

This paper investigates learning-based agents that are capable of mimicking human behavior in game playing, a central task in computational economics. Although computational economists have developed various game-playing agents, well-established machine learning methods such as graphical models have not been applied before. Leveraging probabilistic graphical models, this paper presents a novel sequential Bayesian network (SBN) framework for building artificial game-playing agents. We show that many existing agents, including reinforcement learning, fictitious play, and many of their variants, have a unified Bayesian explanation within the proposed SBN framework. Moreover, we discover that SBN can handle various important settings of game playing, allowing for a broad scope of its use in economics. SBN not only provides a unifying and satisfying framework to explain existing learning approaches in virtual economies, but also enables the development of new algorithms that are stronger or have fewer restrictions. In this paper, we derive a new algorithm, Hidden Markovian Play (HMP), from the generic SBN model to handle an important but difficult setting in which a player cannot observe the opponent’s strategy and payoff. It leverages Markovian learning to infer unobservable information, leading to higher quality of the agents. Experiments on real-world field experiments in evaluating economies show that our HMP model outperforms the baseline algorithms for building artificial agents.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Airiau, S., Endriss, U.: Multiagent resource allocation with sharable items: Simple protocols and nash equilibria In: Proceedings of the 9th International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS-2010) (2010)
Arora, S., Hazan, E., Kale, S.: The multiplicative weights update method: a meta-algorithm and applications. Theory Comput. 8(1), 121–164 (2012)
Article MathSciNet Google Scholar
Bishop, C.M.: Pattern Recognition and Machine Learning. Springer-Verlag, New York, Inc. Secaucus, NJ (2006)
MATH Google Scholar
Bonabeau, E.: Agent-Based Modeling: Methods and Techniques for Simulating Human Systems. Proc. Natl. Acad. Sci. U. S. A. 99(10), 7280–7287 (2002)
Article Google Scholar
Boutilier, C., Shoham, Y., Wellman, M.P.: Economic principles of multi-agent systems. Artif. Intell. 94(1–2), 1–6 (1997)
Article Google Scholar
Bowling, M., Veloso, M.: Existence of multiagent equilibria with limited agents. J. Artif. Intell. Res. 22(1), 353–384 (2004) http://dl.acm.org/citation.cfm?id=1622487. 1622498
MATH MathSciNet Google Scholar
Camerer, C., hua Ho, T.: Experience-weighted attraction learning in normal form games. Econometrica 67, 827–874 (1999)
Article MATH Google Scholar
Camerer, C.F., Ho, T.H., Chong, J.K.: A cognitive hierarchy model of games. Q. J. Econ. 119(3), 861–898 (2004)
Article MATH Google Scholar
Carmel, D., Markovitch, S.: Learning models of intelligent agents. In: Proceedings of the Thirteenth National Conference on Artificial Intelligence pp 62–67 Portland Oregon (1996)
Chen, Y., Lai, J., Parkes, D.C., Procaccia, A.D.: Truth, justice, and cake cutting. In: Fox, M., Poole, D. (eds.) AAAI 2010, Atlanta, Georgia, USA, July 11–15, 2010. AAAI Press (2010)
Chen, Y., Vaughan, J.W.: A new understanding of prediction markets via no-regret learning In: Proceedings of he 11th ACM conference on Electronic commerce, EC ’10 pp 189–198. ACM, New York, NY (2010)
Crandall, J.W., Ahmed, A., Goodrich, M.A.: In: Burgard, W., Roth, D. (eds.) : Learning in repeated games with minimal information: The effects of learning bias. AAAI. AAAI Press (2011)
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum Likelihood from Incomplete Data via the EM Algorithm. J. Royal Statist. Soc. Series B (Methodological) 39(1), 1–38 (1977)
MATH MathSciNet Google Scholar
Devaine, M., Hollard, G., Daunizeau, J.: Theory of mind: did evolution fool us. PloS. One. 9(2), e87619 (2014)
Article Google Scholar
Dimicco, J.M., Greenwald, A., Maes, P.: Learning curve: A simulation-based approach to dynamic pricing. J. Electron. Commer. Res. 3, 245–276 (2003)
Article Google Scholar
Doshi, P., Qu, X., Goodie, A., Young, D.: Modeling recursive reasoning by humans using empirically informed interactive pomdps In: Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1, AAMAS ’10, pp.1223–1230. International Foundation for Autonomous Agents and Multiagent Systems, Richland, SC (2010)
Elkind, E., Golberg, L.A., Goldberg, P.W.: Computing good nash equilibria in graphical games Proceedings of the 8th ACM conference on Electronic commerce, EC ’07, pp 162–171. ACM, New York, NY USA (2007)
Google Scholar
Elkind, E., Leyton-Brown, K.: Algorithmic game theory and artificial intelligence. Artif. Intell. Mag. 31(4), 9–12 (2010)
Google Scholar
Erev, I.: On surprise, change, and the effect of recent outcomes. Front. Psychol. 3(0) (2012)
Erev, I., Bereby-Meyer, Y., Roth, A.E.: The effect of adding a constant to all payoffs: experimental investigation, and implications for reinforcement learning models. J. Econ. Behav. & Organ. 39(1), 111–128 (1999)
Article Google Scholar
Erev, I., Ert, E., Roth, A.E.: A choice prediction competition for market entry games: An introduction. Games 1(2), 117–136 (2010). doi:10.3390/g1020117
Article MathSciNet Google Scholar
Erev, I., Ert, E., Roth, A.E., Haruvy, E., Herzog, S., Hau, R., Hertwig, R., Steward, T., West, R., Lebiere, C.: A choice prediction competition, for choices from experience and from description. J. Behav. Decis. Mak. 23, 15–47 (2010)
Article Google Scholar
Erev, I., Roth, A., Slonim, R., Barron, G.: Learning and equilibrium as useful approximations: Accuracy of prediction on randomly selected constant sum games. Econ. Theory 33(1), 29–51 (2007)
Article MATH MathSciNet Google Scholar
Erev, I., Roth, A.E.: Predicting how people play games: Reinforcement learning in experimental games with unique, mixed strategy equilibria. Am. Econ. Rev. 88(4), 848–81 (1998)
Google Scholar
Erev, I., Roth, A.E., Slonim, R.L., Barron, G.: Predictive value and the usefulness of game theoretic models. Int. J. Forecast. 18(3), 359–368 (2002)
Article Google Scholar
Ert, E., Erev, I.: Replicated alternatives and the role of confusion, chasing, and regret in decisions from experience. J. Behav. Decis. Mak. 322, 305–322 (2007)
Article Google Scholar
Fudenberg, D., Levine, D.K.: The Theory of Learning in Games, MIT Press Books, vol. 1. The MIT Press (1998)
Fudenberg, D., Levine, D.K.: Conditional universal consistency. Games Econ. Behav. 29(1–2), 104–130 (1999)
Article MATH MathSciNet Google Scholar
Ho, T.H., Camerer, C.F., Chong, J.K.: Self-tuning experience weighted attraction learning in games. J. Econ. Theory 133(1), 177–198 (2007)
Article MATH MathSciNet Google Scholar
Hu, J., Wellman, M.P.: Nash q-learning for general-sum stochastic games. J. Mach. Learn. Res. 4, 1039–1069 (2003)
MathSciNet Google Scholar
Jafari, A., Greenwald, A.R., Gondek, D., Ercal, G.: On no-regret learning, fictitious play, and nash equilibrium In: Proceedings of the Eighteenth International Conference on Machine Learning, ICML ’01, pp 226–233. Morgan Kaufmann Publishers Inc., San Francisco, CA (2001)
Kearns, M., Littman, M.L., Singh, S.: Graphical models for game theory In: Proceedings of the Seventeenth conference on Uncertainty in artificial intelligence (2001)
Kleinberg, R., Piliouras, G., Tardos, É.: Multiplicative updates outperform generic no-regret learning in congestion games In: Proceedings of the forty-first annual ACM symposium on Theory of computing (2009)
Koller, D., Milch, B.: Multi-agent influence diagrams for representing and solving games. Games Econ. Behav. 45(1), 181–221 (2003)
Article MATH MathSciNet Google Scholar
Levitt, S.D., List, J.A.: Field experiments in economics: The past, the present, and the future. NBER Workingc Papers 14356 National Bureau of Economic Research Inc (2008)
Ng, B., Boakye, K., Meyers, C., Wang, A.: Bayes-adaptive interactive pomdps AAAI (2012)
Pearl, J.: Probabilistic reasoning in intelligent systems: networks of plausible inference. Morgan Kaufmann Publishers Inc., San Francisco, CA (1988)
Google Scholar
Rezek, I., Leslie, D.S., Reece, S., Roberts, S.J., Rogers, A., Dash, R.K., Jennings, N.R.: On similarities between inference in game theory and machine learning. J. Artif. Intell. Res. 33(1), 259–283 (2008)
MATH MathSciNet Google Scholar
Shoham, Y., Powers, R., Grenager, T.: If multi-agent learning is the answer, what is the question?. Artif. Intell. 171(7), 365–377 (2007). doi:10.1016/j.artint.2006.02.006
Article MATH MathSciNet Google Scholar
Watkins, C.J.C.H.: Learning from delayed rewards. Ph.D. thesis, University of Cambridge (1989)
Google Scholar
Waugh, K., Bagnell, D., Ziebart, B.D.: Computational rationalization: The inverse equilibrium problem. In: Proceedings of the 28th International Conference on Machine Learning, pp 1169–1176, New York, NY (2011)
Wright, J.R., Leyton-Brown, K.: Behavioral game theoretic models: a bayesian framework for parameter analysis In; Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems-Volume 2 (2012)
Yang, R., Ordonez, F., Tambe, M.: Computing optimal strategy against quantal response in security games In: Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems-Volume 2 (2012)

Download references

Author information

Authors and Affiliations

Washington University in St. Louis One Brookings Drive, St. Louis, MO, 63130, USA
Wenlin Chen, Yixin Chen & David K. Levine

Authors

Wenlin Chen
View author publications
You can also search for this author in PubMed Google Scholar
Yixin Chen
View author publications
You can also search for this author in PubMed Google Scholar
David K. Levine
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wenlin Chen.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, W., Chen, Y. & Levine, D.K. A unifying learning framework for building artificial game-playing agents. Ann Math Artif Intell 73, 335–358 (2015). https://doi.org/10.1007/s10472-015-9450-1

Download citation

Published: 31 January 2015
Issue Date: April 2015
DOI: https://doi.org/10.1007/s10472-015-9450-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A unifying learning framework for building artificial game-playing agents

Abstract

Access this article

Similar content being viewed by others

Multi-agent deep reinforcement learning: a survey

Monte Carlo Tree Search: a review of recent modifications and applications

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A unifying learning framework for building artificial game-playing agents

Abstract

Access this article

Similar content being viewed by others

Multi-agent deep reinforcement learning: a survey

Monte Carlo Tree Search: a review of recent modifications and applications

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation