Opponent Modelling by Sequence Prediction and Lookahead in Two-Player Games

Mealing, Richard; Shapiro, Jonathan L.

doi:10.1007/978-3-642-38610-7_36

Richard Mealing²³ &
Jonathan L. Shapiro²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7895))

Included in the following conference series:

International Conference on Artificial Intelligence and Soft Computing

2369 Accesses
1 Citations

Abstract

Learning a strategy that maximises total reward in a multi-agent system is a hard problem when it depends on other agents’ strategies. Many previous approaches consider opponents which are reactive and memoryless. In this paper, we use sequence prediction algorithms to perform opponent modelling in two-player games, to model opponents with memory. We argue that to compete with opponents with memory, lookahead is required. We combine these algorithms with reinforcement learning and lookahead action selection, allowing them to find strategies that maximise total reward up to a limited depth. Experiments confirm lookahead is required, and show these algorithms successfully model and exploit opponent strategies with different memory lengths. The proposed approach outperforms popular and state-of-the-art reinforcement learning algorithms in terms of learning speed and final performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Towards a Fast Detection of Opponents in Repeated Stochastic Games

Modeling opponent learning in multiagent repeated games

Article Open access 23 December 2022

Efficiently detecting switches against non-stationary opponents

Article 26 November 2016

References

Watkins, C.J.C.H.: Learning from delayed rewards. PhD thesis, Cambridge (1989)
Google Scholar
Brown, G.: Iterative Solutions of Games by Fictitious Play. In: Activity Analysis of Production and Allocation. Wiley, New York (1951)
Google Scholar
Carmel, Markovitch: Learning models of intelligent agents. In: Proc. of 13th Int. Conf. on AI, AAAI , pp. 62–67 (1996)
Google Scholar
Jensen, B., Gini, S.: Non-stationary policy learning in 2-player zero sum games. In: Proc. of 20th Int. Conf. on AI, pp. 789–794 (2005)
Google Scholar
Knoll, de Freitas: A machine learning perspective on predictive coding with paq. arXiv:1108.3298 (2011)
Google Scholar
Treisman, Faulkner: Generation of random sequences by human subjects: Cognitive operations or psychological process? JEP: General 116, 337–355 (1987)
Google Scholar
Axelrod, R.: The evolution of strategies in the iterated prisoner’s dilemma. In: Genetic Algorithms and Simulated Annealing, pp. 32–41. Morgan Kaufmann (1987)
Google Scholar
Littman, M.L.: Markov games as a framework for multi-agent reinforcement learning. In: 11th Proc. of ICML, pp. 157–163. Morgan Kaufmann (1994)
Google Scholar
Boytsov, Zaslavsky: Context Prediction in Pervasive Computing Systems. In: Burstein, F. (ed.) Supporting Real Time Decision-Making, pp. 35–63. Springer (2011)
Google Scholar
Lempel, Ziv: Compression of individual sequences via variable-rate coding (1978)
Google Scholar
Knoll, B.: Text prediction and classification using string matching (2009)
Google Scholar
Moffat, A.: Implementing the ppm data compression scheme. IEEE Transactions on Communications 38, 1917–1921 (1990)
Article Google Scholar
Gopalratnam, K., Cook, D.J.: Activelezi: An incremental parsing algorithm for sequential prediction. In: 16th Int. FLAIRS Conf., pp. 38–42 (2003)
Google Scholar
Laird, P., Saul, R.: Discrete sequence prediction and its applications. Machine Learning 15, 43–68 (1994)
MATH Google Scholar
Millington, I.: Learning. In: Artificial Intelligence for Games, pp. 583–590. Morgan Kaufmann (2006)
Google Scholar
Gers, F.A., Schraudolph, N.N., Schmidhuber, J.: Learning precise timing with lstm recurrent networks. JMLR 3, 115–143 (2002)
MathSciNet Google Scholar
Bowling, M., Veloso, M.: Multiagent learning using a variable learning rate. Artificial Intelligence 136, 215–250 (2002)
Article MathSciNet MATH Google Scholar
Abdallah, S., Lesser, V.R.: Non-linear dynamics in multiagent reinforcement learning algorithms. In: AAMAS (3), pp. 1321–1324 (2008)
Google Scholar
Zhang, Lesser: Multi-agent learning with policy prediction. In: AAAI (2010)
Google Scholar
Piccolo, E., Squillero, G.: Adaptive opponent modelling for the iterated prisoner’s dilemma. In: IEEE CEC, pp. 836–841 (2011)
Google Scholar

Download references

Author information

Authors and Affiliations

University of Manchester, Manchester, M13 9PL, UK
Richard Mealing & Jonathan L. Shapiro

Authors

Richard Mealing
View author publications
You can also search for this author in PubMed Google Scholar
Jonathan L. Shapiro
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Częstochowa University of Technology, Armii Krajowej 36, 42-200, Częstochowa, Poland
Leszek Rutkowski , Marcin Korytkowski & Rafał Scherer , &
AGH University of Science and Technology, Michiewicza 30, 30-059, Kraków, Poland
Ryszard Tadeusiewicz
Department of Electrical Engineering and Computer Sciences, University of California, 94720-1776, Berkeley, CA, USA
Lotfi A. Zadeh
Electrical and Computer Engineering, University of Louisville, 405 Lutz Hall, 40292, Louisville, KY, USA
Jacek M. Zurada

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mealing, R., Shapiro, J.L. (2013). Opponent Modelling by Sequence Prediction and Lookahead in Two-Player Games. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds) Artificial Intelligence and Soft Computing. ICAISC 2013. Lecture Notes in Computer Science(), vol 7895. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38610-7_36

Download citation

DOI: https://doi.org/10.1007/978-3-642-38610-7_36
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-38609-1
Online ISBN: 978-3-642-38610-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics