Expert Iteration for Risk

Heredia, Lucas Gnecco; Cazenave, Tristan

doi:10.1007/978-3-031-11488-5_3

Lucas Gnecco Heredia¹⁰ &
Tristan Cazenave¹⁰

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13262))

Included in the following conference series:

Advances in Computer Games

481 Accesses
2 Citations

Abstract

Risk is a complex strategy game that may be easier to understand for humans than chess but harder to deal with for computers. The main reasons are the stochastic nature of battles and the different decisions that must be coordinated within turns. Our goal is to create an artificial intelligence able to play the game without human knowledge using the Expert Iteration [1] framework. We use graph neural networks [13, 15, 22, 30] to learn the policies for the different decisions and the value estimation. Experiments on a synthetic board show that with this framework the model can rapidly learn a good country drafting policy, while the main game phases remain a challenge.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Strategic Reparameterization for Enhanced Inference in Imperfect Information Games: A Neural Network Approach

Learning Adaptable Risk-Sensitive Policies to Coordinate in Multi-agent General-Sum Games

Adversarial Risk Analysis as a Decomposition Method for Structured Expert Judgement Modelling

Notes

References

Anthony, T., Tian, Z., Barber, D.: Thinking fast and slow with deep learning and tree search. arXiv preprint arXiv:1705.08439 (2017)
Anthony, T.W.: Expert iteration. Ph.D. thesis, UCL (University College London) (2021)
Google Scholar
Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Mach. Learn. 47(2), 235–256 (2002)
Article Google Scholar
Blomqvist, E.: Playing the game of risk with an alphazero agent (2020)
Google Scholar
Browne, C.B., et al.: A survey of monte carlo tree search methods. IEEE Trans. Comput. Intell. AI Games 4(1), 1–43 (2012)
Google Scholar
Carr, J.: Using graph convolutional networks and td ($\lambda $) to play the game of risk. arXiv preprint arXiv:2009.06355 (2020)
Cazenave, T.: Residual networks for computer go. IEEE Trans. Games 10(1), 107–110 (2018)
Article Google Scholar
Cazenave, T., et al.: Polygames: Improved zero learning. ICGA J. 42(4), 244–256 (2020)
Article Google Scholar
Coulom, R.: Efficient selectivity and backup operators Monte Carlo in tree search. In: van den Herik, H.J., Ciancarini, P., Donkers, H.H.L.M.J. (eds.) CG 2006. LNCS, vol. 4630, pp. 72–83. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-75538-8_7
Chapter Google Scholar
Gibson, R., Desai, N., Zhao, R.: An automated technique for drafting territories in the board game risk. In: Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment. vol. 5 (2010)
Google Scholar
Johansson, S.J., Olsson, F.: Using multi-agent system technologies in risk bots. In: Proceedings of the Second Artificial Intelligence and Interactive Digital Entertainment Conference (AIIDE), Marina del Rey (2006)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016)
Kocsis, L., Szepesvári, C.: Bandit based Monte-Carlo planning. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 282–293. Springer, Heidelberg (2006). https://doi.org/10.1007/11871842_29
Chapter Google Scholar
Li, G., Muller, M., Thabet, A., Ghanem, B.: Deepgcns: Can GCNS GO as deep as CNNS? In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9267–9276 (2019)
Google Scholar
Li, G., Xiong, C., Thabet, A., Ghanem, B.: DeeperGCN: all you need to train deeper GCNs. arXiv preprint arXiv:2006.07739 (2020)
Lütolf, M.: A Learning AI for the game Risk using the TD ($\lambda $)-Algorithm. Ph.D. thesis, BS Thesis, University of Basel (2013)
Google Scholar
Melkó, E., Nagy, B.: Optimal strategy in games with chance nodes. Acta Cybernet. 18(2), 171–192 (2007)
MathSciNet MATH Google Scholar
Nijssen, J., Winands, M.H.: Search policies in multi-player games1. J. Int. Comput. Games Assoc. 36(1), 3–21 (2013)
Google Scholar
Olsson, F.: A multi-agent system for playing the board game risk (2005)
Google Scholar
Rosin, C.D.: Multi-armed bandits with episode context. Ann. Math. Artif. Intell. 61(3), 203–230 (2011)
Article MathSciNet Google Scholar
Scarselli, F., Gori, M., Tsoi, A.C., Hagenbuchner, M., Monfardini, G.: The graph neural network model. IEEE Trans. Neural Netw. 20(1), 61–80 (2008)
Article Google Scholar
Silver, D., et al.: Mastering the game of go with deep neural networks and tree search. Nature 529(7587), 484–489 (2016)
Google Scholar
Silver, D., et al.: A general reinforcement learning algorithm that masters chess, shogi, and go through self-play. Science 362(6419), 1140–1144 (2018)
Article MathSciNet Google Scholar
Silver, D., et al.: Mastering the game of go without human knowledge. Nature 550(7676), 354–359 (2017)
Google Scholar
Soemers, D.J., Piette, E., Stephenson, M., Browne, C.: Manipulating the distributions of experience used for self-play learning in expert iteration. In: 2020 IEEE Conference on Games (CoG), pp. 245–252. IEEE (2020)
Google Scholar
Sturtevant, N.: A comparison of algorithms for multi-player games. In: Schaeffer, J., Müller, M., Björnsson, Y. (eds.) CG 2002. LNCS, vol. 2883, pp. 108–122. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-540-40031-8_8
Chapter Google Scholar
Wolf, M.: An intelligent artificial player for the game of risk. Unpublished doctoral dissertation). TU Darmstadt, Knowledge Engineering Group, Darmstadt Germany (2005). http://www.ke.tu-darmstadt.de/bibtex/topics/single/32/type
Wu, D.J.: Accelerating self-play learning in go. arXiv preprint arXiv:1902.10565 (2019)
Wu, Z., Pan, S., Chen, F., Long, G., Zhang, C., Philip, S.Y.: A comprehensive survey on graph neural networks. IEEE Trans. Neural Netw. Learn. Syst. 32(1), 4–24 (2020)
Article MathSciNet Google Scholar
Zhang, S., Tong, H., Xu, J., Maciejewski, R.: Graph convolutional networks: a comprehensive review. Comput. Soc. Netw. 6(1), 1–23 (2019). https://doi.org/10.1186/s40649-019-0069-y
Article Google Scholar

Download references

Acknowledgment

This work was supported in part by the French government under management of Agence Nationale de la Recherche as part of the “Investissements d’avenir" program, reference ANR19-P3IA-0001 (PRAIRIE 3IA Institute).

Author information

Authors and Affiliations

LAMSADE, Université Paris-Dauphine, PSL, CNRS, Paris, France
Lucas Gnecco Heredia & Tristan Cazenave

Authors

Lucas Gnecco Heredia
View author publications
You can also search for this author in PubMed Google Scholar
Tristan Cazenave
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tristan Cazenave .

Editor information

Editors and Affiliations

Maastricht University, Maastricht, The Netherlands
Cameron Browne
IBM Research - Tokyo, Tokyo, Japan
Akihiro Kishimoto
University of Alberta, Edmonton, AB, Canada
Jonathan Schaeffer

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Heredia, L.G., Cazenave, T. (2022). Expert Iteration for Risk. In: Browne, C., Kishimoto, A., Schaeffer, J. (eds) Advances in Computer Games. ACG 2021. Lecture Notes in Computer Science, vol 13262. Springer, Cham. https://doi.org/10.1007/978-3-031-11488-5_3

Download citation

DOI: https://doi.org/10.1007/978-3-031-11488-5_3
Published: 01 August 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-11487-8
Online ISBN: 978-3-031-11488-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Expert Iteration for Risk