ABSTRACT
Monte Carlo Tree Search (MCTS) is a decision-making technique that has received considerable interest in the past decade due to its success in a number of domains. In this paper, we explore its application in the “Diplomacy” multi-agent strategic board game, by putting forward and evaluating eight (8) variants of MCTS Diplomacy agents. In the core of our MCTS agents lies the well-known Upper Confidence Bounds for Trees (UCT) bandit method, which attempts to strike a balance between exploration and exploitation during the search tree creation. Moreover, we devised a heuristic weighting system for prioritizing the tree nodes’ actions, and used it to effectively incorporate high-quality domain knowledge in some of our agents. We provide a thorough experimental evaluation of our approach, in which we systematically compare the performance of our agents against each other and against other opponents, including the state-of-the-art Diplomacy agent, DBrane. Our results verify that several of our agents are highly competitive in this domain, exhibiting as they do performance which is comparable to, and in some instances superior to, that of DBrane. Interestingly, the MCTS approach consistently outperforms all others in tournaments in which one MCTS agent faces one D-Brane agent and several other opponents.
- Cameron Browne, Edward Powley, Daniel Whitehouse, Simon Lucas, Peter I. Cowling, Stephen Tavener, Diego Perez, Spyridon Samothrakis, Simon Colton, and et al.2012. A survey of Monte Carlo tree search methods. IEEE TRANSACTIONS ON COMPUTATIONAL INTELLIGENCE AND AI (2012).Google Scholar
- Alan B. Calhamer. 2000. The Rules of Diplomacy, 4th Edition, The Avalon Hill Games Co. http://www.diplomacy-archive.com/resources/rulebooks/2000AH4th.pdfGoogle Scholar
- Guillaume Chaslot, Sander Bakkes, István Szita, and Pieter Spronck. 2008. Monte-Carlo Tree Search: A New Framework for Game AI. In AIIDE.Google Scholar
- Xander Croes. 2016. Tree Search Methods for Diplomacy Agents. Master’s thesis. Universiteit Leiden. https://theses.liacs.nl/pdf/xandercroesmaster2016.pdfGoogle Scholar
- Dave de Jonge. 2019. The BANDANA Framework v1.3. https://www.iiia.csic.es/~davedejonge/bandana/files/Bandana 1.3 Manual.pdfGoogle Scholar
- Dave de Jonge, Tim Baarslag, Reyhan Aydoğan, Catholijn Jonker, Katsuhide Fujita, and Takayuki Ito. 2019. The Challenge of Negotiation in the Game of Diplomacy. In Agreement Technologies 2018, Revised Selected Papers, Marin Lujak (Ed.). Springer International Publishing, Springer International Publishing, Cham, 100–114.Google Scholar
- Dave de Jonge and Carles Sierra. 2017. D-Brane: a diplomacy playing agent for automated negotiations research. Applied Intelligence 47 (02/2017 2017), 158–177.Google Scholar
- Rina Dechter and Robert Mateescu. 2007. AND/OR search spaces for graphical models. Artificial Intelligence 171, 2 (2007), 73 – 106. https://doi.org/10.1016/j.artint.2006.11.003Google ScholarDigital Library
- David P. Helmbold and Aleatha Parker-Wood. 2009. All-Moves-As-First Heuristics in Monte-Carlo Go. In Proceedings of the 2009 International Conference on Artificial Intelligence (IC-AI)(ICAI’09). 605–610.Google Scholar
- Wassily Hoeffding. 1994. Probability Inequalities for sums of Bounded Random Variables. Springer New York, New York, NY, 409–426. https://doi.org/10.1007/978-1-4612-0865-5_26Google Scholar
- Sean D Holcomb, Shaun V Ault, William K Porter, Guifen Mao, and Jin Wang. 2018. Overview on DeepMind and Its AlphaGo Zero AI. In ICBDE ’18: Proceedings of the 2018 International Conference on Big Data and Education (Honolulu, HI, USA). Association for Computing Machinery, New York, NY, USA.Google ScholarDigital Library
- Emanouil Karamalegos. 2016. Monte Carlo Tree Search in the “Settlers of Catan” Strategy Game. Master’s thesis. Technical University of Crete. https://doi.org/10.26233/heallink.tuc.66891Google Scholar
- Levente Kocsis and Csaba Szepesvári. 2006. Bandit Based Monte-Carlo Planning. In Machine Learning: ECML 2006, Johannes Fürnkranz, Tobias Scheffer, and Myra Spiliopoulou(Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 282–293.Google Scholar
- Leandro Soriano Marcolino and Hitoshi Matsubara. 2011. Multi-Agent Monte Carlo Go. In The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 1 (Taipei, Taiwan) (AAMAS ’11). International Foundation for Autonomous Agents and Multiagent Systems, Richland, SC, 21–28.Google Scholar
- David Norman. 2003. David’s Diplomacy AI Page. http://www.ellought.demon.co.uk/dipaiGoogle Scholar
- Konstantinos P. Panousis. 2014. Real-time Planning and Learning in the “Settlers of Catan”. Master’s thesis. Technical University of Crete. https://doi.org/10.26233/heallink.tuc.18113Google Scholar
- David Silver, Julian Schrittwieser, Karen Simonyan, Ioannis Antonoglou, Aja Huang, Arthur Guez, Thomas Hubert, Lucas Baker, Matthew Lai, Adrian Bolton, Yutian Chen, Timothy Lillicrap, Fan Hui, Laurent Sifre, George van den Driessche, Thore Graepel, and Demis Hassabis. 2017. Mastering the game of Go without human knowledge. Nature 550, 7676 (01 Oct 2017), 354–359. https://doi.org/10.1038/nature24270Google Scholar
- Gerald Tesauro, V T Rajan, and Richard Segal. 2010. Bayesian Inference in Monte-Carlo Tree Search. In Proceedings of the Twenty-Sixth Conference on Uncertainty in Artificial Intelligence (Catalina Island, CA) (UAI’10). AUAI Press, Arlington, Virginia, USA, 580–588.Google ScholarDigital Library
- Monte Carlo Tree Search for the Game of Diplomacy
Recommendations
Monte Carlo Tree Search in The Octagon Theory
ICAART 2014: Proceedings of the 6th International Conference on Agents and Artificial Intelligence - Volume 1Monte Carlo Tree Search (MCTS) is a family of algorithms known by its performance in difficult problems that cannot be targeted with the current technology using classical AI approaches. Although MCTS has proven to be successful in various domains, ...
Monte Carlo tree search: a tutorial
WSC '18: Proceedings of the 2018 Winter Simulation ConferenceMonte Carlo tree search (MCTS) is a general approach to solving game problems, playing a central role in Google DeepMind's AlphaZero and its predecessor AlphaGo, which famously defeated the (human) world Go champion Lee Sedol in 2016 and world #1 Go ...
Exploiting Game Decompositions in Monte Carlo Tree Search
Advances in Computer GamesAbstractIn this paper, we propose a variation of the MCTS framework to perform a search in several trees to exploit game decompositions. Our Multiple Tree MCTS (MT-MCTS) approach builds simultaneously multiple MCTS trees corresponding to the different sub-...
Comments