short-paper

The Cost of Reinforcement Learning for Game Engines: The AZ-Hive Case-study

Authors:

Danilo de Goede,

Duncan Kampert,

Ana Lucia VarbanescuAuthors Info & Claims

ICPE '22: Proceedings of the 2022 ACM/SPEC on International Conference on Performance Engineering

Pages 145 - 152

https://doi.org/10.1145/3489525.3511685

Published: 09 April 2022 Publication History

Abstract

Although utilising computers to play board games has been a topic of research for many decades, the recent rapid developments in the field of reinforcement learning - like AlphaZero and variants - brought unprecedented progress in games such as chess and Go. However, the efficiency of this process remains unknown. In this work, we analyse the cost and efficiency of the AlphaZero approach when building a new game engine. Thus, we present our experience building AZ-Hive, an AlphaZero-based playing engine for the game of Hive. Using only the rules of the game and a quality of play assessment, AZ-Hive learns to play the game from scratch. Getting AZ-Hive up and running requires encoding the game in AlphaZero, i.e., capturing the board, the game state, the rules and the assessment of play-quality. And different encodings lead to significantly different AZ-Hive engines, with very different performance results. Thus, we propose a design space for configuring AZ-Hive, and demonstrate the costs and benefits of different configurations in this space. We find that different configurations lead to a less or more competitive playing-engine, but the training and evaluation for different such engines is prohibitively expensive. Moreover, no systematic, efficient exploration or pruning of the space is possible. In turn, an exhaustive exploration can easily take tens of training-years.

References

[1]

[n. d.]. Hive: The Game. Gen42. https://www.gen42.com/games/hive Accessed: 2020-04--15.

[2]

Bruno Bouzy and Tristan Cazenave. 2001. Computer Go: An AI oriented survey. Artificial Intelligence 132, 1 (oct 2001), 39--103. https://doi.org/10.1016/s0004- 3702(01)00127--8

[3]

J. Burmeister and J. Wiles. 1995. The challenge of Go as a domain for AI research: a comparison between Go and chess. In Proceedings of Third Australian and New Zealand Conference on Intelligent Information Systems. ANZIIS-95. IEEE. https://doi.org/10.1109/anziis.1995.705737

[4]

Tristan Cazenave, Yen-Chi Chen, Guan-Wei Chen, et al. 2021. Polygames: Improved zero learning. ICGA Journal 42, 4 (jan 2021). https://doi.org/10.3233/icg- 200157

[5]

R. Coulom. 2010. BayesElo. https://www.remi-coulom.fr/Bayesian-Elo/ Accessed: 2020-06--13.

[6]

Danilo de Goede. 2021. Enhancing a Hive Playing Engine with Reinforcement Learning. B.S. thesis.

[7]

Arpad E Elo. 1978. The rating of chessplayers, past and present. Arco Pub.

[8]

Sergey Ioffe and Christian Szegedy. 2015. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International conference on machine learning. PMLR, 448--456.

[9]

Duncan Kampert. 2021. Creating and optimizing a game-playing program for Hive. (July 2021).

[10]

Sangyeob Kim, Juhyoung Lee, Sanghoon Kang, et al. 2021. PNPU: An Energy- Efficient Deep-Neural-Network Learning Processor With Stochastic Coarse--Fine Level Weight Pruning and Adaptive Input/Output/Weight Zero Skipping. IEEE Solid-State Circuits Letters 4 (2021), 22--25.

[11]

C. Piech. 2013. Deep Blue. Stanford University. https://stanford.edu/~cpiech/ cs221/apps/deepBlue.html Accessed: 2020-04-01.

[12]

Aske Plaat. 2020. Self-Play. In Learning to Play. Springer International Publishing, 195--232. https://doi.org/10.1007/978--3-030--59238--7_7

[13]

Julian Schrittwieser, Ioannis Antonoglou, Thomas Hubert, et al. 2020. Mastering Atari, Go, chess and shogi by planning with a learned model. Nature 588, 7839 (dec 2020), 604--609. https://doi.org/10.1038/s41586-020-03051--4

[14]

Roy Schwartz, Jesse Dodge, Noah A. Smith, and Oren Etzioni. 2019. Green AI. CoRR abs/1907.10597 (2019). arXiv:1907.10597 http://arxiv.org/abs/1907.10597

[15]

Claude E. Shannon. 1950. XXII. Programming a computer for playing chess. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science 41, 314 (1950), 256--275. https://doi.org/10.1080/14786445008521796

[16]

David Silver, Aja Huang, Chris J. Maddison, et al. 2016. Mastering the game of Go with deep neural networks and tree search. Nature 529, 7587 (jan 2016), 484--489. https://doi.org/10.1038/nature16961

[17]

David Silver, Thomas Hubert, Julian Schrittwieser, et al. 2017. Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm. arXiv:1712.01815 [cs.AI]

[18]

David Silver, Thomas Hubert, Julian Schrittwieser, et al. 2018. A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play. Science 362, 6419 (dec 2018), 1140--1144. https://doi.org/10.1126/science.aar6404

[19]

David Silver, Julian Schrittwieser, Karen Simonyan, et al. 2017. Nature 550, 7676 (oct 2017), 354--359. https://doi.org/10.1038/nature24270

[20]

Vivienne Sze, Yu-Hsin Chen, Tien-Ju Yang, and Joel Emer. 2017. Efficient processing of deep neural networks: A tutorial and survey. Proc. IEEE (2017).

[21]

Vivienne Sze, Yu-Hsin Chen, Tien-Ju Yang, and Joel Emer. 2020. Efficient Processing of Deep Neural Networks. Morgan and Claypool Publishers.

[22]

Hui Wang, Michael Emmerich, Mike Preuss, and Aske Plaat. 2019. Hyperparameter sweep on alphazero general. arXiv preprint arXiv:1903.08129 (2019).

[23]

Yang Wang, Yubin Qin, Leibo Liu, Shaojun Wei, and Shouyi Yin. 2021. HPPU: An Energy-Efficient Sparse DNN Training Processor with Hybrid Weight Pruning. In 2021 IEEE AICAS. 1--4. https://doi.org/10.1109/AICAS51828.2021.9458410

[24]

Yuxin Wang, Qiang Wang, Shaohuai Shi, et al. 2020. Benchmarking the Performance and Energy Efficiency of AI Accelerators for AI Training. In 2020 IEEE/ACM CCGRID. 744--751. https://doi.org/10.1109/CCGrid49817.2020.00--15

[25]

Wikipedia contributors. 2021. Hive (game) - Wikipedia, The Free Encyclopedia. https://en.wikipedia.org/w/index.php?title=Hive_(game)&oldid=1007191566

Cited By

Miller TDurlik IKostecka EMitan-Zalewska PSokołowska SCembrowska-Lech DŁobodzińska A(2023)Advancements in Artificial Intelligence Circuits and Systems (AICAS)Electronics10.3390/electronics1301010213:1(102)Online publication date: 26-Dec-2023
https://doi.org/10.3390/electronics13010102

Index Terms

The Cost of Reinforcement Learning for Game Engines: The AZ-Hive Case-study

Recommendations

Enhancing Mario Gaming Using Optimized Reinforcement Learning
Distributed Computing and Intelligent Technology
Abstract
“In the realm of classic gaming, Mario has held a special place in the hearts of players for generations. This study, titled ‘Enhancing Mario Gaming using Optimized Reinforcement Learning’, ventures into the uncharted territory of machine learning ...
Hidden Information General Game Playing with Deep Learning and Search
PRICAI 2022: Trends in Artificial Intelligence
Abstract
General Game Playing agents are capable of learning to play games they have never seen before, merely by looking at a formal description of the rules of a game. Recent developments in deep learning have influenced the way state-of-the-art AI ...
Application of reinforcement learning to the game of Othello

Operations research and management science are often confronted with sequential decision making problems with large state spaces. Standard methods that are used for solving such complex problems are associated with some difficulties. As we discuss in ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ICPE '22: Proceedings of the 2022 ACM/SPEC on International Conference on Performance Engineering

April 2022

242 pages

ISBN:9781450391436

DOI:10.1145/3489525

General Chairs:
Dan Feng
Huazhong University of Science and Technology, China
,
Steffen Becker
University of Stuttgart, Germany
,
Program Chairs:
Nikolas Herbst
University of Würzburg, Germany
,
Philipp Leitner
Chalmers and University of Gothenburg
,
Publications Chair:
Alessandro Papadopoulos
Mälardalen University, Sweden

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 April 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Short-paper

Conference

ICPE '22

Sponsor:

ICPE '22: ACM/SPEC International Conference on Performance Engineering

April 9 - 13, 2022

Beijing, China

Acceptance Rates

ICPE '22 Paper Acceptance Rate 14 of 58 submissions, 24%;

Overall Acceptance Rate 252 of 851 submissions, 30%

Upcoming Conference

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
134
Total Downloads

Downloads (Last 12 months)17
Downloads (Last 6 weeks)4

Reflects downloads up to 17 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Miller TDurlik IKostecka EMitan-Zalewska PSokołowska SCembrowska-Lech DŁobodzińska A(2023)Advancements in Artificial Intelligence Circuits and Systems (AICAS)Electronics10.3390/electronics1301010213:1(102)Online publication date: 26-Dec-2023
https://doi.org/10.3390/electronics13010102

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten