Deep Preference Neural Network for Move Prediction in Board Games

Runarsson, Thomas Philip

doi:10.1007/978-3-319-75931-9_3

Thomas Philip Runarsson¹²

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 818))

Included in the following conference series:

Workshop on Computer Games

666 Accesses

Abstract

The training of deep neural networks for move prediction in board games using comparison training is studied. Specifically, the aim is to predict moves for the game Othello from championship tournament game data. A general deep preference neural network will be presented based on a twenty year old model by Tesauro. The problem of over-fitting becomes an immediate concern when training the deep preference neural networks. It will be shown how dropout may combat this problem to a certain extent. How classification test accuracy does not necessarily correspond to move accuracy is illustrated and the key difference between preference training versus single-label classification is discussed. The careful use of dropout coupled with richer game data produces an evaluation function that is a better move predictor but will not necessarily produce a stronger game player.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Determining Player Skill in the Game of Go with Deep Neural Networks

Deep Learning for Classifying Battlefield 4 Players

MambaNet: A Hybrid Neural Network for Predicting the NBA Playoffs

Article Open access 10 June 2024

Notes

1.
www.ffothello.org.

References

Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G.S., Davis, A., Dean, J., Devin, M., et al.: Tensorflow: large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467 (2016)
Binkley, K.J., Seehart, K., Hagiwara, M.: A study of artificial neural network architectures for Othello evaluation functions. Inf. Media Technol. 2(4), 1129–1139 (2007)
Google Scholar
Buro, M.: Logistello: a strong learning Othello program. In: 19th Annual Conference Gesellschaft für Klassifikation eV, vol. 2 (1995)
Google Scholar
Burrow, P.: Hybridising evolution and temporal difference learning. Ph.D. thesis, University of Essex, UK (2011)
Google Scholar
Foullon-Perez, A., Lucas, S.M.: Orientational features with the SNT-grid. In: 2009 International Joint Conference on Neural Networks, pp. 877–881 (2009)
Google Scholar
Fürnkranz, J., Hüllermeier, E.: Preference learning: an introduction. In: Fürnkranz, J., Hüllermeier, E. (eds.) Preference Learning, pp. 1–17. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-14125-6_1
Google Scholar
Kingma, D., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Lagoudakis, M., Parr, R.: Reinforcement learning as classification: leveraging modern classifiers. In: ICML, vol. 20, pp. 424–431 (2003)
Google Scholar
Lazaric, A., Ghavamzadeh, M., Munos, R.: Analysis of a classification-based policy iteration algorithm. In: Proceedings of the 27th International Conference on Machine Learning, pp. 607–614 (2010)
Google Scholar
Li, L., Bulitko, V., Greiner, R.: Focus of attention in reinforcement learning. J. Univ. Comput. Sci. 13(9), 1246–1269 (2007)
Google Scholar
Rigutini, L., Papini, T., Maggini, M., Scarselli, F.: Sortnet: learning to rank by a neural preference function. IEEE Trans. Neural Netw. 22(9), 1368–1380 (2011)
Article Google Scholar
Rimmel, A., Teytaud, O., Lee, C.S., Yen, S.J., Wang, M.H., Tsai, S.R.: Current frontiers in computer Go. IEEE Trans. Comput. Intell. AI Games 2(4), 229–238 (2010)
Article Google Scholar
Runarsson, T.P., Lucas, S.M.: Preference learning for move prediction and evaluation function approximation in Othello. IEEE Trans. Comput. Intell. AI Games 6(3), 300–313 (2014)
Article Google Scholar
Runarsson, T., Lucas, S.: Imitating play from game trajectories: temporal difference learning versus preference learning. In: IEEE Conference on Computational Intelligence and Games, pp. 79–82 (2012)
Google Scholar
Silver, D., Huang, A., Maddison, C.J., Guez, A., Sifre, L., van den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M., Dieleman, S., Grewe, D., Nham, J., Kalchbrenner, N., Sutskever, I., Lillicrap, T., Leach, M., Kavukcuoglu, K., Graepel, T., Hassabis, D.: Mastering the game of Go with deep neural networks and tree search. Nature 529(7587), 484–489 (2016)
Article Google Scholar
Srivastava, N., Hinton, G.E., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
MathSciNet MATH Google Scholar
Tesauro, G.: Practical issues in temporal difference learning. Mach. Learn. 8, 257–277 (1992)
MATH Google Scholar
Tesauro, G.: Connectionist learning of expert preferences by comparison training. In: NIPS, vol. 1, pp. 99–106 (1988)
Google Scholar
Tesauro, G.: Neurogammon wins computer olympiad. Neural Comput. 1(3), 321–323 (1989)
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Engineering and Natural Sciences, University of Iceland, Reykjavik, Iceland
Thomas Philip Runarsson

Authors

Thomas Philip Runarsson
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Thomas Philip Runarsson .

Editor information

Editors and Affiliations

Université Paris-Dauphine, Paris, France
Tristan Cazenave
Maastricht University, Maastricht, The Netherlands
Mark H.M. Winands
The University of New South Wales, Sydney, New South Wales, Australia
Abdallah Saffidine

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Runarsson, T.P. (2018). Deep Preference Neural Network for Move Prediction in Board Games. In: Cazenave, T., Winands, M., Saffidine, A. (eds) Computer Games. CGW 2017. Communications in Computer and Information Science, vol 818. Springer, Cham. https://doi.org/10.1007/978-3-319-75931-9_3

Download citation

DOI: https://doi.org/10.1007/978-3-319-75931-9_3
Published: 15 February 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-75930-2
Online ISBN: 978-3-319-75931-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics