Abstract
This paper describes first results from the application of Temporal Difference learning [1] to shogi. We report on experiments to determine whether sensible values for shogi pieces can be obtained in the same manner as for western chess pieces [2]. The learning is obtained entirely from randomised self-play, without access to any form of expert knowledge. The piece values are used in a simple search program that chooses shogi moves from a shallow lookahead, using pieces values to evaluate the leaves, with a random tie-break at the top level. Temporal difference learning is used to adjust the piece values over the course of a series of games. The method is successful in learning values that perform well in matches against hand-crafted values.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Sutton, R.S.: Learning to Predict by the Methods of Temporal Differences. Machine Learning 3 (1988) 9–44
Beal, D.F. and Smith, M.C.: Learning Piece Values Using Temporal Differences International Computer Chess Association Journal, Vol. 20, No. 3 (1997) 147–151
Levinson, R. and Snyder, R.: Adaptive Pattern Oriented Chess. Proceedings of AAAI-91, Morgan-Kaufman (1991) 601–605
Christensen, J. and Korf, R.: A Unified Theory of Heuristic Evaluation Functions and its Application to Learning.. AAAI-86, Morgan-Kaufman (1986) 148–152
Baxter, J., Tridgell, A. and Weaver, L.: KnightCap: A chess program that learns by combining TD(lambda) with game-tree search. In: Machine Learning, Proceedings of the Fifteenth International Conference (ICML’ 98), Madison (1998) 28–36
Fairbairn, J.: Shogi for Beginners. Ishi Press International (1989)
Leggett, T.: Shogi: Japan’s Game of Strategy. Charles E. Tuttle Company [Reprinted in 1993, first published in 1966]
Matsubara, H., Iida, H. and Grimbergen, R.: Natural Developments in Game Research: From Chess to Shogi to Go International Computer Chess Association Journal, Vol. 19, No. 2 (1996) 103–112
Tesauro, G.: Practical Issues in Temporal Difference Learning. Machine Learning 8 (1988) 9–44
Tesauro, G.: TD-Gammon, a Self-Teaching Backgammon Program, achieves Master Level Play. Neural Computation, Vol. 6, No. 2 (1994) 215–219
Marsland, T.A.: Computer Chess and Search. In: Shapiro, S. (ed.) Encyclopaedia of Artificial Intelligence. 2nd edn. J. Wiley & Sons (1992)
Beal, D.F.: Experiments with the Null Move. In: Beal, D.F. (ed.) Advances in Computer Chess 5. Elsevier Science Publishers (1989) 65–79
Donninger, C.: Null Move and Deep Search: Selective Search Heuristics for Obtuse Chess Programs. International Computer Chess Association Journal, Vol. 16, No. 3 (1993) 137–143
Mutz, M.: Gnu Shogi v1.2p03. Available from many sources, including ftp://ftp.unipassau. de/pub/local/shogi (1994)
Yamashita, H.: YSS: About the Data Structures and the Algorithm. Published on the WWW at http://plaza15.mbn.or.jp/~yss (1997)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Beal, D.F., Smith, M.C. (1999). First Results from Using Temporal Difference Learning in Shogi. In: van den Herik, H.J., Iida, H. (eds) Computers and Games. CG 1998. Lecture Notes in Computer Science, vol 1558. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48957-6_7
Download citation
DOI: https://doi.org/10.1007/3-540-48957-6_7
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65766-8
Online ISBN: 978-3-540-48957-3
eBook Packages: Springer Book Archive