Skip to main content

Chess Neighborhoods, Function Combination, and Reinforcement Learning

  • Conference paper
  • First Online:
Computers and Games (CG 2000)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2063))

Included in the following conference series:

Abstract

Over the years, various research projects have attempted to develop a chess program that learns to play well given little prior knowledge beyond the rules of the game. Early on it was recognized that the key would be to adequately represent the relationships between the pieces and to evaluate the strengths or weaknesses of such relationships. As such, representations have developed, including a graph-based model. In this paper we extend the work on graph representation to a precise type of graph that we call a piece or square neighborhood. Specifically, a chessboard is represented as 64 neighborhoods, one for each square. Each neighborhood has a center, and 16 satellites corresponding to the pieces that are immediately close on the 4 diagonals, 2 ranks, 2 files, and 8 knight moves related to the square. Games are played and training values for boards are developed using temporal difference learning, as in other reinforcement learning systems. We then use a 2-layer regression network to learn. At the lower level the values (expected probability of winning) of the neighborhoods are learned and at the top they are combined based on their product and entropy. We report on relevant experiments including a learning experience on the Internet Chess Club (ICC) from which we can estimate a rating for the new program. The level of chess play achieved in a few days of training is comparable to a few months of work on previous systems such as Morph which is described as “one of the best from-scratch game learning systems, perhaps the best” [22].

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Allen, J., Hamilton, E., and Levinson, R. New Advances in Adaptive Pattern-Oriented Chess (1997). In H.J. van den Herik and J.W.H., Uiterwijk. Advances in Computer Chess 8, pp. 312-233., Universiteit Maastricht, The Netherlands.

    Google Scholar 

  2. Baxter, J., Tridgell, A., and Weaver, L. A chess program that learns by combining TD(λ) with game tree search. In Proceedings of the 15th International Conference on Machine Learning (ICML-98), pages 28–36. Madision, WI. 1998. Morgan Kaufmann.

    Google Scholar 

  3. Ballard, D. H. An Introduction to Natural Computation. Cambridge: MIT Press.

    Google Scholar 

  4. Beal, D. F., & Smith, M.C. (1994). Random Evaluation in Chess. ICCA Journal, Vol. 17, No. 1, pp. 3–9 (A).

    Google Scholar 

  5. Beal, D. F., & Smith, M.C. Learning Piece Values Using Temporal Differences. Journal of The International Computer Chess Association, September 1997.

    Google Scholar 

  6. Beal, D. F., & Smith, M.C. First results from using temporal difference learning in Shogi. In H. J. van den Herik and H. Iida, editors, Proceedings of the First International Conference on Computers and Games ( CG-98), volume 1558 of Lecture Notes in Computer Science, page 114, Tsukuba, Japan, 1998. Springer-Verlag.

    Google Scholar 

  7. Bishop, Christopher M. Neural Networks for Pattern Recognition, Oxford Univ. Press, 1998. ISBN0-19-853864-2.

    Google Scholar 

  8. Bradtke, S. J., and Barto, A. G. (1996). Linear least-squares algorithms for temporal difference learning. Machine Learning, 22, 33–57.

    MATH  Google Scholar 

  9. Christensen, J. and Korf, R. (1986). A unified theory of heuristic evaluation functions and its applications to learning. Proceedings of AAAI-86 (pp. 148–152).

    Google Scholar 

  10. Fürnkranz, J., Machine learning in computer chess: The next generation. International Computer Chess Association Journal, 19(3):147–160, September (1996).

    Google Scholar 

  11. Gherrity, M. A Game-Learning Machine. Ph.D thesis. University of California, San Diego. San Diego, CA. 1993.

    Google Scholar 

  12. Helmbold, D. P., Kivinen, J., and Warmuth, M. K. (1996a), Worst-case loss bounds for sigmoided linear neurons, in “Advances in Neural Information Processing Systems 8,” MIT Press, Cambridge, MA.

    Google Scholar 

  13. Herik, H.J. van den. A New Research Scope. International Computer Chess Association Journal 21(4), 1998.

    Google Scholar 

  14. Kivinen, J. and Warmuth, M. K. Additive versus exponentiated gradient updates for linear prediction. Information and Computation. Vol. 2, pp. 285–318, 1998.

    Google Scholar 

  15. Levinson, R. A., and Snyder, R. (1991). Adaptive pattern-oriented chess. In L. Birnbaum and G. Collins (Eds.), Proceedings of the 8th International Workshop on Machine Learning, pp. 85–89, Morgan Kaufmann.

    Google Scholar 

  16. Levinson, R. A., and Snyder, R., “Distance: Towards the Unification of Chess Knowledge”, International Computer Chess Association Journal 16(3): 123–136, September 1993.

    Google Scholar 

  17. Levinson, R. A., and Weber, R. J. (2000). “Pattern-level Temporal Difference Learning, Data Fusion, and Chess”. In SPIE’s 14th Annual Conference on Aerospace/Defense Sensing and Controls: Sensor Fusion: Architectures, Algorithms, and Applications IV.

    Google Scholar 

  18. Littlestone, N., Long, P.M., and Warmuth, M. K. (1995), On-line learning of linear functions, Journal of Computational Complexity 5, 1–23.

    Article  MATH  MathSciNet  Google Scholar 

  19. Pearl, J. (1984). Heuristics: Intelligent Search Strategies for Computer Problem Solving. Addison-Wesley, Reading, Massachusetts.

    Google Scholar 

  20. Pellen, Luke. Neural net chess program Octavius: http://home.seol.net.au/luke/Octavius (1999).

  21. Samuel, A. (1959). Some studies in machine learning using the game of checkers. IBM J. of Research and Development, 3, 210–229.

    Article  MathSciNet  Google Scholar 

  22. Scott, J. Machine Learning in Games: the Morph Project, Swarthmore College, Swarthmore, PA. http://forum.swarthmore.edu/~jay/learn-game/projects/morph.html.

  23. Slate, D.J., A chess program that uses its transposition table to learn from experience. International Computer Chess Association Journal 10(2):59–71, 1987.

    Google Scholar 

  24. Sutton, R. S. (1988). Learning to predict by the methods of temporal differences. Machine Learning, 3, 9–44.

    Google Scholar 

  25. Sutton, R. S., & Barto, A.G. (1998). Reinforcement Learning: An Introduction. Cambridge: MIT Press.

    Google Scholar 

  26. Tesauro, G. Temporal Difference Learning and TD-Gammon. Communications of the ACM, Vol 38, No 3, March 1995.

    Google Scholar 

  27. Tesauro, G. Practical Issues in Temporal Difference Learning. Machine Learning, 8:257–278, 1992.

    MATH  Google Scholar 

  28. Thrun, S., 1995. Learning to Play the Game of Chess. In Advances in Neural Information Processing Systems (NIPS) 7, G. Tesauro, D. Touretzky, and T. Leen (eds.), MIT Press.

    Google Scholar 

  29. Widrow, B., and Stearns, S. (1985), “Adaptive Signal Processing,” Prentice Hall, Engelwood Cliffs, NJ.

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2001 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Levinson, R., Weber, R. (2001). Chess Neighborhoods, Function Combination, and Reinforcement Learning. In: Marsland, T., Frank, I. (eds) Computers and Games. CG 2000. Lecture Notes in Computer Science, vol 2063. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45579-5_9

Download citation

  • DOI: https://doi.org/10.1007/3-540-45579-5_9

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-43080-3

  • Online ISBN: 978-3-540-45579-0

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics