LS-VisionDraughts: improving the performance of an agent for checkers by integrating computational intelligence, reinforcement learning and a powerful search method

Neto, Henrique Castro; Julia, Rita Maria Silva; Caexeta, Gutierrez Soares; Barcelos, Ayres Roberto Araujo

doi:10.1007/s10489-014-0536-y

LS-VisionDraughts: improving the performance of an agent for checkers by integrating computational intelligence, reinforcement learning and a powerful search method

Published: 24 April 2014

Volume 41, pages 525–550, (2014)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Henrique Castro Neto¹,
Rita Maria Silva Julia¹,
Gutierrez Soares Caexeta¹ &
…
Ayres Roberto Araujo Barcelos¹

433 Accesses
14 Citations
Explore all metrics

Abstract

This paper presents LS-VisionDraughts: an efficient unsupervised evolutionary learning system for Checkers whose contribution is to automate the process of selecting an appropriate representation for the board states – by means of Evolutionary Computation – keeping a deep look-ahead (search depth) at the moment of choosing an adequate move. It corresponds to a player Multi Layer Perceptron Neural Network whose weights are updated through an evaluation function that is automatically adjusted by means of the Temporal Difference methods. A Genetic Algorithm automatically chooses a concise and efficient set of functions, which describe various scenarios associated with Checkers – called features – to represent the board states in the input layer of the Neural Network. It means that each individual of the Genetic Algorithm is a candidate set of features that is associated to a distinct Multi Layer Perceptron Neural Network. The output layer of the Neural Network is a real number (prediction) that indicates to which extent the input state is favorable to provide a better agent performance. In LS-VisionDraughts, a particular version of the search algorithm Alpha-Beta, called fail-soft Alpha-Beta, combined with Table Transposition, Iterative Deepening and ordered tree, uses this prediction value to choose the best move corresponding to the current board state. The best individual is chosen by means of numerous tournaments involving these selfsame Neural Networks. The architecture of LS-VisionDraught is inspired on the agent NeuroDraughts. However, the former system enhances the performance of the latter by automating the selection of the features through Evolutionary Computation and by replacing its Minimax search algorithm with the improved search strategy resumed above. This procedure allows for a 95 % reduction in the search runtime. Further, it remarkably increases the search tree depth. The results obtained from evaluative tournaments confirm the advances of LS-VisionDraughts compared to its opponents. It is however important to point out that LS-VisionDraughts learns practically without human supervision, contrary to the current automatic world champion Chinook, which has been built in a strongly supervised manner.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Developing a Reinforcement Learning Agent for the Game of Checkers

Optimisation of a Checkers Player Using Neural and Metaheuristic Approaches

Safe contextual Bayesian optimization integrated in industrial control for self-learning machines

Article Open access 13 February 2023

References

Abdoos M, Mozayani N, Bazzan ALC (2014) Hierarchical control of traffic signals using q-learning with tile coding. Appl Intell 40(2):201–213
Article Google Scholar
Al-Khateeb B, Kendall G (2012) Effect of look-ahead depth in evolutionary checkers. J Comput Sci Tech 27(5):996–1006
Article MATH MathSciNet Google Scholar
Al-Khateeb B, Kendall G (2012) Introducing individual and social learning into evolutionary checkers. IEEE Trans Comput Intell AI Games 258–269
Barcelos ARA, Julia RMS, Matias R Jr (2011) D-visiondraughts: a draughts player neural network that learns by reinforcement a high performance environment. Eur Symp Artif Neural Netw Comput Intell Mach Learn
Baxter J, Trigdell A, Weaver L (1998) Knightcap: a chess program that learns by combining TD(λ) with game-tree search. In: Proceedings 15th international conference on machine learning. Morgan Kaufmann, San Francisco, pp 29–37
Google Scholar
Breuker D, Uiterwijk J, Herik H (1994) Replacement schemes for transposition tables. Technical report. Available in: http://citeseer.ist.psu.edu/112066.html
Caexeta GS, Julia RMS (2008) A draughts learning system based on neural networks and temporal differences: the impact of an efficent tree-search algorithm. In: The 19th Brazilian symposium on artificial intelligence, SBIA, LNAI series of Springer-Verlag
Campos P, Langlois T (2003) Abalearn: Efficient self-play learning of the game abalone. In: INESC-ID, neural networks and signal processing group
Cheheltani SH, Ebadzadeh MM (2012) Immune based fuzzy agent plays checkers game. Appl Soft Comput 12(8):2227–2236
Article Google Scholar
Darwen PJ (2001) Why co-evolution beats temporal difference learning at backgammon for a linear architecture, but not a non-linear architecture. In: Proceedings of the 2001 congress on evolutionary computation CEC2001. IEEE Press, pp 1003– 1010
Derrac J, Garcia S, Molina D, Herrera F (2011) A practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms. Swarm Evol Comput 1(1):3–18
Article Google Scholar
Duarte VAR, Julia RMS (2009) Mp-draughts: a multiagent reiforcement learning system based on mpl and kohonen-som neural networks. In: IEEE international conference on systems, man and cybernetics, pp 2270–2275
Dutta PK (1999) Strategies and games: theory and practice. MIT Press, Cambridge
Google Scholar
Eiben AE, Smith JE (2003) Introduction to evolutionary computing. Springer
Epstein S (2001) Learning to play expertly: a tutorial on hoyle. In: Machines that learn to play games. Nova Science Publishers, Huntington
Google Scholar
Fierz MC (2008) Cake 1.85. Technical report. Available in: http://www.fierz.ch/checkers.htm
Fierz MC (2012) Checkerboard program - version 1.72. Technical report. Available in http://www.fierz.ch/checkerboard.php
Fogel DB (2002) Blondie24: playing at the edge of AI. Morgan Kaufmann, San Francisco
Google Scholar
Fogel DB, Chellapilla K (2001) Verifying anaconda’s expert rating by competing against chinook: experiments in co-evolving a neural checkers player. Neurocomputing 42(1–4):69–86
Google Scholar
Fogel DB, Hays TJ, Hahn SL, Quon J (2004) A self-learning evolutionary chess program. Proc IEEE 92(12):1947–1954
Article Google Scholar
Gilbert E (2000) Kingsrow. Technical report. Available in: http://edgilbert.org/Checkers/KingsRow.htm
Haykin S (1998) Neural networks: a comprehensive foundation, 2ł edn. Printice Hall
Herik HJV, Uiterwijk JW, Rijswijck JV (2002) Games solved: now and in the future. Artif Intell 134(1-2):277–311
Article MATH Google Scholar
Holland JH (1992) Adaptation in natural and artificial systems: an introductory analysis with applications to biology, control, and artificial intelligence, 2ł edn. MIT Press
Kim KJ, Cho SB (2005) Systematically incorporating domain-specific knowledge into evolutionary speciated checkers players. IEEE Trans Evol Comput 9(6):615–627
Article MathSciNet Google Scholar
Kortenkamp D, Bonasso RP, Murphy R (1988) AI- based mobile robots: case studies of successful robot systems. MIT Press
Leouski A (1995) Learning of position evaluation in the game of othello. Technical Report. Available in: http://people.ict.usc.edu/leuski/publications
Levinson R, Weber R (2002) Chess neighborhoods, function combination, and reinforcement learning. In: Revised papers from the second international conference on computers and games. Springer, London
Google Scholar
Lynch M (1997) An application of temporal difference learning to draughts. Master’s Thesis, University of Limerick, Ireland
Lynch M, Griffith N (1997) Neurodraughts: the role of representation, search, training regime and architecture in a td draughts player. In: 8th Ireland conference on artificial intelligence, pp 67–72
Marsland TA (1986) A review of game-tree pruning. In: International computer chess association journal, pp 3–19
McCarthy JL, Feigenbaum EA (1990) In memoriam - Arthur Samuel: pioneer in machine learning. AI Mag 11(3):10–11
Google Scholar
McCulloch W, Pitts W (1943) A logical calculus of the ideas immanent in nervous activity. Bull Math Biophys 5:115–133
Article MATH MathSciNet Google Scholar
Mendonca M, Arruda LVR, Junior FN (2012) Autonomous navigation system using event driven-fuzzy cognitive maps. Appl Intell 37(2):175–188
Article Google Scholar
Millington I (2006) Artificial intelligence for games. Morgan Kaufmann, San Francisco
Google Scholar
Neto HC, Julia RMS (2007) Ls-draughts - a draughts learning system based on genetic algorithms, neural network and temporal differences. In: Proceedings of the IEEE congress on evolutionary computation, CEC 2007. Singapore, pp 2523– 2529
Neto HC, Julia RMS, Caexeta GS (2009) Theory and novel application of machine learning. chap. In: LSDraughts: using databases to treat endgame loop in a hybrid evolutionary learning system. I-Tech Education and Publishing
Neumann JV, Morgenstern O (1944) Theory of games and economic behavior. Princeton University Press. Available in: http://en.wikipedia.org/wiki/Theory_of_Games_and_Economic_Behavior
Plaat A (1996) Research re: search & re-search. Ph.D. Thesis, Rotterdam, The Netherlands
Plaat A, Schaeffer J, Pijls W, Bruin A (1995) A new paradigm for minimax search
Ribeiro CHC, Monteiro ST (2003) Navigation learning in map building for autonomous mobile robots. In: 4th National meeting of artificial intelligence (ENIA). Article title translated from original version in portuguese
Richards N, Moriarty DE, Miikkulainen R (1998) Evolving neural networks to play go. Appl Intell 8(1):85–96
Article Google Scholar
Russell S, Norvig P (2003) Artificial intelligence: a modern approach, 2ł edn. Prentice Hall
Salcedo-Sanz S, Matias-Roman JM, Jimenez-Fernandez S, Portilla-Figueras A, Cuadra L (2013) An evolutionary-based hyper-heuristic approach for the jawbreaker puzzle. Appl Intell 1–11
Samuel AL (1959) Some studies in machine learning using the game of checkers. IBM J Res Dev 3(3):210–229
Article MathSciNet Google Scholar
Samuel AL (1967) Some studies in machine learning using the game of checkers II. IBM J Res Dev 11(6):601–617
Article Google Scholar
Schaeffer J (2002) Applying the experience of building a high performance search engine for one domain to another
Schaeffer J, Hlynka M, Jussila V (2001) Temporal difference learning applied to a high performance game-playing program. In: International joint conference on artificial intelligence, pp 529–534
Schaeffer J, Lake R, Lu P, Bryant M (1996) Chinook: the world man-machine checkers champion. AI Mag 17(1):21–30
Google Scholar
Schraudolph NN, Dayan P, Sejnowski TJ (2001) Learning to evaluate go positions via temporal difference methods. In: Computational intelligence in games studies in fuzziness and soft computing, vol 62. Springer
Shams R, Kaindl H, Horacek H (1991) Using aspiration windows for minimax algorithms. In: Proceedings of the 12th international joint conference on artificial intelligence. Morgan Kaufmann, pp 192–197
Silver D, Sutton RS, Muller M (2012) Temporal-difference search in computer go. Mach Learn 87(2):183–219
Article MATH MathSciNet Google Scholar
Slate DJ, Atkin LR (1977) Chess skill in man and machine. Springer
Stevanovic R (2007) Quantum random bit generator service. Technical report. Available in: http://random.irb.hr
Sutton RS (1988) Learning to predict by the methods of temporal differences. Mach Learn 3(1):9–44
Google Scholar
Tesauro G (1992) Practical issues in temporal difference learning. In: Advances in neural information processing systems, vol 4. Morgan Kaufmann, pp 259–266
Tesauro G (1994) Td-gammon, a self-teaching backgammon program, achieves master-level play. Neural Comput 6(2):215–219
Article Google Scholar
Tesauro G (1995) Temporal difference learning and td-gammon. Commun ACM 38(3):58–68
Article Google Scholar
Thrun S (1995) Learning to play the game of chess. In: Advances in neural information processing systems 7. The MIT Press, pp 1069–1076
Tucker AW (1950) Prisioner’s dilema problem. Technical report. Available in: http://www.answers.com/topic/prisoner-s-dilemma
Walker MA (2000) An application of reinforcement learning to dialogue strategy in a spoken dialogue system for email. Artif Intell Res 12:387–416
MATH Google Scholar
Burgard W, Fox D, Jans H, Matenar C, Thrun S (1999) Sonar-based mapping with mobile robots using em. In: Proceedings 16th international conference on machine learning
Wiering M (2000) Multi-agent reinforcement learning for traffic light control. In: 17th international conference on machine learning, pp 1151–1158
Zobrist AL (1969) A hashing method with applications for game playing. Technical report

Download references

Author information

Authors and Affiliations

Computer Sciences Department, Federal University of Uberlandia, Campus Santa Monica, CEP 38400-902, Av. Joao Naves de Avila, 2121, Block 1B, Room 1B143, Uberlandia, Brazil
Henrique Castro Neto, Rita Maria Silva Julia, Gutierrez Soares Caexeta & Ayres Roberto Araujo Barcelos

Authors

Henrique Castro Neto
View author publications
You can also search for this author in PubMed Google Scholar
Rita Maria Silva Julia
View author publications
You can also search for this author in PubMed Google Scholar
Gutierrez Soares Caexeta
View author publications
You can also search for this author in PubMed Google Scholar
Ayres Roberto Araujo Barcelos
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Henrique Castro Neto.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Neto, H.C., Julia, R.M.S., Caexeta, G.S. et al. LS-VisionDraughts: improving the performance of an agent for checkers by integrating computational intelligence, reinforcement learning and a powerful search method. Appl Intell 41, 525–550 (2014). https://doi.org/10.1007/s10489-014-0536-y

Download citation

Published: 24 April 2014
Issue Date: September 2014
DOI: https://doi.org/10.1007/s10489-014-0536-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

LS-VisionDraughts: improving the performance of an agent for checkers by integrating computational intelligence, reinforcement learning and a powerful search method

Abstract

Access this article

Similar content being viewed by others

Developing a Reinforcement Learning Agent for the Game of Checkers

Optimisation of a Checkers Player Using Neural and Metaheuristic Approaches

Safe contextual Bayesian optimization integrated in industrial control for self-learning machines

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation