Skip to main content

Advertisement

Log in

LS-VisionDraughts: improving the performance of an agent for checkers by integrating computational intelligence, reinforcement learning and a powerful search method

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

This paper presents LS-VisionDraughts: an efficient unsupervised evolutionary learning system for Checkers whose contribution is to automate the process of selecting an appropriate representation for the board states – by means of Evolutionary Computation – keeping a deep look-ahead (search depth) at the moment of choosing an adequate move. It corresponds to a player Multi Layer Perceptron Neural Network whose weights are updated through an evaluation function that is automatically adjusted by means of the Temporal Difference methods. A Genetic Algorithm automatically chooses a concise and efficient set of functions, which describe various scenarios associated with Checkers – called features – to represent the board states in the input layer of the Neural Network. It means that each individual of the Genetic Algorithm is a candidate set of features that is associated to a distinct Multi Layer Perceptron Neural Network. The output layer of the Neural Network is a real number (prediction) that indicates to which extent the input state is favorable to provide a better agent performance. In LS-VisionDraughts, a particular version of the search algorithm Alpha-Beta, called fail-soft Alpha-Beta, combined with Table Transposition, Iterative Deepening and ordered tree, uses this prediction value to choose the best move corresponding to the current board state. The best individual is chosen by means of numerous tournaments involving these selfsame Neural Networks. The architecture of LS-VisionDraught is inspired on the agent NeuroDraughts. However, the former system enhances the performance of the latter by automating the selection of the features through Evolutionary Computation and by replacing its Minimax search algorithm with the improved search strategy resumed above. This procedure allows for a 95 % reduction in the search runtime. Further, it remarkably increases the search tree depth. The results obtained from evaluative tournaments confirm the advances of LS-VisionDraughts compared to its opponents. It is however important to point out that LS-VisionDraughts learns practically without human supervision, contrary to the current automatic world champion Chinook, which has been built in a strongly supervised manner.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20

Similar content being viewed by others

References

  1. Abdoos M, Mozayani N, Bazzan ALC (2014) Hierarchical control of traffic signals using q-learning with tile coding. Appl Intell 40(2):201–213

    Article  Google Scholar 

  2. Al-Khateeb B, Kendall G (2012) Effect of look-ahead depth in evolutionary checkers. J Comput Sci Tech 27(5):996–1006

    Article  MATH  MathSciNet  Google Scholar 

  3. Al-Khateeb B, Kendall G (2012) Introducing individual and social learning into evolutionary checkers. IEEE Trans Comput Intell AI Games 258–269

  4. Barcelos ARA, Julia RMS, Matias R Jr (2011) D-visiondraughts: a draughts player neural network that learns by reinforcement a high performance environment. Eur Symp Artif Neural Netw Comput Intell Mach Learn

  5. Baxter J, Trigdell A, Weaver L (1998) Knightcap: a chess program that learns by combining TD(λ) with game-tree search. In: Proceedings 15th international conference on machine learning. Morgan Kaufmann, San Francisco, pp 29–37

    Google Scholar 

  6. Breuker D, Uiterwijk J, Herik H (1994) Replacement schemes for transposition tables. Technical report. Available in: http://citeseer.ist.psu.edu/112066.html

  7. Caexeta GS, Julia RMS (2008) A draughts learning system based on neural networks and temporal differences: the impact of an efficent tree-search algorithm. In: The 19th Brazilian symposium on artificial intelligence, SBIA, LNAI series of Springer-Verlag

  8. Campos P, Langlois T (2003) Abalearn: Efficient self-play learning of the game abalone. In: INESC-ID, neural networks and signal processing group

  9. Cheheltani SH, Ebadzadeh MM (2012) Immune based fuzzy agent plays checkers game. Appl Soft Comput 12(8):2227–2236

    Article  Google Scholar 

  10. Darwen PJ (2001) Why co-evolution beats temporal difference learning at backgammon for a linear architecture, but not a non-linear architecture. In: Proceedings of the 2001 congress on evolutionary computation CEC2001. IEEE Press, pp 1003– 1010

  11. Derrac J, Garcia S, Molina D, Herrera F (2011) A practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms. Swarm Evol Comput 1(1):3–18

    Article  Google Scholar 

  12. Duarte VAR, Julia RMS (2009) Mp-draughts: a multiagent reiforcement learning system based on mpl and kohonen-som neural networks. In: IEEE international conference on systems, man and cybernetics, pp 2270–2275

  13. Dutta PK (1999) Strategies and games: theory and practice. MIT Press, Cambridge

    Google Scholar 

  14. Eiben AE, Smith JE (2003) Introduction to evolutionary computing. Springer

  15. Epstein S (2001) Learning to play expertly: a tutorial on hoyle. In: Machines that learn to play games. Nova Science Publishers, Huntington

    Google Scholar 

  16. Fierz MC (2008) Cake 1.85. Technical report. Available in: http://www.fierz.ch/checkers.htm

  17. Fierz MC (2012) Checkerboard program - version 1.72. Technical report. Available in http://www.fierz.ch/checkerboard.php

  18. Fogel DB (2002) Blondie24: playing at the edge of AI. Morgan Kaufmann, San Francisco

    Google Scholar 

  19. Fogel DB, Chellapilla K (2001) Verifying anaconda’s expert rating by competing against chinook: experiments in co-evolving a neural checkers player. Neurocomputing 42(1–4):69–86

    Google Scholar 

  20. Fogel DB, Hays TJ, Hahn SL, Quon J (2004) A self-learning evolutionary chess program. Proc IEEE 92(12):1947–1954

    Article  Google Scholar 

  21. Gilbert E (2000) Kingsrow. Technical report. Available in: http://edgilbert.org/Checkers/KingsRow.htm

  22. Haykin S (1998) Neural networks: a comprehensive foundation, 2ł edn. Printice Hall

  23. Herik HJV, Uiterwijk JW, Rijswijck JV (2002) Games solved: now and in the future. Artif Intell 134(1-2):277–311

    Article  MATH  Google Scholar 

  24. Holland JH (1992) Adaptation in natural and artificial systems: an introductory analysis with applications to biology, control, and artificial intelligence, 2ł edn. MIT Press

  25. Kim KJ, Cho SB (2005) Systematically incorporating domain-specific knowledge into evolutionary speciated checkers players. IEEE Trans Evol Comput 9(6):615–627

    Article  MathSciNet  Google Scholar 

  26. Kortenkamp D, Bonasso RP, Murphy R (1988) AI- based mobile robots: case studies of successful robot systems. MIT Press

  27. Leouski A (1995) Learning of position evaluation in the game of othello. Technical Report. Available in: http://people.ict.usc.edu/leuski/publications

  28. Levinson R, Weber R (2002) Chess neighborhoods, function combination, and reinforcement learning. In: Revised papers from the second international conference on computers and games. Springer, London

    Google Scholar 

  29. Lynch M (1997) An application of temporal difference learning to draughts. Master’s Thesis, University of Limerick, Ireland

  30. Lynch M, Griffith N (1997) Neurodraughts: the role of representation, search, training regime and architecture in a td draughts player. In: 8th Ireland conference on artificial intelligence, pp 67–72

  31. Marsland TA (1986) A review of game-tree pruning. In: International computer chess association journal, pp 3–19

  32. McCarthy JL, Feigenbaum EA (1990) In memoriam - Arthur Samuel: pioneer in machine learning. AI Mag 11(3):10–11

    Google Scholar 

  33. McCulloch W, Pitts W (1943) A logical calculus of the ideas immanent in nervous activity. Bull Math Biophys 5:115–133

    Article  MATH  MathSciNet  Google Scholar 

  34. Mendonca M, Arruda LVR, Junior FN (2012) Autonomous navigation system using event driven-fuzzy cognitive maps. Appl Intell 37(2):175–188

    Article  Google Scholar 

  35. Millington I (2006) Artificial intelligence for games. Morgan Kaufmann, San Francisco

    Google Scholar 

  36. Neto HC, Julia RMS (2007) Ls-draughts - a draughts learning system based on genetic algorithms, neural network and temporal differences. In: Proceedings of the IEEE congress on evolutionary computation, CEC 2007. Singapore, pp 2523– 2529

  37. Neto HC, Julia RMS, Caexeta GS (2009) Theory and novel application of machine learning. chap. In: LSDraughts: using databases to treat endgame loop in a hybrid evolutionary learning system. I-Tech Education and Publishing

  38. Neumann JV, Morgenstern O (1944) Theory of games and economic behavior. Princeton University Press. Available in: http://en.wikipedia.org/wiki/Theory_of_Games_and_Economic_Behavior

  39. Plaat A (1996) Research re: search & re-search. Ph.D. Thesis, Rotterdam, The Netherlands

  40. Plaat A, Schaeffer J, Pijls W, Bruin A (1995) A new paradigm for minimax search

  41. Ribeiro CHC, Monteiro ST (2003) Navigation learning in map building for autonomous mobile robots. In: 4th National meeting of artificial intelligence (ENIA). Article title translated from original version in portuguese

  42. Richards N, Moriarty DE, Miikkulainen R (1998) Evolving neural networks to play go. Appl Intell 8(1):85–96

    Article  Google Scholar 

  43. Russell S, Norvig P (2003) Artificial intelligence: a modern approach, 2ł edn. Prentice Hall

  44. Salcedo-Sanz S, Matias-Roman JM, Jimenez-Fernandez S, Portilla-Figueras A, Cuadra L (2013) An evolutionary-based hyper-heuristic approach for the jawbreaker puzzle. Appl Intell 1–11

  45. Samuel AL (1959) Some studies in machine learning using the game of checkers. IBM J Res Dev 3(3):210–229

    Article  MathSciNet  Google Scholar 

  46. Samuel AL (1967) Some studies in machine learning using the game of checkers II. IBM J Res Dev 11(6):601–617

    Article  Google Scholar 

  47. Schaeffer J (2002) Applying the experience of building a high performance search engine for one domain to another

  48. Schaeffer J, Hlynka M, Jussila V (2001) Temporal difference learning applied to a high performance game-playing program. In: International joint conference on artificial intelligence, pp 529–534

  49. Schaeffer J, Lake R, Lu P, Bryant M (1996) Chinook: the world man-machine checkers champion. AI Mag 17(1):21–30

    Google Scholar 

  50. Schraudolph NN, Dayan P, Sejnowski TJ (2001) Learning to evaluate go positions via temporal difference methods. In: Computational intelligence in games studies in fuzziness and soft computing, vol 62. Springer

  51. Shams R, Kaindl H, Horacek H (1991) Using aspiration windows for minimax algorithms. In: Proceedings of the 12th international joint conference on artificial intelligence. Morgan Kaufmann, pp 192–197

  52. Silver D, Sutton RS, Muller M (2012) Temporal-difference search in computer go. Mach Learn 87(2):183–219

    Article  MATH  MathSciNet  Google Scholar 

  53. Slate DJ, Atkin LR (1977) Chess skill in man and machine. Springer

  54. Stevanovic R (2007) Quantum random bit generator service. Technical report. Available in: http://random.irb.hr

  55. Sutton RS (1988) Learning to predict by the methods of temporal differences. Mach Learn 3(1):9–44

    Google Scholar 

  56. Tesauro G (1992) Practical issues in temporal difference learning. In: Advances in neural information processing systems, vol 4. Morgan Kaufmann, pp 259–266

  57. Tesauro G (1994) Td-gammon, a self-teaching backgammon program, achieves master-level play. Neural Comput 6(2):215–219

    Article  Google Scholar 

  58. Tesauro G (1995) Temporal difference learning and td-gammon. Commun ACM 38(3):58–68

    Article  Google Scholar 

  59. Thrun S (1995) Learning to play the game of chess. In: Advances in neural information processing systems 7. The MIT Press, pp 1069–1076

  60. Tucker AW (1950) Prisioner’s dilema problem. Technical report. Available in: http://www.answers.com/topic/prisoner-s-dilemma

  61. Walker MA (2000) An application of reinforcement learning to dialogue strategy in a spoken dialogue system for email. Artif Intell Res 12:387–416

    MATH  Google Scholar 

  62. Burgard W, Fox D, Jans H, Matenar C, Thrun S (1999) Sonar-based mapping with mobile robots using em. In: Proceedings 16th international conference on machine learning

  63. Wiering M (2000) Multi-agent reinforcement learning for traffic light control. In: 17th international conference on machine learning, pp 1151–1158

  64. Zobrist AL (1969) A hashing method with applications for game playing. Technical report

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Henrique Castro Neto.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Neto, H.C., Julia, R.M.S., Caexeta, G.S. et al. LS-VisionDraughts: improving the performance of an agent for checkers by integrating computational intelligence, reinforcement learning and a powerful search method. Appl Intell 41, 525–550 (2014). https://doi.org/10.1007/s10489-014-0536-y

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-014-0536-y

Keywords

Navigation