Skip to main content

Advertisement

Log in

A multiagent player system composed by expert agents in specific game stages operating in high performance environment

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

This paper proposes a new approach for the non-supervised learning process of multiagent player systems operating in a high performance environment, being that the cooperative agents are trained so as to be expert in specific stages of a game. This proposal is implemented by means of the Checkers automatic player denominated D-MA-Draughts, which is composed of 26 agents. The first is specialized in initial and intermediary game stages, whereas the remaining are specialists in endgame stages (defined by board-games containing, at most, 12 pieces). Each of these agents consists of a Multilayer Neural Network, trained without human supervision through Temporal Difference Methods. The best move is determined by the distributed search algorithm known as Young Brothers Wait Concept. Each endgame agent is able to choose a move from a determined profile of endgame board. These profiles are defined by a clustering process performed by a Kohonen-SOM network from a database containing endgame boards retrieved from real matches. Once trained, the D-MA-Draughts agents can actuate in a match according to two distinct game dynamics. In fact, the D-MA-Draughts architecture corresponds to an extension of two preliminary versions: MP-Draughts, which is a multiagent system with a serial search algorithm, and D-VisionDraughts, which is a single agent with a distributed search algorithm. The D-MA-Draughts gains are estimated through several tournaments against these preliminary versions. The results show that D-MA-Draughts improves upon its predecessors by significantly reducing training time and the endgame loops, thus beating them in several tournaments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

References

  1. (2012) The message passing interface (mpi) standard. http://www.mcs.anl.gov/research/projects/mpi/

  2. Barcelos ARA, Julia RMS, Matias R Jr (2011) D-visiondraughts: a draughts player neural network that learns by reinforcement a high performance environment European symposium on artificial neural networks, computational intelligence and machine learning

  3. Baxter J, Trigdell A, Weaver L (1998) Knightcap: a chess program that learns by combining TD(λ) with game-tree search Proceedings of the 15th international conference on machine learning. Morgan Kaufmann, San Francisco, CA, pp 29–37

    Google Scholar 

  4. Brockington MG, Schaeffer J (2000) APHID: asynchronous parallel game-tree search. J Parallel Distrib Comput 60(2):247–273

    Article  MATH  Google Scholar 

  5. Caexeta GS (2008) Visiondraughts - um sistema de aprendizagem de jogos de damas baseado em redes neurais, diferenças temporais, algoritmos eficientes de busca em árvores e informações perfeitas contidas em bases de dados. Master’s thesis, Federal University of Uberlandia, Uberlandia, Brazil

  6. Caexeta GS, Julia RMS (2008) A draughts learning system based on neural networks and temporal differences: the impact of an efficient tree-search algorithm The 19th Brazilian symposium on artificial intelligence, SBIA, LNAI series of Springer-Verlag

    Google Scholar 

  7. Campos P, Langlois T (2003) Abalearn: efficient self-play learning of the game abalone INESC-ID, neural networks and signal processing group

    Google Scholar 

  8. Cao Y, Yu W, Ren W, Chen G (2013) An overview of recent progress in the study of distributed multi-agent coordination. IEEE Trans Ind Inf 9(1):427–438

    Article  Google Scholar 

  9. Chellapilla K, Fogel DB (2000) Anaconda defeats hoyle 6-0: a case study competing an evolved checkers program against commercially available software Proceedings of the 2000 congress on evolutionary computation CEC00, pp 857–863

    Google Scholar 

  10. Chellapilla K, Fotel DB (2001) Evolving an expert checkers playin program without using human expertise. IEEE Trans Evol Comput 5(4):422–428

    Article  Google Scholar 

  11. Duarte VAR, Julia RMS (2012) Mp-draughts: ordering the search tree and refining the game board representation to improve a multi-agent system for draughts IEEE international conference on tools with artificial intelligence (ICTAI)

    Google Scholar 

  12. Duarte VAR, Julia RMS, Barcelos ARA, Otsuka AB (2009) Mp-draughts: a multiagent reinforcement learning system based on mlp and kohonen-som neural networks IEEE international conference on systems, man, and cybernetics

  13. Epstein SL (2001) Learning to play expertly: a tutorial on Hoyle. pp 153–178

  14. Fierz MC (2016) Cake informations. http://www.fierz.ch/cake.php (disponível em 22/11/2016)

  15. Fogel DB, Chellapilla K (2001) Verifying anaconda’s expert rating by competing against chinook: experiments in co-evolving a neural checkers player. Neurocomputing 42(1–4):69–86

    MATH  Google Scholar 

  16. Golpayegani F, Dusparic I, Taylor A, Clarke S (2016) Multi-agent collaboration for conflict management in residential demand response. Comput Commun 96:63–72

    Article  Google Scholar 

  17. Grossberg S (1976) Adaptive pattern classification and universal recoding: I. Parallel development and coding of neural feature detectors. Biol Cybern 23(3):121–134

    Article  MathSciNet  MATH  Google Scholar 

  18. Hadzibeganovic T, Xia C (2016) Cooperation and strategy coexistence in a tag-based multi-agent system with contingent mobility. Knowl-Based Syst 112(Complete):1–13. doi:10.1016/j.knosys.2016.08.024

    Article  Google Scholar 

  19. Haykin S (1998) Neural networks: a comprehensive foundation, 2nd edn. Printice Hall

  20. van den Herik HJ, Uiterwijk JWHM, van Rijswijck J (2002) Games solved: now and in the future. Artif Intell 134(1-2):277–311

  21. Kohonen T (1990) The self-organizing map. Proc IEEE 78:1464–1480

    Article  Google Scholar 

  22. Kohonen T (2001) Self-organizing maps. Springer

  23. Leouski A (1995) Learning of position evaluation in the game of othello. Tech. rep., available in: http://people.ict.usc.edu/leuski/publications

  24. Levinson R, Weber R (2002) Chess neighborhoods, function combination, and reinforcement learning Revised papers from the second international conference on computers and games. Springer, London, UK

  25. Liu H, Zhang P, Hu B, Moore P (2015) A novel approach to task assignment in a cooperative multi-agent design system. Appl Intell 43(1):162–175

    Article  Google Scholar 

  26. Lu CPP (1993) Parallel search of narrow game trees. Master’s thesis, University of Alberta

  27. Lynch M (1997) Neurodraughts: an application of temporal difference learning to draughts. Master’s thesis, University of Limerick, Ireland

  28. Lynch M, Griffith N (1997) Neurodraughts: the role of representation, search, training regime and architecture in a td draughts player Eighth Ireland conference on artificial intelligence, pp 67–72

  29. Manohararajah V (2001) Parallel alpha-beta search on shared memory multiprocessors. Master’s thesis, University of Toronto

  30. McCulloch W, Pitts W (1943) A logical calculus of the ideas immanent in nervous activity. Bull Math Biophys 5:115– 133

    Article  MathSciNet  MATH  Google Scholar 

  31. Millington I (2006) Artificial intelligence for games. Morgan Kaufmann Publishers Inc, San Francisco, CA, USA

  32. Moulin B, Chaib-Draa B (1996) An overview of distributed artificial intelligence Foundations of distributed artificial intelligence, Wiley

  33. Neto HC (2007) Ls-draughts - um sistema de aprendizagem de jogos de damas baseado em algoritmos genéticos, redes neurais e diferenças temporais. Master’s thesis, Federal University of Uberlandia, Uberlandia, Brazil

  34. Neto HC, Julia RMS (2007) Ls-draughts - a draughts learning system based on genetic algorithms, neural network and temporal differences Proceedings of the IEEE congress on evolutionary computation, CEC 2007, Singapore, pp 2523–2529

  35. Neto HC, Julia RMS, Caexeta GS, Barcelos ARA (2014) Ls-visiondraughts: improving the performance of an agent for checkers by integrating computational intelligence, reinforcement learning and a powerful search method. Appl Intell 41(2):525–550. doi:10.1007/s10489-014-0536-y

    Article  Google Scholar 

  36. Neumann JV, Morgenstern O (1944) Theory of games and economic behavior. teste

  37. Feldmann PMOVB Monien R (1990) Distributed game tree search. In: Kumar V, Kanal LN, Gopalakrishnan PS (eds) Parallel algorithms for machine intelligence and vision, Springer, pp 66–101

  38. Rosaci D (2007) Cilios: connectionist inductive learning and inter-ontology similarities for recommending information agents. Inf Syst 32(6):793–825

    Article  Google Scholar 

  39. Rosaci D, Sarné GML (2011) Eva: an evolutionary approach to mutual monitoring of learning information agents. Appl Artif Intell 25(5):341–361

    Article  Google Scholar 

  40. Rosaci D, Sarné GML (2013) Cloning mechanisms to improve agent performances. J Netw Comput Appl 36(1):402–408

    Article  Google Scholar 

  41. Russell S, Norvig P (2003) Artificial intelligence: a modern approach, 2nd edn. Prentice Hall

  42. Samuel AL (1959) Some studies in machine learning using the game of checkers. IBM J Res Dev 3(3):210–229

    Article  MathSciNet  Google Scholar 

  43. Samuel AL (1967) Some studies in machine learning using the game of checkers ii. IBM J Res Dev 11 (6):601–617

    Article  Google Scholar 

  44. Schaeffer J (1992) man versus machine: the silicon graphics world checkers championship

  45. Schaeffer J, Culberson J, Treloar N, Knight B, Lu P, Szafron D (1992) A world championship caliber checkers program. Artif Intell 53(2-3):273–289

    Article  Google Scholar 

  46. Schaeffer J, Lake R, Lu P, Bryant M (1996) Chinook: the world man-machine checkers champion. AI Mag 17(1):21–30

    Google Scholar 

  47. Schaeffer J, Hlynka M, Jussila V (2001) Temporal difference learning applied to a high performance game-playing program International joint conference on artificial intelligence, pp 529–534

  48. Schaeffer J, Burch N, Bjornsson Y, Kishimoto A, Muller M, Lake R, Lu P, Sutphen S (2007) Checkers is solved. Science Express 328(5844):1518

    MathSciNet  MATH  Google Scholar 

  49. Schraudolph NN, Dayan P, Sejnowski TJ (2001) Learning to evaluate go positions via temporal difference methods Computational intelligence in games studies in fuzziness and soft computing, vol 62. Springer

  50. Sutton RS (1988) Learning to predict by the methods of temporal differences. Mach Learn 3(1):9–44

    Google Scholar 

  51. Tesauro G (1994) Td-gammon, a self-teaching backgammon program, achieves master-level play. Neural Comput 6(2):215–219

    Article  Google Scholar 

  52. Thrun S (1995) Learning to play the game of chess Advances in neural information processing systems, vol 7. The MIT Press, pp 1069–1076

  53. Tomaz LBP, Julia RMS, Barcelos ARA (2013) Improving the accomplishment of a neural network based agent for draughts that operates in a distributed learning environment IEEE 14th international conference on information reuse and integration

  54. WInands MHM (2004) Informed search in complex games. PhD thesis, Maastricht University

  55. Wooldridge M (2009) An introduction to multiagent systems, 2nd edn. Wiley, New York, NY, USA

  56. Woolridge M, Wooldridge M J (2001) Introduction to multiagent systems. Wiley, New York, NY, USA

  57. Zobrist A L (1969) A hashing method with applications for game playing. Tech rep, University of Wisconsin, Wisconsin

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lidia Bononi Paiva Tomaz.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Paiva Tomaz, L.B., Silva Julia, R.M. & Duarte, V.A. A multiagent player system composed by expert agents in specific game stages operating in high performance environment. Appl Intell 48, 1–22 (2018). https://doi.org/10.1007/s10489-017-0952-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-017-0952-x

Keywords

Navigation