A multiagent player system composed by expert agents in specific game stages operating in high performance environment

Paiva Tomaz, Lidia Bononi; Silva Julia, Rita Maria; Duarte, Valquiria Aparecida

doi:10.1007/s10489-017-0952-x

A multiagent player system composed by expert agents in specific game stages operating in high performance environment

Published: 09 June 2017

Volume 48, pages 1–22, (2018)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Lidia Bononi Paiva Tomaz ORCID: orcid.org/0000-0001-5608-8898¹,
Rita Maria Silva Julia¹ &
Valquiria Aparecida Duarte¹

1132 Accesses
14 Citations
Explore all metrics

Abstract

This paper proposes a new approach for the non-supervised learning process of multiagent player systems operating in a high performance environment, being that the cooperative agents are trained so as to be expert in specific stages of a game. This proposal is implemented by means of the Checkers automatic player denominated D-MA-Draughts, which is composed of 26 agents. The first is specialized in initial and intermediary game stages, whereas the remaining are specialists in endgame stages (defined by board-games containing, at most, 12 pieces). Each of these agents consists of a Multilayer Neural Network, trained without human supervision through Temporal Difference Methods. The best move is determined by the distributed search algorithm known as Young Brothers Wait Concept. Each endgame agent is able to choose a move from a determined profile of endgame board. These profiles are defined by a clustering process performed by a Kohonen-SOM network from a database containing endgame boards retrieved from real matches. Once trained, the D-MA-Draughts agents can actuate in a match according to two distinct game dynamics. In fact, the D-MA-Draughts architecture corresponds to an extension of two preliminary versions: MP-Draughts, which is a multiagent system with a serial search algorithm, and D-VisionDraughts, which is a single agent with a distributed search algorithm. The D-MA-Draughts gains are estimated through several tournaments against these preliminary versions. The results show that D-MA-Draughts improves upon its predecessors by significantly reducing training time and the endgame loops, thus beating them in several tournaments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

Article Open access 31 March 2021

Multi-agent deep reinforcement learning: a survey

Article Open access 15 April 2021

Monte Carlo Tree Search: a review of recent modifications and applications

Article Open access 19 July 2022

References

(2012) The message passing interface (mpi) standard. http://www.mcs.anl.gov/research/projects/mpi/
Barcelos ARA, Julia RMS, Matias R Jr (2011) D-visiondraughts: a draughts player neural network that learns by reinforcement a high performance environment European symposium on artificial neural networks, computational intelligence and machine learning
Baxter J, Trigdell A, Weaver L (1998) Knightcap: a chess program that learns by combining TD(λ) with game-tree search Proceedings of the 15th international conference on machine learning. Morgan Kaufmann, San Francisco, CA, pp 29–37
Google Scholar
Brockington MG, Schaeffer J (2000) APHID: asynchronous parallel game-tree search. J Parallel Distrib Comput 60(2):247–273
Article MATH Google Scholar
Caexeta GS (2008) Visiondraughts - um sistema de aprendizagem de jogos de damas baseado em redes neurais, diferenças temporais, algoritmos eficientes de busca em árvores e informações perfeitas contidas em bases de dados. Master’s thesis, Federal University of Uberlandia, Uberlandia, Brazil
Caexeta GS, Julia RMS (2008) A draughts learning system based on neural networks and temporal differences: the impact of an efficient tree-search algorithm The 19th Brazilian symposium on artificial intelligence, SBIA, LNAI series of Springer-Verlag
Google Scholar
Campos P, Langlois T (2003) Abalearn: efficient self-play learning of the game abalone INESC-ID, neural networks and signal processing group
Google Scholar
Cao Y, Yu W, Ren W, Chen G (2013) An overview of recent progress in the study of distributed multi-agent coordination. IEEE Trans Ind Inf 9(1):427–438
Article Google Scholar
Chellapilla K, Fogel DB (2000) Anaconda defeats hoyle 6-0: a case study competing an evolved checkers program against commercially available software Proceedings of the 2000 congress on evolutionary computation CEC00, pp 857–863
Google Scholar
Chellapilla K, Fotel DB (2001) Evolving an expert checkers playin program without using human expertise. IEEE Trans Evol Comput 5(4):422–428
Article Google Scholar
Duarte VAR, Julia RMS (2012) Mp-draughts: ordering the search tree and refining the game board representation to improve a multi-agent system for draughts IEEE international conference on tools with artificial intelligence (ICTAI)
Google Scholar
Duarte VAR, Julia RMS, Barcelos ARA, Otsuka AB (2009) Mp-draughts: a multiagent reinforcement learning system based on mlp and kohonen-som neural networks IEEE international conference on systems, man, and cybernetics
Epstein SL (2001) Learning to play expertly: a tutorial on Hoyle. pp 153–178
Fierz MC (2016) Cake informations. http://www.fierz.ch/cake.php (disponível em 22/11/2016)
Fogel DB, Chellapilla K (2001) Verifying anaconda’s expert rating by competing against chinook: experiments in co-evolving a neural checkers player. Neurocomputing 42(1–4):69–86
MATH Google Scholar
Golpayegani F, Dusparic I, Taylor A, Clarke S (2016) Multi-agent collaboration for conflict management in residential demand response. Comput Commun 96:63–72
Article Google Scholar
Grossberg S (1976) Adaptive pattern classification and universal recoding: I. Parallel development and coding of neural feature detectors. Biol Cybern 23(3):121–134
Article MathSciNet MATH Google Scholar
Hadzibeganovic T, Xia C (2016) Cooperation and strategy coexistence in a tag-based multi-agent system with contingent mobility. Knowl-Based Syst 112(Complete):1–13. doi:10.1016/j.knosys.2016.08.024
Article Google Scholar
Haykin S (1998) Neural networks: a comprehensive foundation, 2nd edn. Printice Hall
van den Herik HJ, Uiterwijk JWHM, van Rijswijck J (2002) Games solved: now and in the future. Artif Intell 134(1-2):277–311
Kohonen T (1990) The self-organizing map. Proc IEEE 78:1464–1480
Article Google Scholar
Kohonen T (2001) Self-organizing maps. Springer
Leouski A (1995) Learning of position evaluation in the game of othello. Tech. rep., available in: http://people.ict.usc.edu/leuski/publications
Levinson R, Weber R (2002) Chess neighborhoods, function combination, and reinforcement learning Revised papers from the second international conference on computers and games. Springer, London, UK
Liu H, Zhang P, Hu B, Moore P (2015) A novel approach to task assignment in a cooperative multi-agent design system. Appl Intell 43(1):162–175
Article Google Scholar
Lu CPP (1993) Parallel search of narrow game trees. Master’s thesis, University of Alberta
Lynch M (1997) Neurodraughts: an application of temporal difference learning to draughts. Master’s thesis, University of Limerick, Ireland
Lynch M, Griffith N (1997) Neurodraughts: the role of representation, search, training regime and architecture in a td draughts player Eighth Ireland conference on artificial intelligence, pp 67–72
Manohararajah V (2001) Parallel alpha-beta search on shared memory multiprocessors. Master’s thesis, University of Toronto
McCulloch W, Pitts W (1943) A logical calculus of the ideas immanent in nervous activity. Bull Math Biophys 5:115– 133
Article MathSciNet MATH Google Scholar
Millington I (2006) Artificial intelligence for games. Morgan Kaufmann Publishers Inc, San Francisco, CA, USA
Moulin B, Chaib-Draa B (1996) An overview of distributed artificial intelligence Foundations of distributed artificial intelligence, Wiley
Neto HC (2007) Ls-draughts - um sistema de aprendizagem de jogos de damas baseado em algoritmos genéticos, redes neurais e diferenças temporais. Master’s thesis, Federal University of Uberlandia, Uberlandia, Brazil
Neto HC, Julia RMS (2007) Ls-draughts - a draughts learning system based on genetic algorithms, neural network and temporal differences Proceedings of the IEEE congress on evolutionary computation, CEC 2007, Singapore, pp 2523–2529
Neto HC, Julia RMS, Caexeta GS, Barcelos ARA (2014) Ls-visiondraughts: improving the performance of an agent for checkers by integrating computational intelligence, reinforcement learning and a powerful search method. Appl Intell 41(2):525–550. doi:10.1007/s10489-014-0536-y
Article Google Scholar
Neumann JV, Morgenstern O (1944) Theory of games and economic behavior. teste
Feldmann PMOVB Monien R (1990) Distributed game tree search. In: Kumar V, Kanal LN, Gopalakrishnan PS (eds) Parallel algorithms for machine intelligence and vision, Springer, pp 66–101
Rosaci D (2007) Cilios: connectionist inductive learning and inter-ontology similarities for recommending information agents. Inf Syst 32(6):793–825
Article Google Scholar
Rosaci D, Sarné GML (2011) Eva: an evolutionary approach to mutual monitoring of learning information agents. Appl Artif Intell 25(5):341–361
Article Google Scholar
Rosaci D, Sarné GML (2013) Cloning mechanisms to improve agent performances. J Netw Comput Appl 36(1):402–408
Article Google Scholar
Russell S, Norvig P (2003) Artificial intelligence: a modern approach, 2nd edn. Prentice Hall
Samuel AL (1959) Some studies in machine learning using the game of checkers. IBM J Res Dev 3(3):210–229
Article MathSciNet Google Scholar
Samuel AL (1967) Some studies in machine learning using the game of checkers ii. IBM J Res Dev 11 (6):601–617
Article Google Scholar
Schaeffer J (1992) man versus machine: the silicon graphics world checkers championship
Schaeffer J, Culberson J, Treloar N, Knight B, Lu P, Szafron D (1992) A world championship caliber checkers program. Artif Intell 53(2-3):273–289
Article Google Scholar
Schaeffer J, Lake R, Lu P, Bryant M (1996) Chinook: the world man-machine checkers champion. AI Mag 17(1):21–30
Google Scholar
Schaeffer J, Hlynka M, Jussila V (2001) Temporal difference learning applied to a high performance game-playing program International joint conference on artificial intelligence, pp 529–534
Schaeffer J, Burch N, Bjornsson Y, Kishimoto A, Muller M, Lake R, Lu P, Sutphen S (2007) Checkers is solved. Science Express 328(5844):1518
MathSciNet MATH Google Scholar
Schraudolph NN, Dayan P, Sejnowski TJ (2001) Learning to evaluate go positions via temporal difference methods Computational intelligence in games studies in fuzziness and soft computing, vol 62. Springer
Sutton RS (1988) Learning to predict by the methods of temporal differences. Mach Learn 3(1):9–44
Google Scholar
Tesauro G (1994) Td-gammon, a self-teaching backgammon program, achieves master-level play. Neural Comput 6(2):215–219
Article Google Scholar
Thrun S (1995) Learning to play the game of chess Advances in neural information processing systems, vol 7. The MIT Press, pp 1069–1076
Tomaz LBP, Julia RMS, Barcelos ARA (2013) Improving the accomplishment of a neural network based agent for draughts that operates in a distributed learning environment IEEE 14th international conference on information reuse and integration
WInands MHM (2004) Informed search in complex games. PhD thesis, Maastricht University
Wooldridge M (2009) An introduction to multiagent systems, 2nd edn. Wiley, New York, NY, USA
Woolridge M, Wooldridge M J (2001) Introduction to multiagent systems. Wiley, New York, NY, USA
Zobrist A L (1969) A hashing method with applications for game playing. Tech rep, University of Wisconsin, Wisconsin

Download references

Author information

Authors and Affiliations

Federal University of Uberlândia, Uberlândia, Brazil
Lidia Bononi Paiva Tomaz, Rita Maria Silva Julia & Valquiria Aparecida Duarte

Authors

Lidia Bononi Paiva Tomaz
View author publications
You can also search for this author in PubMed Google Scholar
Rita Maria Silva Julia
View author publications
You can also search for this author in PubMed Google Scholar
Valquiria Aparecida Duarte
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lidia Bononi Paiva Tomaz.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Paiva Tomaz, L.B., Silva Julia, R.M. & Duarte, V.A. A multiagent player system composed by expert agents in specific game stages operating in high performance environment. Appl Intell 48, 1–22 (2018). https://doi.org/10.1007/s10489-017-0952-x

Download citation

Published: 09 June 2017
Issue Date: January 2018
DOI: https://doi.org/10.1007/s10489-017-0952-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A multiagent player system composed by expert agents in specific game stages operating in high performance environment

Abstract

Access this article

Similar content being viewed by others

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

Multi-agent deep reinforcement learning: a survey

Monte Carlo Tree Search: a review of recent modifications and applications

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A multiagent player system composed by expert agents in specific game stages operating in high performance environment

Abstract

Access this article

Similar content being viewed by others

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

Multi-agent deep reinforcement learning: a survey

Monte Carlo Tree Search: a review of recent modifications and applications

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation