Abstract
A very effective and promising approach to simulate real-life conditions in multi-agent virtual environments with intelligent agents is to introduce social parameters and dynamics. Introduction of social parameters in such settings reshapes the overall performance of the synthetic agents, so a new challenge of reconsidering the methods to assess agents’ evolution emerges. In a number of studies regarding such environments, the rating of the agents is being considered in terms of metrics (or measures or simple grading) designed for humans, such as Elo and Glicko. In this paper, we present a large-scale evaluation of existing rating methods and a proposal for a new rating approach named Relative Skill-Level Estimator (RSLE), which can be regarded as a base for developing rating systems for multi-agent systems. The presented comparative study highlights an inconsistency in rating synthetic agents in the context described by the widely used methods and demonstrates the efficiency of the new RSLE.







Similar content being viewed by others
Notes
Here, we do not consider the L1 distance but we, instead, simply subtract the rankings.
The application along with usage instructions, an implementation of the algorithms as well as the experimental dataset can be accessed online https://sites.google.com/site/kiourtchairi/projects/reskill.
References
Al-Khateeb B, Kendall G (2011) Introducing a Round Robin tournament into evolutionary individual and social learning checkers. In: Developments in E-systems engineering, Dubai, United Arab Emirates, pp 294–299. https://doi.org/110.1109/DeSE.2011.10
Attle S, Baker B (2007) Cooperative learning in a competitive environment: classroom applications. Int J Teach Learn High Educ 19(1):77–83
Basili V, Caldiera G, Rombach HD (1994) Goal question metric approach. In: Encyclopedia of software engineering, pp 528–532
Busoniu L, Babuska R, De Schutter B (2008) A comprehensive survey of multiagent reinforcement learning. IEEE Trans Syst Man Cybern Part C Appl Rev 38(2):156–172. https://doi.org/10.1109/TSMCC.2007.913919
Caballero A, Botia J, Gomez-Skarmeta A (2011) Using cognitive agents in social simulations. Eng Appl Artif Intell 24(7):1098–1109. https://doi.org/10.1016/j.engappai.2011.06.006
Calvo R (1998) Valencia Spain: the cradle of European chess. In: Presentation to the CCI, Vienna, Austria
Di Bitonto P, Laterza M, Rosell T, Rossano V (2010) An evaluation method for multi-agent systems. In: Agent and multi-agent systems: technologies and applications, pp 32-41. https://doi.org/110.1007/978-3-642-13480-7_5
Edelkamp S, Kissmann P (2008) Symbolic classication of general two-player games. In: Technical report. Technische Universitat, Dortmund
Elo AE (1978) The rating of chess players, past and present. Arco Publishing, New York
Ferber J (1999) Multi-agent systems: an introduction to distributed artificial intelligence. Addison-Wesley, Boston
Ferber J, Gutknecht O, Michel F (2004) From agents to organizations: an organizational view of multi-agent systems. In: In Agent-oriented software engineering IV, LNCS 2935. Springer, Berlin, pp 214–230. https://doi.org/110.1007/978-3-540-24620-6_15
Francisca T, Castro AM, Ana PR, Eugnio O (2013) Multi-agent learning in both cooperative and competitive environments. In: 16th Portuguese conference on artificial intelligence. Azores, Portugal, pp 400–411
Friedrich H, Rogalla O, Dillmann R (1998) Integrating skills into multi-agent systems. J Intell Manuf 9(2):119–127. https://doi.org/10.1023/A:1008811827890
Gilbert N, Troitzsch KG (2005) Simulation for the social scientist, 2nd edn. Open University Press, New York
Glickman EM (1995) A comprehensive guide to chess ratings. Am Chess J 3:59–102
Glickman EM (1999) Parameter estimation in large dynamic paired comparison experiments. Appl Stat 48:377–394
Glickman EM, Albyn JC (1999) Rating the chess rating system. Chance 12(2):21–28
Harkness K (1967) Official chess handbook. McKay, Philadelphia
Herbrich R, Minka T, Graepel T (2007) TrueSkill(TM): a Bayesian skill rating system. Adv Neural Inf Process Syst 19:269–576
Hoen JP, Tuyls K, Panait L, Luke S, La Poutr JA (2005) An overview of cooperative and competitive multiagent learning. In: Proceedings of the first international conference on learning and adaption in multi-agent systems, Utrecht, The Netherlands, pp 1–46. https://doi.org/10.1007/11691839_1
Kalles D (2007) Measuring expert impact on learning how to play a board game. In: IFIP conference on artificial intelligence applications and innovations, Athens, Greece
Kalles D, Kanellopoulos P (2001) On verifying game design and playing strategies using reinforcement learning. In: Proceedings of ACM symposium on applied computing, special track on artificial intelligence and computation logic. ACM, Las Vegas, pp 6–11
Kapetanakis S, Kudenko D (2002) Reinforcement learning of coordination in cooperative multi-agent system. In: Proceeding of eighteenth national conference on artificial intelligence. Edmonton, Alberta, pp 326–331
Kendall M (1936) A new measure of rank correlation. Biometrica 30(1/2):81–93
Kiourt C, Kalles D (2012) Social reinforcement learning in game playing. In: IEEE international conference on tools with artificial intelligence, Athens, Greece, pp 322–326. https://doi.org/10.1109/ICTAI.2012.51
Kiourt C, Kalles D (2016) Synthetic learning agents in game-playing social environments. Adapt Behav 24(6):411–427. https://doi.org/10.1177/1059712316679239
Kiourt C, Pavlidis G, Kalles D (2016a) Human rating methods on multi-agent systems. In: Multi-agent systems and agreement technologies: 13th European conference, EUMAS 2015, and third international conference, AT 2015, Athens, Greece. Springer, pp 129–136. https://doi.org/10.1007/978-3-319-33509-4_11
Kiourt C, Pavlidis G, Kalles D (2016b) ReSkill: relative skill-level calculation system. In: 9th hellenic conference on artificial intelligence (SETN2016), Thessaloniki, Greece, p 39. https://doi.org/10.1145/2903220.2903224
Kiourt C, Kalles D and Kanellopoulos P (2017) How game complexity affects the playing behavior of synthetic agents. In: 15th European conference on multi-agent systems (EUMAS2017), France
Kotb Y, Beauchemin S, Barron J (2012) Workflow nets for multiagent cooperation. IEEE Trans Autom Sci Eng 9(1):198–203. https://doi.org/10.1109/TASE.2011.2163510
Langville NA, Meyer DC (2012) Who’s #1?: The science of rating and ranking. Princeton University Press, Princeton
Liang, Shi W (2005) Performance evaluation of rating aggregation algorithms in reputation systems. In: International conference on collaborative computing: networking, applications and worksharing, p 10. https://doi.org/10.1109/COLCOM.2005.1651235
Logan Y, Kagan T (2013) Elo ratings for structural credit assignment in multiagent systems. In: Twenty-seventh AAAI conference on artificial intelligence (late-breaking developments)
March JG (1991) Exploration and exploitation in organizational learning. Organ Sci 2(1):71–87
Marivate VN, Marwala T (2008) Social learning methods in board game agents. In: IEEE symposium computational intelligence and games, Perth, Australia, pp 323–328. https://doi.org/10.1109/CIG.2008.5035657
Nikolenko S, Sirotkin A (2011) A new bayesian rating system for team competitions. In: Proceedings of the 28th international conference on machine learning, Bellevue, Washington, pp 601–608
Panait L, Luke S (2005) Cooperative multi-agent learning: the state of the art. Auton Agent Multi-Agent Syst 11(3):387–434. https://doi.org/10.1007/s10458-005-2631-2
Pieter JH, Tuyls K, Panait L, Luke S (2006) An overview of cooperative and competitive multiagent learning. In: Learning and adaption in multi-agent systems. Lecture notes in computer science, vol 3898, pp 1–46. https://doi.org/10.1007/11691839_1
Ruohomaa S, Kutvonen L, Koutrouli E (2007) Reputation management survey. In: ARES’07: the second international conference on availability, reliability and security, pp 103–111. https://doi.org/10.1109/ARES.2007.123
Russell S, Norving P (2010) Artificial intelligence: a modern approach. Prentice-Hall Inc, Upper Saddle River
Shoham Y, Leyton KB (2009) Multiagent systems: algorithmic, game-theoretic and logical foundations. Cambridge University Press, New York
Spearman C (1904) The proof and measurement of association between two things. Am J Psychol 15(1):72–101
Sutton R (1988) Learning to predict by the methods of temporal differences. Mach Learn 3(1):9–44. https://doi.org/10.1023/A:1022633531479
Sutton R, Barto A (1998) Reinforcement learning—an introduction. MIT Press, Cambridge
Tampuu A, Matiisen T, Kodelja D, Kuzovkin I, Korjus K, Aru J, Vicente R (2017) Multiagent cooperation and competition with deep reinforcement learning. PLoS ONE 12(4):e0172395. https://doi.org/10.1371/journal.pone.0172395
Tesauro G (1992) Practical issues in temporal difference learning. Mach Learn 4:257–277. https://doi.org/10.1007/BF00992697
Tesauro G (1995) Temporal difference learning and TD-gammon. Commun ACM 38(3):56–68. https://doi.org/10.1145/203330.203343
Tesauro G (2002) Programming backgammon using self-teaching neural nets. Artif Intell 134:181–199. https://doi.org/10.1016/S0004-3702(01)00110-2
Tromp J (2008) Solving connect-4 on medium board sizes. ICGA J 31(1):110–112
Weng RC, Lin C (2011) A bayesian approximation method for online ranking. J Mach Learn Res 12:267–300
Wilensky U (1999) NetLogo itself. In: Center for connected learning and computer-based modeling. Northwestern University, Evanston
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Kiourt, C., Kalles, D. & Pavlidis, G. Rating the skill of synthetic agents in competitive multi-agent environments. Knowl Inf Syst 58, 35–58 (2019). https://doi.org/10.1007/s10115-018-1234-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-018-1234-6