Multi-Criteria Comparison of Coevolution and Temporal Difference Learning on Othello

Jaśkowski, Wojciech; Szubert, Marcin; Liskowski, Paweł

doi:10.1007/978-3-662-45523-4_25

Multi-Criteria Comparison of Coevolution and Temporal Difference Learning on Othello

Wojciech Jaśkowski¹⁵,
Marcin Szubert¹⁵ &
Paweł Liskowski¹⁵

Conference paper
First Online: 29 November 2014

1762 Accesses
3 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8602))

Abstract

We compare Temporal Difference Learning (TDL) with Coevolutionary Learning (CEL) on Othello. Apart from using three popular single-criteria performance measures: (i) generalization performance or expected utility, (ii) average results against a hand-crafted heuristic and (iii) result in a head to head match, we compare the algorithms using performance profiles. This multi-criteria performance measure characterizes player’s performance in the context of opponents of various strength. The multi-criteria analysis reveals that although the generalization performance of players produced by the two algorithms is similar, TDL is much better at playing against strong opponents, while CEL copes better against weak ones. We also find out that the TDL produces less diverse strategies than CEL. Our results confirms the usefulness of performance profiles as a tool for comparison of learning algorithms for games.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Lucas, S.M., Runarsson, T.P.: Temporal difference learning versus co-evolution for acquiring othello position evaluation. In: IEEE Symposium on Computational Intelligence and Games, 52–59 IEEE (2006)
Google Scholar
van den Dries, S., Wiering, M.A.: Neural-Fitted TD-Leaf Learning for Playing Othello With Structured Neural Networks. IEEE Transactions on Neural Networks and Learning Systems 23(11), 1701–1713 (2012)
Article Google Scholar
Szubert, M., Jaśkowski, W., Krawiec, K.: On scalability, generalization, and hybridization of coevolutionary learning: a case study for othello. IEEE Transactions on Computational Intelligence and AI in Games 5(3), 214–226 (2013)
Article Google Scholar
Axelrod, R.: The evolution of strategies in the iterated prisoner’s dilemma. In: Davis, L., (ed.) Genetic Algorithms in Simulated Annealing, London pp. 32–41 (1987)
Google Scholar
Sutton, R.S.: Learning to predict by the methods of temporal differences. Machine learning 3(1), 9–44 (1988)
Google Scholar
Sutton, R., Barto, A.: Reinforcement learning, Vol. 9. MIT Press (1998)
Google Scholar
Szubert, M., Jaśkowski, W., Krawiec, K.: Learning board evaluation function for othello by hybridizing coevolution with temporal difference learning. Control and Cybernetics 40(3), 805–831 (2011)
MathSciNet Google Scholar
Lucas, S.M.: Learning to play Othello with N-tuple systems. Australian Journal of Intelligent Information Processing Systems, Special Issue on Game Technology 9(4), 01–20 (2007)
Google Scholar
Darwen, P.J.: Why co-evolution beats temporal difference learning at backgammon for a linear architecture, but not a non-linear architecture. In: Proceedings of the 2001 Congress on Evolutionary Computation, Vol. 2, pp. 1003–1010. IEEE (2001)
Google Scholar
Jaśkowski, W., Liskowski, P., Szubert, M., Krawiec, K.: Improving coevolution by random sampling. In: Blum, C. (ed.) GECCO’13: Proceedings of the 15th Annual Conference on Genetic and Evolutionary Computation, pp. 1141–1148. ACM, Amsterdam (2013)
Google Scholar
Popovici, E., Bucci, A., Wiegand, R.P., de Jong, E.D.: Coevolutionary Principles. In: Handbook of Natural Computing. Springer (2011)
Google Scholar
Nolfi, S., Floreano, D.: Coevolving Predator and Prey Robots: Do Arms Races Arise in Artificial Evolution? Artificial Life 4(4), 311–335 (1998)
Article Google Scholar
Tesauro, G.: Temporal difference learning and td-gammon. Communications of the ACM 38(3), 58–68 (1995)
Article Google Scholar
Chong, S.Y., Tino, P., Yao, X.: Relationship between generalization and diversity in coevolutionary learning. IEEE Transactions on Computational Intelligence and AI in Games 1(3), 214–232 (2009)
Article Google Scholar
Baker, J.E.: Reducing bias and inefficiency in the selection algorithms (1985)
Google Scholar
Chong, S.Y., Tino, P., Ku, D.C., Xin, Y.: Improving Generalization Performance in Co-Evolutionary Learning. IEEE Transactions on Evolutionary Computation 16(1), 70–85 (2012)
Article Google Scholar
Szubert, M., Jaśkowski, W., Krawiec, K.: Coevolutionary temporal difference learning for othello. In: IEEE Symposium on Computational Intelligence and Games, Milano, Italy, pp. 104–111 (2009)
Google Scholar
Samothrakis, S., Lucas, S., Runarsson, T., Robles, D.: Coevolving Game-Playing Agents: Measuring Performance and Intransitivities. IEEE Transactions on Evolutionary Computation 99, 1–15 (2012)
Google Scholar
Comaniciu, D., Meer, P., Member, S.: Mean shift: A robust approach toward feature space analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence 24, 603–619 (2002)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Computing Science, Poznan University of Technology, Poznań, Poland
Wojciech Jaśkowski, Marcin Szubert & Paweł Liskowski

Authors

Wojciech Jaśkowski
View author publications
You can also search for this author in PubMed Google Scholar
Marcin Szubert
View author publications
You can also search for this author in PubMed Google Scholar
Paweł Liskowski
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wojciech Jaśkowski .

Editor information

Editors and Affiliations

Depto. Estadística e Investigación, Universidad Politécnica de Valencia, Valencia, Spain
Anna I. Esparcia-Alcázar
University of Granada, Granada, Spain
Antonio M. Mora

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jaśkowski, W., Szubert, M., Liskowski, P. (2014). Multi-Criteria Comparison of Coevolution and Temporal Difference Learning on Othello. In: Esparcia-Alcázar, A., Mora, A. (eds) Applications of Evolutionary Computation. EvoApplications 2014. Lecture Notes in Computer Science(), vol 8602. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-45523-4_25

Download citation

DOI: https://doi.org/10.1007/978-3-662-45523-4_25
Published: 29 November 2014
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-45522-7
Online ISBN: 978-3-662-45523-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics