ABSTRACT
We propose Coevolutionary Gradient Search, a blueprint for a family of iterative learning algorithms that combine elements of local search and population-based search. The approach is applied to learning Othello strategies represented as n-tuple networks, using different search operators and modes of learning. We focus on the interplay between the continuous, directed, gradient-based search in the space of weights, and fitness-driven, combinatorial, coevolutionary search in the space of entire n-tuple networks. In an extensive experiment, we assess both the objective and relative performance of algorithms, concluding that the hybridization of search techniques improves the convergence. The best algorithms not only learn faster than constituent methods alone, but also produce top ranked strategies in the online Othello League.
- P. J. Angeline and J. B. Pollack. Competitive Environments Evolve Better Solutions for Complex Tasks. In S. Forrest, editor, Proceedings of the 5th International Conference on Genetic Algorithms, pages 264--270, 1993. Google ScholarDigital Library
- W. W. Bledsoe and I. Browning. Pattern Recognition and Reading by Machine. In Papers presented at the December 1--3, 1959, eastern joint IRE-AIEE-ACM computer conference, IRE-AIEE-ACM '59 (Eastern), pages 225--232, New York, NY, USA, 1959. ACM. Google ScholarDigital Library
- M. Buro. Logistello: A Strong Learning Othello Program. In 19th Annual Conference Gesellschaft für Klassifikation e.V., 1995.Google Scholar
- S. Y. Chong, M. K. Tan, and J. D. White. Observing the Evolution of Neural Networks Learning to Play the Game of Othello. IEEE Trans. Evolutionary Computation, 9(3):240--251, 2005. Google ScholarDigital Library
- D. Denaro and D. Parisi. Cultural Evolution in a Population of Neural Networks. In Proceedings of the 8th italian workshop on neural nets, 1997.Google ScholarCross Ref
- S. G. Ficici. Solution Concepts in Coevolutionary Algorithms. PhD thesis, Waltham, MA, USA, 2004. Adviser-Jordan B. Pollack. Google ScholarDigital Library
- D. B. Fogel. Blondie24: Playing at the Edge of AI. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 2002. Google ScholarDigital Library
- K.-J. Kim, H. Choi, and S.-B. Cho. Hybrid of Evolution and Reinforcement Learning for Othello Players. Computational Intelligence and Games, 2007. CIG 2007. IEEE Symposium on, pages 203--209, 2007.Google ScholarDigital Library
- A. Kolcz and N. M. Allinson. N-tuple Regression Network. Neural Netw., 9:855--869, July 1996. Google ScholarDigital Library
- K. Krawiec and M. Szubert. Coevolutionary Temporal Difference Learning for Small-Board Go. In IEEE Congress on Evolutionary Computation (CEC 2010), pages 1513--1520, 2010.Google Scholar
- S. Lucas. Learning to Play Othello with N-tuple Systems. Australian Journal of Intelligent Information Processing Systems, Special Issue on Game Technology, 9(4):01--20, 2007.Google Scholar
- S. Lucas and T. P. Runarsson. Othello Competition; http://algoval.essex.ac.uk:8080/othello/League.jsp.Google Scholar
- S. M. Lucas and T. P. Runarsson. Temporal Difference Learning versus Co-Evolution for Acquiring Othello Position Evaluation. In CIG, pages 52--59, 2006.Google ScholarCross Ref
- S. Luke. ECJ 20 -- A Java-based Evolutionary Computation Research System. http://cs.gmu.edu/ eclab/projects/ecj/, 2010.Google Scholar
- E. P. Manning. Using Resource-Limited Nash Memory to Improve an Othello Evaluation Function. IEEE Transactions on Computational Intelligence and AI in Games, 2(1):40--53, 2010.Google ScholarCross Ref
- J. B. Pollack and A. D. Blair. Co-Evolution in the Successful Learning of Backgammon Strategy. Machine Learning, 32(3):225--240, 1998. Google ScholarDigital Library
- R. Rohwer and M. Morciniec. A Theoretical and Experimental Account of N-tuple Classifier performance. Neural Comput., 8:629--642, April 1996. Google ScholarDigital Library
- C. D. Rosin and R. K. Belew. New Methods for Competitive Coevolution. Evolutionary Computation, 5(1):1--29, 1997. Google ScholarDigital Library
- T. P. Runarsson and S. Lucas. Co-Evolution versus Self-Play Temporal Difference Learning for Acquiring Position Evaluation in Small-Board Go. IEEE Transactions on Evolutionary Computation, 9, 2005. Google ScholarDigital Library
- K. O. Stanley. Efficient Evolution of Neural Networks Through Complexification. PhD thesis, Department of Computer Sciences, The University of Texas at Austin, 2004. Google ScholarDigital Library
- R. S. Sutton. Learning to Predict by the Methods of Temporal Differences. Machine Learning, 3:9--44, 1988. Google ScholarDigital Library
- M. Szubert. cECJ -- Coevolutionary Computation in Java. http://www.cs.put.poznan.pl/mszubert/projects/cecj.html, 2010.Google Scholar
- M. Szubert, W. Jaskowski, and K. Krawiec. Coevolutionary Temporal Difference Learning for Othello. In IEEE Symposium on Computational Intelligence and Games, 2009. Google ScholarDigital Library
- G. Tesauro. Temporal Difference Learning and TD-Gammon. Commun. ACM, 38(3):58--68, 1995. Google ScholarDigital Library
- R. A. Watson and J. B. Pollack. Coevolutionary Dynamics in a Minimal Substrate. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2001), pages 702--709, 2001.Google Scholar
Index Terms
- Learning n-tuple networks for othello by coevolutionary gradient search
Recommendations
Evolving Game Playing Strategies for Othello Incorporating Reinforcement Learning and Mobility
SAICSIT '15: Proceedings of the 2015 Annual Research Conference on South African Institute of Computer Scientists and Information TechnologistsGenetic programming is rapidly gaining popularity in research areas for the induction of complex game playing strategies for board games such as Othello, checkers, backgammon and chess endgames. Most of this research has focused on developing evaluation ...
Evolving small-board Go players using coevolutionary temporal difference learning with archives
Evolving small-board Go players using coevolutionary temporal difference learning with archivesWe apply Coevolutionary Temporal Difference Learning CTDL to learn small-board Go strategies represented as weighted piece counters. CTDL is a randomized ...
A cooperative coevolutionary biogeography-based optimizer
With its unique migration operator and mutation operator, Biogeography-Based Optimization (BBO), which simulates migration of species in natural biogeography, is different from existing evolutionary algorithms, but it has shortcomings such as poor ...
Comments