ABSTRACT
Among many interaction schemes in coevolutionary settings for interactive domains, the round-robin tournament provides the most precise evaluation of candidate solutions at the expense of computational effort. In order to improve the coevolutionary learning speed, we propose an interaction scheme that computes only a fraction of interactions outcomes between the pairs of coevolving individuals. The missing outcomes in the interaction matrix are predicted using matrix factorization. The algorithm adaptively decides how much of the interaction matrix to compute based on the learning speed statistics. We evaluate our method in the context of coevolutionary covariance matrix adaptation strategy (CoCMAES) for the problem of learning position evaluation in the game of Othello. We show that our adaptive interaction scheme allows to match the state-of-the-art results obtained by the standard round-robin CoCMAES while, at the same time, considerably improves the learning speed.
- P. J. Angeline and J. B. Pollack. Competitive Environments Evolve Better Solutions for Complex Tasks. In Proceedings of the 5th International Conference on Genetic Algorithms, pages 264--270, San Francisco, CA, USA, 1993. Morgan Kaufmann Publishers Inc. Google ScholarDigital Library
- M. W. Berry, M. Browne, A. N. Langville, V. P. Pauca, and R. J. Plemmons. Algorithms and applications for approximate nonnegative matrix factorization. Computational statistics & data analysis, 52(1):155--173, 2007.Google Scholar
- W. W. Bledsoe and I. Browning. Pattern recognition and reading by machine. In Proc. Eastern Joint Comput. Conf., pages 225--232, 1959. Google ScholarDigital Library
- M. Buro. Experiments with Multi-ProbCut and a new high-quality evaluation function for Othello. In H. J. van den Herik et al., editor, Games in AI Research, pages 77--96. Univ. Maastricht, 2000.Google Scholar
- S. Y. Chong, P. Tino, D. C. Ku, and Y. Xin. Improving Generalization Performance in Co-Evolutionary Learning. IEEE Transactions on Evolutionary Computation, 16(1):70--85, 2012. Google ScholarDigital Library
- S. Y. Chong, P. Tino, D. C. Ku, and X. Yao. Improving Generalization Performance in Co-Evolutionary Learning. IEEE Transactions on Evolutionary Computation, 16(1):70--85, 2012. Google ScholarDigital Library
- K. Deb, A. Pratap, S. Agarwal, and T. Meyarivan. A fast and elitist multiobjective genetic algorithm: Nsga-ii. IEEE transactions on evolutionary computation, 6(2):182--197, 2002. Google ScholarDigital Library
- S. G. Ficici. Solution concepts in coevolutionary algorithms. PhD thesis, Brandeis University, Waltham, MA, USA, 2004. Adviser-Pollack, Jordan B. Google ScholarDigital Library
- N. Hansen. The CMA evolution strategy: a comparing review. In J. Lozano, P. Larranaga, I. Inza, and E. Bengoetxea, editors, Towards a new evolutionary computation. Advances on estimation of distribution algorithms, pages 75--102. Springer, 2006.Google Scholar
- W. Jaśkowski. Algorithms for Test-Based Problems. PhD thesis, Institute of Computing Science, Poznan University of Technology, Poznań, Poland, 2011. Adviser: Krzysztof Krawiec.Google Scholar
- W. Jaśkowski. Systematic n-tuple networks for othello position evaluation. ICGA Journal, 37(2):85--96, June 2014.Google ScholarCross Ref
- W. Jaśkowski. Systematic n-tuple networks for position evaluation: Exceeding 90% in the othello league. Technical Report RA-06/2014, arXiv:1406.1509, Institute of Computing Science, Poznan University of Technology, Poznań, Poland, 2014.Google Scholar
- W. Jaśkowski, K. Krawiec, and B. Wieloch. Evolving strategy for a probabilistic game of imperfect information using genetic programming. Genetic Programming and Evolvable Machines, 9(4):281--294, 2008. Google ScholarDigital Library
- W. Jaśkowski, P. Liskowski, M. Szubert, and K. Krawiec. Performance profile: a multi-criteria performance evaluation method for test-based problems. International Journal of Applied Mathematics and Computer Science, 26(1):215--229, 2016.Google ScholarCross Ref
- W. Jaśkowski, P. Liskowski, M. G. Szubert, and K. Krawiec. Improving Coevolution by Random Sampling. In Proceeding of the Fifteenth Annual Conference on Genetic and Evolutionary Computation Conference, GECCO '13, pages 1141--1148, New York, NY, USA, 2013. ACM. Google ScholarDigital Library
- W. Jaśkowski and M. Szubert. Coevolutionary CMA-ES for knowledge-free learning of game position evaluation. IEEE Transactions on Computational Intelligence and AI in Games, 8(4):389--401, 2016.Google ScholarCross Ref
- W. Jaśkowski, M. Szubert, and P. Liskowski. Multi-criteria comparison of co-evolution and temporal difference learning on othello. In A. I. Espareia-Alcazar and A. M. Mora, editors, EvoApplications 2014, volume 8602 of Lecture Notes in Computer Science, pages 301--312. Springer, 2014.Google Scholar
- W. Jaśkowski, M. Szubert, P. Liskowski, and K. Krawiec. High-dimensional function approximation for knowledge-free reinforcement learning: a case study in SZ-Tetris. In GECCO'15: Proceedings of the 17th annual conference on Genetic and Evolutionary Computation, pages 567--574, Mardid, Spain, July 2015. ACM, ACM Press. Google ScholarDigital Library
- W. Jaśkowski, B. Wieloch, and K. Krawiec. Fitnessless coevolution. In M. Keijzer, editor, GECCO '08: Proceedings of the 10th annual conference on Genetic and evolutionary computation, pages 355--362, Atlanta, GA, USA, Jul 2008. Association for Computing Machinery, Association for Computing Machinery. Google ScholarDigital Library
- Y. Jin, M. Olhofer, and B. Sendhoff. A framework for evolutionary optimization with approximate fitness functions. IEEE Transactions on Evolutionary Computation, 6:481--494, 2002. Google ScholarDigital Library
- Y. Koren, R. Bell, and C. Volinsky. Matrix factorization techniques for recommender systems. Computer, 42(8), 2009. Google ScholarDigital Library
- K. Krawiec and P. Liskowski. Automatic derivation of search objectives for test-based genetic programming. In European Conference on Genetic Programming, pages 53--65. Springer, 2015.Google ScholarCross Ref
- D. D. Lee and H. S. Seung. Learning the parts of objects by non-negative matrix factorization. Nature, 401(6755):788--791, 1999.Google ScholarCross Ref
- P. Liskowski and K. Krawiec. Discovery of implicit objectives by compression of interaction matrix in test-based problems. In Parallel Problem Solving from Nature-PPSN XIII, pages 611--620. Springer, 2014.Google Scholar
- P. Liskowski and K. Krawiec. Non-negative matrix factorization for unsupervised derivation of search objectives in genetic programming. In Proceedings of the 2016 on Genetic and Evolutionary Computation Conference, pages 749--756. ACM, 2016. Google ScholarDigital Library
- P. Liskowski and K. Krawiec. Online Discovery of Search Objectives for Test-based Problems. Evolutionary Computation, mar 2016. Google ScholarDigital Library
- P. Liskowski and K. Krawiec. Surrogate fitness via factorization of interaction matrix. In European Conference on Genetic Programming, pages 68--82. Springer, 2016.Google ScholarCross Ref
- S. M. Lucas. Learning to play Othello with N-tuple systems. Australian Journal of Intelligent Information Processing Systems, Special Issue on Game Technology, 9(4):01--20, 2007.Google Scholar
- S. Luke and R. P. Wiegand. Guaranteeing coevolutionary objective measures. In K. A. de Jong, R. Poli, and J. E. Rowe, editors, Foundations of Genetic Algorithms VII, pages 237--251, Torremolinos, Spain, 2002. Morgan Kaufman.Google Scholar
- E. P. Manning. Using Resource-Limited Nash Memory to Improve an Othello Evaluation Function. IEEE Transactions on Computational Intelligence and AI in Games, 2(1):40--53, 2010.Google ScholarCross Ref
- L. Panait and S. Luke. A comparison of two competitive fitness functions. In GECCO '02: Proceedings of the Genetic and Evolutionary Computation Conference, pages 503--511, San Francisco, CA, USA, 2002. Morgan Kaufmann Publishers Inc.Google ScholarDigital Library
- V. P. Pauca, F. Shahnaz, M. W. Berry, and R. J. Plemmons. Text mining using non-negative matrix factorizations. In Proceedings of the 2004 SIAM International Conference on Data Mining, pages 452--456. SIAM, 2004.Google ScholarCross Ref
- E. Popovici, A. Bucci, R. P. Wiegand, and E. D. de Jong. Coevolutionary Principles. In G. Rozenberg, T. Bäck, and J. N. Kok, editors, Handbook of Natural Computing, pages 987--1033. Springer, 2012.Google ScholarCross Ref
- C. Reynolds. Competition, coevolution and the game of tag. In R. A. Brooks and P. Maes, editors, Artificial Life IV, Proceedings of the fourth International Workshop on the Synthesis and Simulation of Living Systems, pages 59--69, MIT, Cambridge, MA, USA, 1994. MIT Press.Google Scholar
- T. Runarsson and S. Lucas. Preference Learning for Move Prediction and Evaluation Function Approximation in Othello. Computational Intelligence and AI in Games, IEEE Transactions on, 6(3):300--313, 2014.Google Scholar
- I. E. Skoulakis and M. G. Lagoudakis. Efficient Reinforcement Learning in Adversarial Games. In 2012 IEEE 24th International Conference on Tools with Artificial Intelligence, pages 704--711. IEEE, Nov. 2012. Google ScholarDigital Library
- M. Szubert and W. Jaśkowski. Temporal difference learning of n-tuple networks for the game 2048. In Proceedings of the IEEE Conference on Computational Intelligence and Games, pages 1--8. IEEE, 2014.Google ScholarCross Ref
- M. Szubert, W. Jaśkowski, and K. Krawiec. On scalability, generalization, and hybridization of coevolutionary learning: a case study for othello. IEEE Transactions on Computational Intelligence and AI in Games, 5(3):214--226, 2013.Google ScholarCross Ref
- M. Szubert, W Jaśkowski, P. Liskowski, and K. Krawiec. Shaping Fitness Function for Evolutionary Learning of Game Strategies. In Proceedings of the 15th Annual Conference on Genetic and Evolutionary Computation, GECCO '13, pages 1149--1156, New York, NY, USA, 2013. ACM. Google ScholarDigital Library
- M. Thill, P. Koch, and W. Konen. Reinforcement Learning with N-tuples on the Game Connect-4. In Proc. of the 12th International Conference on Parallel Problem Solving from Nature, pages 184--194, Berlin, Heidelberg, 2012. Springer. Google ScholarDigital Library
- S. van den Dries and M. A. Wiering. Neural-Fitted TD-Leaf Learning for Playing Othello With Structured Neural Networks. IEEE Transactions on Neural Networks and Learning Systems, 23(11):1701--1713, Nov. 2012.Google ScholarCross Ref
Index Terms
- Accelerating coevolution with adaptive matrix factorization
Recommendations
Heuristics for exact nonnegative matrix factorization
The exact nonnegative matrix factorization (exact NMF) problem is the following: given an m-by-n nonnegative matrix X and a factorization rank r, find, if possible, an m-by-r nonnegative matrix W and an r-by-n nonnegative matrix H such that $$X = WH$$X=...
Quadratic nonnegative matrix factorization
In Nonnegative Matrix Factorization (NMF), a nonnegative matrix is approximated by a product of lower-rank factorizing matrices. Most NMF methods assume that each factorizing matrix appears only once in the approximation, thus the approximation is ...
Monotonous (semi-)nonnegative matrix factorization
CODS '15: Proceedings of the 2nd ACM IKDD Conference on Data SciencesNonnegative matrix factorization (NMF) factorizes a non-negative matrix into product of two non-negative matrices, namely a signal matrix and a mixing matrix. NMF suffers from the scale and ordering ambiguities. Often, the source signals can be ...
Comments