Abstract
Learning complex game functions is still a difficult task. We apply temporal difference learning (TDL), a well-known variant of the reinforcement learning approach, in combination with n-tuple networks to the game Connect-4. Our agent is trained just by self-play. It is able, for the first time, to consistently beat the optimal-playing Minimax agent (in game situations where a win is possible). The n-tuple network induces a mighty feature space: It is not necessary to design certain features, but the agent learns to select the right ones. We believe that the n-tuple network is an important ingredient for the overall success and identify several aspects that are relevant for achieving high-quality results. The architecture is sufficiently general to be applied to similar reinforcement learning tasks as well.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Allis, V.: A knowledge-based approach of Connect-4. The game is solved: White wins. Master’s thesis, Department of Mathematics and Computer Science, Vrije Universiteit, Amsterdam, The Netherlands (1988)
Bledsoe, W.W., Browning, I.: Pattern recognition and reading by machine. In: Proc. Eastern Joint Computer Conference, New York, pp. 225–232 (1959)
Curran, D., O’Riordan, C.: Evolving Connect-4 playing neural networks using cultural learning. NUIG-IT-081204, National University of Ireland, Galway (2004)
Edelkamp, S., Kissmann, P.: Symbolic classication of general two-player games. Technical report, Technische Universität Dortmund (2008)
Konen, W., Bartz–Beielstein, T.: Reinforcement Learning: Insights from Interesting Failures in Parameter Selection. In: Rudolph, G., Jansen, T., Lucas, S., Poloni, C., Beume, N. (eds.) PPSN 2008. LNCS, vol. 5199, pp. 478–487. Springer, Heidelberg (2008)
Konen, W., Bartz-Beielstein, T.: Reinforcement learning for games: failures and successes – CMA-ES and TDL in comparison. In: Proc. GECCO 2009, Montreal, pp. 2641–2648. ACM, New York (2009)
Krawiec, K., Szubert, M.G.: Learning n-tuple networks for Othello by coevolutionary gradient search. In: Proc. GECCO 2011, Dublin, pp. 355–362. ACM, New York (2011)
Lucas, S.M.: Learning to play Othello with n-tuple systems. Australian Journal of Intelligent Information Processing 4, 1–20 (2008)
Samuel, A.L.: Some studies in machine learning using the game of checkers. IBM Journal of Research and Development 3(3), 210–229 (1959)
Schneider, M., Garcia Rosa, J.: Neural Connect-4 - a connectionist approach. In: Proc. VII. Brazilian Symposium on Neural Networks, pp. 236–241 (2002)
Sommerlund, P.: Artificial neural nets applied to strategic games (1996) (unpublished), http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.56.4690 (last access: June 05, 2012)
Stenmark, M.: Synthesizing board evaluation functions for Connect-4 using machine learning techniques. Master’s thesis, Østfold University College, Norway (2005)
Sutton, R.S.: Temporal Credit Assignment in Reinforcement Learning. PhD thesis, University of Massachusetts, Amherst, MA (1984)
Sutton, R.S.: Learning to predict by the method of temporal differences. Machine Learning 3, 9–44 (1988)
Tesauro, G.: TD-gammon, a self-teaching backgammon program, achieves master-level play. Neural Computation 6, 215–219 (1994)
Thill, M.: Using n-tuple systems with TD learning for strategic board games. CIOP Report 01/12, Cologne University of Applied Science (2012) (in German)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Thill, M., Koch, P., Konen, W. (2012). Reinforcement Learning with N-tuples on the Game Connect-4. In: Coello, C.A.C., Cutello, V., Deb, K., Forrest, S., Nicosia, G., Pavone, M. (eds) Parallel Problem Solving from Nature - PPSN XII. PPSN 2012. Lecture Notes in Computer Science, vol 7491. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32937-1_19
Download citation
DOI: https://doi.org/10.1007/978-3-642-32937-1_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-32936-4
Online ISBN: 978-3-642-32937-1
eBook Packages: Computer ScienceComputer Science (R0)