Abstract
Hierarchical learning automata are shown to be an excellent tool for solving multi-stage games. However, most updating schemes used by hierarchical automata expect the multi-stage game to reach an absorbing state at which point the automata are updated in a Monte Carlo way. As such, the approach is infeasible for large multi-stage games (and even for problems with an infinite horizon) and the convergence process is slow. In this paper we propose an algorithm where the rewards don’t have to travel all the way up to the top of the hierarchy and in which there is no need for explicit end-stages.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Tuyls, K.: Learning in Multi-Agent Systems: An Evolutionary Game Theoretic Approach. PhD thesis, Vrije Universiteit Brussel (2004)
Claus, C., Boutilier, C.: The dynamics of reinforcement learning in cooperative multiagent systems. In: AAAI 1998. Proceedings of the Fifteenth National Conference of Artificial Intelligence, Madison, WI, pp. 746–752 (1998)
Verbeeck, K., Nowé, A., Peeters, M., Tuyls, K.: Multi-agent reinforcement learning in stochastic single and multi-stage games. In: Kudenko, D., Kazakov, D., Alonso, E. (eds.) Adaptive Agents and Multi-Agent Systems II, pp. 275–294. Springer, Heidelberg (2005)
Verbeeck, K.: Coordinated Exploration in Multi-Agent Reinforcement Learning. PhD thesis, Vrije Universiteit Brussel (2004)
Kapetanakis, S., Kudenko, D., Strens, M.J.A.: Learning to coordinate using commitment sequences in cooperative multi-agent systems. In: Kudenko, D., Kazakov, D., Alonso, E. (eds.) Adaptive Agents and Multi-Agent Systems II, pp. 275–294. Springer, Heidelberg (2005)
Tsetlin, M.L.: On the behavior of finite automata in random media. Avtomatika i Telemekhanika 22(10), 1345–1354 (1961)
Narendra, K.S., Thathachar, M.A.L.: Learning automata - a survey. IEEE_J_SMC SMC-4(4), 323–334 (1974)
Narendra, K.S., Thathachar, M.A.L.: Learning Automata: An Introduction. Prentice-Hall, Englewood Cliffs (1989)
Thathachar, M.A.L., Sastry, P.S.: Networks of Learning Automata: Techniques for Online Stochastic Optimization. Kluwer Academic Publishers, Dordrecht (2004)
Nowé, A., Verbeeck, K., Peeters, M.: Learning automata as a basis for multi-agent reinforcement learning. In: Tuyls, K., t Hoen, P.J., Verbeeck, K., Sen, S. (eds.) LAMAS 2005. LNCS (LNAI), vol. 3898, pp. 71–85. Springer, Heidelberg (2006)
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge, MA (1998)
Verbeeck, K., Nowé, A., Parent, J., Tuyls, K.: Exploring selfish reinforcement learning in repeated games with stochastic rewards. Journal of Autonomous Agents and Multi-agent Systems (to appear)
Panait, L., Luke, S.: Cooperative multi-agent learning: The state of the art. Autonomous Agents and Multi-Agent Systems 3(11), 383–434 (2005)
Boutilier, C.: Sequential optimality and coordination in multiagent systems. In: Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence, pp. 478–485 (1996)
Thathachar, M.A.L., Ramakrishnan, K.R.: A hierarchical system of learning automata. IEEE Transactions on Systems, Man, and Cybernetics SMC-11(3), 236–241 (1981)
Ramakrishnan, K.R.: Hierarchical systems and cooperative games of learning automata. PhD thesis, Indian Institute of Science, Bangalore, India (1982)
Verbeeck, K., Nowé, A., Tuyls, K., Peeters, M.: Multi-agent reinforcement learning in stochastic single and multi-stage games. In: Kudenko, D., Kazakov, D., Alonso, E. (eds.) Adaptive Agents and Multi-Agent Systems II. LNCS (LNAI), vol. 3394, pp. 275–294. Springer, Heidelberg (2005)
Watkins, C., Dayan, P.: Q-learning. Machine Learning 8(3), 279–292 (1992)
Shoham, Y., Powers, R., Grenager, T.: Multi-agent reinforcement learning: a critical survey. Technical report, Stanford University (2003)
Sutton, R.S., Barto, A.G.: Reinforcement Learning An Introduction. MIT Press, Cambridge (1998)
Peeters, M., Verbeeck, K., Nowé, A.: The effect of bootstrapping in multi-automata reinforcement learning. In: IEEE Symposium Series on Computational Intelligence, International Symposium on Approximate Dynamic Programming and Reinforcement Learning (2007)
Kaelbling, L.P., Littman, M.L., Moore, A.P.: Reinforcement learning: A survey. Journal of Artificial Intelligence Research 4, 237–285 (1996)
Tsitsiklis, J.: Asynchronous stochastic approximation and q-learning. Machine Learning 16, 185–202 (1994)
Narendra, K.S., Parthasarathy, K.: Learning automata approach to hierarchical multiobjective analysis. IEEE Transactions on Systems, Man, and Cybernetics 21(2), 263–273 (1991)
Peeters, M., Nowé, A., Verbeeck, K.: Bootstrapping versus monte carlo in a learning automata hierarchy. Adaptive Learning Agents and Multi-Agent Systems, 61–71 (2006)
Peeters, M., Nowé, A., Verbeeck, K.: Toward bootstrapping in a hierarchy of learning automata. In: Proceedings of the Seventh European Workshop on Reinforcement Learning, pp. 31–32 (2005)
Van de Wege, L.: Learning automata as a framework for multi-agent reinforcement learning: Convergence issues in tree-structured multi-stage games. Master’s thesis, Vrije Universiteit Brussel (2006)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Peeters, M., Verbeeck, K., Nowé, A. (2008). Solving Multi-stage Games with Hierarchical Learning Automata That Bootstrap. In: Tuyls, K., Nowe, A., Guessoum, Z., Kudenko, D. (eds) Adaptive Agents and Multi-Agent Systems III. Adaptation and Multi-Agent Learning. AAMAS ALAMAS ALAMAS 2005 2007 2006. Lecture Notes in Computer Science(), vol 4865. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-77949-0_13
Download citation
DOI: https://doi.org/10.1007/978-3-540-77949-0_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-77947-6
Online ISBN: 978-3-540-77949-0
eBook Packages: Computer ScienceComputer Science (R0)