Using Generalized Learning Automata for State Space Aggregation in MAS

De Hauwere, Yann-Michaël; Vrancx, Peter; Nowé, Ann

doi:10.1007/978-3-540-85563-7_28

Using Generalized Learning Automata for State Space Aggregation in MAS

Yann-Michaël De Hauwere¹,
Peter Vrancx¹ &
Ann Nowé¹

Conference paper

1925 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5177))

Abstract

A key problem in multi-agent reinforcement learning remains dealing with the large state spaces typically associated with realistic distributed agent systems. As the state space grows, agent policies become more and more complex and learning slows. One possible solution for an agent to continue learning in these large-scale systems is to learn a policy which generalizes over states, rather than trying to map each individual state to an action.

In this paper we present a multi-agent learning approach capable of aggregating states, using simple reinforcement learners called learning automata (LA). Independent Learning automata have already been shown to perform well in multi-agent environments. Previously we proposed LA based multi-agent algorithms capable of finding a Nash Equilibrium between agent policies. In these algorithms, however, one LA per agent is associated with each system state, as such the approach is limited to discrete state spaces. Furthermore, when the number of states increases, the number of automata also increases and the learning speed of the system slows down. To deal with this problem, we propose to use Generalized Learning Automata (GLA), which are capable of identifying regions within the state space with the same optimal action, and as such aggregating states. We analyze the behaviour of GLA in a multi-agent setting and demonstrate results on a set of sample problems.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Sutton, R.S., Precup, D., Singh, S.P.: Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence 112(1-2), 181–211 (1999)
Article MATH MathSciNet Google Scholar
Stolle, M., Precup, D.: Learning options in reinforcement learning. In: Koenig, S., Holte, R.C. (eds.) SARA 2002. LNCS (LNAI), vol. 2371, pp. 212–223. Springer, Heidelberg (2002)
Chapter Google Scholar
Boutilier, C.: Planning, learning and coordination in multiagent decision processes. In: Theoretical Aspects of Rationality and Knowledge, pp. 195–201 (1996)
Google Scholar
Boutilier, C., Dearden, R., Goldszmidt, M.: Exploiting structure in policy construction. In: Mellish, C. (ed.) Proceedings of the 14th International Joint Conference on Artificial Intelligence, pp. 1104–1111. Morgan Kaufmann, San Francisco (1995)
Google Scholar
Guestrin, C., Koller, D., Parr, R.: Multiagent planning with factored mdps. In: 14th Neural Information Processing Systems (NIPS-14) (2001)
Google Scholar
Degris, T., Sigaud, O., Wuillemin, P.H.: Learning the structure of factored markov decision processes in reinforcement learning problems. In: Proceedings of the 23rd International Conference on Machine learning, New York, NY, USA, pp. 257–264 (2006)
Google Scholar
Strehl, A.L., Diuk, C., Littman, M.L.: Efficient structure learning in factored-state mdps. In: AAAI, pp. 645–650. AAAI Press, Menlo Park (2007)
Google Scholar
Abbeel, P., Koller, D., Ng, A.Y.: Learning factor graphs in polynomial time and sample complexity. Journal of Machine Learning Research 7, 1743–1788 (2006)
MathSciNet Google Scholar
Boutilier, C.: Planning, Learning and Coordination in Multiagent Decision Processes. In: Proceedings of the Sixth Conference on Theoretical Aspects of Rationality and Knowledge table of contents, pp. 195–210 (1996)
Google Scholar
Guestrin, C., Hauskrecht, M., Kveton, B.: Solving factored MDPs with continuous and discrete variables. In: Proceedings of the 20th conference on Uncertainty in artificial intelligence, pp. 235–242 (2004)
Google Scholar
Williams, R.: Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning. Reinforcement Learning 8, 229–256 (1992)
MATH Google Scholar
Thathachar, M., Sastry, P.: Networks of Learning Automata: Techniques for Online Stochastic Optimization. Kluwer Academic Pub., Dordrecht (2004)
Google Scholar
Phansalkar, V., Thathachar, M.: Local and global optimization algorithms for generalized learning automata. Neural Computation 7(5), 950–973 (1995)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Computational Modeling Lab, Vrije Universiteit Brussel,
Yann-Michaël De Hauwere, Peter Vrancx & Ann Nowé

Authors

Yann-Michaël De Hauwere
View author publications
You can also search for this author in PubMed Google Scholar
Peter Vrancx
View author publications
You can also search for this author in PubMed Google Scholar
Ann Nowé
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Ignac Lovrek Robert J. Howlett Lakhmi C. Jain

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

De Hauwere, YM., Vrancx, P., Nowé, A. (2008). Using Generalized Learning Automata for State Space Aggregation in MAS. In: Lovrek, I., Howlett, R.J., Jain, L.C. (eds) Knowledge-Based Intelligent Information and Engineering Systems. KES 2008. Lecture Notes in Computer Science(), vol 5177. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85563-7_28

Download citation

DOI: https://doi.org/10.1007/978-3-540-85563-7_28
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-85562-0
Online ISBN: 978-3-540-85563-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics