Skip to main content

Using Generalized Learning Automata for State Space Aggregation in MAS

  • Conference paper
  • 1925 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5177))

Abstract

A key problem in multi-agent reinforcement learning remains dealing with the large state spaces typically associated with realistic distributed agent systems. As the state space grows, agent policies become more and more complex and learning slows. One possible solution for an agent to continue learning in these large-scale systems is to learn a policy which generalizes over states, rather than trying to map each individual state to an action.

In this paper we present a multi-agent learning approach capable of aggregating states, using simple reinforcement learners called learning automata (LA). Independent Learning automata have already been shown to perform well in multi-agent environments. Previously we proposed LA based multi-agent algorithms capable of finding a Nash Equilibrium between agent policies. In these algorithms, however, one LA per agent is associated with each system state, as such the approach is limited to discrete state spaces. Furthermore, when the number of states increases, the number of automata also increases and the learning speed of the system slows down. To deal with this problem, we propose to use Generalized Learning Automata (GLA), which are capable of identifying regions within the state space with the same optimal action, and as such aggregating states. We analyze the behaviour of GLA in a multi-agent setting and demonstrate results on a set of sample problems.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Sutton, R.S., Precup, D., Singh, S.P.: Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence 112(1-2), 181–211 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  2. Stolle, M., Precup, D.: Learning options in reinforcement learning. In: Koenig, S., Holte, R.C. (eds.) SARA 2002. LNCS (LNAI), vol. 2371, pp. 212–223. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  3. Boutilier, C.: Planning, learning and coordination in multiagent decision processes. In: Theoretical Aspects of Rationality and Knowledge, pp. 195–201 (1996)

    Google Scholar 

  4. Boutilier, C., Dearden, R., Goldszmidt, M.: Exploiting structure in policy construction. In: Mellish, C. (ed.) Proceedings of the 14th International Joint Conference on Artificial Intelligence, pp. 1104–1111. Morgan Kaufmann, San Francisco (1995)

    Google Scholar 

  5. Guestrin, C., Koller, D., Parr, R.: Multiagent planning with factored mdps. In: 14th Neural Information Processing Systems (NIPS-14) (2001)

    Google Scholar 

  6. Degris, T., Sigaud, O., Wuillemin, P.H.: Learning the structure of factored markov decision processes in reinforcement learning problems. In: Proceedings of the 23rd International Conference on Machine learning, New York, NY, USA, pp. 257–264 (2006)

    Google Scholar 

  7. Strehl, A.L., Diuk, C., Littman, M.L.: Efficient structure learning in factored-state mdps. In: AAAI, pp. 645–650. AAAI Press, Menlo Park (2007)

    Google Scholar 

  8. Abbeel, P., Koller, D., Ng, A.Y.: Learning factor graphs in polynomial time and sample complexity. Journal of Machine Learning Research 7, 1743–1788 (2006)

    MathSciNet  Google Scholar 

  9. Boutilier, C.: Planning, Learning and Coordination in Multiagent Decision Processes. In: Proceedings of the Sixth Conference on Theoretical Aspects of Rationality and Knowledge table of contents, pp. 195–210 (1996)

    Google Scholar 

  10. Guestrin, C., Hauskrecht, M., Kveton, B.: Solving factored MDPs with continuous and discrete variables. In: Proceedings of the 20th conference on Uncertainty in artificial intelligence, pp. 235–242 (2004)

    Google Scholar 

  11. Williams, R.: Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning. Reinforcement Learning 8, 229–256 (1992)

    MATH  Google Scholar 

  12. Thathachar, M., Sastry, P.: Networks of Learning Automata: Techniques for Online Stochastic Optimization. Kluwer Academic Pub., Dordrecht (2004)

    Google Scholar 

  13. Phansalkar, V., Thathachar, M.: Local and global optimization algorithms for generalized learning automata. Neural Computation 7(5), 950–973 (1995)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Ignac Lovrek Robert J. Howlett Lakhmi C. Jain

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

De Hauwere, YM., Vrancx, P., Nowé, A. (2008). Using Generalized Learning Automata for State Space Aggregation in MAS. In: Lovrek, I., Howlett, R.J., Jain, L.C. (eds) Knowledge-Based Intelligent Information and Engineering Systems. KES 2008. Lecture Notes in Computer Science(), vol 5177. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85563-7_28

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-85563-7_28

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-85562-0

  • Online ISBN: 978-3-540-85563-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics