Skip to main content

Networks of Learning Automata and Limiting Games

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4865))

Abstract

Learning Automata (LA) were recently shown to be valuable tools for designing Multi-Agent Reinforcement Learning algorithms. One of the principal contributions of LA theory is that a set of decentralized, independent learning automata is able to control a finite Markov Chain with unknown transition probabilities and rewards. This result was recently extended to Markov Games and analyzed with the use of limiting games. In this paper we continue this analysis but we assume here that our agents are fully ignorant about the other agents in the environment, i.e. they can only observe themselves; they do not know how many other agents are present in the environment, the actions these other agents took, the rewards they received for this, or the location they occupy in the state space. We prove that in Markov Games, where agents have this limited type of observability, a network of independent LA is still able to converge to an equilibrium point of the underlying limiting game, provided a common ergodic assumption and provided the agents do not interfere each other’s transition probabilities.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Nowé, A., Verbeeck, K., Peeters, M.: Learning automata as a basis for multi-agent reinforcement learning. In: Tuyls, K., ’t Hoen, P.J., Verbeeck, K., Sen, S. (eds.) LAMAS 2005. LNCS (LNAI), vol. 3898, pp. 71–85. Springer, Heidelberg (2006)

    Google Scholar 

  2. Littman, M.: Markov games as a framework for multi-agent reinforcement learning. In: Proceedings of the 11th International Conference on Machine Learning, pp. 322–328 (1994)

    Google Scholar 

  3. Osborne, J., Rubinstein, A.: A Course in Game Theory. MIT Press, Cambridge (1994)

    Google Scholar 

  4. Vrancx, P., Verbeeck, K., Nowé, A.: Decentralized learning of markov games. Technical Report COMO/12/2006, Computational Modeling Lab, Vrije Universiteit Brussel, Brussels, Belgium (2006)

    Google Scholar 

  5. Wheeler, R., Narendra, K.: Decentralized learning in finite markov chains. IEEE Transactions on Automatic Control AC-31, 519–526 (1986)

    Article  Google Scholar 

  6. Boutilier, C.: Planning, learning and coordination in multiagent decision processes. In: Proceedings of the 6th Conference on Theoretical Aspects of Rationality and Knowledge, Renesse, Holland, pp. 195–210 (1996)

    Google Scholar 

  7. Thathachar, M., Phansalkar, V.: Learning the global maximum with parameterized learning automata. IEEE Transactions on Neural Networks 6(2), 398–406 (1995)

    Article  Google Scholar 

  8. Narendra, K., Thathachar, M.: Learning Automata: An Introduction. Prentice-Hall International, Inc, Englewood Cliffs (1989)

    Google Scholar 

  9. Thathachar, M., Sastry, P.: Networks of Learning Automata: Techniques for Online Stochastic Optimization. Kluwer Academic Publishers, Dordrecht (2004)

    Google Scholar 

  10. Sastry, P., Phansalkar, V., Thathachar, M.: Decentralized learning of nash equilibria in multi-person stochastic games with incomplete information. IEEE Transactions on Systems, Man, and Cybernetics 24(5), 769–777 (1994)

    Article  MathSciNet  Google Scholar 

  11. Hu, J., Wellman, M.: Nash q-learning for general-sum stochastic games. Journal of Machine Learning Research 4, 1039–1069 (2003)

    Article  MathSciNet  Google Scholar 

  12. Wang, X., Sandholm, T.: Reinforcement learning to play an optimal nash equilibrium in team markov games. In: Proceedings of the Neural Information Processing Systems: Natural and Synthetic (NIPS) conference (2002)

    Google Scholar 

  13. Chang, Y.H., Ho, T., Kaelbling, L.P.: All learning is local: Multi-agent learning in global reward games. In: Thrun, S., Saul, L., Schölkopf, B. (eds.) Advances in Neural Information Processing Systems 16, MIT Press, Cambridge (2004)

    Google Scholar 

  14. Verbeeck, K., Nowé, A., Parent, J., Tuyls, K.: Exploring selfish reinforcement learning in repeated games with stochastic rewards. Technical report, Accepted at Journal of Autonomous Agents and Multi-Agent Systems (to appear 2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Karl Tuyls Ann Nowe Zahia Guessoum Daniel Kudenko

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Vrancx, P., Verbeeck, K., Nowé, A. (2008). Networks of Learning Automata and Limiting Games. In: Tuyls, K., Nowe, A., Guessoum, Z., Kudenko, D. (eds) Adaptive Agents and Multi-Agent Systems III. Adaptation and Multi-Agent Learning. AAMAS ALAMAS ALAMAS 2005 2007 2006. Lecture Notes in Computer Science(), vol 4865. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-77949-0_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-77949-0_16

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-77947-6

  • Online ISBN: 978-3-540-77949-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics