Skip to main content

Optimistic-Pessimistic Q-Learning Algorithm for Multi-Agent Systems

  • Conference paper
Multiagent System Technologies (MATES 2008)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5244))

Included in the following conference series:

Abstract

A reinforcement learning algorithm OP-Q for multi-agent systems based on Hurwicz’s optimistic-pessimistic criterion which allows to embed preliminary knowledge on the degree of environment friendliness is proposed. The proof of its convergence to stationary policy is given. Thorough testing of the developed algorithm against well-known reinforcement learning algorithms has shown that OP-Q can function on the level of its opponents.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Littman, M.L.: Markov games as a framework for multi-agent reinforcement learning. In: ICML, pp. 157–163 (1994)

    Google Scholar 

  2. Littman, M.L.: Friend-or-foe q-learning in general-sum games. In: Brodley, C.E., Danyluk, A.P. (eds.) ICML, pp. 322–328. Morgan Kaufmann, San Francisco (2001)

    Google Scholar 

  3. Hu, J., Wellman, M.P.: Multiagent reinforcement learning: theoretical framework and an algorithm. In: Proc. 15th International Conf. on Machine Learning, pp. 242–250. Morgan Kaufmann, San Francisco (1998)

    Google Scholar 

  4. Claus, C., Boutilier, C.: The dynamics of reinforcement learning in cooperative multiagent systems. In: AAAI 1998/IAAI 1998: Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence, pp. 746–752. American Association for Artificial Intelligence, Menlo Park (1998)

    Google Scholar 

  5. Bowling, M.H., Veloso, M.M.: Multiagent learning using a variable learning rate. Artificial Intelligence 136(2), 215–250 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  6. Szepesvári, C., Littman, M.L.: Generalized markov decision processes: Dynamic-programming and reinforcement-learning algorithms. Technical report, Providence, RI, USA (1996)

    Google Scholar 

  7. Robbins, H., Monro, S.: A stochastic approximation method. Annals of Mathematical Statistics 22(3), 400–407 (1951)

    Article  MATH  MathSciNet  Google Scholar 

  8. Arrow, K.: Hurwiczs optimality criterion for decision making under ignorance. Technical Report 6, Stanford University (1953)

    Google Scholar 

  9. Nudelman, E., Wortman, J., Shoham, Y., Leyton-Brown, K.: Run the gamut: A comprehensive approach to evaluating game-theoretic algorithms. In: AAMAS 2004, pp. 880–887. IEEE Computer Society, Los Alamitos (2004)

    Google Scholar 

  10. Watkins, C.J.C.H.: Learning from Delayed Rewards. PhD thesis, King’s College, Cambridge, England (1989)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Ralph Bergmann Gabriela Lindemann Stefan Kirn Michal Pěchouček

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Akchurina, N. (2008). Optimistic-Pessimistic Q-Learning Algorithm for Multi-Agent Systems. In: Bergmann, R., Lindemann, G., Kirn, S., Pěchouček, M. (eds) Multiagent System Technologies. MATES 2008. Lecture Notes in Computer Science(), vol 5244. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-87805-6_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-87805-6_3

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-87804-9

  • Online ISBN: 978-3-540-87805-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics