Skip to main content
Log in

Coordinating Multiple Agents via Reinforcement Learning

  • Published:
Autonomous Agents and Multi-Agent Systems Aims and scope Submit manuscript

Abstract

In this paper, we attempt to use reinforcement learning techniques to solve agent coordination problems in task-oriented environments. The Fuzzy Subjective Task Structure model (FSTS) is presented to model the general agent coordination. We show that an agent coordination problem modeled in FSTS is a Decision-Theoretic Planning (DTP) problem, to which reinforcement learning can be applied. Two learning algorithms, ‘‘coarse-grained’’ and ‘‘fine-grained’’, are proposed to address agents coordination behavior at two different levels. The ‘‘coarse-grained’’ algorithm operates at one level and tackle hard system constraints, and the ‘‘fine-grained’’ at another level and for soft constraints. We argue that it is important to explicitly model and explore coordination-specific (particularly system constraints) information, which underpins the two algorithms and attributes to the effectiveness of the algorithms. The algorithms are formally proved to converge and experimentally shown to be effective.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • O. Abul F. Polat R. Alhajj (2000) ArticleTitle‘‘Multiagent reinforcement learning using function approximation’‘ IEEE Trans. Syst., Man, Cyber 30 IssueID4 485–497

    Google Scholar 

  • S. Arai, K. Sycara, and T. R. Payne, ‘‘Multi-agent reinforcement learning for scheduling multiple-goals’’, in Proceedings of the fourth International Conference on Multi-Agent Systems, 2000.

  • R. Bellman (1957a) Dynamic Programming Princeton University Press Englewood Cliffs, NJ

    Google Scholar 

  • R. Bellman (1957b) Dynamic Programming Princeton University Press Englewood Cliffs, NJ

    Google Scholar 

  • H. R. Berenji and D. Vengerov, ‘‘Cooperation and coordination between fuzzy reinforcement learning agents in continuous state partially observable Markov decision processes’’, in Proceedings of the 8th IEEE International Conference on Fuzzy Systems, 1999. pp. 621-627.

  • D. P. Bertsekas (1987) Dynamic Programming Prentice-Hall Englewood Cliffs, NJ

    Google Scholar 

  • D. P. Bertsekas and J. N. Tsitsiklis, Neuro-Dynamic Programming, Athena Scientific, 1996.

  • A. Bonarini and V. Trianni, ‘‘Learning fuzzy classifier systems for multi-agent coordination’’, Int. J. of Inform. Sci. pp. 215-239, 2001.

  • A. H. Bound L. Gasser (Eds) (1988) Readings in Distributed Artificial Intelligence Morgan Kaufmann Los Atlas, CA

    Google Scholar 

  • C. Boutilier T. Dean S. Hanks (1999) ArticleTitle‘‘Decision-Theoretic Planning: Structural Assumptions and Computational Leverage’‘ J. Artif. Intelligence Res. Vol. 11 1–94

    Google Scholar 

  • C. Boutilier R. Dearden M. Goldszmidt (2000) ArticleTitle‘‘Stochastic dynamic programming with factored representations’‘ Artif. Intell. Vol. 121 49–107

    Google Scholar 

  • H. Bunke X. Jiang (2000) ‘‘Graph matching and similarity,’‘ H. N. Teodorescu D. Mlynek A. Kandel H. J. Zimmermann (Eds) Intelligent Systems and Interfaces Kluwer Academic Publishers Dordrecht

    Google Scholar 

  • G. Chalkiadakis and C. Boutilier, ‘‘Coordination in multiagent reinforcement learning: A bayesian approach’’, in Proceedings of the Second International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS-03). Melbourne, Australia, 2003, pp. 709–716.

  • G. Chen, Z. Yang, H. He, and K. M. Goh, ‘‘A fuzzy logic based multiagent coordination framewrok’’, in Proceedings of the International Conference on Intelligent Agents, Web Technologies and Internet Commerce, 2003, Vienna, Austria.

  • H. Chen (2002) Stochastic approximation and its Applications Academic Publishers Dordrecht

    Google Scholar 

  • R. H. Crites A. G. Barto (1998) ArticleTitle‘‘Elevator group control using multiple reinforcement learning agents’‘ Machine Learning 33 235–262

    Google Scholar 

  • R. Dearden C. Boutilier (1997) ArticleTitle‘‘Abstraction and approximate decision-theoretic planning’‘ Artif. Intell. 89 219–283

    Google Scholar 

  • K. S. Decker ‘‘Environment centered analysis and design of coordination mechanisms’’, Ph.D. thesis, University of Massachusetts Amherst, 1995.

  • K. S. Decker and V. R. Lesser, ‘‘Generalizing the partial global planning algorithm’’, Int. J. Intell. Cooperative Inform. Syst., pp. 319–346, 1992.

  • E. H. Durfee (Eds) (1988) ‘‘Coordination of Distributed Problem Solvers’‘ Kluwer Academic Publishers Dordrecht

    Google Scholar 

  • E. H. Durfee V. R. Lesser (1989) ArticleTitle‘‘Negotiation task decomposition and allocation using partial global planning’‘ Distributed Artificial Intelligence vol. 2 229–243

    Google Scholar 

  • R. Givan T. Dean M. Greig (2003) ArticleTitle‘‘Equivalence notions and model minimization in Markov decision processes’‘ Artif. Intell. vol. 147 163–223

    Google Scholar 

  • C. Guestrin S. Venkataraman D. Koller (2002) ‘‘Context specific multiagent coordination and planning with factored MDPs’’, in AAAI Spring Symposium on Collaborative Learning Agents Stanford California

    Google Scholar 

  • M. Heger, ‘‘Consideration of risk in reinforcement learning’’, in Proceedings of the 11th International Conference on Machine Learning, 1994. pp. 105–111.

  • R. A. Horn C. R. Johnson (1985) Matrix Analysis Cambridge, MA Cambridge University Press

    Google Scholar 

  • L. P. Kaelbling M. L. Littman A. W. Moore (1996) ArticleTitle‘‘Reinforcement Learning A survey’‘ J. Artif. Intell. Res. vol. 4 237–285

    Google Scholar 

  • S. Kapetanakis and D. Kudenko, ‘‘Reinforcement learning of coordination in cooperative multi-agent systems’’, in Eighteenth National Conference on Artificial Intelligence, 2002. Edmonton, Alberta, Canada, pp. 326–331.

  • T. W. Malone and K. Crowston, ‘‘What is coordination theory and how can it help design cooperative work systems?’’, in Proceedings of the 1990 ACM conference on Computer-Supported Cooperative Work, 1990. pp. 357–370.

  • M. L. Puterman, ‘‘Markov decision processes,’’ in D. P. Heyman and M. J. Sobel (eds.), Handbook in Operations Research and Management Science, vol. 2, 1990. Stochastic Models. North-Holand, Chapt. 8, pp. 331–434.

  • H. Robbins S. Monro (1951) ArticleTitle‘‘A stochastic approximation method’‘ Ann. Math. Stat. vol. 22 400–407

    Google Scholar 

  • J. Rust, ‘‘Numerical dynamic programming in economics’’, in H. M. Amman, D. A. Kendrick, and J. Rust (eds.), Handbook of Computational Economics, vol. 1, 1996. Amsterdam, Elsevier, Amsterdam; The Netherlands: Chapt. 14.

  • A. Sathi M. S. Fox (1989) ArticleTitle‘‘Constraint-directed negotiation of resource reallocations’‘ Distributed Artificial Intelligence vol. 2 163–193

    Google Scholar 

  • S. Sen M. Sekaran (1998) ArticleTitle‘‘Individual learning of coordination knowledge’‘ J. Exp. Theor. Artif. Intell. vol. 10 333–356

    Google Scholar 

  • P. Stone, ‘‘Layered learning in multi-agent systems’’, Ph.D. thesis, School of Computer Science, Carnegie Mellon University, 1998.

  • M. Sugeno (1985) ArticleTitle‘‘An introductory survey of fuzzy control’‘ Inform. Sci. vol. 36 59–83

    Google Scholar 

  • R. S. Sutton A. G. Barto (1998) Reinforcement Learning: An Introduction MIT Press Cambridge, MA

    Google Scholar 

  • C. Szepesvári M. L. Littman (1996) ArticleTitle‘‘A unified analysis of value-function-based reinforcement learning algorithms’‘ Neural Comput. vol. 11 2017–2060

    Google Scholar 

  • M. Tan (1997) ‘‘Multi-agent reinforcement learning: Independent vs. cooperative learning’‘ M. N. Huhns M. P. Singh (Eds) Readings in Agents Morgan Kaufmann San Francisco, CA, USA 487–494

    Google Scholar 

  • D. Vengerov and H. R. Berenji, ‘‘Adaptive coordination among fuzzy reinforcement learning agents performing distributed dynamic load balancing’’, in Proceedings of the 11th IEEE International Conference on Fuzzy Systems, 2002.

  • C. J. C. H. Watkins P. Dayan (1992) ArticleTitle‘‘Q-learning’‘ Machine Learning 8 279–292

    Google Scholar 

  • G. Weiss, ‘‘Learning to coordinate actions in multi-agent systems’’, in Proceedings of the 13th International Joint Conference on Artificial Intelligence, 1993. pp. 311–316.

  • D. J. White (1993) Markov Decision Processes John Wiley Sons New York

    Google Scholar 

  • L. A. Zadeh, in L. A. Zadeh, R. R. Yage and R. R. Yager and R. M. Ton (eds.), Fuzzy Sets and Applications: Selected Papers, John Wiley & Sons, New York, 1987.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gang Chen.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, G., Yang, Z., He, H. et al. Coordinating Multiple Agents via Reinforcement Learning. Auton Agent Multi-Agent Syst 10, 273–328 (2005). https://doi.org/10.1007/s10458-004-4344-3

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10458-004-4344-3

Keywords

Navigation