Abstract
In this paper, we attempt to use reinforcement learning techniques to solve agent coordination problems in task-oriented environments. The Fuzzy Subjective Task Structure model (FSTS) is presented to model the general agent coordination. We show that an agent coordination problem modeled in FSTS is a Decision-Theoretic Planning (DTP) problem, to which reinforcement learning can be applied. Two learning algorithms, ‘‘coarse-grained’’ and ‘‘fine-grained’’, are proposed to address agents coordination behavior at two different levels. The ‘‘coarse-grained’’ algorithm operates at one level and tackle hard system constraints, and the ‘‘fine-grained’’ at another level and for soft constraints. We argue that it is important to explicitly model and explore coordination-specific (particularly system constraints) information, which underpins the two algorithms and attributes to the effectiveness of the algorithms. The algorithms are formally proved to converge and experimentally shown to be effective.
Similar content being viewed by others
References
O. Abul F. Polat R. Alhajj (2000) ArticleTitle‘‘Multiagent reinforcement learning using function approximation’‘ IEEE Trans. Syst., Man, Cyber 30 IssueID4 485–497
S. Arai, K. Sycara, and T. R. Payne, ‘‘Multi-agent reinforcement learning for scheduling multiple-goals’’, in Proceedings of the fourth International Conference on Multi-Agent Systems, 2000.
R. Bellman (1957a) Dynamic Programming Princeton University Press Englewood Cliffs, NJ
R. Bellman (1957b) Dynamic Programming Princeton University Press Englewood Cliffs, NJ
H. R. Berenji and D. Vengerov, ‘‘Cooperation and coordination between fuzzy reinforcement learning agents in continuous state partially observable Markov decision processes’’, in Proceedings of the 8th IEEE International Conference on Fuzzy Systems, 1999. pp. 621-627.
D. P. Bertsekas (1987) Dynamic Programming Prentice-Hall Englewood Cliffs, NJ
D. P. Bertsekas and J. N. Tsitsiklis, Neuro-Dynamic Programming, Athena Scientific, 1996.
A. Bonarini and V. Trianni, ‘‘Learning fuzzy classifier systems for multi-agent coordination’’, Int. J. of Inform. Sci. pp. 215-239, 2001.
A. H. Bound L. Gasser (Eds) (1988) Readings in Distributed Artificial Intelligence Morgan Kaufmann Los Atlas, CA
C. Boutilier T. Dean S. Hanks (1999) ArticleTitle‘‘Decision-Theoretic Planning: Structural Assumptions and Computational Leverage’‘ J. Artif. Intelligence Res. Vol. 11 1–94
C. Boutilier R. Dearden M. Goldszmidt (2000) ArticleTitle‘‘Stochastic dynamic programming with factored representations’‘ Artif. Intell. Vol. 121 49–107
H. Bunke X. Jiang (2000) ‘‘Graph matching and similarity,’‘ H. N. Teodorescu D. Mlynek A. Kandel H. J. Zimmermann (Eds) Intelligent Systems and Interfaces Kluwer Academic Publishers Dordrecht
G. Chalkiadakis and C. Boutilier, ‘‘Coordination in multiagent reinforcement learning: A bayesian approach’’, in Proceedings of the Second International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS-03). Melbourne, Australia, 2003, pp. 709–716.
G. Chen, Z. Yang, H. He, and K. M. Goh, ‘‘A fuzzy logic based multiagent coordination framewrok’’, in Proceedings of the International Conference on Intelligent Agents, Web Technologies and Internet Commerce, 2003, Vienna, Austria.
H. Chen (2002) Stochastic approximation and its Applications Academic Publishers Dordrecht
R. H. Crites A. G. Barto (1998) ArticleTitle‘‘Elevator group control using multiple reinforcement learning agents’‘ Machine Learning 33 235–262
R. Dearden C. Boutilier (1997) ArticleTitle‘‘Abstraction and approximate decision-theoretic planning’‘ Artif. Intell. 89 219–283
K. S. Decker ‘‘Environment centered analysis and design of coordination mechanisms’’, Ph.D. thesis, University of Massachusetts Amherst, 1995.
K. S. Decker and V. R. Lesser, ‘‘Generalizing the partial global planning algorithm’’, Int. J. Intell. Cooperative Inform. Syst., pp. 319–346, 1992.
E. H. Durfee (Eds) (1988) ‘‘Coordination of Distributed Problem Solvers’‘ Kluwer Academic Publishers Dordrecht
E. H. Durfee V. R. Lesser (1989) ArticleTitle‘‘Negotiation task decomposition and allocation using partial global planning’‘ Distributed Artificial Intelligence vol. 2 229–243
R. Givan T. Dean M. Greig (2003) ArticleTitle‘‘Equivalence notions and model minimization in Markov decision processes’‘ Artif. Intell. vol. 147 163–223
C. Guestrin S. Venkataraman D. Koller (2002) ‘‘Context specific multiagent coordination and planning with factored MDPs’’, in AAAI Spring Symposium on Collaborative Learning Agents Stanford California
M. Heger, ‘‘Consideration of risk in reinforcement learning’’, in Proceedings of the 11th International Conference on Machine Learning, 1994. pp. 105–111.
R. A. Horn C. R. Johnson (1985) Matrix Analysis Cambridge, MA Cambridge University Press
L. P. Kaelbling M. L. Littman A. W. Moore (1996) ArticleTitle‘‘Reinforcement Learning A survey’‘ J. Artif. Intell. Res. vol. 4 237–285
S. Kapetanakis and D. Kudenko, ‘‘Reinforcement learning of coordination in cooperative multi-agent systems’’, in Eighteenth National Conference on Artificial Intelligence, 2002. Edmonton, Alberta, Canada, pp. 326–331.
T. W. Malone and K. Crowston, ‘‘What is coordination theory and how can it help design cooperative work systems?’’, in Proceedings of the 1990 ACM conference on Computer-Supported Cooperative Work, 1990. pp. 357–370.
M. L. Puterman, ‘‘Markov decision processes,’’ in D. P. Heyman and M. J. Sobel (eds.), Handbook in Operations Research and Management Science, vol. 2, 1990. Stochastic Models. North-Holand, Chapt. 8, pp. 331–434.
H. Robbins S. Monro (1951) ArticleTitle‘‘A stochastic approximation method’‘ Ann. Math. Stat. vol. 22 400–407
J. Rust, ‘‘Numerical dynamic programming in economics’’, in H. M. Amman, D. A. Kendrick, and J. Rust (eds.), Handbook of Computational Economics, vol. 1, 1996. Amsterdam, Elsevier, Amsterdam; The Netherlands: Chapt. 14.
A. Sathi M. S. Fox (1989) ArticleTitle‘‘Constraint-directed negotiation of resource reallocations’‘ Distributed Artificial Intelligence vol. 2 163–193
S. Sen M. Sekaran (1998) ArticleTitle‘‘Individual learning of coordination knowledge’‘ J. Exp. Theor. Artif. Intell. vol. 10 333–356
P. Stone, ‘‘Layered learning in multi-agent systems’’, Ph.D. thesis, School of Computer Science, Carnegie Mellon University, 1998.
M. Sugeno (1985) ArticleTitle‘‘An introductory survey of fuzzy control’‘ Inform. Sci. vol. 36 59–83
R. S. Sutton A. G. Barto (1998) Reinforcement Learning: An Introduction MIT Press Cambridge, MA
C. Szepesvári M. L. Littman (1996) ArticleTitle‘‘A unified analysis of value-function-based reinforcement learning algorithms’‘ Neural Comput. vol. 11 2017–2060
M. Tan (1997) ‘‘Multi-agent reinforcement learning: Independent vs. cooperative learning’‘ M. N. Huhns M. P. Singh (Eds) Readings in Agents Morgan Kaufmann San Francisco, CA, USA 487–494
D. Vengerov and H. R. Berenji, ‘‘Adaptive coordination among fuzzy reinforcement learning agents performing distributed dynamic load balancing’’, in Proceedings of the 11th IEEE International Conference on Fuzzy Systems, 2002.
C. J. C. H. Watkins P. Dayan (1992) ArticleTitle‘‘Q-learning’‘ Machine Learning 8 279–292
G. Weiss, ‘‘Learning to coordinate actions in multi-agent systems’’, in Proceedings of the 13th International Joint Conference on Artificial Intelligence, 1993. pp. 311–316.
D. J. White (1993) Markov Decision Processes John Wiley Sons New York
L. A. Zadeh, in L. A. Zadeh, R. R. Yage and R. R. Yager and R. M. Ton (eds.), Fuzzy Sets and Applications: Selected Papers, John Wiley & Sons, New York, 1987.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Chen, G., Yang, Z., He, H. et al. Coordinating Multiple Agents via Reinforcement Learning. Auton Agent Multi-Agent Syst 10, 273–328 (2005). https://doi.org/10.1007/s10458-004-4344-3
Issue Date:
DOI: https://doi.org/10.1007/s10458-004-4344-3