Rationality of Reward Sharing in Multi-agent Reinforcement Learning

Miyazaki, Kazuteru; Kobayashi, Shigenobu

doi:10.1007/3-540-46693-2_9

Kazuteru Miyazaki³ &
Shigenobu Kobayashi³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1733))

Included in the following conference series:

Pacific Rim International Workshop on Multi-Agents

1815 Accesses
6 Citations

Abstract

In multi-agent reinforcement learning systems, it is important to share a reward among all agents. We focus on the Rationality Theorem of Profit Sharing [5] and analyze how to share a reward among all profit sharing agents. When an agent gets a direct reward R (R > 0), an indirect reward µR (µ ≥ 0) is given to the other agents. We have derived the necessary and sufficient condition to preserve the rationality as follows

$$ \mu < \frac{{M - 1}} {{M^W \left( {1 - (\tfrac{1} {M})^{W_0 } } \right)\left( {n - 1} \right)L}}, $$

where M and L are the maximum number of conflicting all rules and rational rules in the same sensory input, W and W ₀ are the maximum episode length of a direct and an indirect-reward agents, and n is the number of agents. This theory is derived by avoiding the least desirable situation whose expected reward per an action is zero. Therefore, if we use this theorem, we can experience several efficient aspects of reward sharing. Through numerical examples, we confirm the effectiveness of this theorem.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Arai, S., Miyazaki, K., and Kobayashi, S.: Generating Cooperative Behavior by Multi-Agent Reinforcement Learning, Proc. of the 6th European Workshop on Learning Robots, pp.143–157 (1997). 111, 115
Google Scholar
Arai, S., Miyazaki, K., and Kobayashi, S.: Cranes Control Using Multi-agent Reinforcement Learning, International Conference on Intelligent Autonomous System 5, pp.335–342 (1998). 111, 115
Google Scholar
Grefenstette, J. J.: Credit Assignment in Rule Discovery Systems Based on Genetic Algorithms, Machine Learning Vol.3, pp.225–245 (1988). 111, 113
Google Scholar
Holland, J. H.: Escaping Brittleness: The Possibilities of General-Purpose Learning Algorithms Applied to Parallel Rule-Based Sysems, in R.S. Michalsky et al. (eds.), Machine Learning: An Artificial Intelligence Approach, Vol.2, pp.593–623. Morgan Kaufman (1986). 111
Google Scholar
Miyazaki, K., Yamamura, M., and Kobayashi, S.: On the Rationality of Profit Sharing in Reinforcement Learning, Proc. of the 3rd International Conference on Fuzzy Logic, Neural Nets and Soft Computing, Iizuka, Japan, pp.285–288 (1994). 111, 112, 114, 116
Google Scholar
Miyazaki, K., and Kobayashi, S.: Learning Deterministic Policies in Partially Observable Markov Decision Processes, International Conference on Intelligent Autonomous System 5, pp.250–257 (1998). 115
Google Scholar
Ono, N., Ikeda, O. and Rahmani, A.T.: Synthesis of Herding and Specialized Behavior by Modular Q-learning Animats, Proc. of the ALIFE V Poster Presentations, pp.26–30 (1996). 111
Google Scholar
Sen, S. and Sekaran, M.: Multiagent Coordination with Learning Classifier Systems, inWeiss, G. and Sen, S.(eds.), Adaption and Learning in Multi-agent systems, Berlin, Heidelberg. Springer Verlag, pp.218–233 (1995). 111, 115
Google Scholar
Tan, M.: Multi-Agent Reinforcement Learning: Independent vs. Cooperative Agents, Proc. of the 10th International Conference on Machine Learning, pp.330–337 (1993). 111
Google Scholar
Watkins, C. J. H., and Dayan, P.: Technical note: Q-learning, Machine Learning Vol.8, pp.55–68 (1992). 111, 116
Google Scholar
Weiss, G.: Learning to Coordinate Actions in Multi-Agent Systems, Proc. of the 13th International Joint Conference on Artificial Intelligence, pp.311–316 (1993). 111
Google Scholar
Whitehead, S. D. and Balland, D. H.: Active perception and Reinforcement Learning, Proc. of the 7th International Conference on Machine Learning, pp.162–169 (1990). 111, 115
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computational Intelligence and Systems Science Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology, 4259, Nagatsuta, Midori-ku, Yokohama, 226-8502, JAPAN
Kazuteru Miyazaki & Shigenobu Kobayashi

Authors

Kazuteru Miyazaki
View author publications
You can also search for this author in PubMed Google Scholar
Shigenobu Kobayashi
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Electrotechnical Laboratories, Umezono 1-1-4, Tsukuba, Ibaraki, 305-0045, Japan
Hideyuki Nakashima
Deakin University Geelong, Scool of Computing and Mathematics, Victoria 3217, Australia
Chengqi Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Miyazaki, K., Kobayashi, S. (1999). Rationality of Reward Sharing in Multi-agent Reinforcement Learning. In: Nakashima, H., Zhang, C. (eds) Approaches to Intelligence Agents. PRIMA 1999. Lecture Notes in Computer Science(), vol 1733. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-46693-2_9

Download citation

DOI: https://doi.org/10.1007/3-540-46693-2_9
Published: 04 June 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66823-7
Online ISBN: 978-3-540-46693-2
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics