Achieving Multiagent Coordination Through CALA-rFMQ Learning in Continuous Action Space

Liu, Wanshu; Zhang, Chengwei; Yang, Tianpei; Hao, Jianye; Li, Xiaohong; Bao, Zhijie

doi:10.1007/978-3-319-97310-4_15

Wanshu Liu¹⁵,
Chengwei Zhang¹⁶,
Tianpei Yang¹⁵,
Jianye Hao¹⁵,
Xiaohong Li¹⁶ &
…
Zhijie Bao¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11013))

Included in the following conference series:

Pacific Rim International Conference on Artificial Intelligence

3718 Accesses

Abstract

In cooperative multiagent systems, an agent often needs to coordinate with other agents to optimize both individual and system-level payoffs. A lot of multiagent learning approaches have been proposed to address coordination problems in discrete-action cooperative environments. However, it becomes more challenging when faced with continuous action spaces, e.g., slow convergence rate and convergence to suboptimal policy. In this paper, we propose a novel algorithm called CALA-rFMQ (Continuous Action Learning Automata with recursive Frequency Maximum Q-Value) that ensures robust and efficient coordination among multiple agents in continuous action spaces. Experimental results show that CALA-rFMQ facilitates efficient coordination, and outperforms previous works.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

SCC-rFMQ: a multiagent reinforcement learning method in cooperative Markov games with continuous actions

Article 20 January 2022

QBRT: Bias and Rising Threshold Algorithm with Q-Learning

Expected Lenient Q-learning: a fast variant of the Lenient Q-learning algorithm for cooperative stochastic Markov games

Article 09 January 2024

References

Thathachar, M., Sastry, P.: Networks of Learning Automata: Techniques for Online Stochastic Optimization. Kluwer Academic Publishers, Boston (2004)
Book Google Scholar
Bowling, M., Veloso, M.: Multiagent learning using a variable learning rate. Artif. Intell. 136(2), 215–250 (2002)
Article MathSciNet Google Scholar
Tuyls, K., Now, A.: Evolutionary game theory and multi-agent reinforcement learning. Knowl. Eng. Rev. 20(1), 63–90 (2005)
Article Google Scholar
Kapetanakis, S., Kudenko, D.: Reinforcement learning of coordination in cooperative multi-agent systems. In: AAAI/IAAI, pp. 326–331 (2002)
Google Scholar
Matignon, L., Laurent, G.J., Le Fort-Piat, N.: Independent reinforcement learners in cooperative markov games: a survey regarding coordination problems. Knowl. Eng. Rev. 27(1), 1–31 (2012)
Article Google Scholar
Hao, J., Huang, D., Cai, Y., et al.: The dynamics of reinforcement social learning in networked cooperative multiagent systems. Eng. Appl. Artifi. Intell. 58, 111–122 (2017)
Article Google Scholar
Claus, C., Boutilier, C.: The dynamics of reinforcement learning in cooperative multiagent systems. In: Proceedings of the Fifteenth National Conference on Artificial Intelligence, pp. 746–752 (1998)
Google Scholar
Lauer, M., Riedmiller, M.: An algorithm for distributed reinforcement learning in cooperative multi-agent systems. In: Proceedings of the Seventeenth International Conference on Machine Learning (2000)
Google Scholar
Chen, X., Duan, Y., Houthooft, R., et al.: Infogan: interpretable representation learning by information maximizing generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2172–2180 (2016)
Google Scholar
Ray, S.S.: Numerical Analysis with Algorithms and Programming. CRC Press, Boca Raton (2016)
MATH Google Scholar
Alibekov, E., Kubalk, J., Babuka, R.: Policy derivation methods for critic-only reinforcement learning in continuous spaces. Eng. Appl. Artif. Intell. 69, 178–187 (2018)
Article Google Scholar
Sutton, R.S., Maei, H.R., Precup, D., et al.: Fast gradient-descent methods for temporal-difference learning with linear function approximation. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 993–1000. ACM (2009)
Google Scholar
Galstyan, A.: Continuous strategy replicator dynamics for multi-agent Q-learning. Auton. Agents Multi-Agent Syst. 26(1), 37–53 (2013)
Article Google Scholar
Lillicrap, T.P., Hunt, J.J., Pritzel, A., et al.: Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015)
Peters, J., Schaal, S.: Reinforcement learning of motor skills with policy gradients. Neural Netw. 21(4), 682–697 (2008)
Article Google Scholar
Van Hasselt, H.: Reinforcement learning in continuous state and action spaces. In: Wiering, M., van Otterlo, M. (eds.) Reinforcement Learning, pp. 207–251. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-27645-3_7
Chapter Google Scholar
Van Hasselt, H., Guez, A., Silver, D.: Deep reinforcement learning with double Q-Learning. In: AAAI 2016, pp. 2094–2100 (2016)
Google Scholar
Lazaric, A., Restelli, M., Bonarini, A.: Reinforcement learning in continuous action spaces through sequential Monte Carlo methods. In: Advances in Neural Information Processing Systems, pp. 833–840 (2008)
Google Scholar
De Jong, S., Tuyls, K., Verbeeck, K.: Artificial agents learning human fairness. In: Proceedings of the 7th International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS 2008, vol. 2, pp. 863–870 (2008)
Google Scholar

Download references

Acknowledgments

The work is supported by the National Natural Science Foundation of China under Grant No.: 61702362 and Special Program of Artificial Intelligence of Tianjin Municipal Science and Technology Commission (No.:569 17ZXRGGX00150).

Author information

Authors and Affiliations

School of Computer Software, Tianjin University, Tianjin, China
Wanshu Liu, Tianpei Yang & Jianye Hao
School of Computer Science and Technology, Tianjin University, Tianjin, China
Chengwei Zhang & Xiaohong Li
School of Textiles, Tianjin Polytechnic University, Tianjin, China
Zhijie Bao

Authors

Wanshu Liu
View author publications
You can also search for this author in PubMed Google Scholar
Chengwei Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Tianpei Yang
View author publications
You can also search for this author in PubMed Google Scholar
Jianye Hao
View author publications
You can also search for this author in PubMed Google Scholar
Xiaohong Li
View author publications
You can also search for this author in PubMed Google Scholar
Zhijie Bao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jianye Hao .

Editor information

Editors and Affiliations

Southeast University, Nanjing, China
Xin Geng
University of Tasmania, Hobart, Tasmania, Australia
Byeong-Ho Kang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, W., Zhang, C., Yang, T., Hao, J., Li, X., Bao, Z. (2018). Achieving Multiagent Coordination Through CALA-rFMQ Learning in Continuous Action Space. In: Geng, X., Kang, BH. (eds) PRICAI 2018: Trends in Artificial Intelligence. PRICAI 2018. Lecture Notes in Computer Science(), vol 11013. Springer, Cham. https://doi.org/10.1007/978-3-319-97310-4_15

Download citation

DOI: https://doi.org/10.1007/978-3-319-97310-4_15
Published: 27 July 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-97309-8
Online ISBN: 978-3-319-97310-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Achieving Multiagent Coordination Through CALA-rFMQ Learning in Continuous Action Space

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

SCC-rFMQ: a multiagent reinforcement learning method in cooperative Markov games with continuous actions

QBRT: Bias and Rising Threshold Algorithm with Q-Learning

Expected Lenient Q-learning: a fast variant of the Lenient Q-learning algorithm for cooperative stochastic Markov games

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Achieving Multiagent Coordination Through CALA-rFMQ Learning in Continuous Action Space

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

SCC-rFMQ: a multiagent reinforcement learning method in cooperative Markov games with continuous actions

QBRT: Bias and Rising Threshold Algorithm with Q-Learning

Expected Lenient Q-learning: a fast variant of the Lenient Q-learning algorithm for cooperative stochastic Markov games

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation