Abstract
It can largely benefit the reinforcement learning (RL) process of each agent if multiple geographically distributed agents perform their separate RL tasks cooperatively. Different from multi-agent reinforcement learning (MARL) where multiple agents are in a common environment and should learn to cooperate or compete with each other, in this case each agent has its separate environment and only communicates with others to share knowledge without any cooperative or competitive behaviour as a learning outcome. In fact, this scenario exists widely in real life whose concept can be utilised in many applications, but is not well understood yet and not well formulated. As the first effort, we propose group-agent system for RL as a formulation of this scenario and the third type of RL system with respect to single-agent and multi-agent systems. We then propose a distributed RL framework called DDAL (Decentralised Distributed Asynchronous Learning) designed for group-agent reinforcement learning (GARL). We show through experiments that DDAL achieved desirable performance with very stable training and has good scalability.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bandura, A., Walters, R.H.: Social Learning Theory, vol. 1. Prentice Hall, Englewood Cliffs (1977)
Bellman, R.: A Markovian decision process. J. Math. Mech. 6(5), 679–684 (1957). http://www.jstor.org/stable/24900506
Buşoniu, L., Babuška, R., Schutter, B.D.: Multi-agent reinforcement learning: an overview. In: Innovations in Multi-Agent Systems and Applications-1, pp. 183–221 (2010)
Denoyer, L., de la Fuente, A., Duong, S., Gaya, J.B., Kamienny, P.A., Thompson, D.H.: Salina: sequential learning of agents (2021). https://github.com/facebookresearch/salina
Foerster, J., et al.: Stabilising experience replay for deep multi-agent reinforcement learning. In: International Conference on Machine Learning, pp. 1146–1155. PMLR (2017)
Guo, X., Chang, S., Yu, M., Tesauro, G., Campbell, M.: Hybrid reinforcement learning with expert state sequences. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 3739–3746 (2019)
Lowe, R., Wu, Y.I., Tamar, A., Harb, J., Pieter Abbeel, O., Mordatch, I.: Multi-agent actor-critic for mixed cooperative-competitive environments. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Luo, M., Yao, J., Liaw, R., Liang, E., Stoica, I.: Impact: importance weighted asynchronous architectures with clipped target networks (2020)
Ma, X., Yang, Y., Li, C., Lu, Y., Zhao, Q., Jun, Y.: Modeling the interaction between agents in cooperative multi-agent reinforcement learning. arXiv preprint arXiv:2102.06042 (2021)
Matignon, L., Laurent, G.J., Le Fort-Piat, N.: Independent reinforcement learners in cooperative Markov games: a survey regarding coordination problems. Knowl. Eng. Rev. 27(1), 1–31 (2012)
Mnih, V., et al.: Asynchronous methods for deep reinforcement learning. In: International Conference on Machine Learning, pp. 1928–1937. PMLR (2016)
Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015). https://doi.org/10.1038/nature14236
Nair, A., et al.: Massively parallel methods for deep reinforcement learning (2015)
Ndousse, K.K., Eck, D., Levine, S., Jaques, N.: Emergent social learning via multi-agent reinforcement learning. In: International Conference on Machine Learning, pp. 7991–8004. PMLR (2021)
Omidshafiei, S., Pazis, J., Amato, C., How, J.P., Vian, J.: Deep decentralized multi-task multi-agent reinforcement learning under partial observability. In: International Conference on Machine Learning, pp. 2681–2690. PMLR (2017)
Sallab, A.E., Abdou, M., Perot, E., Yogamani, S.: Deep reinforcement learning framework for autonomous driving. Electron. Imaging 2017(19), 70–76 (2017)
Samsami, M.R., Alimadad, H.: Distributed deep reinforcement learning: an overview. CoRR abs/2011.11012 (2020). arxiv.org/abs/2011.11012
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms (2017)
Stadie, B.C., Abbeel, P., Sutskever, I.: Third-person imitation learning. arXiv preprint arXiv:1703.01703 (2017)
Vithayathil Varghese, N., Mahmoud, Q.H.: A survey of multi-task deep reinforcement learning. Electronics 9(9), 1363 (2020)
Wang, J., Ren, Z., Liu, T., Yu, Y., Zhang, C.: QPLEX: duplex dueling multi-agent Q-learning. arXiv preprint arXiv:2008.01062 (2020)
Watkins, C.J.C.H.: Learning from delayed rewards (1989)
Wijmans, E., et al.: DD-PPO: learning near-perfect PointGoal navigators from 2.5 billion frames (2020)
Zhang, K., Yang, Z., Basar, T.: Networked multi-agent reinforcement learning in continuous spaces. In: 2018 IEEE Conference on Decision and Control (CDC), pp. 2771–2776. IEEE (2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Wu, K., Zeng, XJ. (2023). Group-Agent Reinforcement Learning. In: Iliadis, L., Papaleonidas, A., Angelov, P., Jayne, C. (eds) Artificial Neural Networks and Machine Learning – ICANN 2023. ICANN 2023. Lecture Notes in Computer Science, vol 14259. Springer, Cham. https://doi.org/10.1007/978-3-031-44223-0_4
Download citation
DOI: https://doi.org/10.1007/978-3-031-44223-0_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-44222-3
Online ISBN: 978-3-031-44223-0
eBook Packages: Computer ScienceComputer Science (R0)