Abstract
Beyond Multi-Task or Multi-Agent learning, we develop in this work a multi-agent reinforcement learning algorithm to handle a multi-task environments. Our proposed algorithm, Multi-Task Multi-Agent Deep Deterministic Policy gradient, (MTMA-DDPG) (Code available at https://gitlab.com/awadailab/mtmaddpg), extends its single task counterpart by running multiple tasks on distributed nodes and communicating parameters via pre-determined coefficients across the nodes. Parameter sharing is modulated through temporal decay of the communication coefficients. Training across nodes is parallelized without any centralized controller for different tasks, which opens horizons for flexible leveraging and parallel processing to improve MA learning.
Empirically, we design different MA particle environments, where tasks are similar or heterogeneous. We study the performance of MTMA-DDPG in terms of reward, convergence, variance, and communication overhead. We demonstrate the improvement of our algorithm over its single-task counterpart, as well as the importance of a versatile technique to take advantage of parallel computing resources.
J. El Zini and J. Hajar—Equal contribution.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bram, T., Brunner, G., Richter, O., Wattenhofer, R.: Attentive multi-task deep reinforcement learning. arXiv preprint arXiv:1907.02874 (2019)
Crawshaw, M.: Multi-task learning with deep neural networks: a survey. arXiv preprint arXiv:2009.09796 (2020)
El Bsat, S., Ammar, H.B., Taylor, M.E.: Scalable multitask policy gradient reinforcement learning. In: Thirty-First AAAI Conference on Artificial Intelligence (2017)
Iqbal, S.: Maddpg-pytorch. https://github.com/shariqiqbal2810/maddpg-pytorch (2017)
Lillicrap, T.P., et al.: Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015)
Liu, X., Li, L., Hsieh, P.C., Xie, M., Ge, Y., Chen, R.: Developing multi-task recommendations with long-term rewards via policy distilled reinforcement learning. arXiv preprint arXiv:2001.09595 (2020)
Lowe, R., Wu, Y., Tamar, A., Harb, J., Abbeel, P., Mordatch, I.: Multi-agent actor-critic for mixed cooperative-competitive environments. arXiv preprint arXiv:1706.02275 (2017)
Macua, S.V., Tukiainen, A., Hernández, D.G.O., Baldazo, D., de Cote, E.M., Zazo, S.: Diff-dac: Distributed actor-critic for average multitask deep reinforcement learning. arXiv preprint arXiv:1710.10363 (2017)
Omidshafiei, S., Pazis, J., Amato, C., How, J.P., Vian, J.: Deep decentralized multi-task multi-agent reinforcement learning under partial observability. In: International Conference on Machine Learning, pp. 2681–2690. PMLR (2017)
Papoudakis, G., Christianos, F., Schäfer, L., Albrecht, S.V.: Benchmarking multi-agent deep reinforcement learning algorithms in cooperative tasks (2021)
Pinto, L., Gupta, A.: Learning to push by grasping: Using multiple tasks for effective learning. In: 2017 IEEE international conference on robotics and automation (ICRA), pp. 2161–2168. IEEE (2017)
Pitis, S., Chan, H., Zhao, S., Stadie, B., Ba, J.: Maximum entropy gain exploration for long horizon multi-goal reinforcement learning. In: International Conference on Machine Learning, pp. 7750–7761. PMLR (2020)
Sunehag, P., et al.: Value-decomposition networks for cooperative multi-agent learning. arXiv preprint arXiv:1706.05296 (2017)
Teh, Y.W., et al.: Distral: robust multitask reinforcement learning. arXiv preprint arXiv:1707.04175 (2017)
Tutunov, R., Kim, D., Bou Ammar, H.: Distributed multitask reinforcement learning with quadratic convergence. Adv. Neural Inf. Process. Syst. 31, 8907–8916 (2018)
Vithayathil Varghese, N., Mahmoud, Q.H.: A survey of multi-task deep reinforcement learning. Electronics 9(9), 1363 (2020)
Wang, R.E., Everett, M., How, J.P.: R-MADDPG for partially observable environments and limited communication. arXiv preprint arXiv:2002.06684 (2020)
Yu, C., Velu, A., Vinitsky, E., Wang, Y., Bayen, A., Wu, Y.: The surprising effectiveness of mappo in cooperative, multi-agent games. arXiv preprint arXiv:2103.01955 (2021)
Zhang, R., Zhu, Q.: Consensus-based transfer linear support vector machines for decentralized multi-task multi-agent learning. In: 2018 52nd Annual Conference on Information Sciences and Systems (CISS), pp. 1–6. IEEE (2018)
Zhao, R., Sun, X., Tresp, V.: Maximum entropy-regularized multi-goal reinforcement learning. In: International Conference on Machine Learning, pp. 7553–7562. PMLR (2019)
Acknowledgment
This work was supported by the University Research Board (URB) and the Maroun Semaan Faculty of Engineering and Architecture (MSFEA) at the American University of Beirut.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 IFIP International Federation for Information Processing
About this paper
Cite this paper
Hamadeh, K., El Zini, J., Hajar, J., Awad, M. (2022). MTMA-DDPG: A Deep Deterministic Policy Gradient Reinforcement Learning for Multi-task Multi-agent Environments. In: Maglogiannis, I., Iliadis, L., Macintyre, J., Cortez, P. (eds) Artificial Intelligence Applications and Innovations. AIAI 2022. IFIP Advances in Information and Communication Technology, vol 646. Springer, Cham. https://doi.org/10.1007/978-3-031-08333-4_22
Download citation
DOI: https://doi.org/10.1007/978-3-031-08333-4_22
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-08332-7
Online ISBN: 978-3-031-08333-4
eBook Packages: Computer ScienceComputer Science (R0)