Skip to main content

MTMA-DDPG: A Deep Deterministic Policy Gradient Reinforcement Learning for Multi-task Multi-agent Environments

  • Conference paper
  • First Online:
Artificial Intelligence Applications and Innovations (AIAI 2022)

Part of the book series: IFIP Advances in Information and Communication Technology ((IFIPAICT,volume 646))

  • 1321 Accesses

Abstract

Beyond Multi-Task or Multi-Agent learning, we develop in this work a multi-agent reinforcement learning algorithm to handle a multi-task environments. Our proposed algorithm, Multi-Task Multi-Agent Deep Deterministic Policy gradient, (MTMA-DDPG) (Code available at https://gitlab.com/awadailab/mtmaddpg), extends its single task counterpart by running multiple tasks on distributed nodes and communicating parameters via pre-determined coefficients across the nodes. Parameter sharing is modulated through temporal decay of the communication coefficients. Training across nodes is parallelized without any centralized controller for different tasks, which opens horizons for flexible leveraging and parallel processing to improve MA learning.

Empirically, we design different MA particle environments, where tasks are similar or heterogeneous. We study the performance of MTMA-DDPG in terms of reward, convergence, variance, and communication overhead. We demonstrate the improvement of our algorithm over its single-task counterpart, as well as the importance of a versatile technique to take advantage of parallel computing resources.

J. El Zini and J. Hajar—Equal contribution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 119.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 159.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 159.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bram, T., Brunner, G., Richter, O., Wattenhofer, R.: Attentive multi-task deep reinforcement learning. arXiv preprint arXiv:1907.02874 (2019)

  2. Crawshaw, M.: Multi-task learning with deep neural networks: a survey. arXiv preprint arXiv:2009.09796 (2020)

  3. El Bsat, S., Ammar, H.B., Taylor, M.E.: Scalable multitask policy gradient reinforcement learning. In: Thirty-First AAAI Conference on Artificial Intelligence (2017)

    Google Scholar 

  4. Iqbal, S.: Maddpg-pytorch. https://github.com/shariqiqbal2810/maddpg-pytorch (2017)

  5. Lillicrap, T.P., et al.: Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015)

  6. Liu, X., Li, L., Hsieh, P.C., Xie, M., Ge, Y., Chen, R.: Developing multi-task recommendations with long-term rewards via policy distilled reinforcement learning. arXiv preprint arXiv:2001.09595 (2020)

  7. Lowe, R., Wu, Y., Tamar, A., Harb, J., Abbeel, P., Mordatch, I.: Multi-agent actor-critic for mixed cooperative-competitive environments. arXiv preprint arXiv:1706.02275 (2017)

  8. Macua, S.V., Tukiainen, A., Hernández, D.G.O., Baldazo, D., de Cote, E.M., Zazo, S.: Diff-dac: Distributed actor-critic for average multitask deep reinforcement learning. arXiv preprint arXiv:1710.10363 (2017)

  9. Omidshafiei, S., Pazis, J., Amato, C., How, J.P., Vian, J.: Deep decentralized multi-task multi-agent reinforcement learning under partial observability. In: International Conference on Machine Learning, pp. 2681–2690. PMLR (2017)

    Google Scholar 

  10. Papoudakis, G., Christianos, F., Schäfer, L., Albrecht, S.V.: Benchmarking multi-agent deep reinforcement learning algorithms in cooperative tasks (2021)

    Google Scholar 

  11. Pinto, L., Gupta, A.: Learning to push by grasping: Using multiple tasks for effective learning. In: 2017 IEEE international conference on robotics and automation (ICRA), pp. 2161–2168. IEEE (2017)

    Google Scholar 

  12. Pitis, S., Chan, H., Zhao, S., Stadie, B., Ba, J.: Maximum entropy gain exploration for long horizon multi-goal reinforcement learning. In: International Conference on Machine Learning, pp. 7750–7761. PMLR (2020)

    Google Scholar 

  13. Sunehag, P., et al.: Value-decomposition networks for cooperative multi-agent learning. arXiv preprint arXiv:1706.05296 (2017)

  14. Teh, Y.W., et al.: Distral: robust multitask reinforcement learning. arXiv preprint arXiv:1707.04175 (2017)

  15. Tutunov, R., Kim, D., Bou Ammar, H.: Distributed multitask reinforcement learning with quadratic convergence. Adv. Neural Inf. Process. Syst. 31, 8907–8916 (2018)

    Google Scholar 

  16. Vithayathil Varghese, N., Mahmoud, Q.H.: A survey of multi-task deep reinforcement learning. Electronics 9(9), 1363 (2020)

    Article  Google Scholar 

  17. Wang, R.E., Everett, M., How, J.P.: R-MADDPG for partially observable environments and limited communication. arXiv preprint arXiv:2002.06684 (2020)

  18. Yu, C., Velu, A., Vinitsky, E., Wang, Y., Bayen, A., Wu, Y.: The surprising effectiveness of mappo in cooperative, multi-agent games. arXiv preprint arXiv:2103.01955 (2021)

  19. Zhang, R., Zhu, Q.: Consensus-based transfer linear support vector machines for decentralized multi-task multi-agent learning. In: 2018 52nd Annual Conference on Information Sciences and Systems (CISS), pp. 1–6. IEEE (2018)

    Google Scholar 

  20. Zhao, R., Sun, X., Tresp, V.: Maximum entropy-regularized multi-goal reinforcement learning. In: International Conference on Machine Learning, pp. 7553–7562. PMLR (2019)

    Google Scholar 

Download references

Acknowledgment

This work was supported by the University Research Board (URB) and the Maroun Semaan Faculty of Engineering and Architecture (MSFEA) at the American University of Beirut.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mariette Awad .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 IFIP International Federation for Information Processing

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Hamadeh, K., El Zini, J., Hajar, J., Awad, M. (2022). MTMA-DDPG: A Deep Deterministic Policy Gradient Reinforcement Learning for Multi-task Multi-agent Environments. In: Maglogiannis, I., Iliadis, L., Macintyre, J., Cortez, P. (eds) Artificial Intelligence Applications and Innovations. AIAI 2022. IFIP Advances in Information and Communication Technology, vol 646. Springer, Cham. https://doi.org/10.1007/978-3-031-08333-4_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-08333-4_22

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-08332-7

  • Online ISBN: 978-3-031-08333-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics