Skip to main content

MAR2MIX: A Novel Model for Dynamic Problem in Multi-agent Reinforcement Learning

  • Conference paper
  • First Online:
Neural Information Processing (ICONIP 2022)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1791))

Included in the following conference series:

  • 798 Accesses

Abstract

As a challenging problem in the Multi-Agent Reinforcement Learning (MARL) community, the cooperative task has received extensive attention in recent years. Most current MARL algorithms use the centralized training distributed execution approach, which cannot effectively handle the relationship between local and global information during training. Meanwhile, many algorithms mainly focus on the collaborative tasks with a fixed number of agents without considering how to cooperate with the existing agents when the new agents enter in the environment. To address the above problems, we propose a Multi-agent Recurrent Residual Mix model (MAR2MIX). Firstly, we utilize the dynamic masking techniques to ensure that different multi-agent algorithms can operate in dynamic environments. Secondly, through the cyclic residual mixture network, we can efficiently extract features in the dynamic environment and achieve task collaboration while ensuring effective information transfer between global and local agents. We evaluate the MAR2MIX model in both non-dynamic and dynamic environments. The results show that our model can learn faster than other benchmark models. The training model is more stable and generalized, which can deal with the problem of agents joining in dynamic environments well.

This work is supported in part by the Shanghai Key Research Lab. of NSAI, China, and the Joint Lab. on Networked AI Edge Computing Fudan University-Changan.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Ackermann, J., Gabler, V., Osa, T., Sugiyama, M.: Reducing overestimation bias in multi-agent domains using double centralized critics. arXiv preprint arXiv:1910.01465 (2019)

  2. Canese, L., et al.: Multi-agent reinforcement learning: a review of challenges and applications. Appl. Sci. 11(11), 4948 (2021)

    Article  Google Scholar 

  3. Chen, M., et al.: Distributed learning in wireless networks: recent progress and future challenges. IEEE J. Sel. Areas Commun. (2021)

    Google Scholar 

  4. Chu, T., Wang, J., Codecà, L., Li, Z.: Multi-agent deep reinforcement learning for large-scale traffic signal control. IEEE Trans. Intell. Transp. Syst. 21(3), 1086–1095 (2019)

    Article  Google Scholar 

  5. Cui, J., Wei, L., Zhang, J., Xu, Y., Zhong, H.: An efficient message-authentication scheme based on edge computing for vehicular ad hoc networks. IEEE Trans. Intell. Transp. Syst. 20(5), 1621–1632 (2018)

    Article  Google Scholar 

  6. Da Silva, F.L., Warnell, G., Costa, A.H.R., Stone, P.: Agents teaching agents: a survey on inter-agent transfer learning. Auton. Agent. Multi-Agent Syst. 34(1), 1–17 (2020)

    Article  Google Scholar 

  7. Feriani, A., Hossain, E.: Single and multi-agent deep reinforcement learning for AI-enabled wireless networks: a tutorial. IEEE Commun. Surv. Tutor. 23(2), 1226–1252 (2021)

    Article  Google Scholar 

  8. Foerster, J., Farquhar, G., Afouras, T., Nardelli, N., Whiteson, S.: Counterfactual multi-agent policy gradients. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32 (2018)

    Google Scholar 

  9. Gupta, S., Dukkipati, A.: Probabilistic view of multi-agent reinforcement learning: a unified approach (2019)

    Google Scholar 

  10. Haarnoja, T., Zhou, A., Abbeel, P., Levine, S.: Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor. In: International Conference on Machine Learning, pp. 1861–1870. PMLR (2018)

    Google Scholar 

  11. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

    Google Scholar 

  12. LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)

    Article  Google Scholar 

  13. Liang, J., Chen, J., Zhu, Y., Yu, R.: A novel intrusion detection system for vehicular ad hoc networks (VANETs) based on differences of traffic flow and position. Appl. Soft Comput. 75, 712–727 (2019)

    Article  Google Scholar 

  14. Lillicrap, T.P., et al.: Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015)

  15. Liu, B., Liu, Q., Stone, P., Garg, A., Zhu, Y., Anandkumar, A.: Coach-player multi-agent reinforcement learning for dynamic team composition. In: International Conference on Machine Learning, pp. 6860–6870. PMLR (2021)

    Google Scholar 

  16. Liu, C., Tang, F., Hu, Y., Li, K., Tang, Z., Li, K.: Distributed task migration optimization in MEC by extending multi-agent deep reinforcement learning approach. IEEE Trans. Parallel Distrib. Syst. 32(7), 1603–1614 (2020)

    Article  Google Scholar 

  17. Liu, Y., Liu, J., Zhu, X., Wei, D., Huang, X., Song, L.: Learning task-specific representation for video anomaly detection with spatial-temporal attention. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2190–2194. IEEE (2022)

    Google Scholar 

  18. Lowe, R., Wu, Y.I., Tamar, A., Harb, J., Pieter Abbeel, O., Mordatch, I.: Multi-agent actor-critic for mixed cooperative-competitive environments. In: Advances in Neural Information Processing Systems, vol. 30 (2017)

    Google Scholar 

  19. Mnih, V., et al.: Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013)

  20. Murphy, K.P.: A survey of POMDP solution techniques. Environment 2(10) (2000)

    Google Scholar 

  21. Rakelly, K., Zhou, A., Finn, C., Levine, S., Quillen, D.: Efficient off-policy meta-reinforcement learning via probabilistic context variables. In: International Conference on Machine Learning, pp. 5331–5340. PMLR (2019)

    Google Scholar 

  22. Rashid, T., Samvelyan, M., Schroeder, C., Farquhar, G., Foerster, J., Whiteson, S.: QMIX: monotonic value function factorisation for deep multi-agent reinforcement learning. In: International Conference on Machine Learning, pp. 4295–4304. PMLR (2018)

    Google Scholar 

  23. Shao, K., Zhu, Y., Zhao, D.: Starcraft micromanagement with reinforcement learning and curriculum transfer learning. IEEE Trans. Emerg. Top. Comput. Intell. 3(1), 73–84 (2018)

    Article  Google Scholar 

  24. Son, K., Ahn, S., Reyes, R.D., Shin, J., Yi, Y.: Qtran++: improved value transformation for cooperative multi-agent reinforcement learning. arXiv preprint arXiv:2006.12010 (2020)

  25. Son, K., Kim, D., Kang, W.J., Hostallero, D.E., Yi, Y.: Qtran: learning to factorize with transformation for cooperative multi-agent reinforcement learning. In: International Conference on Machine Learning, pp. 5887–5896. PMLR (2019)

    Google Scholar 

  26. Song, L., Hu, X., Zhang, G., Spachos, P., Plataniotis, K., Wu, H.: Networking systems of AI: on the convergence of computing and communications. IEEE Internet Things J. (2022)

    Google Scholar 

  27. Sunehag, P., et al.: Value-decomposition networks for cooperative multi-agent learning. arXiv preprint arXiv:1706.05296 (2017)

  28. Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)

    Google Scholar 

  29. Xie, Y., Lin, R., Zou, H.: Multi-agent reinforcement learning via directed exploration method. In: 2022 2nd International Conference on Consumer Electronics and Computer Engineering (ICCECE), pp. 512–517. IEEE (2022)

    Google Scholar 

  30. Yang, Y., Wang, J.: An overview of multi-agent reinforcement learning from game theoretical perspective. arXiv preprint arXiv:2011.00583 (2020)

  31. Yu, Y., Si, X., Hu, C., Zhang, J.: A review of recurrent neural networks: LSTM cells and network architectures. Neural Comput. 31(7), 1235–1270 (2019)

    Article  MathSciNet  MATH  Google Scholar 

  32. Zhang, K., Yang, Z., Başar, T.: Multi-agent reinforcement learning: a selective overview of theories and algorithms. In: Handbook of Reinforcement Learning and Control, pp. 321–384 (2021)

    Google Scholar 

  33. Zhou, Z., Chen, X., Li, E., Zeng, L., Luo, K., Zhang, J.: Edge intelligence: Paving the last mile of artificial intelligence with edge computing. Proc. IEEE 107(8), 1738–1762 (2019)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Liang Song .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Fang, G., Liu, Y., Liu, J., Song, L. (2023). MAR2MIX: A Novel Model for Dynamic Problem in Multi-agent Reinforcement Learning. In: Tanveer, M., Agarwal, S., Ozawa, S., Ekbal, A., Jatowt, A. (eds) Neural Information Processing. ICONIP 2022. Communications in Computer and Information Science, vol 1791. Springer, Singapore. https://doi.org/10.1007/978-981-99-1639-9_56

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-1639-9_56

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-1638-2

  • Online ISBN: 978-981-99-1639-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics