MAR2MIX: A Novel Model for Dynamic Problem in Multi-agent Reinforcement Learning

Fang, Gaoyun; Liu, Yang; Liu, Jing; Song, Liang

doi:10.1007/978-981-99-1639-9_56

Gaoyun Fang¹⁰,
Yang Liu¹⁰,
Jing Liu^10,11 &
…
Liang Song^10,11

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1791))

Included in the following conference series:

International Conference on Neural Information Processing

798 Accesses

Abstract

As a challenging problem in the Multi-Agent Reinforcement Learning (MARL) community, the cooperative task has received extensive attention in recent years. Most current MARL algorithms use the centralized training distributed execution approach, which cannot effectively handle the relationship between local and global information during training. Meanwhile, many algorithms mainly focus on the collaborative tasks with a fixed number of agents without considering how to cooperate with the existing agents when the new agents enter in the environment. To address the above problems, we propose a Multi-agent Recurrent Residual Mix model (MAR2MIX). Firstly, we utilize the dynamic masking techniques to ensure that different multi-agent algorithms can operate in dynamic environments. Secondly, through the cyclic residual mixture network, we can efficiently extract features in the dynamic environment and achieve task collaboration while ensuring effective information transfer between global and local agents. We evaluate the MAR2MIX model in both non-dynamic and dynamic environments. The results show that our model can learn faster than other benchmark models. The training model is more stable and generalized, which can deal with the problem of agents joining in dynamic environments well.

This work is supported in part by the Shanghai Key Research Lab. of NSAI, China, and the Joint Lab. on Networked AI Edge Computing Fudan University-Changan.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Ackermann, J., Gabler, V., Osa, T., Sugiyama, M.: Reducing overestimation bias in multi-agent domains using double centralized critics. arXiv preprint arXiv:1910.01465 (2019)
Canese, L., et al.: Multi-agent reinforcement learning: a review of challenges and applications. Appl. Sci. 11(11), 4948 (2021)
Article Google Scholar
Chen, M., et al.: Distributed learning in wireless networks: recent progress and future challenges. IEEE J. Sel. Areas Commun. (2021)
Google Scholar
Chu, T., Wang, J., Codecà, L., Li, Z.: Multi-agent deep reinforcement learning for large-scale traffic signal control. IEEE Trans. Intell. Transp. Syst. 21(3), 1086–1095 (2019)
Article Google Scholar
Cui, J., Wei, L., Zhang, J., Xu, Y., Zhong, H.: An efficient message-authentication scheme based on edge computing for vehicular ad hoc networks. IEEE Trans. Intell. Transp. Syst. 20(5), 1621–1632 (2018)
Article Google Scholar
Da Silva, F.L., Warnell, G., Costa, A.H.R., Stone, P.: Agents teaching agents: a survey on inter-agent transfer learning. Auton. Agent. Multi-Agent Syst. 34(1), 1–17 (2020)
Article Google Scholar
Feriani, A., Hossain, E.: Single and multi-agent deep reinforcement learning for AI-enabled wireless networks: a tutorial. IEEE Commun. Surv. Tutor. 23(2), 1226–1252 (2021)
Article Google Scholar
Foerster, J., Farquhar, G., Afouras, T., Nardelli, N., Whiteson, S.: Counterfactual multi-agent policy gradients. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32 (2018)
Google Scholar
Gupta, S., Dukkipati, A.: Probabilistic view of multi-agent reinforcement learning: a unified approach (2019)
Google Scholar
Haarnoja, T., Zhou, A., Abbeel, P., Levine, S.: Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor. In: International Conference on Machine Learning, pp. 1861–1870. PMLR (2018)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)
Article Google Scholar
Liang, J., Chen, J., Zhu, Y., Yu, R.: A novel intrusion detection system for vehicular ad hoc networks (VANETs) based on differences of traffic flow and position. Appl. Soft Comput. 75, 712–727 (2019)
Article Google Scholar
Lillicrap, T.P., et al.: Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015)
Liu, B., Liu, Q., Stone, P., Garg, A., Zhu, Y., Anandkumar, A.: Coach-player multi-agent reinforcement learning for dynamic team composition. In: International Conference on Machine Learning, pp. 6860–6870. PMLR (2021)
Google Scholar
Liu, C., Tang, F., Hu, Y., Li, K., Tang, Z., Li, K.: Distributed task migration optimization in MEC by extending multi-agent deep reinforcement learning approach. IEEE Trans. Parallel Distrib. Syst. 32(7), 1603–1614 (2020)
Article Google Scholar
Liu, Y., Liu, J., Zhu, X., Wei, D., Huang, X., Song, L.: Learning task-specific representation for video anomaly detection with spatial-temporal attention. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2190–2194. IEEE (2022)
Google Scholar
Lowe, R., Wu, Y.I., Tamar, A., Harb, J., Pieter Abbeel, O., Mordatch, I.: Multi-agent actor-critic for mixed cooperative-competitive environments. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar
Mnih, V., et al.: Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013)
Murphy, K.P.: A survey of POMDP solution techniques. Environment 2(10) (2000)
Google Scholar
Rakelly, K., Zhou, A., Finn, C., Levine, S., Quillen, D.: Efficient off-policy meta-reinforcement learning via probabilistic context variables. In: International Conference on Machine Learning, pp. 5331–5340. PMLR (2019)
Google Scholar
Rashid, T., Samvelyan, M., Schroeder, C., Farquhar, G., Foerster, J., Whiteson, S.: QMIX: monotonic value function factorisation for deep multi-agent reinforcement learning. In: International Conference on Machine Learning, pp. 4295–4304. PMLR (2018)
Google Scholar
Shao, K., Zhu, Y., Zhao, D.: Starcraft micromanagement with reinforcement learning and curriculum transfer learning. IEEE Trans. Emerg. Top. Comput. Intell. 3(1), 73–84 (2018)
Article Google Scholar
Son, K., Ahn, S., Reyes, R.D., Shin, J., Yi, Y.: Qtran++: improved value transformation for cooperative multi-agent reinforcement learning. arXiv preprint arXiv:2006.12010 (2020)
Son, K., Kim, D., Kang, W.J., Hostallero, D.E., Yi, Y.: Qtran: learning to factorize with transformation for cooperative multi-agent reinforcement learning. In: International Conference on Machine Learning, pp. 5887–5896. PMLR (2019)
Google Scholar
Song, L., Hu, X., Zhang, G., Spachos, P., Plataniotis, K., Wu, H.: Networking systems of AI: on the convergence of computing and communications. IEEE Internet Things J. (2022)
Google Scholar
Sunehag, P., et al.: Value-decomposition networks for cooperative multi-agent learning. arXiv preprint arXiv:1706.05296 (2017)
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar
Xie, Y., Lin, R., Zou, H.: Multi-agent reinforcement learning via directed exploration method. In: 2022 2nd International Conference on Consumer Electronics and Computer Engineering (ICCECE), pp. 512–517. IEEE (2022)
Google Scholar
Yang, Y., Wang, J.: An overview of multi-agent reinforcement learning from game theoretical perspective. arXiv preprint arXiv:2011.00583 (2020)
Yu, Y., Si, X., Hu, C., Zhang, J.: A review of recurrent neural networks: LSTM cells and network architectures. Neural Comput. 31(7), 1235–1270 (2019)
Article MathSciNet MATH Google Scholar
Zhang, K., Yang, Z., Başar, T.: Multi-agent reinforcement learning: a selective overview of theories and algorithms. In: Handbook of Reinforcement Learning and Control, pp. 321–384 (2021)
Google Scholar
Zhou, Z., Chen, X., Li, E., Zeng, L., Luo, K., Zhang, J.: Edge intelligence: Paving the last mile of artificial intelligence with edge computing. Proc. IEEE 107(8), 1738–1762 (2019)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Academy for Engineering and Technology, Fudan University, Shanghai, 200433, China
Gaoyun Fang, Yang Liu, Jing Liu & Liang Song
Shanghai East-Bund Research Institute on NSAI, Shanghai, 200439, China
Jing Liu & Liang Song

Authors

Gaoyun Fang
View author publications
You can also search for this author in PubMed Google Scholar
Yang Liu
View author publications
You can also search for this author in PubMed Google Scholar
Jing Liu
View author publications
You can also search for this author in PubMed Google Scholar
Liang Song
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Liang Song .

Editor information

Editors and Affiliations

Indian Institute of Technology Indore, Indore, India
Mohammad Tanveer
Indian Institute of Information Technology - Allahabad, Prayagraj, India
Sonali Agarwal
Kobe University, Kobe, Japan
Seiichi Ozawa
Indian Institute of Technology Patna, Patna, India
Asif Ekbal
University of Innsbruck, Innsbruck, Austria
Adam Jatowt

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fang, G., Liu, Y., Liu, J., Song, L. (2023). MAR2MIX: A Novel Model for Dynamic Problem in Multi-agent Reinforcement Learning. In: Tanveer, M., Agarwal, S., Ozawa, S., Ekbal, A., Jatowt, A. (eds) Neural Information Processing. ICONIP 2022. Communications in Computer and Information Science, vol 1791. Springer, Singapore. https://doi.org/10.1007/978-981-99-1639-9_56

Download citation

DOI: https://doi.org/10.1007/978-981-99-1639-9_56
Published: 15 April 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-1638-2
Online ISBN: 978-981-99-1639-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

MAR2MIX: A Novel Model for Dynamic Problem in Multi-agent Reinforcement Learning