DDMA: Discrepancy-Driven Multi-agent Reinforcement Learning

Li, Chao; Hu, Yujing; Tian, Pinzhuo; Dong, Shaokang; Gao, Yang

doi:10.1007/978-3-031-20868-3_7

Chao Li^11,14,
Yujing Hu¹²,
Pinzhuo Tian^11,13,
Shaokang Dong^11,14 &
…
Yang Gao^11,14

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13631))

Included in the following conference series:

Pacific Rim International Conference on Artificial Intelligence

1547 Accesses

Abstract

Multi-agent reinforcement learning algorithms depend on quantities of interactions with the environment and other agents to derive an approximately optimal policy. However, these algorithms may struggle in the complex interactive relationships between agents and tend to explore the whole observation space aimlessly, which results in high learning complexity. Motivated by the occasional and local interactions between multiple agents in most real-world scenarios, in this paper, we propose a general framework named Discrepancy-Driven Multi-Agent reinforcement learning (DDMA) to help overcome this limitation. In this framework, we first parse the semantic components of each agent’s observation and introduce a proliferative network to directly initialize the multi-agent policy with the corresponding single-agent optimal policy, which bypasses the misalignment of observation spaces in different scenarios. Then we model the occasional interactions among agents based on the discrepancy between these two policies, and conduct more focused exploration on these areas where agents interact frequently. With the direct initialization and the focused multi-agent policy learning, our framework can help accelerate the learning process and promote the asymptotic performance significantly. Experimental results on a toy example and several classic benchmarks demonstrate that our framework can obtain superior performance compared to baseline methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Mix-attention approximation for homogeneous large-scale multi-agent reinforcement learning

Article 07 October 2022

Multi-Agent Reinforcement Learning

Low variance trust region optimization with independent actors and sequential updates in cooperative multi-agent reinforcement learning

Article 20 February 2025

Notes

1.
Please refer to https://github.com/chaobiubiu/DDMA for more details.

References

Cao, Y., Yu, W., Ren, W., Chen, G.: An overview of recent progress in the study of distributed multi-agent coordination. IEEE Trans. Industr. Inf. 9(1), 427–438 (2012)
Article Google Scholar
Christianos, F., Schäfer, L., Albrecht, S.: Shared experience actor-critic for multi-agent reinforcement learning, vol. 33, pp. 10707–10717 (2020)
Google Scholar
Da Silva, F.L., Costa, A.H.R.: A survey on transfer learning for multiagent reinforcement learning systems. J. Artif. Intell. Res. 64, 645–703 (2019)
Article MathSciNet MATH Google Scholar
De Hauwere, Y.M., Vrancx, P., Nowé, A.: Learning multi-agent state space representations. In: Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems, vol. 1, pp. 715–722 (2010)
Google Scholar
Diuk, C., Cohen, A., Littman, M.L.: An object-oriented representation for efficient reinforcement learning. In: Proceedings of the 25th International Conference on Machine Learning, pp. 240–247 (2008)
Google Scholar
Farquhar, G., Gustafson, L., Lin, Z., Whiteson, S., Usunier, N., Synnaeve, G.: Growing action spaces. In: International Conference on Machine Learning. PMLR, pp. 3040–3051 (2020)
Google Scholar
Foerster, J., Farquhar, G., Afouras, T., Nardelli, N., Whiteson, S.: Counterfactual multi-agent policy gradients. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32 (2018)
Google Scholar
Fortunato, M., et al.: Noisy networks for exploration. arXiv preprint arXiv:1706.10295 (2017)
Goodfellow, I.J., et al.: Generative adversarial nets. In: Proceedings of the 27th International Conference on Neural Information Processing Systems, vol. 2, pp. 2672–2680 (2014)
Google Scholar
Huang, Y., Wu, S., Mu, Z., Long, X., Chu, S., Zhao, G.: A multi-agent reinforcement learning method for swarm robots in space collaborative exploration. In: 2020 6th International Conference on Control, Automation and Robotics (ICCAR), pp. 139–144. IEEE (2020)
Google Scholar
Iqbal, S., Sha, F.: Actor-attention-critic for multi-agent reinforcement learning. In: International Conference on Machine Learning. PMLR, pp. 2961–2970 (2019)
Google Scholar
Liu, Y., Hu, Y., Gao, Y., Chen, Y., Fan, C.: Value function transfer for deep multi-agent reinforcement learning based on n-step returns. In: IJCAI, pp. 457–463 (2019)
Google Scholar
Liu, Y., Wang, W., Hu, Y., Hao, J., Chen, X., Gao, Y.: Multi-agent game abstraction via graph attention neural network. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 7211–7218 (2020)
Google Scholar
Long, Q., Zhou, Z., Gupta, A., Fang, F., Wu, Y., Wang, X.: Evolutionary population curriculum for scaling multi-agent reinforcement learning. arXiv preprint arXiv:2003.10423 (2020)
Lowe, R., Wu, Y., Tamar, A., Harb, J., Abbeel, P., Mordatch, I.: Multi-agent actor-critic for mixed cooperative-competitive environments. arXiv preprint arXiv:1706.02275 (2017)
Mordatch, I., Abbeel, P.: Emergence of grounded compositional language in multi-agent populations. In: Thirty-Second AAAI Conference on Artificial Intelligence (2018)
Google Scholar
Narvekar, S., Peng, B., Leonetti, M., Sinapov, J., Taylor, M.E., Stone, P.: Curriculum learning for reinforcement learning domains: a framework and survey. J. Mach. Learn. Res. 21, 1–50 (2020)
MathSciNet MATH Google Scholar
Omidshafiei, S., et al.: Learning to teach in cooperative multiagent reinforcement learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 6128–6136 (2019)
Google Scholar
Rashid, T., Samvelyan, M., Schroeder, C., Farquhar, G., Foerster, J., Whiteson, S.: QMIX: monotonic value function factorisation for deep multi-agent reinforcement learning. In: International Conference on Machine Learning. PMLR, pp. 4295–4304 (2018)
Google Scholar
Samvelyan, M., et al.: The starcraft multi-agent challenge. In: Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems, pp. 2186–2188 (2019)
Google Scholar
Son, K., Kim, D., Kang, W.J., Hostallero, D.E., Yi, Y.: QTRAN: learning to factorize with transformation for cooperative multi-agent reinforcement learning. In: International Conference on Machine Learning. PMLR, pp. 5887–5896 (2019)
Google Scholar
Sunehag, P., et al.: Value-decomposition networks for cooperative multi-agent learning. arXiv preprint arXiv:1706.05296 (2017)
Sutton, R.S., McAllester, D.A., Singh, S.P., Mansour, Y.: Policy gradient methods for reinforcement learning with function approximation. In: Advances in Neural Information Processing Systems, pp. 1057–1063 (2000)
Google Scholar
Vezhnevets, A., Wu, Y., Eckstein, M., Leblond, R., Leibo, J.Z.: Options as responses: grounding behavioural hierarchies in multi-agent reinforcement learning. In: International Conference on Machine Learning. PMLR, pp. 9733–9742 (2020)
Google Scholar
Wang, J., Ren, Z., Liu, T., Yu, Y., Zhang, C.: QPLEX: duplex dueling multi-agent Q-learning. arXiv preprint arXiv:2008.01062 (2020)
Wang, T., Gupta, T., Mahajan, A., Peng, B., Whiteson, S., Zhang, C.: RODE: learning roles to decompose multi-agent tasks. arXiv preprint arXiv:2010.01523 (2020)
Wang, W., et al.: From few to more: large-scale dynamic multiagent curriculum learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 7293–7300 (2020)
Google Scholar
Yang, T., et al.: An efficient transfer learning framework for multiagent reinforcement learning, vol. 34 (2021)
Google Scholar
Yang, Y., Luo, R., Li, M., Zhou, M., Zhang, W., Wang, J.: Mean field multi-agent reinforcement learning. In: International Conference on Machine Learning. PMLR, pp. 5571–5580 (2018)
Google Scholar

Download references

Acknowledgements

This work is supported by Shenzhen Fundamental Research Program (No.2021Szvup056), Primary Research & Development Plan of Jiangsu Province (No. BE2021028), National Natural Science Foundation of China (No. 62192783), and Science and Technology Innovation 2030 New Generation Artificial Intelligence Major Project (No.2018AAA0100905).

Author information

Authors and Affiliations

State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China
Chao Li, Pinzhuo Tian, Shaokang Dong & Yang Gao
NetEase Fuxi AI Lab, Hangzhou, China
Yujing Hu
School of Computer Engineering and Science, Shanghai University, Shanghai, China
Pinzhuo Tian
Shenzhen Research Institute, Nanjing University, Shenzhen, China
Chao Li, Shaokang Dong & Yang Gao

Authors

Chao Li
View author publications
You can also search for this author in PubMed Google Scholar
Yujing Hu
View author publications
You can also search for this author in PubMed Google Scholar
Pinzhuo Tian
View author publications
You can also search for this author in PubMed Google Scholar
Shaokang Dong
View author publications
You can also search for this author in PubMed Google Scholar
Yang Gao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yang Gao .

Editor information

Editors and Affiliations

CSIRO Australian e-Health Research Centre, Brisbane, QLD, Australia
Sankalp Khanna
Shanghai Jiao Tong University, Shanghai, China
Jian Cao
University of Tasmania, Hobart, TAS, Australia
Quan Bai
University of Technology Sydney, Sydney, NSW, Australia
Guandong Xu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, C., Hu, Y., Tian, P., Dong, S., Gao, Y. (2022). DDMA: Discrepancy-Driven Multi-agent Reinforcement Learning. In: Khanna, S., Cao, J., Bai, Q., Xu, G. (eds) PRICAI 2022: Trends in Artificial Intelligence. PRICAI 2022. Lecture Notes in Computer Science, vol 13631. Springer, Cham. https://doi.org/10.1007/978-3-031-20868-3_7

Download citation

DOI: https://doi.org/10.1007/978-3-031-20868-3_7
Published: 04 November 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-20867-6
Online ISBN: 978-3-031-20868-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics