Skip to main content

DDMA: Discrepancy-Driven Multi-agent Reinforcement Learning

  • Conference paper
  • First Online:
PRICAI 2022: Trends in Artificial Intelligence (PRICAI 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13631))

Included in the following conference series:

  • 1241 Accesses

Abstract

Multi-agent reinforcement learning algorithms depend on quantities of interactions with the environment and other agents to derive an approximately optimal policy. However, these algorithms may struggle in the complex interactive relationships between agents and tend to explore the whole observation space aimlessly, which results in high learning complexity. Motivated by the occasional and local interactions between multiple agents in most real-world scenarios, in this paper, we propose a general framework named Discrepancy-Driven Multi-Agent reinforcement learning (DDMA) to help overcome this limitation. In this framework, we first parse the semantic components of each agent’s observation and introduce a proliferative network to directly initialize the multi-agent policy with the corresponding single-agent optimal policy, which bypasses the misalignment of observation spaces in different scenarios. Then we model the occasional interactions among agents based on the discrepancy between these two policies, and conduct more focused exploration on these areas where agents interact frequently. With the direct initialization and the focused multi-agent policy learning, our framework can help accelerate the learning process and promote the asymptotic performance significantly. Experimental results on a toy example and several classic benchmarks demonstrate that our framework can obtain superior performance compared to baseline methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Please refer to https://github.com/chaobiubiu/DDMA for more details.

References

  1. Cao, Y., Yu, W., Ren, W., Chen, G.: An overview of recent progress in the study of distributed multi-agent coordination. IEEE Trans. Industr. Inf. 9(1), 427–438 (2012)

    Article  Google Scholar 

  2. Christianos, F., Schäfer, L., Albrecht, S.: Shared experience actor-critic for multi-agent reinforcement learning, vol. 33, pp. 10707–10717 (2020)

    Google Scholar 

  3. Da Silva, F.L., Costa, A.H.R.: A survey on transfer learning for multiagent reinforcement learning systems. J. Artif. Intell. Res. 64, 645–703 (2019)

    Article  MathSciNet  MATH  Google Scholar 

  4. De Hauwere, Y.M., Vrancx, P., Nowé, A.: Learning multi-agent state space representations. In: Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems, vol. 1, pp. 715–722 (2010)

    Google Scholar 

  5. Diuk, C., Cohen, A., Littman, M.L.: An object-oriented representation for efficient reinforcement learning. In: Proceedings of the 25th International Conference on Machine Learning, pp. 240–247 (2008)

    Google Scholar 

  6. Farquhar, G., Gustafson, L., Lin, Z., Whiteson, S., Usunier, N., Synnaeve, G.: Growing action spaces. In: International Conference on Machine Learning. PMLR, pp. 3040–3051 (2020)

    Google Scholar 

  7. Foerster, J., Farquhar, G., Afouras, T., Nardelli, N., Whiteson, S.: Counterfactual multi-agent policy gradients. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32 (2018)

    Google Scholar 

  8. Fortunato, M., et al.: Noisy networks for exploration. arXiv preprint arXiv:1706.10295 (2017)

  9. Goodfellow, I.J., et al.: Generative adversarial nets. In: Proceedings of the 27th International Conference on Neural Information Processing Systems, vol. 2, pp. 2672–2680 (2014)

    Google Scholar 

  10. Huang, Y., Wu, S., Mu, Z., Long, X., Chu, S., Zhao, G.: A multi-agent reinforcement learning method for swarm robots in space collaborative exploration. In: 2020 6th International Conference on Control, Automation and Robotics (ICCAR), pp. 139–144. IEEE (2020)

    Google Scholar 

  11. Iqbal, S., Sha, F.: Actor-attention-critic for multi-agent reinforcement learning. In: International Conference on Machine Learning. PMLR, pp. 2961–2970 (2019)

    Google Scholar 

  12. Liu, Y., Hu, Y., Gao, Y., Chen, Y., Fan, C.: Value function transfer for deep multi-agent reinforcement learning based on n-step returns. In: IJCAI, pp. 457–463 (2019)

    Google Scholar 

  13. Liu, Y., Wang, W., Hu, Y., Hao, J., Chen, X., Gao, Y.: Multi-agent game abstraction via graph attention neural network. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 7211–7218 (2020)

    Google Scholar 

  14. Long, Q., Zhou, Z., Gupta, A., Fang, F., Wu, Y., Wang, X.: Evolutionary population curriculum for scaling multi-agent reinforcement learning. arXiv preprint arXiv:2003.10423 (2020)

  15. Lowe, R., Wu, Y., Tamar, A., Harb, J., Abbeel, P., Mordatch, I.: Multi-agent actor-critic for mixed cooperative-competitive environments. arXiv preprint arXiv:1706.02275 (2017)

  16. Mordatch, I., Abbeel, P.: Emergence of grounded compositional language in multi-agent populations. In: Thirty-Second AAAI Conference on Artificial Intelligence (2018)

    Google Scholar 

  17. Narvekar, S., Peng, B., Leonetti, M., Sinapov, J., Taylor, M.E., Stone, P.: Curriculum learning for reinforcement learning domains: a framework and survey. J. Mach. Learn. Res. 21, 1–50 (2020)

    MathSciNet  MATH  Google Scholar 

  18. Omidshafiei, S., et al.: Learning to teach in cooperative multiagent reinforcement learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 6128–6136 (2019)

    Google Scholar 

  19. Rashid, T., Samvelyan, M., Schroeder, C., Farquhar, G., Foerster, J., Whiteson, S.: QMIX: monotonic value function factorisation for deep multi-agent reinforcement learning. In: International Conference on Machine Learning. PMLR, pp. 4295–4304 (2018)

    Google Scholar 

  20. Samvelyan, M., et al.: The starcraft multi-agent challenge. In: Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems, pp. 2186–2188 (2019)

    Google Scholar 

  21. Son, K., Kim, D., Kang, W.J., Hostallero, D.E., Yi, Y.: QTRAN: learning to factorize with transformation for cooperative multi-agent reinforcement learning. In: International Conference on Machine Learning. PMLR, pp. 5887–5896 (2019)

    Google Scholar 

  22. Sunehag, P., et al.: Value-decomposition networks for cooperative multi-agent learning. arXiv preprint arXiv:1706.05296 (2017)

  23. Sutton, R.S., McAllester, D.A., Singh, S.P., Mansour, Y.: Policy gradient methods for reinforcement learning with function approximation. In: Advances in Neural Information Processing Systems, pp. 1057–1063 (2000)

    Google Scholar 

  24. Vezhnevets, A., Wu, Y., Eckstein, M., Leblond, R., Leibo, J.Z.: Options as responses: grounding behavioural hierarchies in multi-agent reinforcement learning. In: International Conference on Machine Learning. PMLR, pp. 9733–9742 (2020)

    Google Scholar 

  25. Wang, J., Ren, Z., Liu, T., Yu, Y., Zhang, C.: QPLEX: duplex dueling multi-agent Q-learning. arXiv preprint arXiv:2008.01062 (2020)

  26. Wang, T., Gupta, T., Mahajan, A., Peng, B., Whiteson, S., Zhang, C.: RODE: learning roles to decompose multi-agent tasks. arXiv preprint arXiv:2010.01523 (2020)

  27. Wang, W., et al.: From few to more: large-scale dynamic multiagent curriculum learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 7293–7300 (2020)

    Google Scholar 

  28. Yang, T., et al.: An efficient transfer learning framework for multiagent reinforcement learning, vol. 34 (2021)

    Google Scholar 

  29. Yang, Y., Luo, R., Li, M., Zhou, M., Zhang, W., Wang, J.: Mean field multi-agent reinforcement learning. In: International Conference on Machine Learning. PMLR, pp. 5571–5580 (2018)

    Google Scholar 

Download references

Acknowledgements

This work is supported by Shenzhen Fundamental Research Program (No.2021Szvup056), Primary Research & Development Plan of Jiangsu Province (No. BE2021028), National Natural Science Foundation of China (No. 62192783), and Science and Technology Innovation 2030 New Generation Artificial Intelligence Major Project (No.2018AAA0100905).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yang Gao .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Li, C., Hu, Y., Tian, P., Dong, S., Gao, Y. (2022). DDMA: Discrepancy-Driven Multi-agent Reinforcement Learning. In: Khanna, S., Cao, J., Bai, Q., Xu, G. (eds) PRICAI 2022: Trends in Artificial Intelligence. PRICAI 2022. Lecture Notes in Computer Science, vol 13631. Springer, Cham. https://doi.org/10.1007/978-3-031-20868-3_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-20868-3_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-20867-6

  • Online ISBN: 978-3-031-20868-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics