Skip to main content
Log in

Transform networks for cooperative multi-agent deep reinforcement learning

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

In recent years, the Multi-agent Deep Reinforcement Learning Algorithm has been developing rapidly, in which the value method-based algorithm plays an important role (such as Monotonic Value Function Factorisation (QMIX) and Learning to Factorize with Transformation for Cooperative Multi-Agent Reinforcement learning (QTRAN)). In spite of the fact, the performance of current value-based multi-agent algorithm under complex scene still can be further improved. In value function-based model, a mixing network is usually used to mix the local action value of each agent to get joint action value when the partial observability will cause the problem of misalignment and unsatisfying mixing results. This paper proposes a multi-agent model called Transform Networks that transform the individual local action-value function gotten by agent network to individual global action-value function, which will avoid the problem of misalignment caused by partial observability when the individual action value is mixed, and the joint action value can represent the cooperative conditions of all agents well. Using the StarCraft Multi-Agent Challenge (SMAC) as the experimental platform, the comparison of the performance of algorithms on five different maps proved that the proposed method has better effect than the current most advanced baseline algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  1. Liu T, Liu H, Li YF, Chen Z, Zhang Z, Liu S (2019) Flexible ftir spectral imaging enhancement for industrial robot infrared vision sensing. IEEE Trans Industr Inform 16(1):544–554

    Article  Google Scholar 

  2. Liu T, Liu H, Li YF, Zhang Z, Liu S (2018) Efficient blind signal reconstruction with wavelet transforms regularization for educational robot infrared vision sensing. IEEE/ASME Trans Mechatron 24(1):384–394

    Article  Google Scholar 

  3. Liu H, Nie H, Zhang Z, Li YF (2021) Anisotropic angle distribution learning for head pose estimation and attention understanding in human-computer interaction. Neurocomputing 433:310–322

    Article  Google Scholar 

  4. Hoffmann R, Zhang C, Ling X, Zettlemoyer L, Weld DS (2011) Knowledge-based weak supervision for information extraction of overlapping relations. In: Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies, pp 541–550

  5. Tan M (1993) Multi-agent reinforcement learning: Independent vs. cooperative agents

  6. Liu H, Liu T, Zhang Z, Sangaiah AK, Yang B, Li Y ARHPE (2022) Asymmetric relation-aware representation learning for head pose estimation in industrial human–machine interaction. IEEE Trans Ind Inf

  7. Liu H, Fang S, Zhang Z, Li D, Lin K, Wang J (2021) Mfdnet: Collaborative poses perception and matrix fisher distribution for head pose estimation. IEEE Transactions on Multimedia

  8. Liu H, Zheng C, Li D, Shen X, Lin K, Wang J, Zhang Z, Zhang Z, Xiong NN (2021) Edmf: Efficient deep matrix factorization with review feature learning for industrial recommender system IEEE Transactions on Industrial Informatics

  9. Li Z, Liu H, Zhang Z, Liu T, Xiong NN (2021) Learning knowledge graph embedding with heterogeneous relation attention networks. IEEE Transactions on Neural Networks and Learning Systems

  10. Liu H, Zheng C, Li D, Zhang Z, Ke Lin, Shen X, Xiong NN, Wang J (2022) Multi-perspective social recommendation method with graph representation learning. Neurocomputing 468:469–481

    Article  Google Scholar 

  11. Gupta JK, Egorov M, Kochenderfer M (2017) Cooperative multi-agent control using deep reinforcement learning. In: International conference on autonomous agents and multiagent systems. Springer, pp 66–83

  12. Oliehoek FA, Spaan MTJ, Vlassis N (2008) Optimal and approximate q-value functions for decentralized pomdps. J Artif Intell Res 32:289–353

    Article  MathSciNet  MATH  Google Scholar 

  13. Lowe R, Wu YI, Tamar A, Harb J, Abbeel P, Mordatch I (2017) Multi-agent actor-critic for mixed cooperative-competitive environments. arXiv:1706.02275

  14. Foerster J, Farquhar G, Afouras T, Nardelli N, Whiteson S (2018) Counterfactual multi-agent policy gradients. In: Proceedings of the AAAI Conference on artificial intelligence, vol 32

  15. Sunehag P, Lever G, Gruslys A, Czarnecki WM, Zambaldi V, Jaderberg M, Lanctot M, Sonnerat N, Leibo JZ , Tuyls K et al (2017) Value-decomposition networks for cooperative multi-agent learning. arXiv:1706.05296

  16. Rashid T, Samvelyan M, Schroeder C, Farquhar G, Foerster J, Whiteson S (2018) Qmix: Monotonic value function factorisation for deep multi-agent reinforcement learning. In: International conference on machine learning. PMLR, pp 4295–4304

  17. Son K, Kim D, Kang WJ, Hostallero DE, Yi Y (2019) Qtran: Learning to factorize with transformation for cooperative multi-agent reinforcement learning. In: International conference on machine learning. PMLR, pp 5887–5896

  18. Samvelyan M, Rashid T, De Witt CS, Farquhar G, Nardelli N, Rudner TGJ, Hung Chia-Man, Torr PHS , Foerster J, Whiteson S (2019) The starcraft multi-agent challeng. arXiv:1902.04043

  19. Zhao S, Grishman R (2005) Extracting relations with integrated information using kernel methods. In: Proceedings of the 43rd annual meeting of the association for computational linguistics (acl’05), pp 419–426

  20. Wen C, Yao Xu, Wang Y, Tan X (2020) Smix (λ): Enhancing centralized value functions for cooperative multi-agent reinforcement learning. In: Proceedings of the AAAI Conference on artificial intelligence, vol 34, pp 7301–7308

  21. Mahajan A, Rashid T, Samvelyan M, Whiteson S (2019) Maven: Multi-agent variational exploration. Advances in Neural Information Processing Systems 32

  22. Wang J, Ren Z, Liu T, Yu Y, Zhang c (2020) Qplex: Duplex dueling multi-agent q-learning. arXiv:2008.01062

  23. Du Y, Han L, Fang M, Lui J, Dai T, Tao D (2019) liir: Learning individual intrinsic reward in multi-agent reinforcement learning

  24. Wang W, Yang T, Liu Y, Hao J, Hao X, Hu Y, Chen Y, Fan C, Gao Y (2019) Action semantics network: Considering the effects of actions in multiagent systems. arXiv:1907.11461

  25. Vinyals O, Ewalds T, Bartunov S, Georgiev P, Vezhnevets AS, Yeo M, Makhzani A, Küttler H, Agapiou J , Schrittwieser J et al (2017) Starcraft ii: A new challenge for reinforcement learning. arXiv:1708.04782

  26. Oliehoek FA, Amato C (2015) A concise introduction to decentralized pomdps

  27. Littman ML (1994) Markov games as a framework for multi-agent reinforcement learning. In: Machine learning proceedings 1994. Elsevier, pp 157–163

  28. Wiering MA, Van Otterlo M (2012) Reinforcement learning. Adaptation, learning, and optimization, 12(3)

  29. Szepesvári C (2009) Synthesis lectures on artificial intelligence and machine learning. Synthesis lectures on artificial intelligence and machine learning

Download references

Acknowledgements

This work is supported by the Key laboratory fund of underwater acoustic countermeasure technology(SSDKKFJJ-2019-02-03).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lianke Zhou.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, H., Xie, X. & Zhou, L. Transform networks for cooperative multi-agent deep reinforcement learning. Appl Intell 53, 9261–9269 (2023). https://doi.org/10.1007/s10489-022-03924-3

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-022-03924-3

Keywords