Transform networks for cooperative multi-agent deep reinforcement learning

Wang, Hongbin; Xie, Xiaodong; Zhou, Lianke

doi:10.1007/s10489-022-03924-3

Transform networks for cooperative multi-agent deep reinforcement learning

Published: 08 August 2022

Volume 53, pages 9261–9269, (2023)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

488 Accesses
1 Altmetric
Explore all metrics

Abstract

In recent years, the Multi-agent Deep Reinforcement Learning Algorithm has been developing rapidly, in which the value method-based algorithm plays an important role (such as Monotonic Value Function Factorisation (QMIX) and Learning to Factorize with Transformation for Cooperative Multi-Agent Reinforcement learning (QTRAN)). In spite of the fact, the performance of current value-based multi-agent algorithm under complex scene still can be further improved. In value function-based model, a mixing network is usually used to mix the local action value of each agent to get joint action value when the partial observability will cause the problem of misalignment and unsatisfying mixing results. This paper proposes a multi-agent model called Transform Networks that transform the individual local action-value function gotten by agent network to individual global action-value function, which will avoid the problem of misalignment caused by partial observability when the individual action value is mixed, and the joint action value can represent the cooperative conditions of all agents well. Using the StarCraft Multi-Agent Challenge (SMAC) as the experimental platform, the comparison of the performance of algorithms on five different maps proved that the proposed method has better effect than the current most advanced baseline algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

GHQ: grouped hybrid Q-learning for cooperative heterogeneous multi-agent reinforcement learning

Article Open access 23 April 2024

MAIT: Multi-agent Local Observation Interaction to Improve the Decision-Making Ability

An Improved Weighted QMIX Based on Weight Function with Q-Value

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Liu T, Liu H, Li YF, Chen Z, Zhang Z, Liu S (2019) Flexible ftir spectral imaging enhancement for industrial robot infrared vision sensing. IEEE Trans Industr Inform 16(1):544–554
Article Google Scholar
Liu T, Liu H, Li YF, Zhang Z, Liu S (2018) Efficient blind signal reconstruction with wavelet transforms regularization for educational robot infrared vision sensing. IEEE/ASME Trans Mechatron 24(1):384–394
Article Google Scholar
Liu H, Nie H, Zhang Z, Li YF (2021) Anisotropic angle distribution learning for head pose estimation and attention understanding in human-computer interaction. Neurocomputing 433:310–322
Article Google Scholar
Hoffmann R, Zhang C, Ling X, Zettlemoyer L, Weld DS (2011) Knowledge-based weak supervision for information extraction of overlapping relations. In: Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies, pp 541–550
Tan M (1993) Multi-agent reinforcement learning: Independent vs. cooperative agents
Liu H, Liu T, Zhang Z, Sangaiah AK, Yang B, Li Y ARHPE (2022) Asymmetric relation-aware representation learning for head pose estimation in industrial human–machine interaction. IEEE Trans Ind Inf
Liu H, Fang S, Zhang Z, Li D, Lin K, Wang J (2021) Mfdnet: Collaborative poses perception and matrix fisher distribution for head pose estimation. IEEE Transactions on Multimedia
Liu H, Zheng C, Li D, Shen X, Lin K, Wang J, Zhang Z, Zhang Z, Xiong NN (2021) Edmf: Efficient deep matrix factorization with review feature learning for industrial recommender system IEEE Transactions on Industrial Informatics
Li Z, Liu H, Zhang Z, Liu T, Xiong NN (2021) Learning knowledge graph embedding with heterogeneous relation attention networks. IEEE Transactions on Neural Networks and Learning Systems
Liu H, Zheng C, Li D, Zhang Z, Ke Lin, Shen X, Xiong NN, Wang J (2022) Multi-perspective social recommendation method with graph representation learning. Neurocomputing 468:469–481
Article Google Scholar
Gupta JK, Egorov M, Kochenderfer M (2017) Cooperative multi-agent control using deep reinforcement learning. In: International conference on autonomous agents and multiagent systems. Springer, pp 66–83
Oliehoek FA, Spaan MTJ, Vlassis N (2008) Optimal and approximate q-value functions for decentralized pomdps. J Artif Intell Res 32:289–353
Article MathSciNet MATH Google Scholar
Lowe R, Wu YI, Tamar A, Harb J, Abbeel P, Mordatch I (2017) Multi-agent actor-critic for mixed cooperative-competitive environments. arXiv:1706.02275
Foerster J, Farquhar G, Afouras T, Nardelli N, Whiteson S (2018) Counterfactual multi-agent policy gradients. In: Proceedings of the AAAI Conference on artificial intelligence, vol 32
Sunehag P, Lever G, Gruslys A, Czarnecki WM, Zambaldi V, Jaderberg M, Lanctot M, Sonnerat N, Leibo JZ , Tuyls K et al (2017) Value-decomposition networks for cooperative multi-agent learning. arXiv:1706.05296
Rashid T, Samvelyan M, Schroeder C, Farquhar G, Foerster J, Whiteson S (2018) Qmix: Monotonic value function factorisation for deep multi-agent reinforcement learning. In: International conference on machine learning. PMLR, pp 4295–4304
Son K, Kim D, Kang WJ, Hostallero DE, Yi Y (2019) Qtran: Learning to factorize with transformation for cooperative multi-agent reinforcement learning. In: International conference on machine learning. PMLR, pp 5887–5896
Samvelyan M, Rashid T, De Witt CS, Farquhar G, Nardelli N, Rudner TGJ, Hung Chia-Man, Torr PHS , Foerster J, Whiteson S (2019) The starcraft multi-agent challeng. arXiv:1902.04043
Zhao S, Grishman R (2005) Extracting relations with integrated information using kernel methods. In: Proceedings of the 43rd annual meeting of the association for computational linguistics (acl’05), pp 419–426
Wen C, Yao Xu, Wang Y, Tan X (2020) Smix (λ): Enhancing centralized value functions for cooperative multi-agent reinforcement learning. In: Proceedings of the AAAI Conference on artificial intelligence, vol 34, pp 7301–7308
Mahajan A, Rashid T, Samvelyan M, Whiteson S (2019) Maven: Multi-agent variational exploration. Advances in Neural Information Processing Systems 32
Wang J, Ren Z, Liu T, Yu Y, Zhang c (2020) Qplex: Duplex dueling multi-agent q-learning. arXiv:2008.01062
Du Y, Han L, Fang M, Lui J, Dai T, Tao D (2019) liir: Learning individual intrinsic reward in multi-agent reinforcement learning
Wang W, Yang T, Liu Y, Hao J, Hao X, Hu Y, Chen Y, Fan C, Gao Y (2019) Action semantics network: Considering the effects of actions in multiagent systems. arXiv:1907.11461
Vinyals O, Ewalds T, Bartunov S, Georgiev P, Vezhnevets AS, Yeo M, Makhzani A, Küttler H, Agapiou J , Schrittwieser J et al (2017) Starcraft ii: A new challenge for reinforcement learning. arXiv:1708.04782
Oliehoek FA, Amato C (2015) A concise introduction to decentralized pomdps
Littman ML (1994) Markov games as a framework for multi-agent reinforcement learning. In: Machine learning proceedings 1994. Elsevier, pp 157–163
Wiering MA, Van Otterlo M (2012) Reinforcement learning. Adaptation, learning, and optimization, 12(3)
Szepesvári C (2009) Synthesis lectures on artificial intelligence and machine learning. Synthesis lectures on artificial intelligence and machine learning

Download references

Acknowledgements

This work is supported by the Key laboratory fund of underwater acoustic countermeasure technology(SSDKKFJJ-2019-02-03).

Author information

Authors and Affiliations

College of Computer Science and Technology, Harbin Engineering University, Harbin, 150001, China
Hongbin Wang, Xiaodong Xie & Lianke Zhou

Authors

Hongbin Wang
View author publications
You can also search for this author inPubMed Google Scholar
Xiaodong Xie
View author publications
You can also search for this author inPubMed Google Scholar
Lianke Zhou
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Lianke Zhou.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, H., Xie, X. & Zhou, L. Transform networks for cooperative multi-agent deep reinforcement learning. Appl Intell 53, 9261–9269 (2023). https://doi.org/10.1007/s10489-022-03924-3

Download citation

Accepted: 22 June 2022
Published: 08 August 2022
Issue Date: April 2023
DOI: https://doi.org/10.1007/s10489-022-03924-3

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Transform networks for cooperative multi-agent deep reinforcement learning

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

GHQ: grouped hybrid Q-learning for cooperative heterogeneous multi-agent reinforcement learning

MAIT: Multi-agent Local Observation Interaction to Improve the Decision-Making Ability

An Improved Weighted QMIX Based on Weight Function with Q-Value

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now