Skip to main content

Advertisement

Log in

RDHNet: addressing rotational and permutational symmetries in continuous multi-agent systems

  • Research Article
  • Published:
Frontiers of Computer Science Aims and scope Submit manuscript

Abstract

Symmetry is prevalent in multi-agent systems. The presence of symmetry, coupled with the misuse of absolute coordinate systems, often leads to a large amount of redundant representation space, significantly increasing the search space for learning policies and reducing learning efficiency. Effectively utilizing symmetry and extracting symmetry-invariant representations can significantly enhance multi-agent systems’ learning efficiency and overall performance by compressing the model’s hypothesis space and improving sample efficiency. The issue of rotational symmetry in multi-agent reinforcement learning has received little attention in previous research and is the primary focus of this paper. To address this issue, we propose a rotation-invariant network architecture for continuous action space tasks. This architecture utilizes relative coordinates between agents, eliminating dependence on absolute coordinate systems, and employs a hypernetwork to enhance the model’s fitting capability, enabling it to model MDPs with more complex dynamics. It can be used for both predicting actions and evaluating action values/utilities. In benchmark tasks, experimental results validate the impact of rotational symmetry on multi-agent decision systems and demonstrate the effectiveness of our method. The code of RDHNet has been available at the website of github.com/wang88256187/RDHNet.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Wen M, Lin R, Wang H, Yang Y, Wen Y, Mai L, Wang J, Zhang H, Zhang W. Large sequence models for sequential decision-making: a survey. Frontiers of Computer Science, 2023, 17(6): 176349

    MATH  Google Scholar 

  2. Mushtaq A, Haq I U, Sarwar M A, Khan A, Khalil W, Mughal M A. Multi-agent reinforcement learning for traffic flow management of autonomous vehicles. Sensors, 2023, 23(5): 2373

    Google Scholar 

  3. Ma C, Li A, Du Y, Dong H, Yang Y. Efficient and scalable reinforcement learning for large-scale network control. Nature Machine Intelligence, 2024, 6(9): 1006–1020

    MATH  Google Scholar 

  4. Keren S, Essayeh C, Albrecht S V, Morstyn T. Multi-agent reinforcement learning for energy networks: computational challenges, progress and open problems. 2024, arXiv preprint arXiv: 2404.15583

  5. Zhai Y, Ding B, Liu X, Jia H, Zhao Y, Luo J. Decentralized multi-robot collision avoidance in complex scenarios with selective communication. IEEE Robotics and Automation Letters, 2021, 6(4): 8379–8386

    MATH  Google Scholar 

  6. Li Y, Wang H, Ding B, Zhou W. RoboCloud: augmenting robotic visions for open environment modeling using internet knowledge. Science China Information Sciences, 2018, 61(5): 050102

    Google Scholar 

  7. OpenAI. Dota 2 with large scale deep reinforcement learning. 2019, arXiv preprint arXiv: 1912.06680

  8. Wen M, Kuba J G, Lin R, Zhang W, Wen Y, Wang J, Yang Y. Multiagent reinforcement learning is a sequence modeling problem. In: Proceedings of the 36th International Conference on Neural Information Processing Systems. 2022, 1201

    MATH  Google Scholar 

  9. Chen Y, Geng Y, Zhong F, Ji J, Jiang J, Lu Z, Dong H, Yang Y. BiDexHands: towards human-level bimanual dexterous manipulation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024, 46(5): 2804–2818

    Google Scholar 

  10. Yarahmadi H, Shiri M E, Navidi H, Sharifi A, Challenger M. Bankruptcy-evolutionary games based solution for the multi-agent credit assignment problem. Swarm and Evolutionary Computation, 2023, 77: 101229

    Google Scholar 

  11. Yarahmadi H, Shiri M E, Navidi H, Sharifi A, Challenger M. RevAP: a bankruptcy-based algorithm to solve the multi-agent credit assignment problem in task start threshold-based multi-agent systems. Robotics and Autonomous Systems, 2024, 174: 104631

    Google Scholar 

  12. Ying D, Zhang Y, Ding Y, Koppel A, Lavaei J. Scalable primal-dual actor-critic method for safe multi-agent RL with general utilities. In: Proceedings of the 37th International Conference on Neural Information Processing Systems. 2024, 1586

    MATH  Google Scholar 

  13. Yuan L, Chen F, Zhang Z, Yu Y. Communication-robust multi-agent learning by adaptable auxiliary multi-agent adversary generation. Frontiers of Computer Science, 2024, 18(6): 186331

    MATH  Google Scholar 

  14. Zhang R, Xu Z, Ma C, Yu C, Tu W W, Huang S, Ye D, Ding W, Yang Y, Wang Y. A survey on self-play methods in reinforcement learning. 2024, arXiv preprint arXiv: 2408.01072

  15. Wang W, Yang T, Liu Y, Hao J, Hao X, Hu Y, Chen Y, Fan C, Gao Y. Action semantics network: considering the effects of actions in multiagent systems. In: Proceedings of the 8th International Conference on Learning Representations. 2020

    MATH  Google Scholar 

  16. Hu S, Zhu F, Chang X, Liang X. UPDeT: universal multi-agent reinforcement learning via policy decoupling with transformers. 2021, arXiv preprint arXiv: 2101.08001

  17. Hao J, Hao X, Mao H, Wang W, Yang Y, Li D, Zheng Y, Wang Z. Boosting multiagent reinforcement learning via permutation invariant and permutation equivariant networks. In: Proceedings of the 11th International Conference on Learning Representations. 2023

    MATH  Google Scholar 

  18. Zhou X, Gao H, Xu X, Zhang X, Jia H, Wang D. PCRL: priority convention reinforcement learning for microscopically sequencable multi-agent problems. In: Proceedings of Deep Reinforcement Learning Workshop NeurIPS 2022. 2022

    MATH  Google Scholar 

  19. van der Pol E, van Hoof H, Oliehoek F A, Welling M. Multi-agent MDP homomorphic networks. In: Proceedings of the 10th International Conference on Learning Representations. 2022

    MATH  Google Scholar 

  20. Yu X, Shi R, Feng P, Tian Y, Li S, Liao S, Wu W. Leveraging partial symmetry for multi-agent reinforcement learning. In: Proceedings of the 38th AAAI Conference on Artificial Intelligence. 2024, 17583–17590

    MATH  Google Scholar 

  21. Yu X, Shi R, Feng P, Tian Y, Luo J, Wu W. ESP: exploiting symmetry prior for multi-agent reinforcement learning. In: Proceedings of the 26th European Conference on Artificial Intelligence Including 12th Conference on Prestigious Applications of Intelligent Systems. 2023, 2946–2953

    Google Scholar 

  22. Kipf T N, Welling M. Semi-supervised classification with graph convolutional networks. In: Proceedings of the 5th International Conference on Learning Representations. 2017

    MATH  Google Scholar 

  23. Qi C R, Su H, Mo K, Guibas L J. PointNet: deep learning on point sets for 3D classification and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017, 77–85

    MATH  Google Scholar 

  24. Qi C R, Yi L, Su H, Guibas L J. PointNet++: deep hierarchical feature learning on point sets in a metric space. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017, 5105–5114

    MATH  Google Scholar 

  25. Thomas N, Smidt T, Kearnes S, Yang L, Li L, Kohlhoff K, Riley P. Tensor field networks: rotation- and translation-equivariant neural networks for 3D point clouds. 2018, arXiv preprint arXiv: 1802.08219

  26. Fuchs F, Worrall D, Fischer V, Welling M. SE(3)-transformers: 3D roto-translation equivariant attention networks. In: Proceedings of the 34th International Conference on Neural Information Processing Systems. 2020, 1970–1981

    MATH  Google Scholar 

  27. Maron H, Ben-Hamu H, Serviansky H, Lipman Y. Provably powerful graph networks. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems. 2019, 193

    MATH  Google Scholar 

  28. Gasteiger J, Groß J, Günnemann S. Directional message passing for molecular graphs. In: Proceedings of the 8th International Conference on Learning Representations. 2020

    MATH  Google Scholar 

  29. Rashid T, Samvelyan M, De Witt C S, Farquhar G, Foerster J, Whiteson S. Monotonic value function factorisation for deep multi-agent reinforcement learning. Journal of Machine Learning Research, 2020, 21(178): 1–51

    MathSciNet  MATH  Google Scholar 

  30. Peng B, Rashid T, de Witt C A S, Kamienny P A, Torr P H S, Böhmer W, Whiteson S. FACMAC: factored multi-agent centralised policy gradients. In: Proceedings of the 35th International Conference on Neural Information Processing Systems. 2021, 934

    Google Scholar 

  31. de Boer P T, Kroese D P, Mannor S, Rubinstein R Y. A tutorial on the cross-entropy method. Annals of Operations Research, 2005, 134: 19–67

    MathSciNet  MATH  Google Scholar 

  32. Mnih V, Kavukcuoglu K, Silver D, Graves A, Antonoglou I, Wierstra D, Riedmiller M. Playing Atari with deep reinforcement learning. 2013, arXiv preprint arXiv: 1312.5602

  33. Lowe R, Wu Y, Tamar A, Harb J, Abbeel P, Mordatch I. Multi-agent actor-critic for mixed cooperative-competitive environments. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017, 6382–6393

    MATH  Google Scholar 

  34. Lillicrap T P, Hunt J J, Pritzel A, Heess N, Erez T, Tassa Y, Silver D, Wierstra D. Continuous control with deep reinforcement learning. In: Proceedings of the 4th International Conference on Learning Representations. 2016

    Google Scholar 

  35. Sunehag P, Lever G, Gruslys A, Czarnecki W M, Zambaldi V, Jaderberg M, Lanctot M, Sonnerat N, Leibo J Z, Tuyls K, Graepel T. Value-decomposition networks for cooperative multi-agent learning based on team reward. In: Proceedings of the 17th International Conference on Autonomous Agents and Multiagent Systems. 2018, 2085–2087

    Google Scholar 

Download references

Acknowledgements

The work was supported by the National Natural Science Foundation of China (Grant Nos. 91948303-1, 62106278), the National Key R and D Program of China (2021ZD0140301).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Minglong Li or Teng Li.

Ethics declarations

Competing interests The authors declare that they have no competing interests or financial conflicts to disclose.

Additional information

Dongzi WANG received MS degrees from National University of Defense Technology, China in 2020. He is currently studying as a doctoral student in Electronic Science and Technology at the National University of Defense Technology. His research interests include machine learning, reinforcement learning and multi-agent reinforcement learning. His research focuses on improving the learning efficiency of multi-agent systems by enhancing knowledge-sharing mechanisms.

Lilan HUANG received the BS in Atmospheric Science from Sun Yat-sen University, China in 2018. She graduated from National University of Defense Technology with a MS in Computer Science and Technology in 2021. She is currently studying as a doctoral student in Software Engineering at the National University of Defense Technology, China. Since 2018, she has been working on data assimilation using machine learning. Using deep learning and reinforcement learning to enable data assimilation, improve the performance of hybrid data assimilation methods, and promote numerical weather forecasting.

Muning WEN is currently a third-year PhD student at Shanghai Jiao Tong University, China under the supervision of Professor Weinan Zhang. He has extensive theoretical and practical experience in multi-agent systems. His research interests mainly focus on reinforcement learning, multi-agent reinforcement learning, and reinforcement learning for LLM agents. Wen Muning has published over twenty papers in prestigious conferences such as NeurIPS, ICML, and ICLR, and has been actively serving as a reviewer for these conferences since 2023.

Yuanxi PENG received the BS degree in computer science from Sichuan University, China in 1988, and the MS and PhD degrees in computer science from the National University of Defense Technology (NUDT), China in 1998 and 2001, respectively. He has been a professor at the College of Computer Science and Technology, National University of Defense Technology (NUDT) since 2011. His research interests are in the areas of hyperspectral image processing, high-performance computing, multi and many-core architectures, on-chip networks, cache coherence protocols, and architectural support for parallel programming.

Minglong LI received the MS degree in computer science from the National University of Defense Technology (NUDT), China in 2017 and the PhD degree in software engineering from the National University of Defense Technology (NUDT), China in 2020, respectively. He is an assistant research fellow in College of Computer Science and Technology, National University of Defense Technology (NUDT). His research interests are in the areas of Artificial Intelligence (AI), Robotics, and Multi-agent Systems.

Teng LI received the BS, MS, and PhD degrees from National University of Defense Technology, China in 2013, 2015, and 2020, respectively. His research interests include machine learning, hyperspectral image processing and representation learning. He has also conducted research on the integration of multi-agent reinforcement learning and graph neural networks in recent years.

Electronic Supplementary Material

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, D., Huang, L., Wen, M. et al. RDHNet: addressing rotational and permutational symmetries in continuous multi-agent systems. Front. Comput. Sci. 19, 1911365 (2025). https://doi.org/10.1007/s11704-025-41250-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11704-025-41250-2

Keywords