Abstract
Symmetry is prevalent in multi-agent systems. The presence of symmetry, coupled with the misuse of absolute coordinate systems, often leads to a large amount of redundant representation space, significantly increasing the search space for learning policies and reducing learning efficiency. Effectively utilizing symmetry and extracting symmetry-invariant representations can significantly enhance multi-agent systems’ learning efficiency and overall performance by compressing the model’s hypothesis space and improving sample efficiency. The issue of rotational symmetry in multi-agent reinforcement learning has received little attention in previous research and is the primary focus of this paper. To address this issue, we propose a rotation-invariant network architecture for continuous action space tasks. This architecture utilizes relative coordinates between agents, eliminating dependence on absolute coordinate systems, and employs a hypernetwork to enhance the model’s fitting capability, enabling it to model MDPs with more complex dynamics. It can be used for both predicting actions and evaluating action values/utilities. In benchmark tasks, experimental results validate the impact of rotational symmetry on multi-agent decision systems and demonstrate the effectiveness of our method. The code of RDHNet has been available at the website of github.com/wang88256187/RDHNet.
Similar content being viewed by others
References
Wen M, Lin R, Wang H, Yang Y, Wen Y, Mai L, Wang J, Zhang H, Zhang W. Large sequence models for sequential decision-making: a survey. Frontiers of Computer Science, 2023, 17(6): 176349
Mushtaq A, Haq I U, Sarwar M A, Khan A, Khalil W, Mughal M A. Multi-agent reinforcement learning for traffic flow management of autonomous vehicles. Sensors, 2023, 23(5): 2373
Ma C, Li A, Du Y, Dong H, Yang Y. Efficient and scalable reinforcement learning for large-scale network control. Nature Machine Intelligence, 2024, 6(9): 1006–1020
Keren S, Essayeh C, Albrecht S V, Morstyn T. Multi-agent reinforcement learning for energy networks: computational challenges, progress and open problems. 2024, arXiv preprint arXiv: 2404.15583
Zhai Y, Ding B, Liu X, Jia H, Zhao Y, Luo J. Decentralized multi-robot collision avoidance in complex scenarios with selective communication. IEEE Robotics and Automation Letters, 2021, 6(4): 8379–8386
Li Y, Wang H, Ding B, Zhou W. RoboCloud: augmenting robotic visions for open environment modeling using internet knowledge. Science China Information Sciences, 2018, 61(5): 050102
OpenAI. Dota 2 with large scale deep reinforcement learning. 2019, arXiv preprint arXiv: 1912.06680
Wen M, Kuba J G, Lin R, Zhang W, Wen Y, Wang J, Yang Y. Multiagent reinforcement learning is a sequence modeling problem. In: Proceedings of the 36th International Conference on Neural Information Processing Systems. 2022, 1201
Chen Y, Geng Y, Zhong F, Ji J, Jiang J, Lu Z, Dong H, Yang Y. BiDexHands: towards human-level bimanual dexterous manipulation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024, 46(5): 2804–2818
Yarahmadi H, Shiri M E, Navidi H, Sharifi A, Challenger M. Bankruptcy-evolutionary games based solution for the multi-agent credit assignment problem. Swarm and Evolutionary Computation, 2023, 77: 101229
Yarahmadi H, Shiri M E, Navidi H, Sharifi A, Challenger M. RevAP: a bankruptcy-based algorithm to solve the multi-agent credit assignment problem in task start threshold-based multi-agent systems. Robotics and Autonomous Systems, 2024, 174: 104631
Ying D, Zhang Y, Ding Y, Koppel A, Lavaei J. Scalable primal-dual actor-critic method for safe multi-agent RL with general utilities. In: Proceedings of the 37th International Conference on Neural Information Processing Systems. 2024, 1586
Yuan L, Chen F, Zhang Z, Yu Y. Communication-robust multi-agent learning by adaptable auxiliary multi-agent adversary generation. Frontiers of Computer Science, 2024, 18(6): 186331
Zhang R, Xu Z, Ma C, Yu C, Tu W W, Huang S, Ye D, Ding W, Yang Y, Wang Y. A survey on self-play methods in reinforcement learning. 2024, arXiv preprint arXiv: 2408.01072
Wang W, Yang T, Liu Y, Hao J, Hao X, Hu Y, Chen Y, Fan C, Gao Y. Action semantics network: considering the effects of actions in multiagent systems. In: Proceedings of the 8th International Conference on Learning Representations. 2020
Hu S, Zhu F, Chang X, Liang X. UPDeT: universal multi-agent reinforcement learning via policy decoupling with transformers. 2021, arXiv preprint arXiv: 2101.08001
Hao J, Hao X, Mao H, Wang W, Yang Y, Li D, Zheng Y, Wang Z. Boosting multiagent reinforcement learning via permutation invariant and permutation equivariant networks. In: Proceedings of the 11th International Conference on Learning Representations. 2023
Zhou X, Gao H, Xu X, Zhang X, Jia H, Wang D. PCRL: priority convention reinforcement learning for microscopically sequencable multi-agent problems. In: Proceedings of Deep Reinforcement Learning Workshop NeurIPS 2022. 2022
van der Pol E, van Hoof H, Oliehoek F A, Welling M. Multi-agent MDP homomorphic networks. In: Proceedings of the 10th International Conference on Learning Representations. 2022
Yu X, Shi R, Feng P, Tian Y, Li S, Liao S, Wu W. Leveraging partial symmetry for multi-agent reinforcement learning. In: Proceedings of the 38th AAAI Conference on Artificial Intelligence. 2024, 17583–17590
Yu X, Shi R, Feng P, Tian Y, Luo J, Wu W. ESP: exploiting symmetry prior for multi-agent reinforcement learning. In: Proceedings of the 26th European Conference on Artificial Intelligence Including 12th Conference on Prestigious Applications of Intelligent Systems. 2023, 2946–2953
Kipf T N, Welling M. Semi-supervised classification with graph convolutional networks. In: Proceedings of the 5th International Conference on Learning Representations. 2017
Qi C R, Su H, Mo K, Guibas L J. PointNet: deep learning on point sets for 3D classification and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017, 77–85
Qi C R, Yi L, Su H, Guibas L J. PointNet++: deep hierarchical feature learning on point sets in a metric space. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017, 5105–5114
Thomas N, Smidt T, Kearnes S, Yang L, Li L, Kohlhoff K, Riley P. Tensor field networks: rotation- and translation-equivariant neural networks for 3D point clouds. 2018, arXiv preprint arXiv: 1802.08219
Fuchs F, Worrall D, Fischer V, Welling M. SE(3)-transformers: 3D roto-translation equivariant attention networks. In: Proceedings of the 34th International Conference on Neural Information Processing Systems. 2020, 1970–1981
Maron H, Ben-Hamu H, Serviansky H, Lipman Y. Provably powerful graph networks. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems. 2019, 193
Gasteiger J, Groß J, Günnemann S. Directional message passing for molecular graphs. In: Proceedings of the 8th International Conference on Learning Representations. 2020
Rashid T, Samvelyan M, De Witt C S, Farquhar G, Foerster J, Whiteson S. Monotonic value function factorisation for deep multi-agent reinforcement learning. Journal of Machine Learning Research, 2020, 21(178): 1–51
Peng B, Rashid T, de Witt C A S, Kamienny P A, Torr P H S, Böhmer W, Whiteson S. FACMAC: factored multi-agent centralised policy gradients. In: Proceedings of the 35th International Conference on Neural Information Processing Systems. 2021, 934
de Boer P T, Kroese D P, Mannor S, Rubinstein R Y. A tutorial on the cross-entropy method. Annals of Operations Research, 2005, 134: 19–67
Mnih V, Kavukcuoglu K, Silver D, Graves A, Antonoglou I, Wierstra D, Riedmiller M. Playing Atari with deep reinforcement learning. 2013, arXiv preprint arXiv: 1312.5602
Lowe R, Wu Y, Tamar A, Harb J, Abbeel P, Mordatch I. Multi-agent actor-critic for mixed cooperative-competitive environments. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017, 6382–6393
Lillicrap T P, Hunt J J, Pritzel A, Heess N, Erez T, Tassa Y, Silver D, Wierstra D. Continuous control with deep reinforcement learning. In: Proceedings of the 4th International Conference on Learning Representations. 2016
Sunehag P, Lever G, Gruslys A, Czarnecki W M, Zambaldi V, Jaderberg M, Lanctot M, Sonnerat N, Leibo J Z, Tuyls K, Graepel T. Value-decomposition networks for cooperative multi-agent learning based on team reward. In: Proceedings of the 17th International Conference on Autonomous Agents and Multiagent Systems. 2018, 2085–2087
Acknowledgements
The work was supported by the National Natural Science Foundation of China (Grant Nos. 91948303-1, 62106278), the National Key R and D Program of China (2021ZD0140301).
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Competing interests The authors declare that they have no competing interests or financial conflicts to disclose.
Additional information
Dongzi WANG received MS degrees from National University of Defense Technology, China in 2020. He is currently studying as a doctoral student in Electronic Science and Technology at the National University of Defense Technology. His research interests include machine learning, reinforcement learning and multi-agent reinforcement learning. His research focuses on improving the learning efficiency of multi-agent systems by enhancing knowledge-sharing mechanisms.
Lilan HUANG received the BS in Atmospheric Science from Sun Yat-sen University, China in 2018. She graduated from National University of Defense Technology with a MS in Computer Science and Technology in 2021. She is currently studying as a doctoral student in Software Engineering at the National University of Defense Technology, China. Since 2018, she has been working on data assimilation using machine learning. Using deep learning and reinforcement learning to enable data assimilation, improve the performance of hybrid data assimilation methods, and promote numerical weather forecasting.
Muning WEN is currently a third-year PhD student at Shanghai Jiao Tong University, China under the supervision of Professor Weinan Zhang. He has extensive theoretical and practical experience in multi-agent systems. His research interests mainly focus on reinforcement learning, multi-agent reinforcement learning, and reinforcement learning for LLM agents. Wen Muning has published over twenty papers in prestigious conferences such as NeurIPS, ICML, and ICLR, and has been actively serving as a reviewer for these conferences since 2023.
Yuanxi PENG received the BS degree in computer science from Sichuan University, China in 1988, and the MS and PhD degrees in computer science from the National University of Defense Technology (NUDT), China in 1998 and 2001, respectively. He has been a professor at the College of Computer Science and Technology, National University of Defense Technology (NUDT) since 2011. His research interests are in the areas of hyperspectral image processing, high-performance computing, multi and many-core architectures, on-chip networks, cache coherence protocols, and architectural support for parallel programming.
Minglong LI received the MS degree in computer science from the National University of Defense Technology (NUDT), China in 2017 and the PhD degree in software engineering from the National University of Defense Technology (NUDT), China in 2020, respectively. He is an assistant research fellow in College of Computer Science and Technology, National University of Defense Technology (NUDT). His research interests are in the areas of Artificial Intelligence (AI), Robotics, and Multi-agent Systems.
Teng LI received the BS, MS, and PhD degrees from National University of Defense Technology, China in 2013, 2015, and 2020, respectively. His research interests include machine learning, hyperspectral image processing and representation learning. He has also conducted research on the integration of multi-agent reinforcement learning and graph neural networks in recent years.
Electronic Supplementary Material
Rights and permissions
About this article
Cite this article
Wang, D., Huang, L., Wen, M. et al. RDHNet: addressing rotational and permutational symmetries in continuous multi-agent systems. Front. Comput. Sci. 19, 1911365 (2025). https://doi.org/10.1007/s11704-025-41250-2
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11704-025-41250-2