MADDPGViz: a visual analytics approach to understand multi-agent deep reinforcement learning

Shi, Xiaoying; Zhang, Jiaming; Liang, Ziyi; Seng, Dewen

doi:10.1007/s12650-023-00928-0

MADDPGViz: a visual analytics approach to understand multi-agent deep reinforcement learning

Regular Paper
Published: 13 May 2023

Volume 26, pages 1189–1205, (2023)
Cite this article

Journal of Visualization Aims and scope Submit manuscript

Xiaoying Shi^1,3,
Jiaming Zhang¹,
Ziyi Liang² &
…
Dewen Seng ORCID: orcid.org/0000-0003-0921-848X¹

280 Accesses
Explore all metrics

Abstract

Deep reinforcement learning (DRL) has received widespread attention recently, where the control policies are trained through deep neural networks. Several visual analytics methods were proposed to reveal the internal mechanism of DRL. However, most of the current methods focused on analyzing DRL algorithms with a single agent. Due to the inherent non-stationarity of the environment and the complex interactions among multiple agents, understanding the learning process of multi-agent deep reinforcement learning is more challenging. This paper presents MADDPGViz, a visual analytics system to expose details of the training process from different aspects for a multi-agent deep deterministic policy gradient (MADDPG) model. MADDPGViz allows users to overview the model’s training statistics and compare the large experience space of multiple agents during different training episodes to learn the functionality of agents’ centralized critics. Users can further examine the dynamic interaction among multiple agents and environmental objects for a selected episode, to discover the learned cooperative strategies of agents from the microscopic aspect. Case studies are conducted in two different cooperative environments, to demonstrate the usability of the system.

Graphic abstract

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 3

Machine learning and deep learning

Article Open access 08 April 2021

Deep learning modelling techniques: current progress, applications, advantages, and challenges

Article Open access 17 April 2023

Explainable artificial intelligence: a comprehensive review

Article 18 November 2021

References

Annasamy RM, Sycara K (2019) Towards better interpretability in deep q-networks. Proc AAAI Conf Artif Intell 33(01):4561–4569
Google Scholar
Bellemare M G, Dabney W, Munos R (2017) A distributional perspective on reinforcement learning. In: Proceedings of international conference on machine learning 449–458
Chen W, Zhou K, Chen C (2016) Real-time bus holding control on a transit corridor based on multi-agent reinforcement learning. In: Proceedings of 2016 IEEE 19th international conference on intelligent transportation systems (ITSC) 100–106.
Chen J, Yuan B, Tomizuka M (2019) Model-free deep reinforcement learning for urban autonomous driving. In: Proceedings of 2019 IEEE intelligent transportation systems conference (ITSC) 2765–2771.
Chu T, Wang J, Codecà L et al (2019) Multi-agent deep reinforcement learning for large-scale traffic signal control. IEEE Trans Intell Transp Syst 21(3):1086–1095
Article Google Scholar
Du W, Ding S (2020) A survey on multi-agent deep reinforcement learning: from the perspective of challenges and applications. Artif Intell Rev 1–24
Foerster J, Nardelli N, Farquhar G, et al (2017) Stabilising experience replay for deep multi-agent reinforcement learning. In: Proceedings of international conference on machine learning 1146–1155
Foerster J, Farquhar G, Afouras T, et al (2018) Counterfactual multi-agent policy gradients. In: Proceedings of the AAAI conference on artificial intelligence 32(1).
Greydanus S, Koul A, Dodge J, et al (2018) Visualizing and understanding atari agents. In: Proceedings of international conference on machine learning 1792–1801
Gu S, Holly E, Lillicrap T, et al (2017) Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates. In: Proceedings of 2017 IEEE international conference on robotics and automation (ICRA) 3389–3396
Gupta J K, Egorov M, Kochenderfer M (2017) Cooperative multi-agent control using deep reinforcement learning. In: Proceedings of international conference on autonomous agents and multiagent systems 66–83.
Haarnoja T, Zhou A, Abbeel P, et al (2018) Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor. In: Proceedings of international conference on machine learning 1861–1870
He W, Lee T Y, van Baar J, et al (2020) DynamicsExplorer: visual analytics for robot control tasks involving dynamics and LSTM-based control policies. In: Proceedings of 2020 IEEE pacific visualization symposium (PacificVis) 36–45
Hessel M, Modayil J, Van Hasselt H, et al (2018) Rainbow: combining improvements in deep reinforcement learning. In: Proceedings of the AAAI conference on artificial intelligence 32(1).
Iqbal S, Sha F (2019) Actor-attention-critic for multi-agent reinforcement learning. In: Proceedings of international conference on machine learning 2961–2970
Jaunet T, Vuillemot R, Wolf C (2020) DRLViz: Understanding decisions and memory in deep reinforcement learning. Comput Gr Forum 39(3):49–61
Article Google Scholar
Kindermans P J, Hooker S, Adebayo J, et al (2019) The (un) reliability of saliency methods. Explainable AI: Interpreting, Explaining and Visualizing Deep Learning 267–280
Kurek M, Jaśkowski W (2016) Heterogeneous team deep q-learning in low-dimensional multi-agent environment. In: Proceedings of 2016 IEEE conference on computational intelligence and games (CIG) 1–8.
Li S, Wu Y, Cui X et al (2019) Robust multi-agent reinforcement learning via minimax deep deterministic policy gradient. Proc AAAI Conf Artif Intel 33(01):4213–4220
Google Scholar
Lillicrap T P, Hunt J J, Pritzel A, et al (2016) Continuous control with deep reinforcement learning. In: Proceedings of the 4th international conference on learning representations 1–10
Liu S, Wu Y, Wei E et al (2013) Storyflow: tracking the evolution of stories. IEEE Trans Vis Comput Gr 19(12):2436–2445
Article Google Scholar
Lowe R, Wu Y I, Tamar A, et al (2017) Multi-agent actor-critic for mixed cooperative-competitive environments. In: Proceedings of advances in neural information processing systems 6379–6390.
Mnih V, Kavukcuoglu K, Silver D et al (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529–533
Article Google Scholar
Mnih V, Badia A P, Mirza M, et al (2016) Asynchronous methods for deep reinforcement learning. In: Proceedings of international conference on machine learning 1928–1937
Mordatch I, Abbeel P (2018) Emergence of grounded compositional language in multi-agent populations. In: Proceedings of the AAAI conference on artificial intelligence 32(1)
Parisotto E, Salakhutdinov R (2018) Neural map: Structured memory for deep reinforcement learning. In: Proceedings of the 6th international conference on learning representations 1–13
Poličar P G, Stražar M, Zupan B (2019) openTSNE: a modular Python library for t-SNE dimensionality reduction and embedding. BioRxiv 731877.
Ryu H, Shin H, Park J (2020) Multi-agent actor-critic with hierarchical graph attention network. Proc AAAI Conf Artif Intell 34(05):7236–7243
Google Scholar
Schulman J, Wolski F, Dhariwal P, et al (2017) Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347.
Such F P, Madhavan V, Liu R, et al (2019) An atari model zoo for analyzing, visualizing, and comparing deep reinforcement learning agents. In: Proceedings of the 28th international joint conference on artificial intelligence 3260–3267
Tampuu A, Matiisen T, Kodelja D et al (2017) Multiagent cooperation and competition with deep reinforcement learning. PLoS ONE 12(4):e0172395
Article Google Scholar
Van der Maaten L, Hinton G (2008) Visualizing data using t-SNE. J Mach Learn Res 9(11)
Van Hasselt H, Guez A, Silver D (2016) Deep reinforcement learning with double q-learning. In: Proceedings of the AAAI conference on artificial intelligence 30(1)
Wai H T, Yang Z, Wang Z, et al (2018) Multi-agent reinforcement learning via double averaging primal-dual optimization. In: Proceedings of the 32nd international conference on neural information processing systems 9672–9683
Wang J, Gou L, Shen HW et al (2018) Dqnviz: a visual analytics approach to understand deep q-networks. IEEE Trans Visual Comput Graphics 25(1):288–298
Article Google Scholar
Wang Z, Schaul T, Hessel M, et al (2016) Dueling network architectures for deep reinforcement learning. In: Proceedings of international conference on machine learning 1995–2003
Yuan J, Xiang S, Xia J et al (2021) Evaluation of Sampling Methods for Scatterplots. IEEE Trans Visual Comput Graphics 27:1720–1730
Article Google Scholar
Zahavy T, Ben-Zrihem N, Mannor S (2016) Graying the black box: Understanding dqns. In: Proceedings of international conference on machine learning 1899–1908

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (No. 61903109) and the Zhejiang Provincial Natural Science Foundation of China (No. LTGS23F030004).

Author information

Authors and Affiliations

School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou, China
Xiaoying Shi, Jiaming Zhang & Dewen Seng
HDU-ITMO Joint Institute, Hangzhou Dianzi University, Hangzhou, China
Ziyi Liang
Key Laboratory of Discrete Industrial Internet of Things of Zhejiang Province, Hangzhou, China
Xiaoying Shi

Authors

Xiaoying Shi
View author publications
You can also search for this author in PubMed Google Scholar
Jiaming Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Ziyi Liang
View author publications
You can also search for this author in PubMed Google Scholar
Dewen Seng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dewen Seng.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Shi, X., Zhang, J., Liang, Z. et al. MADDPGViz: a visual analytics approach to understand multi-agent deep reinforcement learning. J Vis 26, 1189–1205 (2023). https://doi.org/10.1007/s12650-023-00928-0

Download citation

Received: 23 November 2021
Revised: 23 September 2022
Accepted: 19 April 2023
Published: 13 May 2023
Issue Date: October 2023
DOI: https://doi.org/10.1007/s12650-023-00928-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

MADDPGViz: a visual analytics approach to understand multi-agent deep reinforcement learning