Skip to main content
Log in

MADDPGViz: a visual analytics approach to understand multi-agent deep reinforcement learning

  • Regular Paper
  • Published:
Journal of Visualization Aims and scope Submit manuscript

Abstract

Deep reinforcement learning (DRL) has received widespread attention recently, where the control policies are trained through deep neural networks. Several visual analytics methods were proposed to reveal the internal mechanism of DRL. However, most of the current methods focused on analyzing DRL algorithms with a single agent. Due to the inherent non-stationarity of the environment and the complex interactions among multiple agents, understanding the learning process of multi-agent deep reinforcement learning is more challenging. This paper presents MADDPGViz, a visual analytics system to expose details of the training process from different aspects for a multi-agent deep deterministic policy gradient (MADDPG) model. MADDPGViz allows users to overview the model’s training statistics and compare the large experience space of multiple agents during different training episodes to learn the functionality of agents’ centralized critics. Users can further examine the dynamic interaction among multiple agents and environmental objects for a selected episode, to discover the learned cooperative strategies of agents from the microscopic aspect. Case studies are conducted in two different cooperative environments, to demonstrate the usability of the system.

Graphic abstract

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

References

  • Annasamy RM, Sycara K (2019) Towards better interpretability in deep q-networks. Proc AAAI Conf Artif Intell 33(01):4561–4569

    Google Scholar 

  • Bellemare M G, Dabney W, Munos R (2017) A distributional perspective on reinforcement learning. In: Proceedings of international conference on machine learning 449–458

  • Chen W, Zhou K, Chen C (2016) Real-time bus holding control on a transit corridor based on multi-agent reinforcement learning. In: Proceedings of 2016 IEEE 19th international conference on intelligent transportation systems (ITSC) 100–106.

  • Chen J, Yuan B, Tomizuka M (2019) Model-free deep reinforcement learning for urban autonomous driving. In: Proceedings of 2019 IEEE intelligent transportation systems conference (ITSC) 2765–2771.

  • Chu T, Wang J, Codecà L et al (2019) Multi-agent deep reinforcement learning for large-scale traffic signal control. IEEE Trans Intell Transp Syst 21(3):1086–1095

    Article  Google Scholar 

  • Du W, Ding S (2020) A survey on multi-agent deep reinforcement learning: from the perspective of challenges and applications. Artif Intell Rev 1–24

  • Foerster J, Nardelli N, Farquhar G, et al (2017) Stabilising experience replay for deep multi-agent reinforcement learning. In: Proceedings of international conference on machine learning 1146–1155

  • Foerster J, Farquhar G, Afouras T, et al (2018) Counterfactual multi-agent policy gradients. In: Proceedings of the AAAI conference on artificial intelligence 32(1).

  • Greydanus S, Koul A, Dodge J, et al (2018) Visualizing and understanding atari agents. In: Proceedings of international conference on machine learning 1792–1801

  • Gu S, Holly E, Lillicrap T, et al (2017) Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates. In: Proceedings of 2017 IEEE international conference on robotics and automation (ICRA) 3389–3396

  • Gupta J K, Egorov M, Kochenderfer M (2017) Cooperative multi-agent control using deep reinforcement learning. In: Proceedings of international conference on autonomous agents and multiagent systems 66–83.

  • Haarnoja T, Zhou A, Abbeel P, et al (2018) Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor. In: Proceedings of international conference on machine learning 1861–1870

  • He W, Lee T Y, van Baar J, et al (2020) DynamicsExplorer: visual analytics for robot control tasks involving dynamics and LSTM-based control policies. In: Proceedings of 2020 IEEE pacific visualization symposium (PacificVis) 36–45

  • Hessel M, Modayil J, Van Hasselt H, et al (2018) Rainbow: combining improvements in deep reinforcement learning. In: Proceedings of the AAAI conference on artificial intelligence 32(1).

  • Iqbal S, Sha F (2019) Actor-attention-critic for multi-agent reinforcement learning. In: Proceedings of international conference on machine learning 2961–2970

  • Jaunet T, Vuillemot R, Wolf C (2020) DRLViz: Understanding decisions and memory in deep reinforcement learning. Comput Gr Forum 39(3):49–61

    Article  Google Scholar 

  • Kindermans P J, Hooker S, Adebayo J, et al (2019) The (un) reliability of saliency methods. Explainable AI: Interpreting, Explaining and Visualizing Deep Learning 267–280

  • Kurek M, Jaśkowski W (2016) Heterogeneous team deep q-learning in low-dimensional multi-agent environment. In: Proceedings of 2016 IEEE conference on computational intelligence and games (CIG) 1–8.

  • Li S, Wu Y, Cui X et al (2019) Robust multi-agent reinforcement learning via minimax deep deterministic policy gradient. Proc AAAI Conf Artif Intel 33(01):4213–4220

    Google Scholar 

  • Lillicrap T P, Hunt J J, Pritzel A, et al (2016) Continuous control with deep reinforcement learning. In: Proceedings of the 4th international conference on learning representations 1–10

  • Liu S, Wu Y, Wei E et al (2013) Storyflow: tracking the evolution of stories. IEEE Trans Vis Comput Gr 19(12):2436–2445

    Article  Google Scholar 

  • Lowe R, Wu Y I, Tamar A, et al (2017) Multi-agent actor-critic for mixed cooperative-competitive environments. In: Proceedings of advances in neural information processing systems 6379–6390.

  • Mnih V, Kavukcuoglu K, Silver D et al (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529–533

    Article  Google Scholar 

  • Mnih V, Badia A P, Mirza M, et al (2016) Asynchronous methods for deep reinforcement learning. In: Proceedings of international conference on machine learning 1928–1937

  • Mordatch I, Abbeel P (2018) Emergence of grounded compositional language in multi-agent populations. In: Proceedings of the AAAI conference on artificial intelligence 32(1)

  • Parisotto E, Salakhutdinov R (2018) Neural map: Structured memory for deep reinforcement learning. In: Proceedings of the 6th international conference on learning representations 1–13

  • Poličar P G, Stražar M, Zupan B (2019) openTSNE: a modular Python library for t-SNE dimensionality reduction and embedding. BioRxiv 731877.

  • Ryu H, Shin H, Park J (2020) Multi-agent actor-critic with hierarchical graph attention network. Proc AAAI Conf Artif Intell 34(05):7236–7243

    Google Scholar 

  • Schulman J, Wolski F, Dhariwal P, et al (2017) Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347.

  • Such F P, Madhavan V, Liu R, et al (2019) An atari model zoo for analyzing, visualizing, and comparing deep reinforcement learning agents. In: Proceedings of the 28th international joint conference on artificial intelligence 3260–3267

  • Tampuu A, Matiisen T, Kodelja D et al (2017) Multiagent cooperation and competition with deep reinforcement learning. PLoS ONE 12(4):e0172395

    Article  Google Scholar 

  • Van der Maaten L, Hinton G (2008) Visualizing data using t-SNE. J Mach Learn Res 9(11)

  • Van Hasselt H, Guez A, Silver D (2016) Deep reinforcement learning with double q-learning. In: Proceedings of the AAAI conference on artificial intelligence 30(1)

  • Wai H T, Yang Z, Wang Z, et al (2018) Multi-agent reinforcement learning via double averaging primal-dual optimization. In: Proceedings of the 32nd international conference on neural information processing systems 9672–9683

  • Wang J, Gou L, Shen HW et al (2018) Dqnviz: a visual analytics approach to understand deep q-networks. IEEE Trans Visual Comput Graphics 25(1):288–298

    Article  Google Scholar 

  • Wang Z, Schaul T, Hessel M, et al (2016) Dueling network architectures for deep reinforcement learning. In: Proceedings of international conference on machine learning 1995–2003

  • Yuan J, Xiang S, Xia J et al (2021) Evaluation of Sampling Methods for Scatterplots. IEEE Trans Visual Comput Graphics 27:1720–1730

    Article  Google Scholar 

  • Zahavy T, Ben-Zrihem N, Mannor S (2016) Graying the black box: Understanding dqns. In: Proceedings of international conference on machine learning 1899–1908

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (No. 61903109) and the Zhejiang Provincial Natural Science Foundation of China (No. LTGS23F030004).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dewen Seng.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shi, X., Zhang, J., Liang, Z. et al. MADDPGViz: a visual analytics approach to understand multi-agent deep reinforcement learning. J Vis 26, 1189–1205 (2023). https://doi.org/10.1007/s12650-023-00928-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12650-023-00928-0

Keywords

Navigation