Abstract
Recently, the reinforcement learning of multi-agent has been developed rapidly, especially the Minimax Deep Deterministic Policy Gradient (M3DDPG) algorithm which improves agent robustness and solves the problem that agents trained by deep reinforcement learning (DRL) are often vulnerable and sensitive to the training environment. However, agents in the real environment may not be able to perceive certain important characteristics of the environment because of their limited perceptual capabilities. So Agents often fail to achieve the desired results. In this paper, we propose a novel algorithm State Representation Learning for Minimax Deep Deterministic Policy Gradient (SRL_M3DDPG) that combines M3DDPG with the state representation learning neural network model to extract the important characteristics of raw data. And we optimize the actor and critic network by using the neural network model of state representation learning. Then the actor and critic network learn from the state representation model instead of the raw observations. Simulation experiments show that the algorithm improves the final result.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Foerster, J., Farquhar, G., Afouras, T., Nardelli, N., Whiteson, S.: Counterfactual multiagent policy gradients. arXiv preprint arXiv:1705.08926 (2017)
Lowe, R.: Multi-agent actor-critic for mixed cooperative-competitive environments. In: Neural Information Processing Systems (NIPS) (2017)
Shihui, L., Yi, W., Xinyue, C., Honghua, D., Fei, F., Stuart, R.: Robust multi-agent reinforcement learning via minimax deep deterministic policy gradient. In: AAAI Conference on Artificial Intelligence (AAAI) (2019)
Yeung, S., Russakovsky, O., Mori, G., Fei-Fei, L.: End-to-end learning of action detection from frame glimpses in videos. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2678–2687 (2016)
Mnih, V., Kavukcuoglu, K., Silver, D.: Playing atari with deep reinforcement learning: arXiv:1312.5602 [cs.LG]
Bengio, Y., Courville, A., Vincent, P.: Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013)
Duan, Y., Schulman, J., Chen, X.: RL2: Fast reinforcement learning via slow reinforcement learning. arXiv:1611.02779
Finn, C.: Learning visual feature spaces for robotic manipulation with deep spatial autoencoders (2015)
Lange, S., Riedmiller, M., Voigtlander, A.: Autonomous reinforcement learning on raw visual input data in a real world application. In: International Joint Conference on Neural Networks, pp. 1–8 (2012). https://doi.org/10.1109/ijcnn.2012.6252823
Li, Z., Jiang, X.: State representation learning for multi-agent deep deterministic policy gradient. In: Krömer, P., Zhang, H., Liang, Y., Pan, J.-S. (eds.) ECC 2018. AISC, vol. 891, pp. 667–675. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-03766-6_75
Akgül, C.B., Rubin, D.L., Napel, S.: Content-based image retrieval in radiology: current status and future. J. Digit. Imaging 24(2), 208–222 (2011)
Acknowledgments
This work was supported in part by National Key R&D Program of China (No. 2017YFA0700604).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Hu, D., Jiang, X., Wei, X., Wang, J. (2019). State Representation Learning for Minimax Deep Deterministic Policy Gradient. In: Douligeris, C., Karagiannis, D., Apostolou, D. (eds) Knowledge Science, Engineering and Management. KSEM 2019. Lecture Notes in Computer Science(), vol 11775. Springer, Cham. https://doi.org/10.1007/978-3-030-29551-6_43
Download citation
DOI: https://doi.org/10.1007/978-3-030-29551-6_43
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-29550-9
Online ISBN: 978-3-030-29551-6
eBook Packages: Computer ScienceComputer Science (R0)