State Representation Learning for Minimax Deep Deterministic Policy Gradient

Hu, Dapeng; Jiang, Xuesong; Wei, Xiumei; Wang, Jian

doi:10.1007/978-3-030-29551-6_43

Dapeng Hu¹¹,
Xuesong Jiang^11,12,
Xiumei Wei¹¹ &
…
Jian Wang¹²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11775))

Included in the following conference series:

International Conference on Knowledge Science, Engineering and Management

2707 Accesses
1 Citations

Abstract

Recently, the reinforcement learning of multi-agent has been developed rapidly, especially the Minimax Deep Deterministic Policy Gradient (M3DDPG) algorithm which improves agent robustness and solves the problem that agents trained by deep reinforcement learning (DRL) are often vulnerable and sensitive to the training environment. However, agents in the real environment may not be able to perceive certain important characteristics of the environment because of their limited perceptual capabilities. So Agents often fail to achieve the desired results. In this paper, we propose a novel algorithm State Representation Learning for Minimax Deep Deterministic Policy Gradient (SRL_M3DDPG) that combines M3DDPG with the state representation learning neural network model to extract the important characteristics of raw data. And we optimize the actor and critic network by using the neural network model of state representation learning. Then the actor and critic network learn from the state representation model instead of the raw observations. Simulation experiments show that the algorithm improves the final result.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Foerster, J., Farquhar, G., Afouras, T., Nardelli, N., Whiteson, S.: Counterfactual multiagent policy gradients. arXiv preprint arXiv:1705.08926 (2017)
Lowe, R.: Multi-agent actor-critic for mixed cooperative-competitive environments. In: Neural Information Processing Systems (NIPS) (2017)
Google Scholar
Shihui, L., Yi, W., Xinyue, C., Honghua, D., Fei, F., Stuart, R.: Robust multi-agent reinforcement learning via minimax deep deterministic policy gradient. In: AAAI Conference on Artificial Intelligence (AAAI) (2019)
Google Scholar
Yeung, S., Russakovsky, O., Mori, G., Fei-Fei, L.: End-to-end learning of action detection from frame glimpses in videos. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2678–2687 (2016)
Google Scholar
Mnih, V., Kavukcuoglu, K., Silver, D.: Playing atari with deep reinforcement learning: arXiv:1312.5602 [cs.LG]
Bengio, Y., Courville, A., Vincent, P.: Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013)
Article Google Scholar
Duan, Y., Schulman, J., Chen, X.: RL2: Fast reinforcement learning via slow reinforcement learning. arXiv:1611.02779
Finn, C.: Learning visual feature spaces for robotic manipulation with deep spatial autoencoders (2015)
Google Scholar
Lange, S., Riedmiller, M., Voigtlander, A.: Autonomous reinforcement learning on raw visual input data in a real world application. In: International Joint Conference on Neural Networks, pp. 1–8 (2012). https://doi.org/10.1109/ijcnn.2012.6252823
Li, Z., Jiang, X.: State representation learning for multi-agent deep deterministic policy gradient. In: Krömer, P., Zhang, H., Liang, Y., Pan, J.-S. (eds.) ECC 2018. AISC, vol. 891, pp. 667–675. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-03766-6_75
Chapter Google Scholar
Akgül, C.B., Rubin, D.L., Napel, S.: Content-based image retrieval in radiology: current status and future. J. Digit. Imaging 24(2), 208–222 (2011)
Article Google Scholar

Download references

Acknowledgments

This work was supported in part by National Key R&D Program of China (No. 2017YFA0700604).

Author information

Authors and Affiliations

Qilu University of Technology, Shandong Academy of Sciences, Jinan, 250353, China
Dapeng Hu, Xuesong Jiang & Xiumei Wei
Shandong College of Information Technology, Weifang, China
Xuesong Jiang & Jian Wang

Authors

Dapeng Hu
View author publications
You can also search for this author in PubMed Google Scholar
Xuesong Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Xiumei Wei
View author publications
You can also search for this author in PubMed Google Scholar
Jian Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xuesong Jiang .

Editor information

Editors and Affiliations

University of Piraeus, Piraeus, Greece
Christos Douligeris
University of Vienna, Vienna, Austria
Dimitris Karagiannis
University of Piraeus, Piraeus, Greece
Dimitris Apostolou

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hu, D., Jiang, X., Wei, X., Wang, J. (2019). State Representation Learning for Minimax Deep Deterministic Policy Gradient. In: Douligeris, C., Karagiannis, D., Apostolou, D. (eds) Knowledge Science, Engineering and Management. KSEM 2019. Lecture Notes in Computer Science(), vol 11775. Springer, Cham. https://doi.org/10.1007/978-3-030-29551-6_43

Download citation

DOI: https://doi.org/10.1007/978-3-030-29551-6_43
Published: 21 August 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-29550-9
Online ISBN: 978-3-030-29551-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics