Skip to main content

State Representation Learning for Minimax Deep Deterministic Policy Gradient

  • Conference paper
  • First Online:
Knowledge Science, Engineering and Management (KSEM 2019)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11775))

Abstract

Recently, the reinforcement learning of multi-agent has been developed rapidly, especially the Minimax Deep Deterministic Policy Gradient (M3DDPG) algorithm which improves agent robustness and solves the problem that agents trained by deep reinforcement learning (DRL) are often vulnerable and sensitive to the training environment. However, agents in the real environment may not be able to perceive certain important characteristics of the environment because of their limited perceptual capabilities. So Agents often fail to achieve the desired results. In this paper, we propose a novel algorithm State Representation Learning for Minimax Deep Deterministic Policy Gradient (SRL_M3DDPG) that combines M3DDPG with the state representation learning neural network model to extract the important characteristics of raw data. And we optimize the actor and critic network by using the neural network model of state representation learning. Then the actor and critic network learn from the state representation model instead of the raw observations. Simulation experiments show that the algorithm improves the final result.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Foerster, J., Farquhar, G., Afouras, T., Nardelli, N., Whiteson, S.: Counterfactual multiagent policy gradients. arXiv preprint arXiv:1705.08926 (2017)

  2. Lowe, R.: Multi-agent actor-critic for mixed cooperative-competitive environments. In: Neural Information Processing Systems (NIPS) (2017)

    Google Scholar 

  3. Shihui, L., Yi, W., Xinyue, C., Honghua, D., Fei, F., Stuart, R.: Robust multi-agent reinforcement learning via minimax deep deterministic policy gradient. In: AAAI Conference on Artificial Intelligence (AAAI) (2019)

    Google Scholar 

  4. Yeung, S., Russakovsky, O., Mori, G., Fei-Fei, L.: End-to-end learning of action detection from frame glimpses in videos. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2678–2687 (2016)

    Google Scholar 

  5. Mnih, V., Kavukcuoglu, K., Silver, D.: Playing atari with deep reinforcement learning: arXiv:1312.5602 [cs.LG]

  6. Bengio, Y., Courville, A., Vincent, P.: Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013)

    Article  Google Scholar 

  7. Duan, Y., Schulman, J., Chen, X.: RL2: Fast reinforcement learning via slow reinforcement learning. arXiv:1611.02779

  8. Finn, C.: Learning visual feature spaces for robotic manipulation with deep spatial autoencoders (2015)

    Google Scholar 

  9. Lange, S., Riedmiller, M., Voigtlander, A.: Autonomous reinforcement learning on raw visual input data in a real world application. In: International Joint Conference on Neural Networks, pp. 1–8 (2012). https://doi.org/10.1109/ijcnn.2012.6252823

  10. Li, Z., Jiang, X.: State representation learning for multi-agent deep deterministic policy gradient. In: Krömer, P., Zhang, H., Liang, Y., Pan, J.-S. (eds.) ECC 2018. AISC, vol. 891, pp. 667–675. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-03766-6_75

    Chapter  Google Scholar 

  11. Akgül, C.B., Rubin, D.L., Napel, S.: Content-based image retrieval in radiology: current status and future. J. Digit. Imaging 24(2), 208–222 (2011)

    Article  Google Scholar 

Download references

Acknowledgments

This work was supported in part by National Key R&D Program of China (No. 2017YFA0700604).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xuesong Jiang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Hu, D., Jiang, X., Wei, X., Wang, J. (2019). State Representation Learning for Minimax Deep Deterministic Policy Gradient. In: Douligeris, C., Karagiannis, D., Apostolou, D. (eds) Knowledge Science, Engineering and Management. KSEM 2019. Lecture Notes in Computer Science(), vol 11775. Springer, Cham. https://doi.org/10.1007/978-3-030-29551-6_43

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-29551-6_43

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-29550-9

  • Online ISBN: 978-3-030-29551-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics