Skip to main content

A Guided Evaluation Method for Robot Dynamic Manipulation

  • Conference paper
  • First Online:
Book cover Intelligent Robotics and Applications (ICIRA 2020)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12595))

Included in the following conference series:

Abstract

It is challenging for reinforcement learning (RL) to solve the dynamic goal tasks of robot in sparse reward setting. Dynamic Hindsight Experience Replay (DHER) is a method to solve such problems. However, the learned policy DHER is easy to degrade, and the success rate is low, especially in complex environment. In order to help agents learn purposefully in dynamic goal tasks, avoid blind exploration, and improve the stability and robustness of policy, we propose a guided evaluation method named GEDHER, which assists the agent to learn under the guidance of evaluated expert demonstrations based on the DHER. In addition, We add the Gaussian noise in action sampling to balance the exploration and exploitation, preventing from falling into local optimal policy. Experiment results show that our method outperforms original DHER method in terms of both stability and success rate.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Lillicrap, T.P., et al.: Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015)

  2. Nair, A., et al.: Overcoming exploration in reinforcement learning with demonstrations. In: ICRA (2018)

    Google Scholar 

  3. Vecerik, M., et al.: Leveraging Demonstrations for Deep Reinforcement Learning on Robotics Problems with Sparse Rewards. arXiv:1707.08817 (2017)

  4. Wang, Y., et al.: An experienced-based policy gradient method for smooth manipulation. In: IEEE-CYBER (2019)

    Google Scholar 

  5. Fang, M., et al.: DHER: hindsight experience replay for dynamic goals. In: ICLR (2019)

    Google Scholar 

  6. Mnih, V., Kavukcuoglu, K., Silver, D., et al.: Human-level control through deep reinforce-ment learning. Nature 518(7540), 529 (2015)

    Article  Google Scholar 

  7. This, Paul R. Markov decision processes. Comap, Incorporated, (1983) (MDP)

    Google Scholar 

  8. Yang, G., et al.: Reinforcement learning form imperfect demonstrations. In: International Conference on Machine Learning, Stockholm, Sweden, PMLR, vol. 80 (2018)

    Google Scholar 

  9. Mnih, V., et al.: Asynchronous methods for deep reinforcement learning. In: ICML (2016)

    Google Scholar 

  10. Ratliff, N., Bagnell, J.A., Srinivasa, S.S.: Imitation learning for locomotion and manipulation. In: 2007 7th IEEE-RAS International Conference on Humanoid Robots (2007)

    Google Scholar 

  11. Todorov, E., Erez, T., Tassa, Y.: “MuJoCo”: a physics engine for model-based control. In: The IEEE/RSJ International Conference on Intelligent Robots and Systems (2012)

    Google Scholar 

  12. Popov, I., et al.: Data-efficient Deep Reinforcement Learning for Dexterous Manipulation. arXiv preprint arXiv:1704.03073 (2017)

  13. Haarnoja, T., et al.: Deep Reinforcement Learning for Robotic Manipulation with Asynchronous Off-Policy Updates. arXiv preprint arXiv:1610.00633 (2016)

  14. Andrychowicz, M., et al.: Hindsight experience replay. In: Advances in Neural Information Processing Systems, pp. 5048–5058 (2017)

    Google Scholar 

  15. Bakker, B., Schmidhuber, J.: Hierarchical reinforcement learning based on subgoal discovery and subpolicy specialization. In: Proceedings of the 8-th Conference on Intelligent Autonomous Systems, pp. 438–445

    Google Scholar 

  16. Hester, T., et al.: Learning from Demonstrations for Real World Reinforcement Learning. arXiv preprint arxiv:1704.03732 (2017)

  17. Xu, K., Liu, H., Shen, H., Yang, T.: Structure design and kinematic analysis of a partially-decoupled 3T1R parallel manipulator. In: Yu, H., Liu, J., Liu, L., Ju, Z., Liu, Y., Zhou, D. (eds.) ICIRA 2019. LNCS (LNAI), vol. 11742, pp. 415–424. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-27535-8_37

    Chapter  Google Scholar 

  18. Heess, N., et al.: Learning continuous control policies by stochastic value gradients. In: Proceedings of the International Conference on Neural Information Processing Systems, pp. 2944–2952 (2015)

    Google Scholar 

Download references

Funding

This work was supported in part by Trico-Robot plan of NSFC under grant No.91748208, National Major Project under grant No. 2018ZX01028-101, Shaanxi Project under grant No.2018ZDCXLGY0607, NSFC No.61973246, and the program of the Ministry of Education.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Xuguang Lan , Lipeng Wan , Zhuo Liang or Haoyu Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Feng, C., Lan, X., Wan, L., Liang, Z., Wang, H. (2020). A Guided Evaluation Method for Robot Dynamic Manipulation. In: Chan, C.S., et al. Intelligent Robotics and Applications. ICIRA 2020. Lecture Notes in Computer Science(), vol 12595. Springer, Cham. https://doi.org/10.1007/978-3-030-66645-3_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-66645-3_14

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-66644-6

  • Online ISBN: 978-3-030-66645-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics