Skip to main content

Robust and High-Precision End-to-End Control Policy for Multi-stage Manipulation Task with Behavioral Cloning

  • Conference paper
  • First Online:
Cognitive Systems and Signal Processing (ICCSIP 2018)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1005))

Included in the following conference series:

Abstract

In this paper, we propose a multi-stage task learning method that trains an end-to-end policy to control a 6-DoF robot arm to accomplish pick-and-place operation with high precision. The policy is mainly composed of CNNs and LSTMs, directly mapping raw images and joint angles to velocities command. In order to acquire a robust and high-precision policy, several techniques are introduced to boost performance. Augmentation trajectories are designed to alleviate compounding error problem, and dataset resampling is used to solve imbalanced data issue. Moreover, Huber loss for auxiliary outputs is illustrated to be very effective in multi-objective optimization problems, especially in robot learning field where sample complexity needs to be reduced desirably. To verify the effectiveness of our method, experiments are carried out in Gazebo simulator with UR5 arm and Kinect v1 camera. Our visuomotor policy can achieve a success rate of \(87\%\) on the pick-and-place task. The results of our experiments demonstrate that, with the skills we mention, behavioral cloning can effectively help us to learn good visuomotor policies for long-horizon tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Torabi, F., Warnell, G., Stone, P.: Behavioral Cloning from Observation, arXiv: 1805.01954, May 2018

  2. Sun, W., Venkatraman, A., Gordon, G.J., et al.: Deeply Aggrevated: Differentiable Imitation Learning for Sequential Prediction, arXiv: 1703.01030, March 2017

  3. Stadie, B.C., Abbeel, P., Sutskever, I.: Third-Person Imitation Learning, arXiv: 1703.01703, March 2017

  4. Sheh, R., Hengst, B., Sammut, C.: Behavioural cloning for driving robots over rough terrain. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, San Francisco, 25–30 September 2011, pp. 732–737 (2011)

    Google Scholar 

  5. James, S., Davison, A.J., Johns, E.: Transferring End-to-End Visuomotor Control from Simulation to Real World for a Multi-Stage Task, arXiv: 1707.02267, October 2017

  6. Jaderberg, M., Mnih, V., Czarnecki, W.M., et al.: Reinforcement Learning with Unsupervised Auxiliary Tasks, arXiv: 1611.05397, November 2016

  7. Dilokthanakul, N., Kaplanis, C., Pawlowski, N., et al.: Feature Control as Intrinsic Motivation for Hierarchical Reinforcement Learning, arXiv: 1705.06769, November 2017

  8. Huber, P.J.: Robust estimation of a location parameter. Ann. Math. Stat. 35(1), 73–101 (1964)

    Article  MathSciNet  Google Scholar 

  9. Pomerleau, D.A.: ALVINN: an autonomous land vehicle in a neural network. In: Advances in Neural Information Processing Systems, Denver, pp. 305–313 (1988)

    Google Scholar 

  10. Chung, J., Kastner, K., Dinh, L., et al.: A recurrent latent variable model for sequential data. In: Advances in Neural Information Processing Systems, Montreal, 07–12 December 2015, pp. 2980–2988 (2015)

    Google Scholar 

  11. Wen, T., Gasic, M., Mrksic, N., et al.: Semantically conditioned LSTM-based natural language generation for spoken dialogue systems. In: Empirical Methods in Natural Language Processing, Lisbon, 17–21 September 2015, pp. 1711–1721 (2015)

    Google Scholar 

  12. Karpathy, A., Li, F.: Deep visual-semantic alignments for generating image descriptions. In: Computer Vision and Pattern Recognition, Boston, 08–10 June 2015, pp. 3128–3137 (2015)

    Google Scholar 

  13. Schaal, S.: Learning from demonstration. Robot. Auton. Syst. 47(2–3), 65–67 (2004)

    Google Scholar 

  14. Abbeel, P., Coates, A., Ng, A.Y.: Autonomous helicopter aerobatics through apprenticeship learning. Int. J. Robot. Res. 29(13), 1608–1639 (2010)

    Article  Google Scholar 

  15. Zucker, M., Ratliff, N., Dragan, A.D., et al.: CHOMP: covariant hamiltonian optimization for motion planning. Int. J. Robot. Res. 32(9–10), 1164–1193 (2013)

    Article  Google Scholar 

  16. Zhang, T., McCarthy, Z., Jow, O., et al.: Deep Imitation Learning for Complex Manipulation Tasks from Virtual Reality Teleoperation, arXiv: 1710.04615, March 2017

  17. Yu, T., Finn, C., Xie, A., et al.: One-Shot Imitation from Observing Humans via Domain-Adaptive Meta-Learning, arXiv:1802.01557, February 2018

  18. Duan, Y., Andrychowicz, M., Stadie, B.C., et al.: One-Shot Imitation Learning, arXiv: 1703.07326, March 2017

  19. Finn, C., Yu, T., Zhang, T., et al.: One-Shot Visual Imitation Learning via Meta-Learning, arXiv: 1709.04905, September 2017

  20. Rahmatizadeh, R., Abolghasemi, P., Behal, A., et al.: From Virtual Demonstration to Real-World Manipulation Using LSTM and MDN, arXiv: 1603.03833, March 2016

  21. Rahmatizadeh, R., Abolghasemi, P., Boloni, L., et al.: Vision-Based Multi-Task Manipulation for Inexpensive Robots Using End-to-End Learning from Demonstration, arXiv: 1707.02920, July 2017

Download references

Acknowledgment

This work was supported by the National Natural Science Foundation of China with Grant No. 51675501 and 51275500, the State Key Laboratory of Robotics and System with Grant No. SKLRS-2018-KF-07, and the Youth Innovation Promotion Association CAS with Grant No. 2012321. The authors would like to thank Information Science Laboratory Center of USTC for the hardware & software services.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Weiwei Shang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ge, W., Shang, W., Song, F., Sui, H., Cong, S. (2019). Robust and High-Precision End-to-End Control Policy for Multi-stage Manipulation Task with Behavioral Cloning. In: Sun, F., Liu, H., Hu, D. (eds) Cognitive Systems and Signal Processing. ICCSIP 2018. Communications in Computer and Information Science, vol 1005. Springer, Singapore. https://doi.org/10.1007/978-981-13-7983-3_42

Download citation

  • DOI: https://doi.org/10.1007/978-981-13-7983-3_42

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-13-7982-6

  • Online ISBN: 978-981-13-7983-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics