Robust and High-Precision End-to-End Control Policy for Multi-stage Manipulation Task with Behavioral Cloning

Ge, Wei; Shang, Weiwei; Song, Fangjing; Sui, Hongjian; Cong, Shuang

doi:10.1007/978-981-13-7983-3_42

Wei Ge¹⁰,
Weiwei Shang¹⁰,
Fangjing Song¹⁰,
Hongjian Sui¹⁰ &
…
Shuang Cong¹⁰

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1005))

Included in the following conference series:

International Conference on Cognitive Systems and Signal Processing

Abstract

In this paper, we propose a multi-stage task learning method that trains an end-to-end policy to control a 6-DoF robot arm to accomplish pick-and-place operation with high precision. The policy is mainly composed of CNNs and LSTMs, directly mapping raw images and joint angles to velocities command. In order to acquire a robust and high-precision policy, several techniques are introduced to boost performance. Augmentation trajectories are designed to alleviate compounding error problem, and dataset resampling is used to solve imbalanced data issue. Moreover, Huber loss for auxiliary outputs is illustrated to be very effective in multi-objective optimization problems, especially in robot learning field where sample complexity needs to be reduced desirably. To verify the effectiveness of our method, experiments are carried out in Gazebo simulator with UR5 arm and Kinect v1 camera. Our visuomotor policy can achieve a success rate of \(87\%\) on the pick-and-place task. The results of our experiments demonstrate that, with the skills we mention, behavioral cloning can effectively help us to learn good visuomotor policies for long-horizon tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Torabi, F., Warnell, G., Stone, P.: Behavioral Cloning from Observation, arXiv: 1805.01954, May 2018
Sun, W., Venkatraman, A., Gordon, G.J., et al.: Deeply Aggrevated: Differentiable Imitation Learning for Sequential Prediction, arXiv: 1703.01030, March 2017
Stadie, B.C., Abbeel, P., Sutskever, I.: Third-Person Imitation Learning, arXiv: 1703.01703, March 2017
Sheh, R., Hengst, B., Sammut, C.: Behavioural cloning for driving robots over rough terrain. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, San Francisco, 25–30 September 2011, pp. 732–737 (2011)
Google Scholar
James, S., Davison, A.J., Johns, E.: Transferring End-to-End Visuomotor Control from Simulation to Real World for a Multi-Stage Task, arXiv: 1707.02267, October 2017
Jaderberg, M., Mnih, V., Czarnecki, W.M., et al.: Reinforcement Learning with Unsupervised Auxiliary Tasks, arXiv: 1611.05397, November 2016
Dilokthanakul, N., Kaplanis, C., Pawlowski, N., et al.: Feature Control as Intrinsic Motivation for Hierarchical Reinforcement Learning, arXiv: 1705.06769, November 2017
Huber, P.J.: Robust estimation of a location parameter. Ann. Math. Stat. 35(1), 73–101 (1964)
Article MathSciNet Google Scholar
Pomerleau, D.A.: ALVINN: an autonomous land vehicle in a neural network. In: Advances in Neural Information Processing Systems, Denver, pp. 305–313 (1988)
Google Scholar
Chung, J., Kastner, K., Dinh, L., et al.: A recurrent latent variable model for sequential data. In: Advances in Neural Information Processing Systems, Montreal, 07–12 December 2015, pp. 2980–2988 (2015)
Google Scholar
Wen, T., Gasic, M., Mrksic, N., et al.: Semantically conditioned LSTM-based natural language generation for spoken dialogue systems. In: Empirical Methods in Natural Language Processing, Lisbon, 17–21 September 2015, pp. 1711–1721 (2015)
Google Scholar
Karpathy, A., Li, F.: Deep visual-semantic alignments for generating image descriptions. In: Computer Vision and Pattern Recognition, Boston, 08–10 June 2015, pp. 3128–3137 (2015)
Google Scholar
Schaal, S.: Learning from demonstration. Robot. Auton. Syst. 47(2–3), 65–67 (2004)
Google Scholar
Abbeel, P., Coates, A., Ng, A.Y.: Autonomous helicopter aerobatics through apprenticeship learning. Int. J. Robot. Res. 29(13), 1608–1639 (2010)
Article Google Scholar
Zucker, M., Ratliff, N., Dragan, A.D., et al.: CHOMP: covariant hamiltonian optimization for motion planning. Int. J. Robot. Res. 32(9–10), 1164–1193 (2013)
Article Google Scholar
Zhang, T., McCarthy, Z., Jow, O., et al.: Deep Imitation Learning for Complex Manipulation Tasks from Virtual Reality Teleoperation, arXiv: 1710.04615, March 2017
Yu, T., Finn, C., Xie, A., et al.: One-Shot Imitation from Observing Humans via Domain-Adaptive Meta-Learning, arXiv:1802.01557, February 2018
Duan, Y., Andrychowicz, M., Stadie, B.C., et al.: One-Shot Imitation Learning, arXiv: 1703.07326, March 2017
Finn, C., Yu, T., Zhang, T., et al.: One-Shot Visual Imitation Learning via Meta-Learning, arXiv: 1709.04905, September 2017
Rahmatizadeh, R., Abolghasemi, P., Behal, A., et al.: From Virtual Demonstration to Real-World Manipulation Using LSTM and MDN, arXiv: 1603.03833, March 2016
Rahmatizadeh, R., Abolghasemi, P., Boloni, L., et al.: Vision-Based Multi-Task Manipulation for Inexpensive Robots Using End-to-End Learning from Demonstration, arXiv: 1707.02920, July 2017

Download references

Acknowledgment

This work was supported by the National Natural Science Foundation of China with Grant No. 51675501 and 51275500, the State Key Laboratory of Robotics and System with Grant No. SKLRS-2018-KF-07, and the Youth Innovation Promotion Association CAS with Grant No. 2012321. The authors would like to thank Information Science Laboratory Center of USTC for the hardware & software services.

Author information

Authors and Affiliations

Department of Automation, University of Science and Technology of China, Hefei, China
Wei Ge, Weiwei Shang, Fangjing Song, Hongjian Sui & Shuang Cong

Authors

Wei Ge
View author publications
You can also search for this author in PubMed Google Scholar
Weiwei Shang
View author publications
You can also search for this author in PubMed Google Scholar
Fangjing Song
View author publications
You can also search for this author in PubMed Google Scholar
Hongjian Sui
View author publications
You can also search for this author in PubMed Google Scholar
Shuang Cong
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Weiwei Shang .

Editor information

Editors and Affiliations

Department of Computer Science and Technology, Tsinghua University, Beijing, China
Fuchun Sun
Department of Computer Science and Technology, Tsinghua University, Beijing, China
Huaping Liu
College of Mechatronics and Automation, National University of Defense Technology, Changsha, Hunan, China
Dewen Hu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ge, W., Shang, W., Song, F., Sui, H., Cong, S. (2019). Robust and High-Precision End-to-End Control Policy for Multi-stage Manipulation Task with Behavioral Cloning. In: Sun, F., Liu, H., Hu, D. (eds) Cognitive Systems and Signal Processing. ICCSIP 2018. Communications in Computer and Information Science, vol 1005. Springer, Singapore. https://doi.org/10.1007/978-981-13-7983-3_42

Download citation

DOI: https://doi.org/10.1007/978-981-13-7983-3_42
Published: 28 April 2019
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-7982-6
Online ISBN: 978-981-13-7983-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics