Experiments with Hierarchical Reinforcement Learning of Multiple Grasping Policies

Osa, Takayuki; Peters, Jan; Neumann, Gerhard

doi:10.1007/978-3-319-50115-4_15

Takayuki Osa⁷,
Jan Peters⁷ &
Gerhard Neumann⁷

Part of the book series: Springer Proceedings in Advanced Robotics ((SPAR,volume 1))

Included in the following conference series:

International Symposium on Experimental Robotics

4799 Accesses
2 Citations
6 Altmetric

Abstract

Robotic grasping has attracted considerable interest, but it still remains a challenging task. The data-driven approach is a promising solution to the robotic grasping problem; this approach leverages a grasp dataset and generalizes grasps for various objects. However, these methods often depend on the quality of the given datasets, which are not trivial to obtain with sufficient quality. Although reinforcement learning approaches have been recently used to achieve autonomous collection of grasp datasets, the existing algorithms are often limited to specific grasp types. In this paper, we present a framework for hierarchical reinforcement learning of grasping policies. In our framework, the lower-level hierarchy learns multiple grasp types, and the upper-level hierarchy learns a policy to select from the learned grasp types according to a point cloud of a new object. Through experiments, we validate that our approach learns grasping by constructing the grasp dataset autonomously. The experimental results show that our approach learns multiple grasping policies and generalizes the learned grasps by using local point cloud information.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 259.00; Price excludes VAT (USA)

Softcover Book: USD 329.99; Price excludes VAT (USA)

Hardcover Book: USD 329.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Robotic Arm Grasping Based on Deep Reinforcement Learning

Fast Grasp Learning for Novel Objects

Irregular Depth Tiles: Automatically Generated Data Used for Network-based Robotic Grasping in 2D Dense Clutter

Article 27 July 2021

References

Bicchi, A., Kumar, V.: Robotic grasping and contact: a review. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 348–353 (2000)
Google Scholar
Bohg, J., Morales, A., Asfour, T., Kragic, D.: Data-driven grasp synthesis- a survey. IEEE Trans. Robot. 30(2), 289–309 (2014)
Article Google Scholar
Goldfeder, C., Allen, P.K.: Data-driven grasping. Autonomous Robots 31, 1–20 (2011)
Article Google Scholar
Fischinger, D., Weiss, A., Vincze, M.: Learning grasps with topographic features. Intl. J. Robot. Res. 34, 1167–1194 (2015)
Google Scholar
Kopicki, M., Detry, R., Adjigble, M., Stolkin, R., Leonardis, A., Wyatt, J.L.: One-shot learning and generation of dexterous grasps for novel objects. Intl. J. Robot. Res. (2015)
Google Scholar
Lenz, I., Lee, H., Saxena, A.: Deep learning for detecting robotic grasps. Intl. J. Robot. Res. 34, 705–724 (2015)
Google Scholar
Ten Pas, A., Platt, R.: Localizing handle-like grasp affordances in 3d point clouds. In: International Symposium on Experimental Robotics (ISER) (2014)
Google Scholar
Gualtieri, M., Ten Pas, A., Saenko, K., Platt, R.: Using geometry to detect grasp poses in 3d point clouds. In: International Symposium on Robotics Research (ISRR) (2015)
Google Scholar
Sutton, R., Barto, A.: Reinforcement Learning: An Introduction. The MIT Press, Cambridge (1998)
Google Scholar
Pinto, L., Gupta, A.: Supersizing self-supervision: learning to grasp from 50k tries and 700 robot hours. In: IEEE International Conference on Robotics and Automation (ICRA) (2016)
Google Scholar
Levine, S., Pastor, P., Krizhevsky, A., Quillen, D.: Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection. CoRR abs/1603.02199 (2016)
Google Scholar
Napier, J.R.: The prehensile movements of the human hand. J. Bone Joint Surg. 38-B(4), 902–913 (1956)
Google Scholar
Cutkosky, M.R., Howe, R.D.: Human grasp choice and robotic grasp analysis. In: Venkataraman, S.T., Iberall, T. (eds.) Dextrous Robot Hands, pp. 5–31. Springer, New York (1990)
Google Scholar
Kroemer, O., Detry, R., Piater, J., Peters, J.: Combining active learning and reactive control for robot grasping. Robot. Autonomous Syst. 9, 1105–1116 (2010)
Article Google Scholar
Peters, J., Muelling, K., Altun, Y.: Relative entropy policy search. In: AAAI Conference on Artificial Intelligence (AAAI) (2010)
Google Scholar
Kupcsik, A., Deisenroth, M.P., Peters, J., Loh, A.P., Vadakkepat, P., Neumann, G.: Model-based contextual policy search for data-efficient generalization of robot skills. Artificial Intell. (2014)
Google Scholar
Deisenroth, M.P., Neumann, G., Peters, J.: A survey on policy search for robotics. Foundations Trends Robot. 21, 388–403 (2013)
Google Scholar
Auer, P.: Using confidence bounds for exploitation-exploration trade-offs. J. Mach. Learn. Res. 3, 397–422 (2003)
MathSciNet MATH Google Scholar
Srinivas, N., Krause, A., Kakade, S., Seeger, M.: Information-theoretic regret bounds for gaussian process optimization in the bandit setting. IEEE Trans. Inf. Theory 58(5), 3250–3265 (2012)
Article MathSciNet Google Scholar
Calandra, R., Seyfarth, A., Peters, J., Deisenroth, M.P.: Bayesian optimization for learning gaits under uncertainty. Ann. Math. Artif. Intell. 76(1), 5–23 (2016)
Article MathSciNet Google Scholar
Rasmussen, C.E., Williams, C.K.I.: Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning). The MIT Press, Cambridge (2005)
Google Scholar
Girard, A., Rasmussen, C.E., Candela, J.Q., Murray-Smith, R.: Gaussian process priors with uncertain inputs - application to multiple-step ahead time series forecasting. In: Advances in Neural Information Processing Systems (2002)
Google Scholar
Candela, J.Q., Girard, A.: Prediction at an uncertain input for Gaussian processes and relevance vector machines - application to multiple-step ahead time-series forecasting. Technical report, Danish Technical University (2002)
Google Scholar
Besl, P.J., McKay, N.D.: A method for registration of 3-d shapes. IEEE Trans. Pattern Anal. Mach. Intell. 14(2), 239–256 (1992)
Article Google Scholar
Murray, R.M., Sastry, S.S., Zexiang, L.: A Mathematical Introduction to Robotic Manipulation, 1st edn. CRC Press Inc., Boca Raton (1994)
MATH Google Scholar
Ferrari, C., Canny, J.: Planning optimal grasps. In: IEEE International Conference on Robotics and Automation (ICRA), vol. 3, pp. 2290–2295, May 1992
Google Scholar
Pokorny, F., Kragic, D.: Classical grasp quality evaluation: new algorithms and theory. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 3493–3500, November 2013
Google Scholar
Peters, J., Schaal, S.: Reinforcement learning by reward-weighted regression for operational space control. In: International Conference on Machine Learning (ICML) (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

Technische Universität Darmstadt, Hochschulstr. 10, 64289, Darmstadt, Germany
Takayuki Osa, Jan Peters & Gerhard Neumann

Authors

Takayuki Osa
View author publications
You can also search for this author in PubMed Google Scholar
Jan Peters
View author publications
You can also search for this author in PubMed Google Scholar
Gerhard Neumann
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Takayuki Osa .

Editor information

Editors and Affiliations

Gra Sch of Info Sci&Tech,Dept of MechInf, The University of Tokyo Gra Sch of Info Sci&Tech,Dept of MechInf, Tokyo, Japan
Dana Kulić
Computer Science Department, Stanford University Computer Science Department, Stanford, California, USA
Yoshihiko Nakamura
Department of Electrical & Computer Engg, University of Waterloo Department of Electrical & Computer Engg, Waterloo, Ontario, Canada
Oussama Khatib
Department of Mechanical Systems Enginee, Tokyo University of Agriculture and Tech Department of Mechanical Systems Enginee, Tokyo, Japan
Gentiane Venture

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Osa, T., Peters, J., Neumann, G. (2017). Experiments with Hierarchical Reinforcement Learning of Multiple Grasping Policies. In: Kulić, D., Nakamura, Y., Khatib, O., Venture, G. (eds) 2016 International Symposium on Experimental Robotics. ISER 2016. Springer Proceedings in Advanced Robotics, vol 1. Springer, Cham. https://doi.org/10.1007/978-3-319-50115-4_15

Download citation

DOI: https://doi.org/10.1007/978-3-319-50115-4_15
Published: 21 March 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-50114-7
Online ISBN: 978-3-319-50115-4
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics