Skip to main content

Variational Augmented the Heuristic Funnel-Transitions Model for Dexterous Robot Manipulation

  • Conference paper
  • First Online:
Intelligent Robotics and Applications (ICIRA 2020)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12595))

Included in the following conference series:

  • 1447 Accesses

Abstract

Learning from demonstrations is a heuristic technique that can only obtain the intentional dynamics of robot manipulation, which may fail to the task with unexpected anomalies. In this paper, we present a method for enhancing the diversity of multimodal signals collected from few-shot demonstrations using Variational Auto-encoders (VAEs), which can provide sufficient observations for clustering many funnel representations of the complex and multi-step task with anomalies. Then a funnel-base reinforcement learning is applied to obtain the policy from the synthetic funnel-transition model. Experimental verifications are based on an open-source force/torque dataset and our previous kitting experiment setup that equips with a well-constructed framework for multimodal signal collection, anomaly detector, and classifier. The baseline is used traditional funnel policy learning (without use augmented signals), the result shows significant improvement on the success rate from 70% to 90% on performing the kitting experiments after combined with the VAEs augmented signals to compute the funnel-transitions model. To the best of our knowledge, our scheme is the first attempt for improving robot manipulation by few demonstrations, which not only can respond to the normal manipulation but also can well adapt to the unexpected abnormal out of the demonstration. Our method can be extended to the environment that not only difficult to collect sufficient transitions online but having unpredictable anomaly. For example learning long-horizon household skills.

This work is partially supported by the Key R&D Programmes of Guangdong Province (Grant No. 2019B090915001), the Frontier and Key Technology Innovation Special Funds of Guangdong Province (Grant No. 2017B050506008), and the National Natural Science Foundation of China (Grant No. 51975126, 51905105).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    An experiment is success if the agent can finish a kitting experiment from skill 3 to skill 9.

References

  1. Alemi, A.A., Poole, B., Fischer, I., Dillon, J.V., Saurous, R.A., Murphy, K.: Fixing a broken elbo. arXiv preprint arXiv:1711.00464 (2017)

  2. Andrychowicz, M., et al.: Hindsight experience replay. In: Advances in Neural Information Processing Systems (2017)

    Google Scholar 

  3. Argall, B., Chernova, S., Veloso, M.M., Browning, B.: A survey of robot learning from demonstration. Robot. Auton. Syst. 57, 469–483 (2009)

    Article  Google Scholar 

  4. Burda, Y., Edwards, H.A., Storkey, A.J., Klimov, O.: Exploration by random network distillation. arXiv:1810.12894 (2019)

  5. Cao, G., Kamata, S.-I.: Data augmentation for historical documents via cascade variational auto-encoder. In: 2019 IEEE International Conference on Signal and Image Processing Applications (ICSIPA), pp. 340–345 (2019)

    Google Scholar 

  6. Chen, T.Q., Li, X., Grosse, R.B., Duvenaud, D.K.: Isolating sources of disentanglement in variational autoencoders. In: NeurIPS (2018)

    Google Scholar 

  7. De Magistris, G., Munawar, A., Pham, T.-H., Inoue, T., Vinayavekhin, P., Tachibana, R.: Experimental force-torque dataset for robot learning of multi-shape insertion. arXiv preprint arXiv:1807.06749 (2018)

  8. Deng, Z.-H., Huang, L., Wang, C.-D., Lai, J.-H., Yu, P.S.: Deepcf: A unified framework of representation learning and matching function learning in recommender system. arXiv:1901.04704 (2019)

  9. Diederik, P.K., et al.: Auto-encoding variational bayes (2014)

    Google Scholar 

  10. Donahue, J., Krähenbühl, P., Darrell, T.: Adversarial feature learning. arXiv:1605.09782 (2017)

  11. Finn, C., Yu, T., Zhang, T., Abbeel, P., Levine, S.: One-shot visual imitation learning via meta-learning. In: CoRL (2017)

    Google Scholar 

  12. Gupta, A., Eysenbach, B., Finn, C., Levine, S.: Unsupervised meta-learning for reinforcement learning. arXiv:1806.04640 (2018)

  13. Hester, T., et al.: Deep q-learning from demonstrations. In: AAAI (2018)

    Google Scholar 

  14. Hettich, S., Blake, C., Merz, C.J.: UCI machine learning repository (1998)

    Google Scholar 

  15. Higgins, I.: Beta-vae: learning basic visual concepts with a constrained variational framework. In: ICLR (2017)

    Google Scholar 

  16. Kalashnikov, D., et al.: Qt-opt: Scalable deep reinforcement learning for vision-based robotic manipulation. arXiv:1806.10293 (2018)

  17. Kang, G., Dong, X., Zheng, L., Yang, Y.: Patchshuffle regularization. arXiv:1707.07103 (2017)

  18. Kim, B., Massoud Farahmand, A., Pineau, J., Precup, D.: Learning from limited demonstrations. In: NIPS (2013)

    Google Scholar 

  19. Kullback, S.: Information theory and statistics. Courier Corporation (1997)

    Google Scholar 

  20. Lee, M.: Making sense of vision and touch: self-supervised learning of multimodal representations for contact-rich tasks. In: 2019 International Conference on Robotics and Automation (ICRA), pp. 8943–8950 (2019)

    Google Scholar 

  21. Lin, Y., Huang, J., Zimmer, M., Rojas, J., Weng, P.: Invariant transform experience replay. arXiv preprint arXiv:1909.10707 (2019)

  22. Litjens, G.J.S., et al.: A survey on deep learning in medical image analysis. Med. Image Anal. 42 (2017)

    Google Scholar 

  23. Luo, Y., Zhu, L., Wan, Z., Lu, B.-L.: Data augmentation for enhancing EEG-based emotion recognition with deep generative models. arXiv:2006.05331 (2020)

  24. Nair, A., Bahl, S., Khazatsky, A., Pong, V.H., Berseth, G., Levine, S.: Contextual imagined goals for self-supervised robotic learning. In: CoRL (2019)

    Google Scholar 

  25. OpenAI, et al.: Solving rubik’s cube with a robot hand. arXiv:1910.07113 (2019)

  26. Osa, T., Esfahani, A.M.G., Stolkin, R., Lioutikov, R., Peters, J., Neumann, G.: Guiding trajectory optimization by demonstrated distributions. IEEE Robot. Autom. Lett. 2, 819–826 (2017)

    Article  Google Scholar 

  27. Pong, V.H., Dalal, M., Lin, S., Nair, A., Bahl, S., Levine, S.: Skew-fit: State-covering self-supervised reinforcement learning. arXiv:1903.03698 (2019)

  28. Rahmatizadeh, R., Abolghasemi, P., Bölöni, L., Levine, S.: Vision-based multi-task manipulation for inexpensive robots using end-to-end learning from demonstration. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 3758–3765 (2018)

    Google Scholar 

  29. Schubert, E., Sander, J., Ester, M., Kriegel, H.P., Xu, X.: Dbscan revisited, revisited: why and how you should (still) use dbscan. ACM Trans. Database Syst. (TODS) 42(3), 1–21 (2017)

    Article  MathSciNet  Google Scholar 

  30. Silver, D., et al.: Mastering the game of go without human knowledge. Nature 550, 354–359 (2017)

    Article  Google Scholar 

  31. Sutton, R.S., Barto, A.G.: Reinforcement learning: an introduction. IEEE Trans. Neural Networks 16, 285–286 (1998)

    MATH  Google Scholar 

  32. Tobin, J., Fong, R.H., Ray, A., Schneider, J., Zaremba, W., Abbeel, P.: Domain randomization for transferring deep neural networks from simulation to the real world. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 23–30 (2017)

    Google Scholar 

  33. Vecerík, M., et al.: Leveraging demonstrations for deep reinforcement learning on robotics problems with sparse rewards. arXiv:1707.08817 (2017)

  34. Wang, A.S., Kroemer, O.: Learning robust manipulation strategies with multimodal state transition models and recovery heuristics. In: 2019 International Conference on Robotics and Automation (ICRA), pp. 1309–1315 (2019)

    Google Scholar 

  35. Wu, H., Guan, Y., Rojas, J.: A latent state-based multimodal execution monitor with anomaly detection and classification for robot introspection. Appl. Sci. 9, 1072 (2019)

    Article  Google Scholar 

  36. Yoo, K.M., Shin, Y., Goo Lee, S.: Data augmentation for spoken language understanding via joint variational generation. In: AAAI (2019)

    Google Scholar 

  37. Yu, T., et al.: Meta-world: a benchmark and evaluation for multi-task and meta reinforcement learning. In: CoRL (2019)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yisheng Guan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Huang, J., Lin, Y., Wu, H., Guan, Y. (2020). Variational Augmented the Heuristic Funnel-Transitions Model for Dexterous Robot Manipulation. In: Chan, C.S., et al. Intelligent Robotics and Applications. ICIRA 2020. Lecture Notes in Computer Science(), vol 12595. Springer, Cham. https://doi.org/10.1007/978-3-030-66645-3_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-66645-3_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-66644-6

  • Online ISBN: 978-3-030-66645-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics