Skip to main content

Monocular 3D Hand Pose Estimation for Teleoperating Low-Cost Actuators

  • Conference paper
  • First Online:
Advances in Physical Agents II (WAF 2020)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1285))

Included in the following conference series:

  • 841 Accesses

Abstract

So far, in order to interact with a virtual agent that simulates hands or to teleoperate a robotic hand it is required any kind of device like joysticks, gamepads or VR controllers. These devices map a button press to a predefined action, usually with no fine control of the entity that is being teleoperated, which is unsuitable and unpleasant. In this work, we study different setups to perform 3D hand pose estimation from color images of hands with the aim of controlling a low-cost robotic hand. The experiments we carried out demonstrated that it is possible to do so with a deviation of less than 5 mm per joint.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    http://inmoov.fr/hand-and-forarm/.

  2. 2.

    https://www.shadowrobot.com/products/dexterous-hand/.

References

  1. Zimmermann, C., Ceylan, D., Yang, J., Russell, B., Argus, M.A., Brox, T.: Freihand: a dataset for markerless capture of hand pose and shape from single rgb images. In: IEEE International Conference on Computer Vision (ICCV) (2019). https://lmb.informatik.uni-freiburg.de/projects/freihand/

  2. de La Gorce, M., Fleet, D.J., Paragios, N.: Model-based 3D hand pose estimation from monocular video. IEEE Trans. Pattern Anal. Mach. Intell. 33(9), 1793–1805 (2011)

    Article  Google Scholar 

  3. Deng, J., Dong, W., Socher, R., Li, L., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009)

    Google Scholar 

  4. Gomez-Donoso, F., Orts-Escolano, S., Cazorla, M.: Accurate and efficient 3D hand pose regression for robot hand teleoperation using a monocular rgb camera. Expert Syst. Appl. 136, 327–337 (2019). https://doi.org/10.1016/j.eswa.2019.06.055. http://www.sciencedirect.com/science/article/pii/S0957417419304634

    Article  Google Scholar 

  5. Gomez-Donoso, F., Orts-Escolano, S., Cazorla, M.: Large-scale multiview 3D hand pose dataset. Image and Vis. Comput. 81, 25–33 (2019)

    Article  Google Scholar 

  6. Grzejszczak, T., Kawulok, M., Galuszka, A.: Hand landmarks detection and localization in color images. Multimedia Tools Appl. 75(23), 16363–16387 (2016). https://doi.org/10.1007/s11042-015-2934-5

    Article  Google Scholar 

  7. Kumar, P.P., Vadakkepat, P., Loh, A.P.: Hand posture and face recognition using a fuzzy-rough approach. Int. J. Humanoid Rob. 7(03), 331–356 (2010). https://doi.org/10.1142/S0219843610002180

    Article  Google Scholar 

  8. Kuznetsova, A., Leal-Taixé, L., Rosenhahn, B.: Real-time sign language recognition using a consumer depth camera. In: 2013 IEEE International Conference on Computer Vision Workshops, pp. 83–90 (2013)

    Google Scholar 

  9. Lin, T., Goyal, P., Girshick, R.B., He, K., Dollár, P.: Focal loss for dense object detection. CoRR abs/1708.02002 (2017). http://arxiv.org/abs/1708.02002

  10. Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., Berg, A.C.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) Computer Vision - ECCV 2016, pp. 21–37. Springer, Cham (2016)

    Chapter  Google Scholar 

  11. Marcel, S., Bernier, O.: Hand posture recognition in a body-face centered space. In: Braffort, A., Gherbi, R., Gibet, S., Teil, D., Richardson, J. (eds.) Gesture-Based Communication in Human-Computer Interaction, pp. 97–100. Springer, Berlin, Heidelberg (1999)

    Chapter  Google Scholar 

  12. Marcel, S., Bernier, O., Viallet, J.E., Collobert, D.: Hand gesture recognition using input-output hidden markov models. In: Proceedings of the 4th IEEE International Conference on Automatic Face and Gesture Recognition 2000, p. 456, FG 2000, IEEE Computer Society, USA (2000)

    Google Scholar 

  13. Molina, J., Pajuelo, J.A., Escudero-Viñolo, M., Bescós, J., Sanchez, J.M.M.: A natural and synthetic corpus for benchmarking of hand gesture recognition systems. Mach. Vis. Appl. 25, 943–954 (2013)

    Article  Google Scholar 

  14. Oberweger, M., Wohlhart, P., Lepetit, V.: Training a feedback loop for hand pose estimation. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 3316–3324 (2015)

    Google Scholar 

  15. Panteleris, P., Argyros, A.A.: Back to RGB: 3D tracking of hands and hand-object interactions based on short-baseline stereo. In: 2017 IEEE International Conference on Computer Vision Workshops (ICCVW), pp. 575–584 (2017)

    Google Scholar 

  16. Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 779–788 (2016)

    Google Scholar 

  17. Simon, T., Joo, H., Matthews, I., Sheikh, Y.: Hand keypoint detection in single images using multiview bootstrapping. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017

    Google Scholar 

  18. Sinha, A., Choi, C., Ramani, K.: Deephand: robust hand pose estimation by completing a matrix imputed with deep features. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016

    Google Scholar 

  19. Sridhar, S., Oulasvirta, A., Theobalt, C.: Interactive markerless articulated hand motion tracking using RGB and depth data. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), December 2013. http://handtracker.mpi-inf.mpg.de/projects/handtracker_iccv2013/

  20. Stenger, B., Thayananthan, A., Torr, P.H.S., Cipolla, R.: Hand pose estimation using hierarchical detection. In: Sebe, N., Lew, M., Huang, T.S. (eds.) Computer Vision in Human-Computer Interaction, pp. 105–116. Springer, Berlin, Heidelberg (2004)

    Chapter  Google Scholar 

  21. Tang, D., Yu, T., Kim, T.: Real-time articulated hand pose estimation using semi-supervised transductive regression forests. In: 2013 IEEE International Conference on Computer Vision, pp. 3224–3231 (2013)

    Google Scholar 

  22. Tompson, J., Stein, M., Lecun, Y., Perlin, K.: Real-time continuous pose recovery of human hands using convolutional networks. ACM Trans. Graph. 33(5), 1–10 (2014). https://doi.org/10.1145/2629500

    Article  Google Scholar 

  23. Zhang, J., Jiao, J., Chen, M., Qu, L., Xu, X., Yang, Q.: 3D hand pose tracking and estimation using stereo matching. CoRR abs/1610.07214 (2016). http://arxiv.org/abs/1610.07214

  24. Zimmermann, C., Brox, T.: Learning to estimate 3D hand pose from single RGB images. In: The IEEE International Conference on Computer Vision (ICCV), October 2017

    Google Scholar 

  25. Zimmermann, C., Brox, T.: Learning to estimate 3D hand pose from single rgb images. Technical report, arXiv:1705.01389 (2017). https://lmb.informatik.uni-freiburg.de/projects/hand3d/, https://arxiv.org/abs/1705.01389

Download references

Acknowledgments

This work has been funded by the Spanish Government PID2019-104818RB-I00 grant, supported with Feder funds. It has also been supported by Spanish grants for PhD studies ACIF/2017/243 and FPU16/00887. Experiments were made possible by a generous hardware donation from NVIDIA.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Francisco Gomez-Donoso .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Gomez-Donoso, F., Escalona, F., Bañuls, A., Abellan, D., Cazorla, M. (2021). Monocular 3D Hand Pose Estimation for Teleoperating Low-Cost Actuators. In: Bergasa, L.M., Ocaña, M., Barea, R., López-Guillén, E., Revenga, P. (eds) Advances in Physical Agents II. WAF 2020. Advances in Intelligent Systems and Computing, vol 1285. Springer, Cham. https://doi.org/10.1007/978-3-030-62579-5_24

Download citation

Publish with us

Policies and ethics