Abstract
So far, in order to interact with a virtual agent that simulates hands or to teleoperate a robotic hand it is required any kind of device like joysticks, gamepads or VR controllers. These devices map a button press to a predefined action, usually with no fine control of the entity that is being teleoperated, which is unsuitable and unpleasant. In this work, we study different setups to perform 3D hand pose estimation from color images of hands with the aim of controlling a low-cost robotic hand. The experiments we carried out demonstrated that it is possible to do so with a deviation of less than 5 mm per joint.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Zimmermann, C., Ceylan, D., Yang, J., Russell, B., Argus, M.A., Brox, T.: Freihand: a dataset for markerless capture of hand pose and shape from single rgb images. In: IEEE International Conference on Computer Vision (ICCV) (2019). https://lmb.informatik.uni-freiburg.de/projects/freihand/
de La Gorce, M., Fleet, D.J., Paragios, N.: Model-based 3D hand pose estimation from monocular video. IEEE Trans. Pattern Anal. Mach. Intell. 33(9), 1793–1805 (2011)
Deng, J., Dong, W., Socher, R., Li, L., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009)
Gomez-Donoso, F., Orts-Escolano, S., Cazorla, M.: Accurate and efficient 3D hand pose regression for robot hand teleoperation using a monocular rgb camera. Expert Syst. Appl. 136, 327–337 (2019). https://doi.org/10.1016/j.eswa.2019.06.055. http://www.sciencedirect.com/science/article/pii/S0957417419304634
Gomez-Donoso, F., Orts-Escolano, S., Cazorla, M.: Large-scale multiview 3D hand pose dataset. Image and Vis. Comput. 81, 25–33 (2019)
Grzejszczak, T., Kawulok, M., Galuszka, A.: Hand landmarks detection and localization in color images. Multimedia Tools Appl. 75(23), 16363–16387 (2016). https://doi.org/10.1007/s11042-015-2934-5
Kumar, P.P., Vadakkepat, P., Loh, A.P.: Hand posture and face recognition using a fuzzy-rough approach. Int. J. Humanoid Rob. 7(03), 331–356 (2010). https://doi.org/10.1142/S0219843610002180
Kuznetsova, A., Leal-Taixé, L., Rosenhahn, B.: Real-time sign language recognition using a consumer depth camera. In: 2013 IEEE International Conference on Computer Vision Workshops, pp. 83–90 (2013)
Lin, T., Goyal, P., Girshick, R.B., He, K., Dollár, P.: Focal loss for dense object detection. CoRR abs/1708.02002 (2017). http://arxiv.org/abs/1708.02002
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., Berg, A.C.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) Computer Vision - ECCV 2016, pp. 21–37. Springer, Cham (2016)
Marcel, S., Bernier, O.: Hand posture recognition in a body-face centered space. In: Braffort, A., Gherbi, R., Gibet, S., Teil, D., Richardson, J. (eds.) Gesture-Based Communication in Human-Computer Interaction, pp. 97–100. Springer, Berlin, Heidelberg (1999)
Marcel, S., Bernier, O., Viallet, J.E., Collobert, D.: Hand gesture recognition using input-output hidden markov models. In: Proceedings of the 4th IEEE International Conference on Automatic Face and Gesture Recognition 2000, p. 456, FG 2000, IEEE Computer Society, USA (2000)
Molina, J., Pajuelo, J.A., Escudero-Viñolo, M., Bescós, J., Sanchez, J.M.M.: A natural and synthetic corpus for benchmarking of hand gesture recognition systems. Mach. Vis. Appl. 25, 943–954 (2013)
Oberweger, M., Wohlhart, P., Lepetit, V.: Training a feedback loop for hand pose estimation. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 3316–3324 (2015)
Panteleris, P., Argyros, A.A.: Back to RGB: 3D tracking of hands and hand-object interactions based on short-baseline stereo. In: 2017 IEEE International Conference on Computer Vision Workshops (ICCVW), pp. 575–584 (2017)
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 779–788 (2016)
Simon, T., Joo, H., Matthews, I., Sheikh, Y.: Hand keypoint detection in single images using multiview bootstrapping. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017
Sinha, A., Choi, C., Ramani, K.: Deephand: robust hand pose estimation by completing a matrix imputed with deep features. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016
Sridhar, S., Oulasvirta, A., Theobalt, C.: Interactive markerless articulated hand motion tracking using RGB and depth data. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), December 2013. http://handtracker.mpi-inf.mpg.de/projects/handtracker_iccv2013/
Stenger, B., Thayananthan, A., Torr, P.H.S., Cipolla, R.: Hand pose estimation using hierarchical detection. In: Sebe, N., Lew, M., Huang, T.S. (eds.) Computer Vision in Human-Computer Interaction, pp. 105–116. Springer, Berlin, Heidelberg (2004)
Tang, D., Yu, T., Kim, T.: Real-time articulated hand pose estimation using semi-supervised transductive regression forests. In: 2013 IEEE International Conference on Computer Vision, pp. 3224–3231 (2013)
Tompson, J., Stein, M., Lecun, Y., Perlin, K.: Real-time continuous pose recovery of human hands using convolutional networks. ACM Trans. Graph. 33(5), 1–10 (2014). https://doi.org/10.1145/2629500
Zhang, J., Jiao, J., Chen, M., Qu, L., Xu, X., Yang, Q.: 3D hand pose tracking and estimation using stereo matching. CoRR abs/1610.07214 (2016). http://arxiv.org/abs/1610.07214
Zimmermann, C., Brox, T.: Learning to estimate 3D hand pose from single RGB images. In: The IEEE International Conference on Computer Vision (ICCV), October 2017
Zimmermann, C., Brox, T.: Learning to estimate 3D hand pose from single rgb images. Technical report, arXiv:1705.01389 (2017). https://lmb.informatik.uni-freiburg.de/projects/hand3d/, https://arxiv.org/abs/1705.01389
Acknowledgments
This work has been funded by the Spanish Government PID2019-104818RB-I00 grant, supported with Feder funds. It has also been supported by Spanish grants for PhD studies ACIF/2017/243 and FPU16/00887. Experiments were made possible by a generous hardware donation from NVIDIA.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Gomez-Donoso, F., Escalona, F., Bañuls, A., Abellan, D., Cazorla, M. (2021). Monocular 3D Hand Pose Estimation for Teleoperating Low-Cost Actuators. In: Bergasa, L.M., Ocaña, M., Barea, R., López-Guillén, E., Revenga, P. (eds) Advances in Physical Agents II. WAF 2020. Advances in Intelligent Systems and Computing, vol 1285. Springer, Cham. https://doi.org/10.1007/978-3-030-62579-5_24
Download citation
DOI: https://doi.org/10.1007/978-3-030-62579-5_24
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-62578-8
Online ISBN: 978-3-030-62579-5
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)