Skip to main content
Log in

3D hand reconstruction from a single image based on biomechanical constraints

  • Original article
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

This paper investigates the estimate of motion parameters from 3D hand joint positions. We formulate the issue as an inverse kinematics problem with biomechanical constraints and propose a fast and robust iterative approach to address the constrained optimization. It elaborately designs a coordinate descent algorithm to decompose the problem into a sequence of decisions on the transformation around each kinematic node (i.e., joint), while the decision for each node is equivalent to a point matching problem. Addressing the whole optimization then amounts to considering all nodes of the kinematic tree from its root to leaves one by one. This not only accelerates the process but also improves the accuracy of the solution of the inverse kinematic optimization. Experiments show that our approach is able to yield results comparable to and even better than those by the state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Ahmad, A., Migniot, C., Dipanda, A.: Hand pose estimation and tracking in real and virtual interaction: A review. Image Vis. Comput. 89, 35–49 (2019)

    Article  Google Scholar 

  2. Aristidou, A.: Hand tracking with physiological constraints. Vis. Comput. 34(2), 213–228 (2018)

    Article  Google Scholar 

  3. Athitsos, V., Sclaroff, S.: Estimating 3d hand pose from a cluttered image. In: 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2003), 16-22 June 2003, Madison, WI, USA, pp. 432–442 (2003)

  4. Boukhayma, A., de Bem, R., Torr, P.H.S.: 3d hand shape and pose from images in the wild. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16-20, 2019, pp. 10,843–10,852 (2019)

  5. Bray, M., Koller-meier, E., Müller, P., Gool, L.V., Schraudolph, N.N.: 3d hand tracking by rapid stochastic gradient descent using a skinning model. In: 1st European Conference on Visual Media Production (CVMP, pp. 59–68 (2004)

  6. Bronstein, M.M., Bruna, J., LeCun, Y., Szlam, A., Vandergheynst, P.: Geometric deep learning: Going beyond euclidean data. IEEE Signal Process. Mag. 34(4), 18–42 (2017)

    Article  Google Scholar 

  7. Choi, H., Moon, G., Lee, K.M.: Pose2mesh: Graph convolutional network for 3d human pose and mesh recovery from a 2d human pose. In: A. Vedaldi, H. Bischof, T. Brox, J. Frahm (eds.) Computer Vision - ECCV 2020 - 16th European Conference, Glasgow, UK, August 23-28, 2020, Proceedings, Part VII, Lecture Notes in Computer Science, vol. 12352, pp. 769–787 (2020)

  8. Diebel, J.: Representing attitude: Euler angles, unit quaternions, and rotation vectors. (2006). http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.110.5134

  9. ElKoura, G., Singh, K.: Handrix: animating the human hand. In: R. Parent, K. Singh, D.E. Breen, M.C. Lin (eds.) Proceedings of the 2003 ACM SIGGRAPH/Eurographics Symposium on Computer Animation, San Diego, CA, USA, July 26-27, 2003, pp. 110–119 (2003)

  10. Feng, Z., Zhang, M., Pan, Z., Yang, B., Xu, T., Tang, H., Li, Y.: 3d-freehand-pose initialization based on operator’s cognitive behavioral models. Vis. Comput. 26(6–8), 607–617 (2010)

    Article  Google Scholar 

  11. Ge, L., Ren, Z., Li, Y., Xue, Z., Wang, Y., Cai, J., Yuan, J.: 3d hand shape and pose estimation from a single RGB image. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16-20, 2019, pp. 10,833–10,842 (2019)

  12. Glauser, O., Wu, S., Panozzo, D., Hilliges, O., Sorkine-Hornung, O.: Interactive hand pose estimation using a stretch-sensing soft glove. ACM Trans. Graph. 38(4), 41:1–41:15 (2019)

    Article  Google Scholar 

  13. Gower, J.C., Dijksterhuis, G.B.: Procrustes Problems. Oxford University Press, Oxford (2004)

    Book  Google Scholar 

  14. Hasson, Y., Varol, G., Tzionas, D., Kalevatykh, I., Black, M.J., Laptev, I., Schmid, C.: Learning joint reconstruction of hands and manipulated objects. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16-20, 2019, pp. 11,807–11,816. Computer Vision Foundation / IEEE (2019)

  15. Heap, T., Hogg, D.C.: Towards 3d hand tracking using a deformable model. In: 2nd International Conference on Automatic Face and Gesture Recognition (FG ’96), October 14-16, 1996, Killington, Vermont, USA, pp. 140–145 (1996)

  16. Imai, A., Shimada, N., Shirai, Y.: 3-d hand posture recognition by training contour variation. In: Sixth IEEE International Conference on Automatic Face and Gesture Recognition (FGR 2004), May 17-19, 2004, Seoul, Korea, pp. 895–900 (2004)

  17. Imai, A., Shimada, N., Shirai, Y.: Hand posture estimation in complex backgrounds by considering mis-match of model. In: Y. Yagi, S.B. Kang, I. Kweon, H. Zha (eds.) Computer Vision - ACCV 2007, 8th Asian Conference on Computer Vision, Tokyo, Japan, November 18-22, 2007, Proceedings, Part I, Lecture Notes in Computer Science, vol. 4843, pp. 596–607 (2007)

  18. Joo, H., Simon, T., Li, X., Liu, H., Tan, L., Gui, L., Banerjee, S., Godisart, T.S., Nabbe, B., Matthews, I., Kanade, T., Nobuhara, S., Sheikh, Y.: Panoptic studio: A massively multiview system for social interaction capture. In: IEEE Transactions on Pattern Analysis and Machine Intelligence (2017)

  19. Kabsch, W.: A solution for the best rotation to relate two sets of vectors. Acta Crystallographica Section A 32, 922–922 (1976)

  20. Kitagawa, M., Windsor, B.: MoCap for Artists: Workflow and Techniques for Motion Capture (2012). https://doi.org/10.4324/9780080877945

  21. Kolluri, R.K., Shewchuk, J.R., O’Brien, J.F.: Spectral surface reconstruction from noisy point clouds. In: J. Boissonnat, P. Alliez (eds.) Second Eurographics Symposium on Geometry Processing, Nice, France, July 8-10, 2004, ACM International Conference Proceeding Series, vol. 71, pp. 11–21 (2004)

  22. Kulon, D., Güler, R.A., Kokkinos, I., Bronstein, M.M., Zafeiriou, S.: Weakly-supervised mesh-convolutional hand reconstruction in the wild. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 13-19, 2020, pp. 4989–4999 (2020)

  23. de La Gorce, M., Fleet, D.J., Paragios, N.: Model-based 3d hand pose estimation from monocular video. IEEE Trans. Pattern Anal. Mach. Intell. 33(9), 1793–1805 (2011)

    Article  Google Scholar 

  24. Loper, M., Mahmood, N., Romero, J., Pons-Moll, G., Black, M.J.: SMPL: a skinned multi-person linear model. ACM Trans. Graph. 34(6), 248:1–248:16 (2015)

    Article  Google Scholar 

  25. Magnenat-thalmann, N., Laperrire, R., Thalmann, D., Montréal, U.D.: Joint-dependent local deformations for hand animation and object grasping. In: In Proceedings on Graphics interface ’88, pp. 26–33 (1988)

  26. Marquardt, A., Maiero, J., Kruijff, E., Trepkowski, C., Schwandt, A., Hinkenjann, A., Schöning, J., Stuerzlinger, W.: Tactile hand motion and pose guidance for 3d interaction. In: S.N. Spencer, S. Morishima, Y. Itoh, T. Shiratori, Y. Yue, R. Lindeman (eds.) Proceedings of the 24th ACM Symposium on Virtual Reality Software and Technology, VRST 2018, Tokyo, Japan, November 28 - December 01, 2018, pp. 3:1–3:10. ACM (2018)

  27. Melax, S., Keselman, L., Orsten, S.: Dynamics based 3d skeletal hand tracking. In: F.F. Samavati, K. Hawkey (eds.) Graphics Interface 2013, GI ’13, Regina, SK, Canada, May 29-31, 2013, Proceedings, pp. 63–70 (2013)

  28. Miyamoto, S., Matsuo, T., Shimada, N., Shirai, Y.: Real-time and precise 3-d hand posture estimation based on classification tree trained with variations of appearances. In: Proceedings of the 21st International Conference on Pattern Recognition, ICPR 2012, Tsukuba, Japan, November 11-15, 2012, pp. 453–456 (2012)

  29. Mueller, F., Davis, M., Bernard, F., Sotnychenko, O., Verschoor, M., Otaduy, M.A., Casas, D., Theobalt, C.: Real-time pose and shape reconstruction of two interacting hands with a single depth camera. ACM Trans. Graph. 38(4), 82:1–82:12 (2019)

  30. Oikonomidis, I., Kyriazis, N., Argyros, A.A.: Full DOF tracking of a hand interacting with an object by modeling occlusions and physical constraints. In: D.N. Metaxas, L. Quan, A. Sanfeliu, L.V. Gool (eds.) IEEE International Conference on Computer Vision, ICCV 2011, Barcelona, Spain, November 6-13, 2011, pp. 2088–2095 (2011)

  31. Pavlakos, G., Choutas, V., Ghorbani, N., Bolkart, T., Osman, A.A.A., Tzionas, D., Black, M.J.: Expressive body capture: 3d hands, face, and body from a single image. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16-20, 2019, pp. 10,975–10,985 (2019)

  32. Pavllo, D., Porssut, T., Herbelin, B., Boulic, R.: Real-time marker-based finger tracking with neural networks. In: K. Kiyokawa, F. Steinicke, B.H. Thomas, G. Welch (eds.) 2018 IEEE Conference on Virtual Reality and 3D User Interfaces, VR 2018, Tuebingen/Reutlingen, Germany, 18-22 March 2018, pp. 651–652 (2018)

  33. Peng, H., Xian, C., Zhang, Y.: 3d hand mesh reconstruction from a monocular RGB image. Vis. Comput. 36(10), 2227–2239 (2020)

    Article  Google Scholar 

  34. Polygerinos, P., Galloway, K.C., Savage, E., Herman, M., O’Donnell, K., Walsh, C.J.: Soft robotic glove for hand rehabilitation and task specific training. In: IEEE International Conference on Robotics and Automation, ICRA 2015, Seattle, WA, USA, 26-30 May, 2015, pp. 2913–2919 (2015)

  35. Qian, C., Sun, X., Wei, Y., Tang, X., Sun, J.: Realtime and robust hand tracking from depth. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2014, Columbus, OH, USA, June 23-28, 2014, pp. 1106–1113 (2014)

  36. Qian, N., Wang, J., Mueller, F., Bernard, F., Golyanik, V., Theobalt, C.: HTML: A parametric hand texture model for 3d hand reconstruction and personalization. In: A. Vedaldi, H. Bischof, T. Brox, J. Frahm (eds.) Computer Vision - ECCV 2020 - 16th European Conference, Glasgow, UK, August 23-28, 2020, Proceedings, Part XI, Lecture Notes in Computer Science, vol. 12356, pp. 54–71. Springer (2020)

  37. Romero, J., Kjellström, H., Kragic, D.: Monocular real-time 3d articulated hand pose estimation. In: 9th IEEE-RAS International Conference on Humanoid Robots, Humanoids 2009, Paris, France, December 7-10, 2009, pp. 87–92. IEEE (2009)

  38. Romero, J., Tzionas, D., Black, M.J.: Embodied hands: modeling and capturing hands and bodies together. ACM Trans. Graph. 36(6), 245:1–245:17 (2017)

    Article  Google Scholar 

  39. Rong, Y., Shiratori, T., Joo, H.: Frankmocap: Fast monocular 3d hand and body motion capture by regression and integration. arXiv preprint arXiv:2008.08324 (2020)

  40. Simon, T., Joo, H., Matthews, I.A., Sheikh, Y.: Hand keypoint detection in single images using multiview bootstrapping. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21-26, 2017, pp. 4645–4653 (2017)

  41. Spurr, A., Iqbal, U., Molchanov, P., Hilliges, O., Kautz, J.: Weakly supervised 3d hand pose estimation via biomechanical constraints. In: A. Vedaldi, H. Bischof, T. Brox, J. Frahm (eds.) Computer Vision - ECCV 2020 - 16th European Conference, Glasgow, UK, August 23-28, 2020, Proceedings, Part XVII, Lecture Notes in Computer Science, vol. 12362, pp. 211–228 (2020)

  42. Spurr, A., Song, J., Park, S., Hilliges, O.: Cross-modal deep variational hand pose estimation. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18-22, 2018, pp. 89–98 (2018)

  43. Stenger, B., Mendonça, P.R.S., Cipolla, R.: Model-based 3d tracking of an articulated hand. In: 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2001), with CD-ROM, 8-14 December 2001, Kauai, HI, USA, pp. 310–315 (2001)

  44. Tang, D., Chang, H.J., Tejani, A., Kim, T.: Latent regression forest: Structured estimation of 3d articulated hand posture. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2014, Columbus, OH, USA, June 23-28, 2014, pp. 3786–3793 (2014)

  45. Tompson, J., Stein, M., LeCun, Y., Perlin, K.: Real-time continuous pose recovery of human hands using convolutional networks. ACM Trans. Graph. 33(5), 169:1–169:10 (2014)

    Article  Google Scholar 

  46. Tuffield, P., Elias, H.: The shadow robot mimics human actions. Ind. Robot 30(1), 56–60 (2003)

    Article  Google Scholar 

  47. Wang, R.Y., Popovic, J.: Real-time hand-tracking with a color glove. ACM Trans. Graph. 28(3), 63 (2009)

    Google Scholar 

  48. Wheatland, N., Jörg, S., Zordan, V.B.: Automatic hand-over animation using principle component analysis. In: R. McDonnell, N.R. Sturtevant, V.B. Zordan (eds.) Motion in Games, MIG ’13, Dublin, Ireland, November 6-8, 2013, pp. 197–202 (2013)

  49. Wheatland, N., Wang, Y., Song, H., Neff, M., Zordan, V.B., Jörg, S.: State of the art in hand and finger modeling and animation. Comput. Graph. Forum 34(2), 735–760 (2015)

    Article  Google Scholar 

  50. Wilding, J., Corcos, D.M.: Basic biomechanics of the musculoskeletal system, ed 3. (reviews). (book review). Physical Therapy (December) (2001)

  51. Xiang, D., Joo, H., Sheikh, Y.: Monocular total capture: Posing face, body, and hands in the wild. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16-20, 2019, pp. 10,965–10,974 (2019)

  52. Zhang, J., Jiao, J., Chen, M., Qu, L., Xu, X., Yang, Q.: A hand pose tracking benchmark from stereo matching. In: 2017 IEEE International Conference on Image Processing, ICIP 2017, Beijing, China, September 17-20, 2017, pp. 982–986 (2017)

  53. Zhang, X., Li, Q., Mo, H., Zhang, W., Zheng, W.: End-to-end hand mesh recovery from a monocular RGB image. In: 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea (South), October 27 - November 2, 2019, pp. 2354–2364 (2019)

  54. Zhao, W., Chai, J., Xu, Y.: Combining marker-based mocap and RGB-D camera for acquiring high-fidelity hand motion data. In: J. Lee, P.G. Kry (eds.) Proceedings of the 2012 Eurographics/ACM SIGGRAPH Symposium on Computer Animation, SCA 2012, Lausanne, Switzerland, 2012, pp. 33–42 (2012)

  55. Zhou, Y., Habermann, M., Xu, W., Habibie, I., Theobalt, C., Xu, F.: Monocular real-time hand shape and motion capture using multi-modal data. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 13-19, 2020, pp. 5345–5354 (2020)

  56. Zimmermann, C., Brox, T.: Learning to estimate 3d hand pose from single RGB images. In: IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, October 22-29, 2017, pp. 4913–4921 (2017)

  57. Zimmermann, C., Ceylan, D., Yang, J., Russell, B.C., Argus, M.J., Brox, T.: Freihand: A dataset for markerless capture of hand pose and shape from single RGB images. In: 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea (South), October 27 - November 2, 2019, pp. 813–822 (2019)

Download references

Funding

This work was partially supported by NSFC (61972160, 62072191), Guangdong Basic and Applied Basic Research Foundation (2021A1515012301, 2019A1515010833, 2019A1515010860), and the Fundamental Research Funds for the Central Universities (2020ZYGXZR089, D2190670).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Aihua Mao.

Ethics declarations

Conflicts of interest

The authors declare that they have no conflicts of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary material 1 (mp4 82012 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, G., Wu, Z., Liu, Y. et al. 3D hand reconstruction from a single image based on biomechanical constraints. Vis Comput 37, 2699–2711 (2021). https://doi.org/10.1007/s00371-021-02250-y

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00371-021-02250-y

Keywords

Navigation