Detail-preserved real-time hand motion regression from depth

Fan, Qing; Shen, Xukun; Hu, Yong

doi:10.1007/s00371-018-1546-2

Detail-preserved real-time hand motion regression from depth

Original Article
Published: 05 May 2018

Volume 34, pages 1145–1154, (2018)
Cite this article

The Visual Computer Aims and scope Submit manuscript

Qing Fan¹,
Xukun Shen¹ &
Yong Hu²

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

This paper aims to address the very challenging problem of efficient and accurate hand tracking from depth sequences, meanwhile to deform a high-resolution 3D hand model with geometric details. We propose an integrated regression framework to infer articulated hand pose, and regress high-frequency details from sparse high-resolution 3D hand model examples. Specifically, our proposed method mainly consists of four components: skeleton embedding, hand joint regression, skeleton alignment, and high-resolution details integration. Skeleton embedding is optimized via a wrinkle-based skeleton refinement method for faithful hand models with fine geometric details. Hand joint regression is based on a deep convolutional network, from which 3D hand joint locations are predicted from a single depth map, then a skeleton alignment stage is performed to recover fully articulated hand poses. Deformable fine-scale details are estimated from a nonlinear mapping between the hand joints and per-vertex displacements. Experiments on two challenging datasets show that our proposed approach can achieve accurate, robust, and real-time hand tracking, while preserve most high-frequency details when deforming a virtual hand.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Vision-based robotic grasping from object localization, object pose estimation to grasp estimation for parallel grippers: a review

Article 17 August 2020

Multi-scale Dilated Attention Graph Convolutional Network for Skeleton-Based Action Recognition

Yoga pose classification: a CNN and MediaPipe inspired deep learning approach for real-world application

Article 03 June 2022

References

Baran, I., Popović, J.: Automatic rigging and animation of 3d characters. In: ACM Trans. Gr. (TOG) 26, 72 (2007). ACM
Google Scholar
Erol, A., Bebis, G., Nicolescu, M., Boyle, R.D., Twombly, X.: Vision-based hand pose estimation: a review. Comput. Vis. Image Underst. 108(1), 52–73 (2007)
Article Google Scholar
Fan, Q., Shen, X., Hu, Y., Yu, C.: Simple very deep convolutional network for robust hand pose regression from a single depth image. Pattern Recognit. Lett. (2017). https://doi.org/10.1016/j.patrec.2017.10.019
Garland, M., Heckbert, P.S.: Surface simplification using quadric error metrics. In: Proceedings of the 24th annual conference on Computer graphics and interactive techniques, pp. 209–216. ACM Press/Addison-Wesley Publishing Co. (1997)
Ge, L., Liang, H., Yuan, J., Thalmann, D.: Robust 3d hand pose estimation in single depth images: from single-view cnn to multi-view cnns. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3593–3601 (2016)
Guo, K., Xu, F., Wang, Y., Liu, Y., Dai, Q.: Robust non-rigid motion tracking and surface reconstruction using l0 regularization. In: IEEE International Conference on Computer Vision, pp. 3083–3091 (2016)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778 (2016)
Heap, T., Hogg, D.: Towards 3d hand tracking using a deformable model. In: Automatic Face and Gesture Recognition, 1996., IEEE Proceedings of the Second International Conference on, pp. 140–145. (1996)
Huang, H., Yin, K., Zhao, L., Qi, Y., Yu, Y., Tong, X.: Detail-preserving controllable deformation from sparse examples. IEEE Trans. Visual Comput. Gr. 18(8), 1215–1227 (2012)
Article Google Scholar
Li, H., Adams, B., Guibas, L.J., Pauly, M.: Robust single-view geometry and motion reconstruction. In: ACM SIGGRAPH Asia, p. 175 (2009)
Oberweger, M., Wohlhart, P., Lepetit, V.: Hands deep in deep learning for hand pose estimation. arXiv preprint arXiv:1502.06807 (2015)
Oberweger, M., Wohlhart, P., Lepetit, V.: Training a feedback loop for hand pose estimation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3316–3324 (2015)
Oikonomidis, I., Kyriazis, N., Argyros, A.A.: Efficient model-based 3d tracking of hand articulations using kinect. BmVC 1, 3 (2011)
Google Scholar
Oikonomidis, I., Kyriazis, N., Argyros, A.A.: Tracking the articulated motion of two strongly interacting hands. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1862–1869 (2012)
Schröder, M., Maycock, J., Ritter, H., Botsch, M.: Real-time hand tracking using synergistic inverse kinematics. In: 2014 IEEE International Conference on Robotics and Automation (ICRA), pp. 5447–5454 (2014)
Sridhar, S., Oulasvirta, A., Theobalt, C.: Interactive markerless articulated hand motion tracking using rgb and depth data. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2456–2463 (2013)
Stenger, B., Mendonça, P.R., Cipolla, R.: Model-based 3d tracking of an articulated hand. In: Computer Vision and Pattern Recognition, 2001. CVPR 2001. Proceedings of the 2001 IEEE Computer Society Conference on, vol. 2, pp. II–II (2001)
Sun, X., Wei, Y., Liang, S., Tang, X., Sun, J.: Cascaded hand pose regression. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 824–832 (2015)
Supancic III, J.S., Rogez, G., Yang, Y., Shotton, J., Ramanan, D.: Depth-based hand pose estimation: methods, data, and challenges. arXiv preprint (2015) arXiv:1504.06378
Tagliasacchi, A., Schröder, M., Tkach, A., Bouaziz, S., Botsch, M., Pauly, M.: Robust articulated-icp for real-time hand tracking. In: Computer Graphics Forum, vol. 34, pp. 101–114. Wiley Online Library (2015)
Taylor, J., Bordeaux, L., Cashman, T., Corish, B., Keskin, C., Sharp, T., Soto, E., Sweeney, D., Valentin, J., Luff, B., et al.: Efficient and precise interactive hand tracking through joint, continuous optimization of pose and correspondences. ACM Trans. Gr. (TOG) 35(4), 143 (2016)
Google Scholar
Taylor, J., Shotton, J., Sharp, T., Fitzgibbon, A.: The vitruvian manifold: Inferring dense correspondences for one-shot human pose estimation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 103–110 (2012)
Tkach, A., Pauly, M., Tagliasacchi, A.: Sphere-meshes for real-time hand modeling and tracking. ACM Trans. Gr. (TOG) 35(6), 222 (2016)
Google Scholar
Tompson, J., Stein, M., Lecun, Y., Perlin, K.: Real-time continuous pose recovery of human hands using convolutional networks. ACM Trans. Gr. (TOG) 33(5), 169 (2014)
Google Scholar
Wang, R.Y., Popović, J.: Real-time hand-tracking with a color glove. In: ACM transactions on graphics (TOG), vol. 28, p. 63. ACM (2009)
Xu, C., Cheng, L.: Efficient hand pose estimation from a single depth image. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3456–3462 (2013)
Zhou, X., Wan, Q., Zhang, W., Xue, X., Wei, Y.: Model-based deep hand pose estimation. In: IJCAI (2016)
Zollhöfer, M., Nießner, M., Izadi, S., Rehmann, C., Zach, C., Fisher, M., Wu, C., Fitzgibbon, A., Loop, C., Theobalt, C., et al.: Real-time non-rigid reconstruction using an rgb-d camera. ACM Trans. Gr. (TOG) 33(4), 156 (2014)
MATH Google Scholar

Download references

Acknowledgements

We would like to thank the anonymous reviewers for their valuable suggestions. This paper is supported by National Key Technologies R & D Program of China (No.2017YFB1002702).

Author information

Authors and Affiliations

State Key Laboratory of Virtual Reality Technology and Systems, Beihang University, Beijing, China
Qing Fan & Xukun Shen
School of New Media Art and Design, Beihang University, Beijing, China
Yong Hu

Authors

Qing Fan
View author publications
You can also search for this author in PubMed Google Scholar
Xukun Shen
View author publications
You can also search for this author in PubMed Google Scholar
Yong Hu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yong Hu.

Ethics declarations

Conflict of interest

All Authors declare that they have no conflict of interest.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (mp4 8485 KB)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fan, Q., Shen, X. & Hu, Y. Detail-preserved real-time hand motion regression from depth. Vis Comput 34, 1145–1154 (2018). https://doi.org/10.1007/s00371-018-1546-2

Download citation

Published: 05 May 2018
Issue Date: September 2018
DOI: https://doi.org/10.1007/s00371-018-1546-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Detail-preserved real-time hand motion regression from depth

Abstract

Access this article

Similar content being viewed by others

Vision-based robotic grasping from object localization, object pose estimation to grasp estimation for parallel grippers: a review

Multi-scale Dilated Attention Graph Convolutional Network for Skeleton-Based Action Recognition

Yoga pose classification: a CNN and MediaPipe inspired deep learning approach for real-world application

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Electronic supplementary material

Supplementary material 1 (mp4 8485 KB)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Detail-preserved real-time hand motion regression from depth

Abstract

Access this article

Similar content being viewed by others

Vision-based robotic grasping from object localization, object pose estimation to grasp estimation for parallel grippers: a review

Multi-scale Dilated Attention Graph Convolutional Network for Skeleton-Based Action Recognition

Yoga pose classification: a CNN and MediaPipe inspired deep learning approach for real-world application

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Electronic supplementary material

Supplementary material 1 (mp4 8485 KB)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation