skip to main content
10.1145/2702123.2702179acmconferencesArticle/Chapter ViewAbstractPublication PageschiConference Proceedingsconference-collections
research-article

Accurate, Robust, and Flexible Real-time Hand Tracking

Published: 18 April 2015 Publication History

Abstract

We present a new real-time hand tracking system based on a single depth camera. The system can accurately reconstruct complex hand poses across a variety of subjects. It also allows for robust tracking, rapidly recovering from any temporary failures. Most uniquely, our tracker is highly flexible, dramatically improving upon previous approaches which have focused on front-facing close-range scenarios. This flexibility opens up new possibilities for human-computer interaction with examples including tracking at distances from tens of centimeters through to several meters (for controlling the TV at a distance), supporting tracking using a moving depth camera (for mobile scenarios), and arbitrary camera placements (for VR headsets). These features are achieved through a new pipeline that combines a multi-layered discriminative reinitialization strategy for per-frame pose estimation, followed by a generative model-fitting stage. We provide extensive technical details and a detailed qualitative and quantitative analysis.

Supplementary Material

ZIP File (pn0362-file4.zip)
Accurate, Robust, and Flexible Real-time Hand Tracking
suppl.mov (pn0362-file3.mp4)
Supplemental video
MP4 File (p3633-sharp.mp4)

References

[1]
3Gear Systems Inc, 2013. http://threegear.com.
[2]
Athitsos, V., and Sclaroff, S. Estimating 3D hand pose from a cluttered image. In Proc. CVPR (2003), II--432.
[3]
Ballan, L., Taneja, A., Gall, J., van Gool, L., and Pollefeys, M. Motion capture of hands in action using discriminative salient points. In Proc. ECCV (2012), 640--653.
[4]
Bray, M., Koller-Meier, E., and Van Gool, L. Smart particle filtering for 3D hand tracking. In Proc. IEEE Conf. Automatic Face and Gesture Recognition (2004), 675--680.
[5]
Criminisi, A., and Shotton, J. Decision Forests for Computer Vision and Medical Image Analysis. Springer, 2013.
[6]
de La Gorce, M., Fleet, D. J., and Paragios, N. Model-based 3D hand pose estimation from monocular video. IEEE Trans. PAMI 33, 9 (Feb. 2011), 1793--1805.
[7]
Dipietro, L., Sabatini, A. M., and Dario, P. A survey of glove-based systems and their applications. IEEE Trans. Sys., Man, and Cybernetics C 38, 4 (2008), 461--482.
[8]
Erol, A., Bebis, G., Nicolescu, M., Boyle, R. D., and Twombly, X. Vision-based hand pose estimation: A review. Comp. Vis. and Image Understanding 108 (2007), 52--73.
[9]
Girshick, R., Shotton, J., Kohli, P., Criminisi, A., and Fitzgibbon, A. Efficient regression of general-activity human poses from depth images. In Proc. ICCV (2011), 415--422.
[10]
Heap, T., and Hogg, D. Towards 3D hand tracking using a deformable model. In Proc. IEEE Conf. Automatic Face and Gesture Recognition (1996), 140--145.
[11]
Juang, C.-F. A hybrid of genetic algorithm and particle swarm optimization for recurrent network design. IEEE Trans. Sys., Man, and Cybernetics B 34, 2 (April 2004), 997--1006.
[12]
Keskin, C., Kiraç, F., Kara, Y. E., and Akarun, L. Hand pose estimation and hand shape classification using multi-layered randomized decision forests. In Proc. ECCV (2012), 852--863.
[13]
Kim, D., Hilliges, O., Izadi, S., Butler, A. D., Chen, J., Oikonomidis, I., and Olivier, P. Digits: freehand 3D interactions anywhere using a wrist-worn gloveless sensor. In Proc. ACM UIST (2012), 167--176.
[14]
Krupka, E., Bar Hillel, A., Klein, B., Vinnikov, A., Freedman, D., and Stachniak, S. Discriminative ferns ensemble for hand pose recognition. In Proc. CVPR (2014).
[15]
Leap Motion Inc, 2014. http://leapmotion.com/product.
[16]
Melax, S., Keselman, L., and Orsten, S. Dynamics based 3D skeletal hand tracking. In Proc. Graphics Interface (2013), 63--70.
[17]
Oikonomidis, I., Kyriazis, N., and Argyros, A. Efficient model-based 3D tracking of hand articulations using Kinect. In Proc. BMVC (2011), 1--11.
[18]
Oikonomidis, I., Kyriazis, N., and Argyros, A. A. Full DoF tracking of a hand interacting with an object by modeling occlusions and physical constraints. In Proc. ICCV (2011), 2088--2095.
[19]
Oikonomidis, I., Kyriazis, N., and Argyros, A. A. Tracking the articulated motion of two strongly interacting hands. In Proc. CVPR (2012), 1862--1869.
[20]
Qian, C., Sun, X., Wei, Y., Tang, X., and Sun, J. Realtime and robust hand tracking from depth. In Proc. CVPR (2014).
[21]
Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., and Blake, A. Real-time human pose recognition in parts from a single depth image. In Proc. CVPR (2011).
[22]
Shotton, J., Sharp, T., Kohli, P., Nowozin, S., Winn, J., and Criminisi, A. Decision jungles: Compact and rich models for classification. In NIPS (2013).
[23]
Sridhar, S., Oulasvirta, A., and Theobalt, C. Interactive markerless articulated hand motion tracking using RGB and depth data. In Proc. ICCV (Dec. 2013).
[24]
Stenger, B., Mendoncça, P. R., and Cipolla, R. Model-based 3D tracking of an articulated hand. In Proc. CVPR (2001), II--310.
[25]
Sun, M., Kohli, P., and Shotton, J. Conditional regression forests for human pose estimation. In Proc. CVPR (2012), 3394--3401.
[26]
Tang, D., Chang, H. J., Tejani, A., and Kim, T.-K. Latent regression forest: Structured estimation of 3D articulated hand posture. In Proc. CVPR (2014), (in press).
[27]
Tang, D., Yu, T.-H., and Kim, T.-K. Real-time articulated hand pose estimation using semi-supervised transductive regression forests. In Proc. ICCV (2013).
[28]
Taylor, J., Shotton, J., Sharp, T., and Fitzgibbon, A. The Vitruvian Manifold: Inferring dense correspondences for one-shot human pose estimation. In Proc. CVPR (2012), 103--110.
[29]
Taylor, J., Stebbing, R., Ramakrishna, V., Keskin, C., Shotton, J., Izadi, S., Hertzmann, A., and Fitzgibbon, A. User-specific hand modeling from monocular depth sequences. In Proc. CVPR (2014).
[30]
Tompson, J., Stein, M., LeCun, Y., and Perlin, K. Real-time continuous pose recovery of human hands using convolutional networks. In ACM Trans. Graph. (2014).
[31]
Wang, R., Paris, S., and Popović, J. 6D hands. In Proc. ACM UIST (New York, New York, USA, Oct. 2011), 549--558.
[32]
Wang, R. Y., and Popović, J. Real-time hand-tracking with a color glove. In ACM Trans. Graph., vol. 28 (2009), 63:1--63:8.
[33]
Wang, Y., Min, J., Zhang, J., Liu, Y., Xu, F., Dai, Q., and Chai, J. Video-based hand manipulation capture through composite motion control. ACM Trans. Graph. (2013).
[34]
Wu, Y., and Huang, T. S. View-independent recognition of hand postures. In Proc. CVPR (2000), II:88--94.
[35]
Wu, Y., Lin, J. Y., and Huang, T. S. Capturing natural hand articulation. In Proc. CVPR (2001), II:426--432.
[36]
Xu, C., and Cheng, L. Efficient hand pose estimation from a single depth image. In Proc. ICCV (2013), 3456--3462.
[37]
Yuille, A., and Kersten, D. Vision as Bayesian inference: analysis by synthesis? Trends in cognitive sciences 10, 7 (2006), 301--308.
[38]
Zhao, W., Chai, J., and Xu, Y.-Q. Combining marker-based mocap and RGB-D camera for acquiring high-fidelity hand motion data. In Proc. Eurographics Symposium on Computer Animation (2012), 33--42.

Cited By

View all
  • (2025)XR-based interactive visualization platform for real-time exploring dynamic earth science dataEnvironmental Modelling & Software10.1016/j.envsoft.2024.106193183:COnline publication date: 1-Jan-2025
  • (2024)Enhancing User Experience in Virtual Museums: Impact of Finger Vibrotactile FeedbackApplied Sciences10.3390/app1415659314:15(6593)Online publication date: 28-Jul-2024
  • (2024)A Review on 3D Hand Pose and Shape Reconstruction from Color Images2024 43rd Chinese Control Conference (CCC)10.23919/CCC63176.2024.10661647(7414-7419)Online publication date: 28-Jul-2024
  • Show More Cited By

Index Terms

  1. Accurate, Robust, and Flexible Real-time Hand Tracking

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CHI '15: Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems
    April 2015
    4290 pages
    ISBN:9781450331456
    DOI:10.1145/2702123
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 18 April 2015

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. computer vision
    2. depth camera
    3. hand tracking

    Qualifiers

    • Research-article

    Conference

    CHI '15
    Sponsor:
    CHI '15: CHI Conference on Human Factors in Computing Systems
    April 18 - 23, 2015
    Seoul, Republic of Korea

    Acceptance Rates

    CHI '15 Paper Acceptance Rate 486 of 2,120 submissions, 23%;
    Overall Acceptance Rate 6,199 of 26,314 submissions, 24%

    Upcoming Conference

    CHI 2025
    ACM CHI Conference on Human Factors in Computing Systems
    April 26 - May 1, 2025
    Yokohama , Japan

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)184
    • Downloads (Last 6 weeks)20
    Reflects downloads up to 17 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)XR-based interactive visualization platform for real-time exploring dynamic earth science dataEnvironmental Modelling & Software10.1016/j.envsoft.2024.106193183:COnline publication date: 1-Jan-2025
    • (2024)Enhancing User Experience in Virtual Museums: Impact of Finger Vibrotactile FeedbackApplied Sciences10.3390/app1415659314:15(6593)Online publication date: 28-Jul-2024
    • (2024)A Review on 3D Hand Pose and Shape Reconstruction from Color Images2024 43rd Chinese Control Conference (CCC)10.23919/CCC63176.2024.10661647(7414-7419)Online publication date: 28-Jul-2024
    • (2024)Exploiting Spatial-Temporal Context for Interacting Hand Reconstruction on Monocular RGB VideoACM Transactions on Multimedia Computing, Communications, and Applications10.1145/363970720:6(1-18)Online publication date: 8-Mar-2024
    • (2024)Hand Gesture Recognition for Blind Users by Tracking 3D Gesture TrajectoryProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642602(1-15)Online publication date: 11-May-2024
    • (2024)Gesture Recognition Using Visible Light on Mobile DevicesIEEE/ACM Transactions on Networking10.1109/TNET.2024.336999632:4(2920-2935)Online publication date: Aug-2024
    • (2024)CST Framework: A Robust and Portable Finger Motion Tracking FrameworkIEEE Transactions on Human-Machine Systems10.1109/THMS.2024.338510554:3(282-291)Online publication date: Jun-2024
    • (2024)Grasping Objects in Immersive Virtual Reality Environments: Challenges and Current Techniques2024 10th International Conference on Virtual Reality (ICVR)10.1109/ICVR62393.2024.10867935(190-197)Online publication date: 24-Jul-2024
    • (2024)3D Hand Mesh Reconstruction Based on A Monocular RGB Camera2024 5th International Conference on Computer Engineering and Intelligent Control (ICCEIC)10.1109/ICCEIC64099.2024.10775583(238-245)Online publication date: 11-Oct-2024
    • (2024)HOIST-Former: Hand-Held Objects Identification, Segmentation, and Tracking in the Wild2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52733.2024.00228(2351-2361)Online publication date: 16-Jun-2024
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media