Real Time Hand Pose Estimation Using Depth Sensors

Keskin, Cem; Kıraç, Furkan; Kara, Yunus Emre; Akarun, Lale

doi:10.1007/978-1-4471-4640-7_7

Cem Keskin⁶,
Furkan Kıraç⁶,
Yunus Emre Kara⁶ &
…
Lale Akarun⁶

Part of the book series: Advances in Computer Vision and Pattern Recognition ((ACVPR))

7634 Accesses
81 Citations

Abstract

Real-time hand posture capture has been a difficult goal in computer vision. The extraction of hand skeleton parameters would be an important milestone for sign language recognition, since it would make classification of hand shapes and gestures possible. The recent introduction of the Kinect depth sensor has accelerated research in human body pose capture. This chapter describes a real-time hand pose estimation method employing an object recognition by parts approach, and the use of this method for hand shape classification. First, a realistic 3D hand model is used to represent the hand with 21 different parts. Then, a random decision forest (RDF) is trained on synthetic depth images generated by animating the hand model, which is used to perform per pixel classification and to assign each pixel to a hand part. The classification results are fed into a local mode finding algorithm to estimate the joint locations for the hand skeleton. The system can process depth images retrieved from Kinect in real time, and does not rely on temporal information. As a simple application of the system, we also describe a support vector machine (SVM)-based recognition module for the ten digits of American Sign Language (ASL) based on our method, which attains a recognition rate of 99.9 % on live depth images in real time.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Estimate Hand Poses Efficiently from Single Depth Images

Article Open access 19 April 2015

Hand Part Classification Using Single Depth Images

Robust hand gesture recognition using multiple shape-oriented visual cues

Article Open access 19 July 2021

References

Microsoft Corp. Redmond, WA. Kinect for Xbox 360
Google Scholar
Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., Blake, A.: Real-time human pose recognition in parts from single depth images. In: IEEE Conference on Computer Vision and Pattern Recognition (2011)
Google Scholar
Girshick, R., Shotton, J., Kohli, P., Criminisi, A., Fitzgibbon, A.: Efficient regression of general-activity human poses from depth images. In: International Conference on Computer Vision (2011)
Google Scholar
Erol, A., Bebis, G., Nicolescu, M., Boyle, R.D., Twombly, X.: Vision-based hand pose estimation: a review. Comput. Vis. Image Underst. 108, 52–73 (2007)
Article Google Scholar
Athitsos, V., Sclaroff, S.: Estimating 3D hand pose from a cluttered image. In: IEEE Conference on Computer Vision and Pattern Recognition (2003)
Google Scholar
Romero, J., Kjellstrom, H., Kragic, D.: Monocular real-time 3D articulated hand pose estimation. In: Humanoids, pp. 87–92 (2009)
Google Scholar
De Campos, T.E., Murray, D.W.: Regression-based hand pose estimation from multiple cameras. In: IEEE Conference on Computer Vision and Pattern Recognition (2006)
Google Scholar
Tipping, M.E., Smola, A.: Sparse Bayesian learning and the relevance vector machine. J. Mach. Learn. Res. 1, 211–244 (2001)
MathSciNet MATH Google Scholar
Rosales, R., Athitsos, V., Sigal, L., Sclaroff, S.: 3D hand pose reconstruction using specialized mappings. In: International Conference on Computer Vision (2001)
Google Scholar
Stergiopoulou, E., Papamarkos, N.: Hand gesture recognition using a neural network shape fitting technique. Eng. Appl. Artif. Intell. 22, 1141–1158 (2009)
Article Google Scholar
De La Gorce, M., Fleet, D.J., Paragios, N.: Model-based 3D hand pose estimation from monocular video. IEEE Trans. Pattern Anal. and Mach. Intell., Feb. 1–14 (2011)
Google Scholar
Oikonomidis, I., Kyriazis, N., Argyros, A.: Efficient model-based 3D tracking of hand articulations using Kinect. In: British Machine Vision Conference (2011)
Google Scholar
Stenger, B., Mendonça, P.R.S., Cipolla, R.: Model-based 3D tracking of an articulated hand. In: IEEE Conference on Computer Vision and Pattern Recognition (2001)
Google Scholar
Bray, M., Koller-Meier, E., Van Gool, L.J.: Smart particle filtering for high-dimensional tracking. Comput. Vis. Image Underst. 106, 116–129 (2007)
Article Google Scholar
Heap, T., Hogg, D.: Towards 3D hand tracking using a deformable model. In: International Conference on Automatic Face and Gesture Recognition, pp. 140–145 (1996)
Google Scholar
Mo, Z., Neumann, U.: Real-time hand pose recognition using low-resolution depth images. In: IEEE Conference on Computer Vision and Pattern Recognition (2006)
Google Scholar
Malassiotis, S., Strintzis, M.: Real-time hand posture recognition using range data. Image Vis. Comput. 26, 1027–1037 (2008)
Article Google Scholar
Liu, X., Fujimura, K.: Hand gesture recognition using depth data. In: Automatic Face and Gesture Recognition (2004)
Google Scholar
Suryanarayan, P., Subramanian, A., Mandalapu, D.: Dynamic hand pose recognition using depth data. In: International Conference on Pattern Recognition (2010)
Google Scholar
Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001)
Article MATH Google Scholar
Uebersax, D., Gall, J., Van den Bergh, M., Van Gool, L.: Real-time sign language letter and word recognition from depth data. In: International Conference on Computer Vision—Workshop on Human Computer Interaction: Real-Time Vision Aspects of Natural User Interfaces (2011)
Google Scholar
Comaniciu, D., Meer, P.: Mean shift: a robust approach toward feature space analysis. Pattern Anal. Mach. Intell. 24, 603–619 (2002)
Article Google Scholar
Basak, J.: Online adaptive decision trees: pattern classification and function approximation. Neural Comput. 18, 2062–2101 (2006)
Article MathSciNet MATH Google Scholar
Sharp, T.: Implementing decision trees and forests on a GPU. In: European Conference on Computer Vision (2008)
Google Scholar
Welch, G., Bishop, G.: An Introduction to the Kalman Filter (1995)
Google Scholar
Isard, M., Blake, A.: CONDENSATION—conditional density propagation for visual tracking. Int. J. Comput. Vis. 29, 5–28 (1998)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Computer Engineering Department, Boğaziçi University, 34342, Istanbul, Turkey
Cem Keskin, Furkan Kıraç, Yunus Emre Kara & Lale Akarun

Authors

Cem Keskin
View author publications
You can also search for this author in PubMed Google Scholar
Furkan Kıraç
View author publications
You can also search for this author in PubMed Google Scholar
Yunus Emre Kara
View author publications
You can also search for this author in PubMed Google Scholar
Lale Akarun
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Cem Keskin .

Editor information

Editors and Affiliations

Computer Vision Laboratory, ETH Zürich, Sternwartstrasse 7, Zürich, 8092, Switzerland
Andrea Fossati
Perceiving Systems Department, Max Planck Inst. for Intelligent Systems, Spemannstrasse 41, Tübingen, 72076, Germany
Juergen Gall
Computer Vision Laboratory, ETH Zürich, Sternwartstrasse 7, Zürich, 8092, Switzerland
Helmut Grabner
Intel Science and Technology Center, Allen Center 462, Seattle, 98195, Washington, USA
Xiaofeng Ren
Industrial Perception, Industrial Ave 911, Palo Alto, 94303, California, USA
Kurt Konolige

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Keskin, C., Kıraç, F., Kara, Y.E., Akarun, L. (2013). Real Time Hand Pose Estimation Using Depth Sensors. In: Fossati, A., Gall, J., Grabner, H., Ren, X., Konolige, K. (eds) Consumer Depth Cameras for Computer Vision. Advances in Computer Vision and Pattern Recognition. Springer, London. https://doi.org/10.1007/978-1-4471-4640-7_7

Download citation

DOI: https://doi.org/10.1007/978-1-4471-4640-7_7
Publisher Name: Springer, London
Print ISBN: 978-1-4471-4639-1
Online ISBN: 978-1-4471-4640-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics