Abstract
This paper describes a method that, given an input image of a person signing a gesture in a cluttered scene, locates the gesturing arm, automatically detects and segments the hand and finally creates a ranked list of possible shape class, 3D pose orientation and full hand configuration parameters. The clutter-tolerant hand segmentation algorithm is based on depth data from a single image captured with a commercially available depth sensor, namely the Kinect TM. Shape and 3D pose estimation is formulated as an image database retrieval method where given a segmented hand the best matches are extracted from a large database of synthetically generated hand images. Contrary to previous approaches this clutter-tolerant method is all-together: user-independent, automatically detects and segments the hand from a single image (no multi-view or motion cues employed) and provides estimation not only for the 3D pose orientation but also for the full hand articulation parameters. The performance of this approach is quantitatively and qualitatively evaluated on a dataset of real images of American Sign Language (ASL) handshapes.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Athitsos, V., Sclaroff, S.: Estimating 3d hand pose from a cluttered image. In: Proceedings of 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. II432–II439 (2003)
Microsoft Corp. Redmond WA.: Kinect Xbox 360, http://www.xbox.com/kinect
Schneider, M., Stevens, C.: Development and testing of a new magnetic-tracking device for image guidance. In: Proceedings of SPIE, vol. 7035, pp. 65090I–65090I–11 (2007)
Wang, R.Y., Popović, J.: Real-time hand-tracking with a color glove. ACM Trans. Graph. 28, 63:1–63:8 (2009)
Erol, A., Bebis, G., Nicolescu, M., Boyle, R.D., Twombly, X.: Vision-based hand pose estimation: A review. Computer Vision and Image Understanding 108, 52–73 (2007)
Moghaddam, B., Pentland, A.: Probabilistic visual learning for object detection. Technical Report 326, MIT (1995)
Triesch, J., von der Malsburg, C.: Robotic Gesture Recognition. In: Wachsmuth, I., Fröhlich, M. (eds.) GW 1997. LNCS (LNAI), vol. 1371, pp. 233–244. Springer, Heidelberg (1998)
Freeman, W.T., Roth, M.: Computer vision for computer games. In: Automatic Face and Gesture Recognition, pp. 100–105 (1996)
Wu, Y., Huang, T.: View-independent recognition of hand postures, vol. 2, pp. 88–94 (2000)
Oikonomidis, I., Kyriazis, N., Argyros, A.A.: Full dof tracking of a hand interacting with an object by modeling occlusions and physical constraints. In: 2011 IEEE International Conference on Computer Vision, ICCV, pp. 2088–2095 (2011)
de La Gorce, M., Fleet, D.J., Paragios, N.: Model-based 3d hand pose estimation from monocular video. IEEE Trans. Pattern Anal. Mach. Intell. 33, 1793–1805 (2011)
Oikonomidis, I., Kyriazis, N., Argyros, A.A.: Markerless and Efficient 26-DOF Hand Pose Recovery. In: Kimmel, R., Klette, R., Sugimoto, A. (eds.) ACCV 2010, Part III. LNCS, vol. 6494, pp. 744–757. Springer, Heidelberg (2011)
Rehg, J.M., Kanade, T.: Model-based tracking of self-occluding articulated objects. In: IEEE International Conference on Computer Vision, p. 612 (1995)
Keskin, C., Kiraç, F., Kara, Y.E., Akarun, L.: Real time hand pose estimation using depth sensors. In: ICCV Workshops, pp. 1228–1234 (2011)
Pugeault, N., Bowden, R.: Spelling it out: Real-time asl fingerspelling recognition. In: ICCV Workshops, pp. 1114–1119 (2011)
Mo, Z., Neumann, U.: Real-time hand pose recognition using low-resolution depth images. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2006, vol. 2, pp. 1499–1505 (2006)
Khachiyan, L.G.: Rounding of polytopes in the real number model of computation. Math. Oper. Res. 21, 307–320 (1996)
Minimum Volume Enclosing Ellipsoid: Matlab Central, http://www.mathworks.com/matlabcentral/fileexchange/9542-minimum-volume-enclosing-ellipsoid
Smoothing 2D Contours Using Local Regression Lines: Matlab Central, http://www.mathworks.com/matlabcentral/fileexchange/30793-smoothing-2d-contours-using-local-regression-lines
Smith Micro, Aliso Viejo, CA: Poser 8, http://poser.smithmicro.com/poser.html
Barrow, H.G., Tenenbaum, J.M., Bolles, R.C., Wolf, H.C.: Parametric correspondence and chamfer matching: two new techniques for image matching. In: Proceedings of the 5th International Joint Conference on Artificial Intelligence, IJCAI 1977, vol. 2, pp. 659–663. Morgan Kaufmann Publishers Inc., San Francisco (1977)
Athitsos, V., Sclaroff, S.: An appearance-based framework for 3D hand shape classification and camera viewpoint estimation. In: Automatic Face and Gesture Recognition (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Doliotis, P., Athitsos, V., Kosmopoulos, D., Perantonis, S. (2012). Hand Shape and 3D Pose Estimation Using Depth Data from a Single Cluttered Frame. In: Bebis, G., et al. Advances in Visual Computing. ISVC 2012. Lecture Notes in Computer Science, vol 7431. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33179-4_15
Download citation
DOI: https://doi.org/10.1007/978-3-642-33179-4_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33178-7
Online ISBN: 978-3-642-33179-4
eBook Packages: Computer ScienceComputer Science (R0)