Abstract
An approach to recognizing hand gestures from a monocular temporal sequence of images is presented. Of particular concern is the representation and recognition of hand movements that are used in single handed American Sign Language (ASL). The approach exploits previous linguistic analysis of manual languages that decompose dynamic gestures into their static and dynamic components. The first level of decomposition is in terms of three sets of primitives, hand shape, location and movement. Further levels of decomposition involve the lexical and sentence levels and are part of our plan for future work. We propose and demonstrate that given a monocular gesture sequence, kinematic features can be recovered from the apparent motion that provide distinctive signatures for 14 primitive movements of ASL. The approach has been implemented in software and evaluated on a database of 592 gesture sequences with an overall recognition rate of 86.00% for fully automated processing and 97.13% for manually initialized processing.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Aggarwal, J.K., Cai, Q.: Human motion analysis: A review. CVIU 73(3), 428–440 (1999)
Badler, N.: Temporal scene analysis: Conceptual descriptions of object movements. In: Dept. of Comp. Sc., Univ. of Toronto, Rep. TR-80 (1975)
Bergen, J.R., Anandan, P., Hanna, K.J., Hingorani, R.: Hierarchical model-based motion estimation. In: Sandini, G. (ed.) ECCV 1992. LNCS, vol. 588, pp. I:5–10. Springer, Heidelberg (1992)
Black, M.J., Anandan, P.: A framework for the robust estimation of optical flow. In: ICCV, pp. 231–236 (1993)
Black, M.J., Jepson, A.D.: A probabilistic framework for matching temporal trajectories. In: Burkhardt, H.-J., Neumann, B. (eds.) ECCV 1998. LNCS, vol. 1406, pp. II:909–924. Springer, Heidelberg (1998)
Bobick, A.F., Wilson, A.D.: A state-based approach to the representation and recognition of gesture. PAMI 19(12), 1325–1337 (1997)
Darrell, T., Pentland, A.: Space-time gestures. In: CVPR, pp. 335–340 (1993)
Derpanis, K.G.: Vision based gesture recognition within a linguistics framework. Master’s thesis, York University, Toronto, Canada (2003)
Elgammal, A., Shet, V., Yacoob, Y., Davis, L.S.: Learning dynamics for exemplar-based gesture recognition. In: CVPR, pp. I:571–578 (2003)
Fels, S.S., Hinton, G.E.: Glove-talk II. Trans. on NN 9(1), 205–212 (1997)
Han, J., Kamber, M.: Data Mining. Morgan Kaufmann, San Francisco (2001)
Horn, B.K.P.: Robot Vision. MIT Press, Cambridge (1986)
Huber, P.J.: Robust Statistical Procedures. SIAM Press, Philadelphia (1977)
Isard, M., Blake, A.: CONDENSATION - conditional density propagation for visual tracking. IJCV 29(1), 5–28 (1998)
Jahne, B.: Digital Image Processing. Springer, Berlin (1991)
Koenderink, J.J., van Doorn, A.J.: Local structure of movement parallax of the plane. JOSA-A 66(7), 717–723 (1976)
Lee, H.K., Kim, J.H.: An HMM-based threshold model approach for gesture recognition. PAMI 21(10), 961–973 (1999)
Liang, R.H., Ouhyoung, M.: A real-time continuous gesture recognition system for sign language. In: AFGR, pp. 558–567 (1998)
Lu, S., Metaxas, D., Samaras, D., Oliensis, J.: Using multiple cues for hand tracking and model refinement. In: CVPR, pp. II:443–450 (2003)
Pavlovic, V.I., Sharma, R., Huang, T.S.: Visual interpretation of hand gestures for human-computer interaction: A review. PAMI 19(7), 677–695 (1997)
Poizner, H., Bellugi, U., Lutes-Driscoll, V.: Perception of American Sign Language in dynamic point-light displays. J. of Exp. Psych. 7(2), 430–440 (1981)
Rabiner, L., Juang, B.: Fundamentals of Speech Recognition. Prentice Hall, Englewood Cliffs (1993)
Rui, Y., Anandan, P.: Segmenting visual actions based on spatio-temporal motion patterns. In: CVPR, pp. I:111–118 (2000)
Schlenzig, J., Hunter, E., Jain, R.: Vision based gesture interpretation using recursive estimation. In: Asilomar Conf. on Signals, Systems and Computers (1994)
Sminchisescu, C., Triggs, B.: Kinematic jump processes for monocular 3d human tracking. In: CVPR, pp. I: 69–76 (2003)
Starner, T., Weaver, J., Pentland, A.P.: Real-time American Sign Language recognition using desk and wearablecomputer based video. PAMI 20(12), 1371–1375 (1998)
Stokoe, W.C., Casterline, D., Croneberg, C.: A Dictionary of American Sign Language. Linstok Press, Washington (1965)
Tsotsos, J.K., Mylopoulos, J., Covvey, H.D., Zucker, S.W.: A framework for visual motion understanding. PAMI 2(6), 563–573 (1980)
Valli, C., Lucas, C.: Linguistics of American Sign Language: An Introduction. Gallaudet University Press, Washington (2000)
Vogler, C., Metaxas, D.: A framework for recognizing the simultaneous aspects of American Sign Language. CVIU 81(3), 358–384 (2001)
Yang, M.H., Ahuja, N., Tabb, M.: Extraction of 2d motion trajectories and its application to hand gesture recognition. PAMI 24(8), 1061–1074 (2002)
Zarit, B., Super, B.J., Quek, F.: Comparison of five color models in skin pixel classification. In: RATFG, pp. 58–63 (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Derpanis, K.G., Wildes, R.P., Tsotsos, J.K. (2004). Hand Gesture Recognition within a Linguistics-Based Framework. In: Pajdla, T., Matas, J. (eds) Computer Vision - ECCV 2004. ECCV 2004. Lecture Notes in Computer Science, vol 3021. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24670-1_22
Download citation
DOI: https://doi.org/10.1007/978-3-540-24670-1_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-21984-2
Online ISBN: 978-3-540-24670-1
eBook Packages: Springer Book Archive