Abstract
We present a glove-based hand gesture recognition system using hidden Markov models (HMMs) for recognizing the unconstrained 3D trajectory gestures of operators in a remote work environment. A Polhemus sensor attached to a PinchGlove is employed to obtain a sequence of 3D positions of a hand trajectory. The direct use of 3D data provides more naturalness in generating gestures, thereby avoiding some of the constraints usually imposed to prevent performance degradation when trajectory data are projected into a specific 2D plane. We use two kinds of HMMs according to the basic units to be modeled: gesture-based HMM and stroke-based HMM. The decomposition of gestures into more primitive strokes is quite attractive, since reversely concatenating stroke-based HMMs makes it possible to construct a new set of gesture-based HMMs. Any deterioration in performance and reliability arising from decomposition can be remedied by a fine-tuned relearning process for such composite HMMs. We also propose an efficient method of estimating a variable threshold of reliability for an HMM, which is found to be useful in rejecting unreliable patterns. In recognition experiments on 16 types of gestures defined for remote work, the fine-tuned composite HMM achieves the best performance of 96.88% recognition rate and also the highest reliability.
Similar content being viewed by others
References
T.S. Huang and V.I. Pavlovic, “Hand gesture modeling, analysis, and synthesis,” in Proc. Int. Workshop Automatic Face-and Gesture-Recognition, Zurich, 1995, pp. 73-79.
J.M. Rehg and T. Kanade, “DigitEyes: Vision-based human hand tracking,” Technical Report, CMU-CS-93-220, School of Computer Science, Carnegie Mellon University, Dec. 1993.
T. Starner and A. Pentland, “Visual recognition of American Sign Language using hidden Markov models,” in Proc. Int. Workshop Automatic Face-and Gesture-Recognition, Zurich, 1995, pp. 189-194.
V.I. Pavlovic, R. Sharma, and T.S. Huang, “Visual interpretation of hand gestures for human-computer interaction: A review,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 7, pp. 677-695, 1997.
J. Kramer and L. Leifer, “The Talking Glove: An expressive and receptive 'Verbal' communication aid for the deaf, deaf-blind, and non-vocal,” Technical Report, Stanford University, Dept. of Electrical Engineering, Stanford, CA, 1989.
S.S. Fels and G.E. Hinton, “Glove-Talk: A neural network interface between a Data-Glove and a speech synthesizer,” IEEE Trans. Neural Networks, vol. 4, no. 1, pp. 2-8, 1993.
S.S. Fels and G.E. Hinton, “Glove-Talk II-a neural-network interface which maps gestures to parallel formant speech synthesizer controls,” IEEE Trans. Neural Networks, vol. 8, no. 5, pp. 977-984, 1997.
C. Lee and Y. Xu, “Online, interactive learning of gestures for human/robot interfaces,” in Proc. IEEE Conf. Robotics and Automation, Minneapolis, MN, vol. 4, pp. 2982-2987, 1996.
J.C. Goble, K. Hinckley, R. Pausch, J.W. Snell, and N.F. Kassell, “Two-handed spatial interface tools for neurosurgical planning,” IEEE Computers, vol. 28, pp. 20-26, 1995.
D.J. Sturman and D. Zeltzer, “A survey of glove-based input,” IEEE Computer Graphics and Applications, vol. 4, pp. 30-39, 1994.
Y. Nam and K. Wohn, “Recognition of space-time hand-gestures using hidden Markov model,” in Proc. ACM Symposium on Virtual Reality Software and Technology, 1996, pp. 51-58.
L.R. Rabiner, “A tutorial on hidden Markov models and selected applications in speech recognition,” in Proc. IEEE, vol. 77, no. 2, pp. 257-286, 1989.
K.S. Nathan, H.S.M. Beigi, J. Subrahmonia, G.J. Clary, and H. Maruyama, “Real-time on-line unconstrained handwriting recognition using statistical methods,” in Proc. IEEE Conf. Acoustics, Speech, and Signal Processing, 1995, pp. 2619-2622.
M.Y. Chen, A. Kundu, and S.N. Srihari, “Variable duration hidden Markov model and morphological segmentation for handwritten word recognition,” IEEE Trans. Image Processing, vol. 4, no. 12, pp. 1675-1688, 1995.
J. Yang, Y. Xu, and C.S. Chen, “Human action learning via hidden Markov model,” IEEE Trans. Systems, Man, and Cybernetics, vol. 27, no. 1, pp. 34-44, 1997.
M.G. Rahim, C.H. Lee, and B.H. Juang, “Discriminative utterance verification for connected digits recognition,” IEEE Trans. Speech and Audio Processing, vol. 5, no. 3, pp. 266-277, 1997.
R.A. Sukkar and C.H. Lee, “Vocabulary independent discriminative utterance verification for nonkeyword rejection in subword based speech recognition,” IEEE Trans. Speech and Audio Processing, vol. 4, no. 6, pp. 420-429, 1996.
F.G. Hofmann, P. Heyer, and G. Hommel, “Velocity profile based recognition of dynamic gestures with discrete hidden Markov models,” in Proc. Int. Gesture Workshop, Bielefeld, Germany, 1997, pp. 81-95.
Fakespace Inc., Fakespace PinchGlove System Installation Guide and User Handbook, CA, 1995.
Y. Linde, A. Buzo, and R.M. Gray, “An algorithm for vector quantizer design,” IEEE Trans. Communications, vol. COM-28, no. 1, pp. 84-95, 1980.
L.R. Rabiner and B.H. Juang, “An introduction to hidden Markov models,” IEEE ASSP Magazine, vol. 3, no. 1, pp. 4-16, 1986.
X.D. Huang, Y. Ariki, and M.A. Jack, Hidden Markov Models for Speech Recognition, Edinburgh University Press, Edinburgh, UK, 1990.
S.J. Young, “The HTK hidden Markov model toolkit: Design and philosophy,” Technical Report, TR. 152, Cambridge University Engineering Department, Sept. 1994.
T. Matsui and S. Furui, “Concatenated phoneme models for text-variable speaker recognition,” in Proc. IEEE Conf. Acoustics, Speech, and Signal Processing, vol. 2, pp. 391-394, 1993.
R. Schwartz, Y. Chow, O. Kimball, S. Roucos, M. Krasner, and J. Makhoul, “Context-dependent modeling for acoustic-phonetic recognition of continuous speech,” in Proc. IEEE Conf. Acoustics, Speech, and Signal Processing, Apr. 1985.
R. Cardin, Y. Normandin, and E. Millien, “Inter-word coarticulation modeling and MMIE training for improved connected digit recognition,” in Proc. IEEE Conf. Acoustics, Speech, and Signal Processing, vol. 2, pp. 243-246, 1993.
E.P. Giachin, A.E. Rosenberg, and C.H. Lee, “Word juncture modeling using phonological rules for HMM-based continuous speech recognition,” Computer Speech and Language, vol. 5, pp. 155-168, 1991.
E.P. Giachin, C.H. Lee, L.R. Rabiner, A.E. Rosenberg, and R.H. Pieraccini, “On the use of inter-word context-dependent units for word juncture modeling,” Computer Speech and Language, vol. 6, pp. 197-213, 1992.
B.K. Sin and J.H. Kim, “Ligature modeling for online cursive script recognition,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 6, pp. 623-633, 1997.
S. Eickeler, A. Kosmala, and G. Rigoll, “Hidden Markov model based continuous online gesture recognition,” in Proc. Int. Conf. Pattern Recognition, vol. 2, pp. 1206-1208, 1998.
K. Fukunaga, Introduction to Statistical Pattern Recognition, 2nd ed., Academic Press, New York, 1990.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Kim, IC., Chien, SI. Analysis of 3D Hand Trajectory Gestures Using Stroke-Based Composite Hidden Markov Models. Applied Intelligence 15, 131–143 (2001). https://doi.org/10.1023/A:1011231305559
Issue Date:
DOI: https://doi.org/10.1023/A:1011231305559