Analysis of composite gestures with a coherent probabilistic graphical model

Corso, Jason J.; Ye, Guangqi; Hager, Gregory D.

doi:10.1007/s10055-005-0157-1

Analysis of composite gestures with a coherent probabilistic graphical model

Published: 12 August 2005

Volume 8, pages 242–252, (2005)
Cite this article

Virtual Reality Aims and scope Submit manuscript

Jason J. Corso¹,
Guangqi Ye¹ &
Gregory D. Hager¹

123 Accesses
3 Citations
Explore all metrics

An Erratum to this article was published on 20 October 2005

Abstract

Traditionally, gesture-based interaction in virtual environments is composed of either static, posture-based gesture primitives or temporally analyzed dynamic primitives. However, it would be ideal to incorporate both static and dynamic gestures to fully utilize the potential of gesture-based interaction. To that end, we propose a probabilistic framework that incorporates both static and dynamic gesture primitives. We call these primitives Gesture Words (GWords). Using a probabilistic graphical model (PGM), we integrate these heterogeneous GWords and a high-level language model in a coherent fashion. Composite gestures are represented as stochastic paths through the PGM. A gesture is analyzed by finding the path that maximizes the likelihood on the PGM with respect to the video sequence. To facilitate online computation, we propose a greedy algorithm for performing inference on the PGM. The parameters of the PGM can be learned via three different methods: supervised, unsupervised, and hybrid. We have implemented the PGM model for a gesture set of ten GWords with six composite gestures. The experimental results show that the PGM can accurately recognize composite gestures.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Gesture Modelling and Recognition by Integrating Declarative Models and Pattern Recognition Algorithms

Multi-layered Gesture Recognition with Kinect

A Transformer Architecture for Online Gesture Recognition of Mathematical Expressions

References

3rd Tech. Hiball-3100 sensor, http://www.3rdtech.com/HiBall.htm
Anandan P (1989) A computational framework and an algorithm for the measurement of visual motion. Int J Comput Vis 2(3):283–310
Article Google Scholar
Ascension technology corporation. Flock of birds, http://www.ascension-tech.com/products/flockofbirds.php
Athitsos V, Sclaroff S (2003) Estimating 3D hand pose from a cluttered image. In Proceedings of Comput Vis Pattern Recognit 2:432–439
Google Scholar
Azuma RT (1997) A survey of augmented reality. Presence: Teleoperators Virtual Environments 6(11):1–38
Google Scholar
Belongie S, Malik J, Puzicha J (2000) Shape context: a new descriptor for shape matching and object recognition. In: Neural Information Processing, pp 831–837
Bobick A, Davis J (2001) The recognition of human movement using temporal templates. IEEE Trans Pattern Anal Mach Intell 23(3):257–267
Article Google Scholar
Bobick A, Wilson A (1997) A state-based approach to the representation and recognition of gesture. IEEE Trans Pattern Anal Mach Intell 19(12):1325–1337
Article Google Scholar
Brand M, Oliver N, Pentland AP (1997) Coupled hidden Markov models for complex action recognition. In: Proceedings of the 1997 Computer Vision Pattern Recognition, pp 994–999
Bregler C (1997) Learning and recognizing human dynamics in video sequences. In: IEEE Conference on Computer Vision Pattern Recognition
Cai Q, Aggarwal JK (1999) Human motion analysis: a review. J Comput Vis Image Understand 73(3):428–440
Article Google Scholar
Corso JJ (2004) Vision-based techniques for dynamic, collaborative mixed-realities. In: Thompson BJ (eds) Research papers of the Link Foundation fellows. University of Rochestor Press in association with the Link Foundation, vol 4
Corso JJ, Burschka D, Hager GD (2003) The 4DT: unencumbered HCI with VICs. In: IEEE workshop on Human Computer Interaction at Conference on Computer Vision and Pattern Recognition
Faugeras O (1993) Three-dimensional computer vision. MIT Press, Cambridge
Google Scholar
Galata A, Johnson N, Hogg D (2001) Learning variable-length Markov models of behavior. Comput Vis Image Understand 83(1):398–413
Article Google Scholar
Insko B, Meehan M, Whitton M, Brooks F (2001) Passive haptics significantly enhances virtual environments. Technical Report 01–10, Department of Computer Science, UNC Chapel Hill
Ivanov YA, Bobick AF (2000) Recognition of visual activities and interactions by stochastic parsing. IEEE Trans Pattern Anal Mach Intell 22(8):852–872
Google Scholar
Jelinek F (1999) Statistical methods for speech recognition. MIT Press, Cambridge
Google Scholar
Lowe D (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110
Article Google Scholar
Malassiotis S, Aifanti N, Strintzis M (2002) A gesture recognition system using 3D data. In: Proceedings of the First International Symposium on 3D Data Processing Visualization and Transmisssion, pp 190–193
Mckenna SJ, Morrison K (2004) A comparison of skin history and trajectory-based representation schemes for the recognition of user-specific gestures. Pattern Recognit 37:999–1009
Article Google Scholar
Moon TK (1996) The expectation-maximization algorithm. IEEE Signal Processing Magazine, pp 47–60
Nickel K, Stiefelhagen R (2003) Pointing gesture recognition based on 3D-tracking of face, hands and head orientation. In: Workshop on Perceptive User Interfaces, pp 140–146
Oka K, Sato Y,Koike H (2002) Real-time fingertip tracking and gesture recognition. IEEE Comput Graph Appl 22(6):64–71
Article Google Scholar
Parameswaran V, Chellappa R (2003) View invariants for human action recognition. In Proceedings of Comput Vis Pattern Recognit 2:613–619
Google Scholar
Pavlovic VI, Sharma R, Huang TS (1997) Visual interpretation of hand gestures for human-computer interaction: a review. IEEE Trans Pattern Anal Mach Intell 19(7):677–695
Article Google Scholar
Pentland A, Liu A (1999) Modeling and prediciton of human behavior. Neural Comput 11(1):229–242
Article PubMed Google Scholar
Quek F (1996) Unnencumbered gesture interaction. IEEE Multimedia 3(3):36–47
Article Google Scholar
Rabiner L (1989) A tutorial on hidden Markov models and selected applications in speech recognition. Proc IEEE 77(2):257–286
Article Google Scholar
Salada M, Colgate JE, Lee M, Vishton P (2002) Validating a novel approach to rendering fingertip contact sensations. In: Proceedings of the 10th IEEE Virtual Reality Haptics Symposium, pp 217–224
Schalkoff RJ (1997) Artificial neural networks. McGraw-Hill, New York
Google Scholar
Shi Y, Huang Y, Minnen D, Bobick A, Essa I (2004) Propagation networks for recognition of partially ordered sequential action. In Proceedings of Comput Vis Pattern Recognit 2:862–869
Google Scholar
Shin MC, Tsap LV, Goldgof DB (2004) Gesture recognition using bezier curvers for visualization navigation from registered 3-D data. Pattern Recognit 37(0):1011–1024
Article Google Scholar
Starner T, Pentland A (1996) Real-time american sign language recognition from video using hidden Markov models. Technical Report TR-375, M.I.T. Media Laboratory
Tomasi C, Petrov S, Sastry A (2003) 3D Tracking = Classification + Interpolation. In: Proceeding International Conference Computer Vision, pp 1441–1448
von Hardenberg C, Berard F (2001) Bare-hand human-computer interaction. In: Workshop on Perceptive User Interfaces
Wilson A, Bobick A (1999) Parametric hidden Markov models for gesture recognition. IEEE Trans Pattern Anal Mach Intell 21(9):884–900
Article Google Scholar
Wren C, Azarbayejani A, Darrell T, Pentland AP (1997) Pfinder: real-time tracking of the human body. IEEE Trans Pattern Anal Mach Intell 19(7):780–784
Article Google Scholar
Wu Y, Huang TS (2000) View-independent recognition of hand postures. In Proceedings of Comput Vis Pattern Recognit 2:88–94
Google Scholar
Wu Y, Huang TS (2001) Hand modeling, analysis, and recognition. IEEE Signal Processing Magazine 18(3):51–60
Article Google Scholar
Yamato J, Ohya J, Ishii K (1992) Recognizing human actions in time-sequential images using hidden Markov model. In: Proceedings of the 1992 IEEE Conference on Computer Vision Pattern Recognition, pp 379–385
Ye G, Corso JJ, Burschka D, Hager GD (2004) VICs: a modular hci framework using spatio-temporal dynamics. Machine Vision and Applications
Ye G, Corso JJ, Hager GD (2004) Gesture recognition using 3D appearance and motion features. In: Proceedings of CVPR workshop on Real-Time Vision for Human-Computer Interaction
Ye G, Corso JJ, Hager GD, Okamura AM (2003) VisHap: augmented reality combining haptics and vision. In: Proceedings of IEEE International Conference on Systems, Man and Cybernetics, pp 3425–3431
Yokokohji Y, Kinoshita J, Yoshikawa T (2001) Path planning for encountered-type haptic devices that render multiple objects in 3D space. In: Proceedings of IEEE Virtual Reality, pp 271–278
Yoshikawa T, Nagura A (2001) A touch/force display system for haptic interface. Presence 10(2):225–235
Google Scholar
Zhang Z, Wu T, Shan Y, Shafer S (2001) Visual panel: virtual mouse keyboard and 3D controller with an ordinary piece of paper. In: Workshop on Perceptive User Interfaces
Zhou H, Lin DJ, Huang TS (2004) Static hand postures recognition based on local orientation histogram feature distribution model. In: Proceedings of CVPR workshop on Real-Time Vision for Human-Computer Interaction

Download references

Acknowledgements

We thank Darius Burschka for his help with the Visual Interaction Cues project. This work was in part funded by a Link Foundation Fellowship in Simulation and Training and by the National Science Foundation under Grant No. 0112882.

Author information

Authors and Affiliations

Computational Interaction and Robotics Lab, The Johns Hopkins University, Baltimore, MD, 21218, USA
Jason J. Corso, Guangqi Ye & Gregory D. Hager

Authors

Jason J. Corso
View author publications
You can also search for this author in PubMed Google Scholar
Guangqi Ye
View author publications
You can also search for this author in PubMed Google Scholar
Gregory D. Hager
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jason J. Corso.

Additional information

An erratum to this article is available at http://dx.doi.org/10.1007/s10055-005-0007-1.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Corso, J.J., Ye, G. & Hager, G.D. Analysis of composite gestures with a coherent probabilistic graphical model. Virtual Reality 8, 242–252 (2005). https://doi.org/10.1007/s10055-005-0157-1

Download citation

Published: 12 August 2005
Issue Date: September 2005
DOI: https://doi.org/10.1007/s10055-005-0157-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Analysis of composite gestures with a coherent probabilistic graphical model

Abstract

Access this article

Similar content being viewed by others

Gesture Modelling and Recognition by Integrating Declarative Models and Pattern Recognition Algorithms

Multi-layered Gesture Recognition with Kinect

A Transformer Architecture for Online Gesture Recognition of Mathematical Expressions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Analysis of composite gestures with a coherent probabilistic graphical model

Abstract

Access this article

Similar content being viewed by others

Gesture Modelling and Recognition by Integrating Declarative Models and Pattern Recognition Algorithms

Multi-layered Gesture Recognition with Kinect

A Transformer Architecture for Online Gesture Recognition of Mathematical Expressions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation