Abstract
In American Sign Language (ASL) the structure of signed sentences is conveyed by grammatical markers which are represented by facial feature movements and head motions. Without recovering grammatical markers, a sign language recognition system cannot fully reconstruct a signed sentence. However, this problem has been largely neglected in the literature. In this paper, we propose to use a 2-layer Conditional Random Field model for recognizing continuously signed grammatical markers in ASL. This recognition requires identifying both facial feature movements and head motions while dealing with uncertainty introduced by movement epenthesis and other effects. We used videos of the signers’ faces, recorded while they signed simple sentences containing multiple grammatical markers. In our experiments, the proposed classifier yielded a precision rate of 93.76% and a recall rate of 85.54%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Baker, C., Cokely, D.: American Sign Language: A teacher’s Resource Text on Grammar and Culture. Clerc Books, Gallaudet University Press, Wasington D.C. (1980)
Ong, S., Ranganath, S.: Automatic Sign Language Analysis: A Survey and the Future Beyond Lexical Meaning. IEEE Transactions on Pattern Analysis and Machine Intelligence 27, 873–891 (2005)
Vogler, C., Goldenstein, S.: Facial movement analysis in ASL. Journal on Universal Access in the Information Society 6, 363–374 (2008)
Neidle, C., Nash, J., Michael, N., Metaxas, D.: A Method for Recognition of Grammatically Significant Head Movements and Facial Expressions, Developed Through Use of a Linguistically Annotated Video Corpus. In: Proceedings of the Language and Logic Workshop, Formal Approaches to Sign Languages, European Summer School in Logic, Language, and Information (ESSLLI 2009), Bordeaux, France (2009)
Pantic, M., Rothkrantz, L.J.: Automatic Analysis of Facial Expressions: The State of the Art. IEEE Transactions on Pattern Analysis and Machine Intelligence 22, 1424–1445 (2000)
Fasel, B., Luettin, J.: Automatic facial expression analysis: a survey. Pattern Recognition 36, 259–275 (2003)
Murphy-Chutorian, E., Trivedi, M.M.: Head pose estimation in computer vision: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence 31 (2009)
Black, M., Yacoob, Y.: Recognizing facial expressions in image sequences using local parameterized models of image motion. International Journal of Computer Vision 25, 23–48 (1997)
la Torre, F.D., Campoy, J., Ambadar, Z., Cohn, J.F.: Temporal Segmentation of Facial Behavior. In: International Conference on Computer Vision (2007)
Matthews, I., Baker, S.: Active AppearanceModels Revisited. International Journal of Computer Vision 60, 135–164 (2004)
Cohen, I., Sebe, N., Garg, A., Chen, L.S., Huang, T.S.: Facial expression recognition from video sequences: temporal and static modeling. Computer Vision and Image Understanding 91, 160–187 (2003); Special Issue on Face Recognition
Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: Probabilistic models for segmenting and labelling sequence data. In: International Conference on Machine Learning (2001)
Kanaujia, A., Metaxas, D.: Recognizing Facial Expressions by Tracking Feature Shapes. In: International Conference on Pattern Recognition, Hong Kong, China (2006)
Quattoni, A., Wang, S.B., Morency, L.P., Collins, M., Darrell, T.: Hidden Conditional Random Fields. IEEE Transactions on Pattern Analysis and Machine Intelligence 29, 1848–1852 (2007)
Chang, K.Y., Liu, T.L., Lai, S.H.: Learning partially-observed hidden conditional random fields for facial expression recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 533–540 (2009)
Nguyen, T.D., Ranganath, S.: Tracking facial features under occlusions and recognizing facial expressions in sign language. In: IEEE International Conference on Automatic Face & Gesture Recognition, Amsterdam, Netherlands, pp. 1–7 (2008)
Oliver, N., Horvitz, E., Garg, A.: Layered representations for learning and inferring office activity from multiple sensory channels. Computer Vision and Image Understanding 96, 163–180 (2004)
Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48, 443–453 (1970)
Schmidt, M., Swersky, K.: Conditional Random Field Toolbox for Matlab, http://www.cs.ubc.ca/~murphyk/Software/CRF/crf.html
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Nguyen, T.D., Ranganath, S. (2011). Recognizing Continuous Grammatical Marker Facial Gestures in Sign Language Video. In: Kimmel, R., Klette, R., Sugimoto, A. (eds) Computer Vision – ACCV 2010. ACCV 2010. Lecture Notes in Computer Science, vol 6495. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19282-1_53
Download citation
DOI: https://doi.org/10.1007/978-3-642-19282-1_53
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-19281-4
Online ISBN: 978-3-642-19282-1
eBook Packages: Computer ScienceComputer Science (R0)