Abstract
Real time recognition of visual face appearances (visemes) which correspond to phonemes and their speech contexts is presented. We distinguish six major classes of visemes. Features are extracted in the form of normalized image texture. The normalization procedure uses barycentric coordinates in a mesh of triangles superimposed onto a reference facial image. The mesh itself is defined using a subset of FAP points conforming with MPEG-4 standard. The elaborated classifiers were designed by PCA subspace and LDA methods. It appears that the LDA classifier outperforms subspace technique. It is better than the best subspace PCA – in recognition rate by more than 13% times (97% versus 84%) and it is more than 10 times faster (0.5ms versus 7ms) and its time is neglected w.r.t. mouth image normalization time (0.5ms versus 5ms).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bober, M., Kucharski, K., Skarbek, W.: Face Recognition by Fisher and Scatter Linear Discriminant Analysis. In: Petkov, N., Westenberg, M.A. (eds.) CAIP 2003. LNCS, vol. 2756, pp. 638–645. Springer, Heidelberg (2003)
Fukunaga, K.: Introduction to statistical pattern recognition, 2nd edn. Academic Press, Boston (1990)
Golub, G., Van Loan, C.: Matrix Computations. Johns Hopkins University Press, Baltimore (1996)
Jolliffe, I.T.: Principal Component Analysis. Springer, New York (2002)
Oja, E.: Subspace Methods of Pattern Recognition. Research Studies Press, England (1983)
Ripley, B.D.: Pattern Recognition and Neural Networks. Cambridge University Press, Cambridge (1996)
Swets, D.L., Weng, J.: Using Discriminant Eigenfeatures for Image Retrieval. IEEE Trans. on PAMI 18(8), 831–837 (1996)
The Hidden Markov Model Toolkit (HTK), http://htk.eng.cam.ac.uk
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Leszczynski, M., Skarbek, W., Badura, S. (2005). Fast Viseme Recognition for Talking Head Application. In: Kamel, M., Campilho, A. (eds) Image Analysis and Recognition. ICIAR 2005. Lecture Notes in Computer Science, vol 3656. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11559573_64
Download citation
DOI: https://doi.org/10.1007/11559573_64
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29069-8
Online ISBN: 978-3-540-31938-2
eBook Packages: Computer ScienceComputer Science (R0)