Lip Movement Recognition

Aleksic, Petar S.

doi:10.1007/978-0-387-73003-5_245

Petar S. Aleksic³

654 Accesses
3 Citations

Synonyms

Audio–visual-dynamic speaker recognition; Visual-dynamic speaker recognition

Definition

Lip movement recognition is a speaker recognition technique, where the identity of a speaker is determined/verified by exploiting information contained in dynamics of changes of visual features extracted from the mouth region. The visual features usually consist of appropriate representations of the mouth appearance and/or shape. This dynamic visual information can also be used in addition to the acoustic information in order to improve the performance of audio-only speaker recognition systems and increase their resilience to spoofing, therefore giving rise to audio–visual-dynamic speaker recognition systems.

Introduction

Speech contains information about identity, emotion, location, as well as linguistic information, and plays a significant role in the development of human computer interaction (HCI) systems, including speaker recognition systems. However, audio-only systems can perform...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 449.00; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Chen, T., Rao, R.R.: Audio-visual integration in multimodal communication. Proc. IEEE 86(5), 837–852 (1998)
Article MathSciNet Google Scholar
Aleksic, P.S., Potamianos, G., Katsaggelos, A.K.: Exploiting visual information in automatic speech processing. In: Bovik, A.L. (ed.) Handbook of Image and Video Processing. Academic, London (2005)
Google Scholar
Aleksic, P.S., Katsaggelos, A.K.: Speech-to-video synthesis using MPEG-4 compliant visual features. IEEE Trans CSVT, Special Issue on Audio and Video Analysis for Multimedia Interactive Services, pp. 682–692, May (2004)
Google Scholar
Summerfield, A.Q.: Some preliminaries to a comprehensive account of audio-visual speech perception. In: Campbell, R., Dodd, B. (eds.) Hearing by Eye: The Psychology of Lip-Reading, pp. 3–51. Lawrence Erlbaum, London, United Kingdom (1987)
Google Scholar
Aleksic, P.S. Katsaggelos, A.K.: Audio-visual biometrics. IEEE Proc 94(11), 2025–2044 (2006)
Article Google Scholar
Chaudhari, U.V., Ramaswamy, G.N., Potamianos, G., Neti, C.: Audio-visual speaker recognition using time-varying stream reliability prediction. IEEE Proc. Int. Conf. Acoustics Speech Signal Process. (Hong Kong, China) 5, V-712–15 (2003)
Google Scholar
Hjelmas, E., Low, B.K.: Face detection: A survey. Computer Vision. Image Understand. 83(3), 236–274 (2001)
Article MATH Google Scholar
Hennecke, M.E., Stork, D.G., Prasad, K.V.: Visionary speech: Looking ahead to practical speechreading systems. In: Stork, D.G., Hennecke, M.E. (eds.) Speechreading by Humans and Machines, pp. 331–349. Springer, Berlin (1996)
Google Scholar
Aleksic, P.S., Katsaggelos, A.K.: Comparison of low- and high-level visual features for audio-visual continuous automatic speech recognition. IEEE Proc. Int. Conf. Acoustics Speech Signal Process. (Montreal, Canada) 5, 917–920 (2004)
Google Scholar
Potamianos, G., Graf, H.P., Cosatto, E.: An image transform approach for HMM based automatic lipreading. Paper presented at the Proceedings of the International Conference on Image Processing, vol. 1, pp. 173–177. Chicago, IL, 4–7 Oct. 1998
Google Scholar
Wark, T., Sridharan, S., Chandran, V.: Robust speaker verification via fusion of speech and lip modalities. Proc. Int. Conf. Acoustics Speech Signal Process. Phoenix 6, 3061–3064 (1999)
Google Scholar
Aleksic, P.S., Katsaggelos, A.K.: An audio-visual person identification and verification system using FAPs as visual features. Paper presented at the Proceedings of Works. Multimedia User Authentication, pp. 80–84. Santa Barbara, CA (2003)
Google Scholar
Cootes, T.F., Edwards, G.J., Taylor, C.J.: Active appearance models. Paper presented at the Proceedings of European Conference on Computer Vision, pp. 484–498. Freiburg, Germany (1998)
Google Scholar

Download references

Author information

Authors and Affiliations

Google Inc., New York, NY, USA
Petar S. Aleksic

Authors

Petar S. Aleksic
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Center for Biometrics and Security Research, Chinese Academy of Sciences, Beijing, China
Stan Z. Li (Professor) (Professor)
Departments of Computer Science & Engineering, Michigan State University, East Lansing, MI, USA
Anil Jain (Professor) (Professor)

Rights and permissions

Reprints and permissions

Copyright information

About this entry

Cite this entry

Aleksic, P.S. (2009). Lip Movement Recognition. In: Li, S.Z., Jain, A. (eds) Encyclopedia of Biometrics. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-73003-5_245

Download citation

DOI: https://doi.org/10.1007/978-0-387-73003-5_245
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-73002-8
Online ISBN: 978-0-387-73003-5
eBook Packages: Computer ScienceReference Module Computer Science and Engineering

Publish with us

Policies and ethics