Skip to main content

Lip Movement Recognition

  • Reference work entry
Encyclopedia of Biometrics

Synonyms

Audio–visual-dynamic speaker recognition; Visual-dynamic speaker recognition

Definition

Lip movement recognition is a speaker recognition technique, where the identity of a speaker is determined/verified by exploiting information contained in dynamics of changes of visual features extracted from the mouth region. The visual features usually consist of appropriate representations of the mouth appearance and/or shape. This dynamic visual information can also be used in addition to the acoustic information in order to improve the performance of audio-only speaker recognition systems and increase their resilience to spoofing, therefore giving rise to audio–visual-dynamic speaker recognition systems.

Introduction

Speech contains information about identity, emotion, location, as well as linguistic information, and plays a significant role in the development of human computer interaction (HCI) systems, including speaker recognition systems. However, audio-only systems can perform...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 449.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Chen, T., Rao, R.R.: Audio-visual integration in multimodal communication. Proc. IEEE 86(5), 837–852 (1998)

    Article  MathSciNet  Google Scholar 

  2. Aleksic, P.S., Potamianos, G., Katsaggelos, A.K.: Exploiting visual information in automatic speech processing. In: Bovik, A.L. (ed.) Handbook of Image and Video Processing. Academic, London (2005)

    Google Scholar 

  3. Aleksic, P.S., Katsaggelos, A.K.: Speech-to-video synthesis using MPEG-4 compliant visual features. IEEE Trans CSVT, Special Issue on Audio and Video Analysis for Multimedia Interactive Services, pp. 682–692, May (2004)

    Google Scholar 

  4. Summerfield, A.Q.: Some preliminaries to a comprehensive account of audio-visual speech perception. In: Campbell, R., Dodd, B. (eds.) Hearing by Eye: The Psychology of Lip-Reading, pp. 3–51. Lawrence Erlbaum, London, United Kingdom (1987)

    Google Scholar 

  5. Aleksic, P.S. Katsaggelos, A.K.: Audio-visual biometrics. IEEE Proc 94(11), 2025–2044 (2006)

    Article  Google Scholar 

  6. Chaudhari, U.V., Ramaswamy, G.N., Potamianos, G., Neti, C.: Audio-visual speaker recognition using time-varying stream reliability prediction. IEEE Proc. Int. Conf. Acoustics Speech Signal Process. (Hong Kong, China) 5, V-712–15 (2003)

    Google Scholar 

  7. Hjelmas, E., Low, B.K.: Face detection: A survey. Computer Vision. Image Understand. 83(3), 236–274 (2001)

    Article  MATH  Google Scholar 

  8. Hennecke, M.E., Stork, D.G., Prasad, K.V.: Visionary speech: Looking ahead to practical speechreading systems. In: Stork, D.G., Hennecke, M.E. (eds.) Speechreading by Humans and Machines, pp. 331–349. Springer, Berlin (1996)

    Google Scholar 

  9. Aleksic, P.S., Katsaggelos, A.K.: Comparison of low- and high-level visual features for audio-visual continuous automatic speech recognition. IEEE Proc. Int. Conf. Acoustics Speech Signal Process. (Montreal, Canada) 5, 917–920 (2004)

    Google Scholar 

  10. Potamianos, G., Graf, H.P., Cosatto, E.: An image transform approach for HMM based automatic lipreading. Paper presented at the Proceedings of the International Conference on Image Processing, vol. 1, pp. 173–177. Chicago, IL, 4–7 Oct. 1998

    Google Scholar 

  11. Wark, T., Sridharan, S., Chandran, V.: Robust speaker verification via fusion of speech and lip modalities. Proc. Int. Conf. Acoustics Speech Signal Process. Phoenix 6, 3061–3064 (1999)

    Google Scholar 

  12. Aleksic, P.S., Katsaggelos, A.K.: An audio-visual person identification and verification system using FAPs as visual features. Paper presented at the Proceedings of Works. Multimedia User Authentication, pp. 80–84. Santa Barbara, CA (2003)

    Google Scholar 

  13. Cootes, T.F., Edwards, G.J., Taylor, C.J.: Active appearance models. Paper presented at the Proceedings of European Conference on Computer Vision, pp. 484–498. Freiburg, Germany (1998)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer Science+Business Media, LLC

About this entry

Cite this entry

Aleksic, P.S. (2009). Lip Movement Recognition. In: Li, S.Z., Jain, A. (eds) Encyclopedia of Biometrics. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-73003-5_245

Download citation

Publish with us

Policies and ethics