Skip to main content

Combining Global and Local Classifiers for Lipreading

  • Conference paper
Affective Computing and Intelligent Interaction (ACII 2007)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 4738))

  • 5741 Accesses

Abstract

Lipreading has become a hot research topic in recent years since the visual information extracted from the lip movement has been shown to improve the performance of automatic speech recognition (ASR) system especially under noisy environments [1]-[3], [5]. There are two important issues related to lipreading: 1) how to extract the most efficient features from lip image sequences, 2) how to build lipreading models. This paper mainly focuses on how to choose more efficient features for lipreading.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Morishima, S., Ogata, S., Murai, K., Nakamura, S.: Audio-visual speech translation with automatic lip synchronization and face tracking based on 3D head model. In: Proc. IEEE Int. Conf. Acoustics, Speech,and Signal Processing, vol. 2, pp. 2117–2120 (2002)

    Google Scholar 

  2. Potamianos, G., Graf, H.P., Cosatto, E.: An image transform approach for HMM based automatic lipreading. In: Proc. Int. Conf. Image Process, Chicago, pp. 173–177 (1998)

    Google Scholar 

  3. Dupont, S., Luettin, J.: Audio-visual speech modeling for continuous speech recognition. IEEE Trans. On Multimedia 2, 141–151 (2000)

    Article  Google Scholar 

  4. Shen, L., Bai, L.: Gabor feature based face recognition using kernel methods. AFGR, pp. 170–176 (2004)

    Google Scholar 

  5. Matthews., et al.: Extraction of Visual Features for Lipreading. IEEE Trans. on Pattern Analysis and Machine Intelligence 24(2) (2002)

    Google Scholar 

  6. Duchnowski, P., et al.: Toward movement-invariant automatic lip-reading and speech recognition. In: Duchnowski, P. (ed.) Proc. Int. Conf. Acoust. Speech Signal Process, Detroit, pp. 109–111 (1995)

    Google Scholar 

  7. Navon, D.: Forest before the trees: the precedence of global features in visual perception. Cognitive Psychology 9, 353–383 (1977)

    Article  Google Scholar 

  8. Biederman, I.: On the semantics of a glance at a scene. In: Kubovy, M., Pomerantz, J. (eds.) Perceptual organization, pp. 213–253. Erlbaum (1981)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Ana C. R. Paiva Rui Prada Rosalind W. Picard

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Zhang, S., Yao, H., Wan, Y., Wang, D. (2007). Combining Global and Local Classifiers for Lipreading. In: Paiva, A.C.R., Prada, R., Picard, R.W. (eds) Affective Computing and Intelligent Interaction. ACII 2007. Lecture Notes in Computer Science, vol 4738. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74889-2_73

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-74889-2_73

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-74888-5

  • Online ISBN: 978-3-540-74889-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics