Lipreading Using Recurrent Neural Prediction Model

Tsunekawa, Takuya; Hotta, Kazuhiro; Takahashi, Haruhisa

doi:10.1007/978-3-540-30126-4_50

Takuya Tsunekawa¹⁸,
Kazuhiro Hotta¹⁸ &
Haruhisa Takahashi¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3212))

Included in the following conference series:

International Conference Image Analysis and Recognition

900 Accesses

Abstract

We present lipreading using recurrent neural prediction model. Lipreading copes with time-series data like speech recognition. Therefore, many traditional methods use Hidden Markov Model (HMM) as the classifier for lipreading. However, in recent years, a speech recognition method using Recurrent Neural Prediction Model (RNPM) is proposed, and good result is reported. It is expected that RNPM also gives the good result for lipreading, because lipreading has the similar properties with speech recognition. The effectiveness of the proposed method is confirmed by using 8 words captured from 5 persons. In addition, the comparison with HMM is performed. It is confirmed that the comparable performance is obtained.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Sentences Prediction Based on Automatic Lip-Reading Detection with Deep Learning Convolutional Neural Networks Using Video-Based Features

Lip Movement Detection—A Survey of State-of-the-Art Approaches

Silent Speech Recognition: Automatic Lip Reading Model Using 3D CNN and GRU

References

Rogozan, A., Deléglise, P.: Adaptive fusion of acoustic and visual sources for automatic speech recognition. Speech Communication 26(1-2), 149–161 (1998)
Article Google Scholar
Potamianos, G., Neti, C., Iyengar, G., Helmuth, E.: Large-Vocabulary Audio-Visual Speech Recognition by Machines and Humans. In: Proc. Eurospeech (2001)
Google Scholar
Iwano, K., Tamura, S., Furui, S.: Bimodal Speech Recognition Using Lip Movement Measured by Optical-Flow Analysis. In: Proceedings International Workshop on Hands-Free Speech Communication, pp. 187–190 (2001)
Google Scholar
Mase, K., Pentland, A.: Lipreading by optical flow. Systems and Computers in Japan 22(6), 67–76 (1991)
Google Scholar
Uchiyama, T., Takahashi, H.: Speech Recognition Using Recurrent Neural Prediction Model. IEICE Transactions on Information and Systems D-II J83-DII(2), 776–783 (2000) (in Japanese)
Google Scholar
Lucas, B.D., Kanade, T.: An Iterative Image Registration Technique with an Application to Stereo Vision. In: Proceedings of Imaging Understanding Workshop, pp. 121–130 (1981)
Google Scholar
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. John Wiley & Sons, Inc., Chichester (2001)
MATH Google Scholar
Jordan, M.: Serial order: A Parallel Distributed Processing Approach, Technical report ICS, no.8604 (1986)
Google Scholar
Elman, J.L.: Finding structure in time. Cognitive Science 14, 179–211 (1990)
Article Google Scholar
Rabiner, L.: A tutorial on Hidden Markov Models and selected applications in speech recognition. Proc. of IEEE 77(2), 257–286 (1989)
Article Google Scholar
Huang, X.D., Ariki, Y., Jack, M.A.: Hidden Markov Models for Speech Recognition. Edinburgh Univ. Press (1990)
Google Scholar

Download references

Author information

Authors and Affiliations

The University of Electro-Communications, 1-5-1 Chofugaoka, Chofu-shi, Tokyo, 182-8585, Japan
Takuya Tsunekawa, Kazuhiro Hotta & Haruhisa Takahashi

Authors

Takuya Tsunekawa
View author publications
You can also search for this author in PubMed Google Scholar
Kazuhiro Hotta
View author publications
You can also search for this author in PubMed Google Scholar
Haruhisa Takahashi
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

FEUP - Faculdade de Engenharia, Universidade do Porto, Rua Dr. Roberto Frias, 4200-465, Porto, Portugal
Aurélio Campilho
Electrical and Computer Engineering Department, University of Waterloo,
Mohamed Kamel

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tsunekawa, T., Hotta, K., Takahashi, H. (2004). Lipreading Using Recurrent Neural Prediction Model. In: Campilho, A., Kamel, M. (eds) Image Analysis and Recognition. ICIAR 2004. Lecture Notes in Computer Science, vol 3212. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30126-4_50

Download citation

DOI: https://doi.org/10.1007/978-3-540-30126-4_50
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23240-7
Online ISBN: 978-3-540-30126-4
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Lipreading Using Recurrent Neural Prediction Model

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Sentences Prediction Based on Automatic Lip-Reading Detection with Deep Learning Convolutional Neural Networks Using Video-Based Features

Lip Movement Detection—A Survey of State-of-the-Art Approaches

Silent Speech Recognition: Automatic Lip Reading Model Using 3D CNN and GRU

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Lipreading Using Recurrent Neural Prediction Model

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Sentences Prediction Based on Automatic Lip-Reading Detection with Deep Learning Convolutional Neural Networks Using Video-Based Features

Lip Movement Detection—A Survey of State-of-the-Art Approaches

Silent Speech Recognition: Automatic Lip Reading Model Using 3D CNN and GRU

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation