Towards a Visual Speech Learning System for the Deaf by Matching Dynamic Lip Shapes

Chen, Shizhi; Quintian, D. Michael; Tian, YingLi

doi:10.1007/978-3-642-31522-0_1

Shizhi Chen²⁰,
D. Michael Quintian²⁰ &
YingLi Tian²⁰

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7382))

Included in the following conference series:

International Conference on Computers for Handicapped Persons

2654 Accesses
1 Citations

Abstract

In this paper we propose a visual-based speech learning framework to assist deaf persons by comparing the lip movements between a student and an E-tutor in an intelligent tutoring system. The framework utilizes lip reading technologies to determine if a student learns the correct pronunciation. Different from conventional speech recognition systems, which usually recognize a speaker’s utterance, our speech learning framework focuses on recognizing whether a student pronounces are correct according to an instructor’s utterance by using visual information. We propose a method by extracting dynamic shape difference features (DSDF) based on lip shapes to recognize the pronunciation difference. The preliminary experimental results demonstrate the robustness and effectiveness of our approach on a database we collected, which contains multiple persons speaking a small number of selected words.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Awad, S.: The Application of Digital Speech Processing to Stuttering Therapy. IEEE Instrumentation and Measurement (1997)
Google Scholar
Chen, S., Tian, Y., Liu, Q., Metaxas, D.: Segment and Recognize Expression Phase by Fusion of Motion Area and Neutral Divergence Features. In: IEEE Int’l Conf. on Automatic Face and Gesture Recognition, AFGR (2011)
Google Scholar
Chen, S., Tian, Y., Liu, Q., Metaxas, D.: Recognizing Expressions from Face and Body Gesture by Temporal Normalized Motion and Appearance Features. In: IEEE Int’l Conf. Computer Vision and Pattern Recognition Workshop for Human Communicative Behavior Analysis, CVPR4HB (2011)
Google Scholar
Cootes, T., Taylor, C., Cooper, D., Graham, J.: Active Shape Models – Their Training and Application. Computer Vision and Image Understanding (1995)
Google Scholar
Hailpern, J., Karahalios, K., DeThorne, L., Halle, J.: Encouraging Speech and Vocalization in Children with Autistic Spectrum Disorder. In: Workshop on Technology in Mental Health, CHI 2008 (2008)
Google Scholar
Lavagetto, F.: Converting speech into lip movements: a multimedia telephone for hard of hearing people. IEEE Transactions on Rehabilitation Engineering 3(1), 90–102 (1995)
Article Google Scholar
Marschark, M., Sapere, P., Convertino, C., Mayer, C., Wauters, L., Sarchet, T.: Are deaf students’ reading challenges really about reading? American Annals of the Deaf 154(4), 357-176 (2009)
Google Scholar
Matthews, I., Cootes, T., Bangham, J., Cox, S., Harvey, R.: Extraction of visual features for lipreading. TPAMI 24(2), 198–213 (2002)
Article Google Scholar
Potamianos, G., Neti, C., Gravier, G., Garg, A., Senior, A.: Recent Advances in the Automatic Recognition of Audio-Visual Speech. Proceedings of the IEEE 91(9), 1306–1326 (2003)
Article Google Scholar
Rahman, M., Ferdous, S., Ahmed, S.: Increasing Intelligibility in the Speech of the Autistic Children by an Interactive Computer Game. In: IEEE International Symposium on Multimedia (2010)
Google Scholar
Riella, R., Linarth, A., Lippmann, L., Nohama, P.: Computerized System to Aid Deaf Children in Speech Learning. In: IEEE EMBS International Conference (2001)
Google Scholar
Schipor, O., Pentiuc, S., Schipor, M.: Towards a Multimodal Emotion Recognition Framework to Be Integrated in a Computer Based Speech Therapy System. In: IEEE Conference on Speech Technology and Human Computer Dialogue, SpeD (2011)
Google Scholar
Wei, Y.: Research on Facial Expression Recognition and Synthesis, Master Thesis (2009) Software available at http://code.google.com/p/asmlibrary
Zhao, G., Barnard, M., Pietikainen, M.: Lipreading with local spatialtemporal descriptors. TMM 11(7), 1254–1265 (2009)
Google Scholar
Zhou, Z., Zhao, G., Pietikainen, M.: Toward a Practical Lipreading System. In: CVPR (2011)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical Engineering, The City College, City University of New York, 160 Convent Ave., New York, NY, 10031, USA
Shizhi Chen, D. Michael Quintian & YingLi Tian

Authors

Shizhi Chen
View author publications
You can also search for this author in PubMed Google Scholar
D. Michael Quintian
View author publications
You can also search for this author in PubMed Google Scholar
YingLi Tian
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institut Integriert Studieren, University of Linz, Altenbergerstraße 69, 4040, Linz, Austria
Klaus Miesenberger
University of San Francisco, 2130 Fulton St., 94117, San Francisco, CA, USA
Arthur Karshmer
Support Centre for Students with Special Needs, Masaryk University, Botanická 68A, 602 00, Brno, Czech Republic
Petr Penaz
Institute "integriert studieren", Vienna University of Technology, Favoritenstr. 11/029, 1040, Vienna, Austria
Wolfgang Zagler

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, S., Quintian, D.M., Tian, Y. (2012). Towards a Visual Speech Learning System for the Deaf by Matching Dynamic Lip Shapes. In: Miesenberger, K., Karshmer, A., Penaz, P., Zagler, W. (eds) Computers Helping People with Special Needs. ICCHP 2012. Lecture Notes in Computer Science, vol 7382. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31522-0_1

Download citation

DOI: https://doi.org/10.1007/978-3-642-31522-0_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-31521-3
Online ISBN: 978-3-642-31522-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics