Correct Speech Visemes as a Root of Total Communication Method for Deaf People

Pajorová, Eva; Hluchý, Ladislav

doi:10.1007/978-3-642-30947-2_43

Eva Pajorová²³ &
Ladislav Hluchý²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7327))

Included in the following conference series:

KES International Symposium on Agent and Multi-Agent Systems: Technologies and Applications

2070 Accesses

Abstract

Many deaf people are using lip reading as a main communication fiorm. A viseme is a representational unit used to classify speech sounds in the visual domain and describes the particular facial and oral positions and movements that occur alongside the voicing of phonemes. A design tool for creating correct speech visemes is designed. It’s composed of 5 modules; one module for creating phonemes, one module for creating 3D speech visemes, one module for facial expression and modul for synchronization between phonemes and visemes and lastly one module to generate speech triphones. We are testing the correctness of generated visemes on Slovak speech domains. The paper descriebes our developed tool.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Hrúz, M., Krňoul, Z., Campr, P., Muller Ludek, M.S.: Tovards Automatic Annotation of Sing Language Dictionarz Corpora. In: TDS Pilsen, pp. 331–339 (2011)
Google Scholar
Cai, Q., Gallup, D., Zhang, C., Zhang, Z.: 3D deformable face tracking with a commodity depth camera. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part III. LNCS, vol. 6313, pp. 229–242. Springer, Heidelberg (2010)
Chapter Google Scholar
Drahoš, P., Šperka, M.: Face Expressions Animation in e-Learning, http://lconf06.dei.uc.pt/pdfs/paper14.pdf
Albrecht, J.: Haber, and H.-P. Seidel. Speech Synchronization for Physicsbased Facial Animation. In: Skala, V. (ed.) Proc. 10th Int. Conf. on Computer Graphics, Visualization and Computer Vision (WSCG 2002), pp. 9–16 (2002)
Google Scholar
Ma, J., Cole, R., Pellom, B., Ward, W., Wise, B.: Accurate visible speech synthesis based on concatenating variable length motion capture data. IEEE
Google Scholar
Wang, A., Emmi, M., Faloutsos, P.: Assembling an expressive facial animation system. In: Sandbox 2007: Proceedings of the 2007 ACM SIGGRAPH Symposium on Video Games, pp. 21–26. ACM Press (2007)
Google Scholar
Ezzat, T., Geiger, G., Poggio, T.: Trainable videorealistic speech animation. In: SIGGRAPH 2002: Proceedings of the 29th Annual Conference on Computer Graphics and Interactive Techniques, pp. 388–398. ACM Press (2002)
Google Scholar
Cohen, M.M., Massaro, D.W.: Modeling coarticulation in synthetic visual speech. In: Magnenat Thalmann, N., Thalmann, D. (eds.) Models and Techniques in Computer Animation, pp. 139–156. Springer, Tokyo (1994)
Google Scholar
Brand, M.: Voice puppetry. In: SIGGRAPH 1999: Proceedings of the 26th Annual Conference on Computer Graphics and Interactive Techniques, pp. 21–28. ACM Press/Addison-Wesley Publishing Co. (1999)
Google Scholar
Kim, I.-J., Ko., H.-S.: 3d lip-synch generation with data-faithful machine learning. Computer Graphics Forum. In: EUROGRAPHICS 2007, vol. 26(3) (2007)
Google Scholar
Bregler, C., Covell, M., Slaney, M.: Video rewrite:driving visual speech with audio. In: Computer Graphics, Proc. SIGGRAPH 1997, pp. 67–74 (1997)
Google Scholar
Meadow-Orlans, K.P., Mertens, D.M., Marilyn, S.-L.: Parents and their Deaf Children. Gallaudet University Press, Washington D.C. (2003)
Google Scholar
Munoz-Baell, I.M., Ruiz, T.M.: Empowering the deaf; let the deaf be deaf. Journal Epidemiol Community Health 54(1), 40–44 (2000)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Informatic, Slovak Academy of Sciences, Dúbravská 9, 84507, Bratislava, Slovakia
Eva Pajorová & Ladislav Hluchý

Authors

Eva Pajorová
View author publications
You can also search for this author in PubMed Google Scholar
Ladislav Hluchý
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Electrical Engineering and Computing, University of Zagreb, Unska 3, 10000, Zagreb, Croatia
Gordan Jezic & Mario Kusek &
Institute of Informatics (I-32), Division of Knowledge Management Systems, Wroclaw University of Technology, Str. Wyb. Wyspianskiego 27, 50-370, Wroclaw, Poland
Ngoc-Thanh Nguyen
KES International, Shoreham-by-sea, P.O. Box 2115, BN43 9AF, UK
Robert J. Howlett
School of Electrical and Information Engineering, University of South Australia, Mawson Lakes Campus, 5095, Adelaide, SA, Australia
Lakhmi C. Jain

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pajorová, E., Hluchý, L. (2012). Correct Speech Visemes as a Root of Total Communication Method for Deaf People. In: Jezic, G., Kusek, M., Nguyen, NT., Howlett, R.J., Jain, L.C. (eds) Agent and Multi-Agent Systems. Technologies and Applications. KES-AMSTA 2012. Lecture Notes in Computer Science(), vol 7327. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-30947-2_43

Download citation

DOI: https://doi.org/10.1007/978-3-642-30947-2_43
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-30946-5
Online ISBN: 978-3-642-30947-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics