Skip to main content

A Method for Predicting Words by Interpreting Labial Movements

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9787))

Abstract

The study of lips movements is relevant for a series of interesting applications in real world to enhance the communication means and in medical applications. In the present paper we illustrate a method we implemented with the purpose of helping Amyotrophic Lateral Schlerosys (ALS) patients to communicate, once the progress of the disease requires to intubate the patient and the voice is lost.

The Method uses several subsystems to carry out a so complex task and the results are really promising. However the method need to be improved in order to make the system more easy to use and more reliable in the prediction of pronounced words.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    RGB is a classification for the colors expressed in terms of the triple expressing the amount of the Red, Green and Blue colors, each ranging from 0 to 255.

  2. 2.

    YIQ is the color space used by the NTSC color TV system used mainly in North America, Central America and Japan.

  3. 3.

    A viseme is a generic facial image that can be used to describe a particular sound. A viseme is the visual equivalent of a phoneme or unit of sound in spoken language. Using visemes, the hearing-impaired can view sounds visually - effectively, “lip-reading” the entire human face.

  4. 4.

    XT9 is a text predicting and correcting system for mobile devices with full keyboards. It is a successor to T9, a popular predictive text algorithm for mobile phones with only numeric pads.

References

  1. Buchsbaum, W.H.: Color TV Servicing, 3rd edn. Prentice Hall, Englewood Cliffs (1975)

    Google Scholar 

  2. Magno Caldognetto, E., Zmarich, C., Cosi, P., Ferrero, F.: Italian consonantal visemes: Relationships between spatial/temporal articulatory characteristics and coproduced acoustic signal. In: Proceedings of AVSP-97, Tutorial and Research Workshop on Audio-Visual Speech Processing: Computational and Cognitive Science Approaches, Rhodes (Greece), pp. 5–8 (1997)

    Google Scholar 

  3. Canzler, U., Dziurzyk, T.: Extraction of non manual features for videobased sign language recognition. In: lAPK Workshop on Machine Vision Applications, MVA2002, Nara, Japan, pp. 318–321 (2002)

    Google Scholar 

  4. Cootes, T., Taylor, C., Cooper, D., Graham, J.: Active shape models-their training and application. Comput. Vis. Image Underst. 61, 61 (1995)

    Article  Google Scholar 

  5. Gale, W.A., Church, K.W.: A program for aligning sentences in bilingual corpora. In: Proceedings of the 29th Annual Meeting on Association for Computational Linguistics, ACL 1991, Stroudsburg, PA, USA, pp. 177–184. Association for Computational Linguistics (1991)

    Google Scholar 

  6. Gervasi, O., Magni, R., Macellari, S.: A brain computer interface for enhancing the communication of people with severe impairment. In: Murgante, B., et al. (eds.) ICCSA 2014, Part VI. LNCS, vol. 8584, pp. 709–721. Springer, Heidelberg (2014)

    Google Scholar 

  7. Gervasi, O., Magni, R., Riganelli, M.: Mixed reality for improving tele-rehabilitation practices. In: Gervasi, O., Murgante, B., Misra, S., Gavrilova, M.L., Rocha, A.M.A.C., Torre, C., Taniar, D., Apduhan, B.O. (eds.) ICCSA 2015. LNCS, vol. 9155, pp. 569–580. Springer, Heidelberg (2015)

    Chapter  Google Scholar 

  8. Gervasi, O., Magni, R., Zampolini, M.: Nu!rehavr: virtual reality in neuro tele-rehabilitation of patients with traumatic brain injury and stroke. Virtual Real. 14(2), 131–141 (2010)

    Article  Google Scholar 

  9. Gervasi, O., Russo, D., Vella, F.: The aes implantation based on opencl for multi/many core architecture. In: Proceedings of the 2010 International Conference on Computational Science and Its Applications, ICCSA 2010, Washington, DC, USA, pp. 129–134. IEEE Computer Society (2010)

    Google Scholar 

  10. Pan, S.W.J., Guan, Y.: A new color transformation based fast outer lip contour extraction. J. Inform. Comput. Sci. 9(9), 2505–2514 (2012)

    Google Scholar 

  11. Kass, M., Witkin, A., Terzopoulos, D.: Snakes: active contour models. Int. J. Comput. Vis. 1(4), 321–331 (1988)

    Article  MATH  Google Scholar 

  12. Kruskal, J.B.: An overview of sequence comparison. In: Sankoff, D., Kruskal, J.B. (eds.) Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison, pp. 1–44. Addison-Wesley, Reading (1983)

    Google Scholar 

  13. Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions and reversals. Sov. Phy. Dokl. 10, 707 (1966)

    MathSciNet  MATH  Google Scholar 

  14. Lievin, M., Delmas, P., Coulon, P.Y., Luthon, F., Fristol, V.: Automatic lip tracking: Bayesian segmentation and active contours in a cooperative scheme. In: IEEE International Conference on Multimedia Computing and Systems, 1999, vol. 1, pp. 691–696, Jul 1999

    Google Scholar 

  15. Mahalanobis, P.C.: On the generalised distance in statistics. Proc. Natl. Inst. Sci. India 2(1), 49–55 (1936)

    MathSciNet  MATH  Google Scholar 

  16. Saeed, U., Dugelay, J.-L.: Combining edge detection and region segmentation for lip contour extraction. In: Perales, F.J., Fisher, R.B. (eds.) AMDO 2010. LNCS, vol. 6169, pp. 11–20. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Osvaldo Gervasi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Gervasi, O., Magni, R., Ferri, M. (2016). A Method for Predicting Words by Interpreting Labial Movements. In: Gervasi, O., et al. Computational Science and Its Applications – ICCSA 2016. ICCSA 2016. Lecture Notes in Computer Science(), vol 9787. Springer, Cham. https://doi.org/10.1007/978-3-319-42108-7_34

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-42108-7_34

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-42107-0

  • Online ISBN: 978-3-319-42108-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics