Speech Training System for Hearing Impaired Individuals Based on Automatic Lip-Reading Recognition

Lu, Yuanyao; Yang, Shenyao; Xu, Zheng; Wang, Jingzhong

doi:10.1007/978-3-030-51369-6_34

Speech Training System for Hearing Impaired Individuals Based on Automatic Lip-Reading Recognition

Yuanyao Lu¹⁵,
Shenyao Yang¹⁵,
Zheng Xu¹⁵ &
…
Jingzhong Wang¹⁵

Conference paper
First Online: 01 July 2020

1528 Accesses
6 Citations

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1207))

Abstract

Using automatic lip recognition technology to promote social interaction and integration of hearing impaired individuals and dysphonic people is one of promising applications of artificial intelligence in healthcare and rehabilitation. Due to inaccurate mouth shapes and unclear expressions, hearing impaired individuals and dysphonic people cannot communicate as normal people do. In this paper, a speech training system for hearing impaired individuals and dysphonic people is constructed using state-of-the-art automatic lip-reading technology which combines convolutional neural network (CNN) and recurrent neural network (RNN). We train their speech skills by comparing different mouth shapes between the hearing impaired individuals and normal people. The speech training system can be divided into four parts. Firstly, we create a speech training database that stores mouth shapes of normal people and corresponding sign language vocabulary. Secondly, the system implements automatic lip-reading through a hybrid neural network of the MobileNet and the Long-Short-Term Memory Networks (LSTM). Thirdly, the system finds correct lip shape matched by sign language vocabulary from the speech training database and compares the result with lip shapes of hearing impaired individuals. Finally, the system draws comparison data and similarity rate based on the size of lips of hearing impaired individuals, the angle of opening lips, and the differences between different lip shapes. Giving a standard lip-reading sequence for the hearing impaired for their learning and training. As a result, hearing impaired individuals and dysphonic people can analyze and correct their vocal lip shapes based on the comparison results. They can perform training independently to improve their mouth shape. Besides, the system can help hearing impaired individuals learn how to pronounce correctly with the help of medical devices such as cochlear implants. Experiments show that the speech training system based on automatic lip-reading recognition can effectively correct lip shape of the hearing impaired individuals while they speak and improve their speech ability without help from others.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Ogawa, T., Uchida, Y.: Hearing-impaired elderly people have smaller social networks: A population-based aging study. Arch. Gerontol. Geriatricsr 83, 75–80 (2019)
Google Scholar
Xiang, Z.: China Statistical Yearbook on the Work for Persons with Disabilities (2018). ISBN 978-7-5037-8563-4
Google Scholar
Melissa, R.: A survey of clinicians with specialization in childhood apraxia of speech. Am. J. Speech-Lang. Pathol. 28, 1659–1672 (2019)
Article Google Scholar
Bhutta, M.F.: Models of service delivery for ear and hearing care in remote or resource-constrained environments. J. Laryngol. Otol. 18, 1–10 (2018)
Google Scholar
Perry, H.B., Zulliger, R., Rogers, M.M.: Community health workers in low-, middle-, and high-income countries: an overview of their history, recent evolution, and current effectiveness. Annu. Rev. Public Health 35(1), 399–421 (2014)
Article Google Scholar
Jaimes, A., Sebe, N.: Multimodal human–computer interaction: a survey. Comput. Vis. Image Underst. 108, 116–134 (2007)
Article Google Scholar
Ma, N.W.: Enlightenment of domestic research of lip-reading on speech rehabilitation of hearing-impaired children. Modern Spec. Educ. 12, 54–57 (2015)
Google Scholar
Lu, Y.Y., Li, H.B.: Automatic lip-reading system based on deep convolutional neural network and attention-based long short-term memory. Appl. Sci. Basel. 9 (2019). https://doi.org/10.3390/app9081599
Hassanat, A.B.: Visual passwords using automatic lip reading. arXiv 2014. arXiv:1409.0924
Thanda, A., Venkatesan, S.M.: Multi-task learning of deep neural networks for audio visual automatic speech recognition. arXiv 2017 arXiv:1701.02477
Biswas, A., Sahu, P.K., Chandra, M.: Multiple cameras audio visual speech recognition using active appearance model visual features in car environment. Int. J. Speech Technol. 19, 159–171 (2016)
Article Google Scholar
Werth, J., Radha, M.: Deep learning approach for ECG-based automatic sleep state classification in preterm infants. Biomed. Sig. Process. Control https://doi.org/10.1016/j.bspc.2019.101663
McNeely-White, D., Beveridge, J.R., Draper, B.A.: Inception and ResNet features are (almost) equivalent. Cogn. Syst. Res. 59, 312–318 (2020)
Article Google Scholar
Rauf, HT., Miu, L., Zahoor, S.: Visual features based automated identification of fish species using deep convolutional neural networks. Comput. Electron. Agric. https://doi.org/10.1016/j.compag.2019.105075
Chollet, F.: Xception: deep learning with depthwise separable convolutions. In: Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017
Google Scholar
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: MobileNetV2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018
Google Scholar
Chen, H.Y., Su, C.Y.: An enhanced hybrid MobileNet. In: Proceedings of the International Conference on Awareness Science and Technology (2018)
Google Scholar
Michele, A., Colin, V., Santika, D.D.: Santika MobileNet convolutional neural networks and support vector machines for palmprint recognition. Procedia Comput. Sci. 157, 110–117 (2019)
Article Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9, 1735–1780 (1997)
Article Google Scholar
Gers, F.A., Schmidhuber, J., Cummins, F.: Learning to forget: continual prediction with LSTM. In: Proceedings of the 9th International Conference on Artificial Neural Networks: ICANN 1999, Edinburgh, UK, 7–10 September 1999
Google Scholar

Download references

Acknowledgements

The research was partially supported by the National Natural Science Foundation of China (no. 61571013 and no. 61971007), the Beijing Natural Science Foundation of China (no. 4143061).

Author information

Authors and Affiliations

School of Information Science and Technology, North China University of Technology, No. 5 Jinyuanzhuang Road, Shijingshan District, Beijing, 100144, China
Yuanyao Lu, Shenyao Yang, Zheng Xu & Jingzhong Wang

Authors

Yuanyao Lu
View author publications
You can also search for this author in PubMed Google Scholar
Shenyao Yang
View author publications
You can also search for this author in PubMed Google Scholar
Zheng Xu
View author publications
You can also search for this author in PubMed Google Scholar
Jingzhong Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yuanyao Lu .

Editor information

Editors and Affiliations

Dept Engg. Mecanica e Industrial, Universidade Nova de Lisboa, Caparica, Portugal
Isabel L. Nunes

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lu, Y., Yang, S., Xu, Z., Wang, J. (2020). Speech Training System for Hearing Impaired Individuals Based on Automatic Lip-Reading Recognition. In: Nunes, I. (eds) Advances in Human Factors and Systems Interaction. AHFE 2020. Advances in Intelligent Systems and Computing, vol 1207. Springer, Cham. https://doi.org/10.1007/978-3-030-51369-6_34

Download citation

DOI: https://doi.org/10.1007/978-3-030-51369-6_34
Published: 01 July 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-51368-9
Online ISBN: 978-3-030-51369-6
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics