Speech-Input Speech-Output Communication for Dysarthric Speakers Using HMM-Based Speech Recognition and Adaptive Synthesis System

Dhanalakshmi, M.; Mariya Celin, T. A.; Nagarajan, T.; Vijayalakshmi, P.

doi:10.1007/s00034-017-0567-9

Speech-Input Speech-Output Communication for Dysarthric Speakers Using HMM-Based Speech Recognition and Adaptive Synthesis System

Published: 04 May 2017

Volume 37, pages 674–703, (2018)
Cite this article

Circuits, Systems, and Signal Processing Aims and scope Submit manuscript

M. Dhanalakshmi¹,
T. A. Mariya Celin¹,
T. Nagarajan¹ &
…
P. Vijayalakshmi¹

646 Accesses
10 Citations
Explore all metrics

Abstract

Dysarthria is a motor speech disorder that causes inability to control and coordinate one or more articulators. This makes it difficult for a dysarthric speaker to utter certain speech sound units, thereby producing poorly articulated, slurred, and unintelligible speech. Hence, a speech supportive system needs to be developed to support them in their social difficulties. The current work aims at developing a speech supportive system, the objectives of which are threefold, namely (i) identifying the articulatory errors of each dysarthric speaker, (ii) developing a speech recognition system that corrects the errors in dysarthric speech by incorporating the findings from the first fold using a speaker-specific dictionary and (iii) developing an HMM-based speaker-adaptive speech synthesis system that synthesizes the error-corrected text for each dysarthric speaker retaining their identity. In the current work, the articulatory errors are analysed and identified, for 10 dysarthric speakers from the Nemours dysarthric speech corpus, using isolated-style phoneme recognition system trained with TIMIT speech corpus, followed by product of likelihood Gaussian-based analysis. The estimated articulatory errors are incorporated into a phoneme recognition system using speaker-specific dictionary and bigram language model. The error-corrected text is then synthesized as speech. The synthesized speech is evaluated to check its intelligibility and naturalness using mean opinion score. To further improve the intelligibility, speech rate of the synthesized speech is modified using time-domain pitch synchronous overlap add (TDPSOLA) technique. The results are quite encouraging, and this system is expected to be developed as a speech assistive device for a large vocabulary, in the near future, in a hand-held device.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A comprehensive survey on automatic speech recognition using neural networks

Article 15 August 2023

Conventional and contemporary approaches used in text to speech synthesis: a review

Article 13 November 2022

Comprehensive literature review on children automatic speech recognition system, acoustic linguistic mismatch approaches and challenges

Article 11 March 2024

References

Y.J. Chen, Identification of articulation error patterns using a novel dependence network. IEEE Trans. Biomed. Eng. 58(11), 3061–3068 (2011)
Article Google Scholar
H. Christensen, P. Green, T. Hain, Learning speaker-specific pronunciations of disordered speech, in Interspeech, (2013), pp. 1159–1163
F.L. Darley, A.E. Aronson, J.R. Brown, Differential diagnostic patterns of dysarthria. J. Speech Hear. Res. 12(2), 246–269 (1969)
Article Google Scholar
M. Dhanalakshmi, P. Vijayalakshmi, Intelligibility modification on Dysarthric speech using HMM-based adaptive synthesis system, in IEEE 2nd International Conference on Biomedical Engineering (ICoBE), (2015), pp. 1–5
P. Enderby, Relationships between dysarthric groups. Int. J. Lang. Commun. Disord. 21(2), 189–197 (1986)
Article Google Scholar
W.M. Fisher, G.R. Doddington, K.M.G. Marshal, The DARPA speech recognition research database: specifications and status, in Proceedings of DARPA Workshop on Speech Recognition, (1986), pp. 93–99
S. Goronzy, R. Kompe, A combined MAP + MLLR approach for speaker adaptation. Proc. Sony Res. Forum 99(1), 1–6 (1999)
Google Scholar
M.S. Hawley, S.P. Cunningham, P.D. Green, P. Enderby, R. Palmer, S. Sehgal, P.O. Neill, A voice-input voice-output communication aid for people with severe speech impairment. IEEE Trans. Neural Syst. Rehabil. Eng. 21, 23–31 (2013)
Article Google Scholar
M.H. Johnson, J. Gunderson, A. Perlman, T. Huang, HMM-based and SVM-based recognition of the speech of talkers with spastic dysarthria, in Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), vol. 3 , (2006), pp. 1060–1063
R.D. Kent, Research on speech motor control and its disorders: a review and prospective. J. Commun. Disord. 33, 391–428 (2000)
Article Google Scholar
R.D. Kent, G. Weismer, J.F. Kent, J.C. Rosenbek, Toward phonetic intelligibility testing in dysarthria. J. Speech Hear. Disord. 54(4), 482–499 (1989)
Article Google Scholar
R.D. Kent, G. Weismer, J.F. Kent, H.K. Vorperian, J.R. Duffy, Acoustic studies of dysarthric speech: methods, progress and potential. J. Commun. Disord. 32, 141–186 (1999)
Article Google Scholar
M.J. Kim, Y. Kim, H. Kim, Automatic intelligibility assessment of dysarthric speech using phonologically-structured sparse linear model. IEEE/ACM Trans. Audio Speech Lang. Process. 23(4), 694–704 (2015)
Article Google Scholar
S. Knuijt, J.G. Kalf, B.J. de Swart, G. Drost, H.T. Hendricks, A.C. Geurts, B.G.V. Engelen, Dysarthria and dysphagia are highly prevalent among various types of neuromuscular diseases. Disabil. Rehabil. 36(15), 1285–1289 (2014)
Article Google Scholar
J. Kominek, A.W. Black, The CMU Arctic speech databases (Fifth ISCA Speech Synthesis Workshop, Pittsburgh, 2004), pp. 223–224
T. Masuko, K. Tokuda, T. Kobayashi, S. Imai, Voice characteristics conversion for HMM-based speech synthesis system. IEEE Int. Conf. Acoust. Speech Signal Process. (ICASSP) 3, 1611–1614 (1997)
Google Scholar
S.O.C. Morales, S.J. Cox, Modelling errors in automatic speech recognition for dysarthric speakers. EURASIP J. Adv. Signal Process. 2009(1), 1–14 (2009)
Article MATH Google Scholar
S.O.C. Morales, F.T. Romero, Evolutionary approach for integration of multiple pronunciation patterns for enhancement of dysarthric speech recognition. Expert Syst. Appl. 41(3), 841–852 (2014)
Article Google Scholar
T. Nagarajan, D. O’Shaughnessy, Bias estimation and correction in a classifier using product of likelihood-Gaussians. IEEE Int. Conf. Acoust. Speech Signal Process. (ICASSP) 3, 1061–1064 (2007)
Google Scholar
X.M. Pidal, J.B. Polikoff, S.M. Peters, J.E. Leonzio, H.T. Bunnell, The Nemours database of dysarthric speech. Fourth Int. Conf. Spok. Lang. Proc. (ICSLP) 3, 1962–1965 (1996)
Article Google Scholar
J.B. Polikoff, H.T. Bunnell, The Nemours database of dysarthric speech: a perceptual analysis, in The XIVth International Congress of Phonetic Sciences, (1999), pp. 783–786
B. Ramani, V.S. Solomi, A.R. Gladston, S.L. Christina, P. Vijayalakshmi, T. Nagarajan, H.A. Murthy, Development and evaluation of unit selection and HMM-based speech synthesis systems for Tamil, in National Conference on Communications (NCC), (2013), pp. 1–5
F. Rudzicz, Articulatory knowledge in the recognition of dysarthric speech. IEEE Trans. Audio Speech Lang. Process. 19(4), 947–960 (2011)
Article Google Scholar
W.K. Seong, J.H. Park, H.K. Kim, Multiple pronunciation lexical modeling based on phoneme confusion matrix for dysarthric speech recognition. Adv. Sci. Technol. Lett. 14, 57–60 (2012)
Google Scholar
H.V. Sharma, M.H. Johnson, Acoustic model adaptation using in-domain background models for dysarthric speech recognition. Comput. Speech Lang. 27(6), 1147–1162 (2013)
Article Google Scholar
H.V. Sharma, M.H. Johnson, State-transition interpolation and MAP adaptation for HMM-based dysarthric speech recognition, in Proceedings of the NAACL HLT 2010 Workshop on Speech and Language Processing for Assistive Technologies, Association for Computational Linguistics, (2010), pp. 72–79
V. Surabhi, P. Vijayalakshmi, T.S. Lily, R.V. Jayanthan, Assessment of laryngeal dysfunctions of dysarthric speakers, in 31st Annual International Conference of the IEEE EMBS , (Minneapolis, 2009), pp. 2608–2611
P. Vijayalakshmi, T. Nagarajan, M.R. Reddy, Assessment of articulatory and velopharyngeal sub-systems of dysarthric speech. Int. J. Biomed. soft Comput. Hum. Sci. Spec. Issue Biosens. Data Acquis. Process. Control 14(2), 87–94 (2009)
Google Scholar
P. Vijayalakshmi, M.R. Reddy, D. O’Shaughnessy, in Assessment of Articulatory Sub-systems of Dysarthric Speech Using an Isolated-style Phoneme Recognition System (INTERSPEECH, ICSLP, 2006), pp. 981–984
M.S. Yakoub, S.A. Selouani, D. O’Shaughnessy, Improving dysarthric speech intelligibility through re-synthesized and grafted units, in Canadian Conference on Electrical and Computer Engineering, (2008), pp. 1523–1526
J. Yamagishi, T. Kobayashi, Y. Nakano, K. Ogata, J. Isogai, Analysis of speaker adaptation algorithms for HMM-based speech synthesis and a constrained SMAPLR adaptation algorithm. IEEE Trans. Audio Speech Lang. Process. 17(1), 66–83 (2009)
Article Google Scholar
S.C. Yin, R. Rose, O. Saz, E. Lleida, Verifying pronunciation accuracy from speakers with neuromuscular disorders, in Proceedings of the 10th International Conference on Spoken Language Processing (ICSLP-Interspeech) (Brisbane, Australia, 2008), pp. 2218–2221

Download references

Author information

Authors and Affiliations

Speech Lab, SSN College of Engineering, Old Mahabalipuram Road, Chennai, India
M. Dhanalakshmi, T. A. Mariya Celin, T. Nagarajan & P. Vijayalakshmi

Authors

M. Dhanalakshmi
View author publications
You can also search for this author in PubMed Google Scholar
T. A. Mariya Celin
View author publications
You can also search for this author in PubMed Google Scholar
T. Nagarajan
View author publications
You can also search for this author in PubMed Google Scholar
P. Vijayalakshmi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to M. Dhanalakshmi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dhanalakshmi, M., Mariya Celin, T.A., Nagarajan, T. et al. Speech-Input Speech-Output Communication for Dysarthric Speakers Using HMM-Based Speech Recognition and Adaptive Synthesis System. Circuits Syst Signal Process 37, 674–703 (2018). https://doi.org/10.1007/s00034-017-0567-9

Download citation

Received: 15 September 2016
Revised: 20 April 2017
Accepted: 22 April 2017
Published: 04 May 2017
Issue Date: February 2018
DOI: https://doi.org/10.1007/s00034-017-0567-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Speech-Input Speech-Output Communication for Dysarthric Speakers Using HMM-Based Speech Recognition and Adaptive Synthesis System

Abstract

Access this article

Similar content being viewed by others

A comprehensive survey on automatic speech recognition using neural networks

Conventional and contemporary approaches used in text to speech synthesis: a review

Comprehensive literature review on children automatic speech recognition system, acoustic linguistic mismatch approaches and challenges

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Speech-Input Speech-Output Communication for Dysarthric Speakers Using HMM-Based Speech Recognition and Adaptive Synthesis System

Abstract

Access this article

Similar content being viewed by others

A comprehensive survey on automatic speech recognition using neural networks

Conventional and contemporary approaches used in text to speech synthesis: a review

Comprehensive literature review on children automatic speech recognition system, acoustic linguistic mismatch approaches and challenges

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation