Skip to main content
Log in

Speech-Input Speech-Output Communication for Dysarthric Speakers Using HMM-Based Speech Recognition and Adaptive Synthesis System

  • Published:
Circuits, Systems, and Signal Processing Aims and scope Submit manuscript

Abstract

Dysarthria is a motor speech disorder that causes inability to control and coordinate one or more articulators. This makes it difficult for a dysarthric speaker to utter certain speech sound units, thereby producing poorly articulated, slurred, and unintelligible speech. Hence, a speech supportive system needs to be developed to support them in their social difficulties. The current work aims at developing a speech supportive system, the objectives of which are threefold, namely (i) identifying the articulatory errors of each dysarthric speaker, (ii) developing a speech recognition system that corrects the errors in dysarthric speech by incorporating the findings from the first fold using a speaker-specific dictionary and (iii) developing an HMM-based speaker-adaptive speech synthesis system that synthesizes the error-corrected text for each dysarthric speaker retaining their identity. In the current work, the articulatory errors are analysed and identified, for 10 dysarthric speakers from the Nemours dysarthric speech corpus, using isolated-style phoneme recognition system trained with TIMIT speech corpus, followed by product of likelihood Gaussian-based analysis. The estimated articulatory errors are incorporated into a phoneme recognition system using speaker-specific dictionary and bigram language model. The error-corrected text is then synthesized as speech. The synthesized speech is evaluated to check its intelligibility and naturalness using mean opinion score. To further improve the intelligibility, speech rate of the synthesized speech is modified using time-domain pitch synchronous overlap add (TDPSOLA) technique. The results are quite encouraging, and this system is expected to be developed as a speech assistive device for a large vocabulary, in the near future, in a hand-held device.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  1. Y.J. Chen, Identification of articulation error patterns using a novel dependence network. IEEE Trans. Biomed. Eng. 58(11), 3061–3068 (2011)

    Article  Google Scholar 

  2. H. Christensen, P. Green, T. Hain, Learning speaker-specific pronunciations of disordered speech, in Interspeech, (2013), pp. 1159–1163

  3. F.L. Darley, A.E. Aronson, J.R. Brown, Differential diagnostic patterns of dysarthria. J. Speech Hear. Res. 12(2), 246–269 (1969)

    Article  Google Scholar 

  4. M. Dhanalakshmi, P. Vijayalakshmi, Intelligibility modification on Dysarthric speech using HMM-based adaptive synthesis system, in IEEE 2nd International Conference on Biomedical Engineering (ICoBE), (2015), pp. 1–5

  5. P. Enderby, Relationships between dysarthric groups. Int. J. Lang. Commun. Disord. 21(2), 189–197 (1986)

    Article  Google Scholar 

  6. W.M. Fisher, G.R. Doddington, K.M.G. Marshal, The DARPA speech recognition research database: specifications and status, in Proceedings of DARPA Workshop on Speech Recognition, (1986), pp. 93–99

  7. S. Goronzy, R. Kompe, A combined MAP + MLLR approach for speaker adaptation. Proc. Sony Res. Forum 99(1), 1–6 (1999)

    Google Scholar 

  8. M.S. Hawley, S.P. Cunningham, P.D. Green, P. Enderby, R. Palmer, S. Sehgal, P.O. Neill, A voice-input voice-output communication aid for people with severe speech impairment. IEEE Trans. Neural Syst. Rehabil. Eng. 21, 23–31 (2013)

    Article  Google Scholar 

  9. M.H. Johnson, J. Gunderson, A. Perlman, T. Huang, HMM-based and SVM-based recognition of the speech of talkers with spastic dysarthria, in Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), vol. 3 , (2006), pp. 1060–1063

  10. R.D. Kent, Research on speech motor control and its disorders: a review and prospective. J. Commun. Disord. 33, 391–428 (2000)

    Article  Google Scholar 

  11. R.D. Kent, G. Weismer, J.F. Kent, J.C. Rosenbek, Toward phonetic intelligibility testing in dysarthria. J. Speech Hear. Disord. 54(4), 482–499 (1989)

    Article  Google Scholar 

  12. R.D. Kent, G. Weismer, J.F. Kent, H.K. Vorperian, J.R. Duffy, Acoustic studies of dysarthric speech: methods, progress and potential. J. Commun. Disord. 32, 141–186 (1999)

    Article  Google Scholar 

  13. M.J. Kim, Y. Kim, H. Kim, Automatic intelligibility assessment of dysarthric speech using phonologically-structured sparse linear model. IEEE/ACM Trans. Audio Speech Lang. Process. 23(4), 694–704 (2015)

    Article  Google Scholar 

  14. S. Knuijt, J.G. Kalf, B.J. de Swart, G. Drost, H.T. Hendricks, A.C. Geurts, B.G.V. Engelen, Dysarthria and dysphagia are highly prevalent among various types of neuromuscular diseases. Disabil. Rehabil. 36(15), 1285–1289 (2014)

    Article  Google Scholar 

  15. J. Kominek, A.W. Black, The CMU Arctic speech databases (Fifth ISCA Speech Synthesis Workshop, Pittsburgh, 2004), pp. 223–224

  16. T. Masuko, K. Tokuda, T. Kobayashi, S. Imai, Voice characteristics conversion for HMM-based speech synthesis system. IEEE Int. Conf. Acoust. Speech Signal Process. (ICASSP) 3, 1611–1614 (1997)

    Google Scholar 

  17. S.O.C. Morales, S.J. Cox, Modelling errors in automatic speech recognition for dysarthric speakers. EURASIP J. Adv. Signal Process. 2009(1), 1–14 (2009)

    Article  MATH  Google Scholar 

  18. S.O.C. Morales, F.T. Romero, Evolutionary approach for integration of multiple pronunciation patterns for enhancement of dysarthric speech recognition. Expert Syst. Appl. 41(3), 841–852 (2014)

    Article  Google Scholar 

  19. T. Nagarajan, D. O’Shaughnessy, Bias estimation and correction in a classifier using product of likelihood-Gaussians. IEEE Int. Conf. Acoust. Speech Signal Process. (ICASSP) 3, 1061–1064 (2007)

    Google Scholar 

  20. X.M. Pidal, J.B. Polikoff, S.M. Peters, J.E. Leonzio, H.T. Bunnell, The Nemours database of dysarthric speech. Fourth Int. Conf. Spok. Lang. Proc. (ICSLP) 3, 1962–1965 (1996)

    Article  Google Scholar 

  21. J.B. Polikoff, H.T. Bunnell, The Nemours database of dysarthric speech: a perceptual analysis, in The XIVth International Congress of Phonetic Sciences, (1999), pp. 783–786

  22. B. Ramani, V.S. Solomi, A.R. Gladston, S.L. Christina, P. Vijayalakshmi, T. Nagarajan, H.A. Murthy, Development and evaluation of unit selection and HMM-based speech synthesis systems for Tamil, in National Conference on Communications (NCC), (2013), pp. 1–5

  23. F. Rudzicz, Articulatory knowledge in the recognition of dysarthric speech. IEEE Trans. Audio Speech Lang. Process. 19(4), 947–960 (2011)

    Article  Google Scholar 

  24. W.K. Seong, J.H. Park, H.K. Kim, Multiple pronunciation lexical modeling based on phoneme confusion matrix for dysarthric speech recognition. Adv. Sci. Technol. Lett. 14, 57–60 (2012)

    Google Scholar 

  25. H.V. Sharma, M.H. Johnson, Acoustic model adaptation using in-domain background models for dysarthric speech recognition. Comput. Speech Lang. 27(6), 1147–1162 (2013)

    Article  Google Scholar 

  26. H.V. Sharma, M.H. Johnson, State-transition interpolation and MAP adaptation for HMM-based dysarthric speech recognition, in Proceedings of the NAACL HLT 2010 Workshop on Speech and Language Processing for Assistive Technologies, Association for Computational Linguistics, (2010), pp. 72–79

  27. V. Surabhi, P. Vijayalakshmi, T.S. Lily, R.V. Jayanthan, Assessment of laryngeal dysfunctions of dysarthric speakers, in 31st Annual International Conference of the IEEE EMBS , (Minneapolis, 2009), pp. 2608–2611

  28. P. Vijayalakshmi, T. Nagarajan, M.R. Reddy, Assessment of articulatory and velopharyngeal sub-systems of dysarthric speech. Int. J. Biomed. soft Comput. Hum. Sci. Spec. Issue Biosens. Data Acquis. Process. Control 14(2), 87–94 (2009)

    Google Scholar 

  29. P. Vijayalakshmi, M.R. Reddy, D. O’Shaughnessy, in Assessment of Articulatory Sub-systems of Dysarthric Speech Using an Isolated-style Phoneme Recognition System (INTERSPEECH, ICSLP, 2006), pp. 981–984

  30. M.S. Yakoub, S.A. Selouani, D. O’Shaughnessy, Improving dysarthric speech intelligibility through re-synthesized and grafted units, in Canadian Conference on Electrical and Computer Engineering, (2008), pp. 1523–1526

  31. J. Yamagishi, T. Kobayashi, Y. Nakano, K. Ogata, J. Isogai, Analysis of speaker adaptation algorithms for HMM-based speech synthesis and a constrained SMAPLR adaptation algorithm. IEEE Trans. Audio Speech Lang. Process. 17(1), 66–83 (2009)

    Article  Google Scholar 

  32. S.C. Yin, R. Rose, O. Saz, E. Lleida, Verifying pronunciation accuracy from speakers with neuromuscular disorders, in Proceedings of the 10th International Conference on Spoken Language Processing (ICSLP-Interspeech) (Brisbane, Australia, 2008), pp. 2218–2221

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to M. Dhanalakshmi.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dhanalakshmi, M., Mariya Celin, T.A., Nagarajan, T. et al. Speech-Input Speech-Output Communication for Dysarthric Speakers Using HMM-Based Speech Recognition and Adaptive Synthesis System. Circuits Syst Signal Process 37, 674–703 (2018). https://doi.org/10.1007/s00034-017-0567-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00034-017-0567-9

Keywords

Navigation