Abstract
Dysarthria is a motor speech disorder that causes inability to control and coordinate one or more articulators. This makes it difficult for a dysarthric speaker to utter certain speech sound units, thereby producing poorly articulated, slurred, and unintelligible speech. Hence, a speech supportive system needs to be developed to support them in their social difficulties. The current work aims at developing a speech supportive system, the objectives of which are threefold, namely (i) identifying the articulatory errors of each dysarthric speaker, (ii) developing a speech recognition system that corrects the errors in dysarthric speech by incorporating the findings from the first fold using a speaker-specific dictionary and (iii) developing an HMM-based speaker-adaptive speech synthesis system that synthesizes the error-corrected text for each dysarthric speaker retaining their identity. In the current work, the articulatory errors are analysed and identified, for 10 dysarthric speakers from the Nemours dysarthric speech corpus, using isolated-style phoneme recognition system trained with TIMIT speech corpus, followed by product of likelihood Gaussian-based analysis. The estimated articulatory errors are incorporated into a phoneme recognition system using speaker-specific dictionary and bigram language model. The error-corrected text is then synthesized as speech. The synthesized speech is evaluated to check its intelligibility and naturalness using mean opinion score. To further improve the intelligibility, speech rate of the synthesized speech is modified using time-domain pitch synchronous overlap add (TDPSOLA) technique. The results are quite encouraging, and this system is expected to be developed as a speech assistive device for a large vocabulary, in the near future, in a hand-held device.
Similar content being viewed by others
References
Y.J. Chen, Identification of articulation error patterns using a novel dependence network. IEEE Trans. Biomed. Eng. 58(11), 3061–3068 (2011)
H. Christensen, P. Green, T. Hain, Learning speaker-specific pronunciations of disordered speech, in Interspeech, (2013), pp. 1159–1163
F.L. Darley, A.E. Aronson, J.R. Brown, Differential diagnostic patterns of dysarthria. J. Speech Hear. Res. 12(2), 246–269 (1969)
M. Dhanalakshmi, P. Vijayalakshmi, Intelligibility modification on Dysarthric speech using HMM-based adaptive synthesis system, in IEEE 2nd International Conference on Biomedical Engineering (ICoBE), (2015), pp. 1–5
P. Enderby, Relationships between dysarthric groups. Int. J. Lang. Commun. Disord. 21(2), 189–197 (1986)
W.M. Fisher, G.R. Doddington, K.M.G. Marshal, The DARPA speech recognition research database: specifications and status, in Proceedings of DARPA Workshop on Speech Recognition, (1986), pp. 93–99
S. Goronzy, R. Kompe, A combined MAP + MLLR approach for speaker adaptation. Proc. Sony Res. Forum 99(1), 1–6 (1999)
M.S. Hawley, S.P. Cunningham, P.D. Green, P. Enderby, R. Palmer, S. Sehgal, P.O. Neill, A voice-input voice-output communication aid for people with severe speech impairment. IEEE Trans. Neural Syst. Rehabil. Eng. 21, 23–31 (2013)
M.H. Johnson, J. Gunderson, A. Perlman, T. Huang, HMM-based and SVM-based recognition of the speech of talkers with spastic dysarthria, in Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), vol. 3 , (2006), pp. 1060–1063
R.D. Kent, Research on speech motor control and its disorders: a review and prospective. J. Commun. Disord. 33, 391–428 (2000)
R.D. Kent, G. Weismer, J.F. Kent, J.C. Rosenbek, Toward phonetic intelligibility testing in dysarthria. J. Speech Hear. Disord. 54(4), 482–499 (1989)
R.D. Kent, G. Weismer, J.F. Kent, H.K. Vorperian, J.R. Duffy, Acoustic studies of dysarthric speech: methods, progress and potential. J. Commun. Disord. 32, 141–186 (1999)
M.J. Kim, Y. Kim, H. Kim, Automatic intelligibility assessment of dysarthric speech using phonologically-structured sparse linear model. IEEE/ACM Trans. Audio Speech Lang. Process. 23(4), 694–704 (2015)
S. Knuijt, J.G. Kalf, B.J. de Swart, G. Drost, H.T. Hendricks, A.C. Geurts, B.G.V. Engelen, Dysarthria and dysphagia are highly prevalent among various types of neuromuscular diseases. Disabil. Rehabil. 36(15), 1285–1289 (2014)
J. Kominek, A.W. Black, The CMU Arctic speech databases (Fifth ISCA Speech Synthesis Workshop, Pittsburgh, 2004), pp. 223–224
T. Masuko, K. Tokuda, T. Kobayashi, S. Imai, Voice characteristics conversion for HMM-based speech synthesis system. IEEE Int. Conf. Acoust. Speech Signal Process. (ICASSP) 3, 1611–1614 (1997)
S.O.C. Morales, S.J. Cox, Modelling errors in automatic speech recognition for dysarthric speakers. EURASIP J. Adv. Signal Process. 2009(1), 1–14 (2009)
S.O.C. Morales, F.T. Romero, Evolutionary approach for integration of multiple pronunciation patterns for enhancement of dysarthric speech recognition. Expert Syst. Appl. 41(3), 841–852 (2014)
T. Nagarajan, D. O’Shaughnessy, Bias estimation and correction in a classifier using product of likelihood-Gaussians. IEEE Int. Conf. Acoust. Speech Signal Process. (ICASSP) 3, 1061–1064 (2007)
X.M. Pidal, J.B. Polikoff, S.M. Peters, J.E. Leonzio, H.T. Bunnell, The Nemours database of dysarthric speech. Fourth Int. Conf. Spok. Lang. Proc. (ICSLP) 3, 1962–1965 (1996)
J.B. Polikoff, H.T. Bunnell, The Nemours database of dysarthric speech: a perceptual analysis, in The XIVth International Congress of Phonetic Sciences, (1999), pp. 783–786
B. Ramani, V.S. Solomi, A.R. Gladston, S.L. Christina, P. Vijayalakshmi, T. Nagarajan, H.A. Murthy, Development and evaluation of unit selection and HMM-based speech synthesis systems for Tamil, in National Conference on Communications (NCC), (2013), pp. 1–5
F. Rudzicz, Articulatory knowledge in the recognition of dysarthric speech. IEEE Trans. Audio Speech Lang. Process. 19(4), 947–960 (2011)
W.K. Seong, J.H. Park, H.K. Kim, Multiple pronunciation lexical modeling based on phoneme confusion matrix for dysarthric speech recognition. Adv. Sci. Technol. Lett. 14, 57–60 (2012)
H.V. Sharma, M.H. Johnson, Acoustic model adaptation using in-domain background models for dysarthric speech recognition. Comput. Speech Lang. 27(6), 1147–1162 (2013)
H.V. Sharma, M.H. Johnson, State-transition interpolation and MAP adaptation for HMM-based dysarthric speech recognition, in Proceedings of the NAACL HLT 2010 Workshop on Speech and Language Processing for Assistive Technologies, Association for Computational Linguistics, (2010), pp. 72–79
V. Surabhi, P. Vijayalakshmi, T.S. Lily, R.V. Jayanthan, Assessment of laryngeal dysfunctions of dysarthric speakers, in 31st Annual International Conference of the IEEE EMBS , (Minneapolis, 2009), pp. 2608–2611
P. Vijayalakshmi, T. Nagarajan, M.R. Reddy, Assessment of articulatory and velopharyngeal sub-systems of dysarthric speech. Int. J. Biomed. soft Comput. Hum. Sci. Spec. Issue Biosens. Data Acquis. Process. Control 14(2), 87–94 (2009)
P. Vijayalakshmi, M.R. Reddy, D. O’Shaughnessy, in Assessment of Articulatory Sub-systems of Dysarthric Speech Using an Isolated-style Phoneme Recognition System (INTERSPEECH, ICSLP, 2006), pp. 981–984
M.S. Yakoub, S.A. Selouani, D. O’Shaughnessy, Improving dysarthric speech intelligibility through re-synthesized and grafted units, in Canadian Conference on Electrical and Computer Engineering, (2008), pp. 1523–1526
J. Yamagishi, T. Kobayashi, Y. Nakano, K. Ogata, J. Isogai, Analysis of speaker adaptation algorithms for HMM-based speech synthesis and a constrained SMAPLR adaptation algorithm. IEEE Trans. Audio Speech Lang. Process. 17(1), 66–83 (2009)
S.C. Yin, R. Rose, O. Saz, E. Lleida, Verifying pronunciation accuracy from speakers with neuromuscular disorders, in Proceedings of the 10th International Conference on Spoken Language Processing (ICSLP-Interspeech) (Brisbane, Australia, 2008), pp. 2218–2221
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Dhanalakshmi, M., Mariya Celin, T.A., Nagarajan, T. et al. Speech-Input Speech-Output Communication for Dysarthric Speakers Using HMM-Based Speech Recognition and Adaptive Synthesis System. Circuits Syst Signal Process 37, 674–703 (2018). https://doi.org/10.1007/s00034-017-0567-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00034-017-0567-9