Using speech rhythm knowledge to improve dysarthric speech recognition

Selouani, S.-A.; Dahmani, H.; Amami, R.; Hamam, H.

doi:10.1007/s10772-011-9104-6

Using speech rhythm knowledge to improve dysarthric speech recognition

Published: 31 August 2011

Volume 15, pages 57–64, (2012)
Cite this article

International Journal of Speech Technology Aims and scope Submit manuscript

S.-A. Selouani¹,
H. Dahmani²,
R. Amami³ &
…
H. Hamam⁴

384 Accesses
11 Citations
Explore all metrics

Abstract

We introduce a new framework to improve the dysarthric speech recognition by using the rhythm knowledge. This approach builds speaker-dependent (SD) recognizers with respect to the dysarthria severity level of each speaker. This severity level is determined by a hybrid classifier combining class posterior distributions and a hierarchical structure of multilayer perceptrons. To perform this classification, rhythm-based features are used as input parameters since the preliminary evidence from perceptual experiments shows that rhythm troubles may be the common characteristic of various types of dysarthria. Then, a speaker-dependent dysarthric speech recognition is performed by using Hidden Markov Models (HMMs). The Nemours database of American dysarthric speakers is used throughout experiments. Results show the relevance of rhythm metrics and the effectiveness of the proposed framework to improve the performance of dysarthric speech recognition.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Machine Learning Based Assistive Speech Technology for People with Neurological Disorders

Improving Recognition of Dysarthric Speech Using Severity Based Tempo Adaptation

Deep neural network architectures for dysarthric speech analysis and recognition

Article 09 January 2021

References

Arvaniti, A. (2009). Rhythm timing and the timing of rhythm. Phonetica, 66, 46–63.
Article Google Scholar
Darley, F. L., Aronson, A., & Brown, J. R. (1975). Motor speech disorders. Philadelphia: Saunders.
Google Scholar
Enderby, P., & Pamela, M. (1983). Frenchay dysarthria assessment. London: College Hill Press.
Google Scholar
Godino-Llorente, J. I., & Gomez-Vilda, P. (2004). Automatic detection of voice impairments by means of short-term cepstral parameters and neural network based detectors. IEEE Transactions on Biomedical Engineering, 51, 380–384.
Article Google Scholar
Grabe, E., & Low, E. L. (2002). Durational variability in speech and the rhythm class hypothesis. Papers in Laboratory Phonology, 7, 515–546.
Google Scholar
Hasegawa-Johnson, M., Gunderson, J., Perlman, A., & Huang, T. (2006). HMM-based and SVM-based recognition of the speech of talkers with spastic dysarthria. In International conference on acoustics, speech and signal processing (ICASSP) (pp. 1060–1063).
Google Scholar
HTK (2009). The HTK book (Version 3.4.1). Cambridge: Speech Group Cambridge University.
Google Scholar
Liss, J., White, L., Mattys, S., Lansford, K., Lotto, A., Spitzer, S., & Caviness, J. (2009). Quantifying speech rhythm abnormalities in the dysarthrias. Journal of Speech, Language, and Hearing Research, 52, 1334–1352.
Article Google Scholar
Polikoff, J. B., & Bunnell, H. T. (1999). The nemours database of dysarthric speech: A perceptual analysis. In The XIVth international congress of phonetic sciences (ICPhS) (pp. 783–786).
Google Scholar
Polur, D., & Miller, G. (2006). Investigation of an HMM/ANN hybrid structure in pattern recognition application using cepstral analysis of dysarthric (distorted) speech signals. Medical Engineering & Physics, 28, 741–748.
Article Google Scholar
Ramus, F., Nespor, M., & Mehler, J. (1999). Correlates of linguistic rhythm in the speech signal. Cognition, 73, 265–292.
Article Google Scholar
Rudzicz, F. (2009). Phonological features in discriminative classification of dysarthric speech. In International conference on acoustics, speech and signal processing (ICASSP) (pp. 4605–4608).
Google Scholar
Schwarz, P., Matejka, P., & Cernocky, J. (2006). Hierarchical structures of neural networks for phoneme recognition. In International conference on acoustics, speech and signal processing (ICASSP) (pp. 325–328).
Google Scholar
Selouani, S. A., Yakoub, M., & O’Shaughnessy, D. (2009). Alternative speech communication system, for persons with severe speech disorders. EURASIP Journal on Advances in Signal Processing, 2009, 540409. doi:10.1155/2009/540409.
Tolba, H., & Eltorgoman, A. (2009). Towards the improvement of automatic recognition of dysarthric speech. In IEEE international conference ICSIT (pp. 277–281).
Google Scholar
Tsuji, T., Fukuda, O., Ichinobe, H., & Kaneko, M. (1999). A log-linearized Gaussian mixture network and its application to EEG pattern classification. IEEE Transactions on Systems, Man, and Cybernetics, 29, 60–72.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Université de Moncton, Campus de Shippagan, Moncton, NB, Canada
S.-A. Selouani
INRS-EMT, Université du Québec, Montréal, QC, Canada
H. Dahmani
École ESPRIT, Tunis, Tunisia
R. Amami
Université de Moncton, Moncton, NB, Canada
H. Hamam

Authors

S.-A. Selouani
View author publications
You can also search for this author in PubMed Google Scholar
H. Dahmani
View author publications
You can also search for this author in PubMed Google Scholar
R. Amami
View author publications
You can also search for this author in PubMed Google Scholar
H. Hamam
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to S.-A. Selouani.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Selouani, SA., Dahmani, H., Amami, R. et al. Using speech rhythm knowledge to improve dysarthric speech recognition. Int J Speech Technol 15, 57–64 (2012). https://doi.org/10.1007/s10772-011-9104-6

Download citation

Received: 15 June 2011
Accepted: 03 August 2011
Published: 31 August 2011
Issue Date: March 2012
DOI: https://doi.org/10.1007/s10772-011-9104-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Using speech rhythm knowledge to improve dysarthric speech recognition

Abstract

Access this article

Similar content being viewed by others

Machine Learning Based Assistive Speech Technology for People with Neurological Disorders

Improving Recognition of Dysarthric Speech Using Severity Based Tempo Adaptation

Deep neural network architectures for dysarthric speech analysis and recognition

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Using speech rhythm knowledge to improve dysarthric speech recognition

Abstract

Access this article

Similar content being viewed by others

Machine Learning Based Assistive Speech Technology for People with Neurological Disorders

Improving Recognition of Dysarthric Speech Using Severity Based Tempo Adaptation

Deep neural network architectures for dysarthric speech analysis and recognition

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation