Toward Explainable Automatic Classification of Children’s Speech Disorders

Shulga, Dima; Silber-Varod, Vered; Benson-Karai, Diamanta; Levi, Ofer; Vashdi, Elad; Lerner, Anat

doi:10.1007/978-3-030-60276-5_49

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12335))

Included in the following conference series:

International Conference on Speech and Computer

1640 Accesses
1 Citations
1 Altmetric

Abstract

Early and adequate diagnosis of speech disorders can contribute to the quality of the treatment and thus to treatment success rates. Using acoustic analysis of the speech of children with speech disorders may aid therapists in the diagnostic process by identifying the acoustic characteristics that are unique to a specific disorder and that distinguish it from normal speech development. The purpose of this work is to investigate the feasibility of the automatic detection of speech disorders based on children’s voices. In this preliminary study, using a dataset of utterance recordings of 24 children whose mother tongue is Hebrew, we propose an automatic system that may facilitate accurate speech assessment by therapists by providing a preliminary diagnosis and explainable insights about the model’s predictions. We built a serial, two-step network that is both powerful and possibly interpretable. The first step can model the complex relations between acoustic features and the speech disorder while the second can shed light on the utterances that make the greatest contribution to the final classification. Our preliminary results focus on the broad spectrum of speech disorders. In future work, we plan to design a system that will be able to detect childhood apraxia of speech (CAS) specifically and shed light on the differences in the speech of individuals with CAS and those with other speech disorders.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Shahin, M.A., et al.: Tabby talks: an automated tool for the assessment of childhood apraxia of speech. Speech Commun. 70, 49–64 (2015)
Article Google Scholar
American Speech-Language-Hearing Association (ASHA): Childhood apraxia of speech. Technical report. www.asha.org/policy. Accessed 21 Apr 2020
Shriberg, L.D., Aram, D.M., Kwiatkowski, J.: Developmental apraxia of speech: I. Descriptive and theoretical perspectives. J. Speech Lang. Hear. Res. 40(2), 273–285 (1997)
Article Google Scholar
Deal, J.L., Darley, F.L.: The influence of linguistic and situational variables on phonemic accuracy in apraxia of speech. J. Speech Lang. Hear. Res. 15(3), 639 (1972)
Article Google Scholar
Yoss, K.A.: Developmental apraxia of speech in children: familial patterns and behavioral characteristics. In: ASHA North Central Regional Conference, Minneapolis, MN (1975)
Google Scholar
Hansen, S.N., Schendel, D.E., Parner, E.T.: Explaining the increase in the prevalence of autism spectrum disorders: the proportion attributable to changes in reporting practices. JAMA Pediatr. 169(1), 56–62 (2015)
Article Google Scholar
Tierney, C., et al.: How valid is the checklist for autism spectrum disorder when a child has apraxia of speech? J. Dev. Behav. Pediatr. 36(8), 569–574 (2015)
Article Google Scholar
Shriberg, L.D., et al.: A diagnostic marker for childhood apraxia of speech: the lexical stress ratio. Clin. Linguist. Phon. 17(7), 549–574 (2003)
Article Google Scholar
Strand, E.A., Duffy, J.R., Clark, H.M., Josephs, K.: The apraxia of speech rating scale: a tool for diagnosis and description of apraxia of speech. J. Commun. Disord. 51, 43–50 (2014)
Article Google Scholar
Malmenholt, A., Lohmander, A., McAllister, A.: Childhood Apraxia of Speech (CAS): a survey of knowledge and experience of Swedish Speech-Language Pathologists. In: ICPLA 2012 14th Meeting of the International Clinical Phonetics and Linguistics Association, p. 143 (2012)
Google Scholar
Hosom, J.P., Shriberg, L., Green, J.R.: Diagnostic assessment of childhood apraxia of speech using automatic speech recognition (ASR) methods. J. Med. Speech-Lang. Pathol. 12(4), 167–171 (2004)
Google Scholar
Keshet, J.: Automatic speech recognition: a primer for speech language pathology researchers. Int. J. Speech-Lang. Pathol. 20(6), 599–609 (2018)
Article Google Scholar
Le, D., Licata, K., Persad, C., Provost, E.M.: Automatic assessment of speech intelligibility for individuals with aphasia. IEEE/ACM Trans. Audio Speech Lang. Process. 24(11), 2187–2199 (2016)
Article Google Scholar
Baird, A., et al.: Automatic classification of autistic child vocalisations: a novel database and results. In: Proceedings of INTERSPEECH 2017. International Speech Communication Association, Stockholm, Sweden (2017)
Google Scholar
Schuller, B., et al.: The INTERSPEECH 2013 computational paralinguistics challenge: social signals, conflict, emotion, autism. In: Proceedings INTERSPEECH 2013, 14th Annual Conference of the International Speech Communication Association, Lyon, France (2013)
Google Scholar
Eyben, F., et al.: The Geneva minimalistic acoustic parameter set (GeMAPS) for voice research and affective computing. IEEE Trans. Affect. Comput. 7(2), 190–202 (2016)
Article Google Scholar
Cummins, N., et al.: An image-based deep spectrum feature representation for the recognition of emotional speech. In: Proceedings of the 2017 ACM on Multimedia Conference. ACM (2017)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Google Scholar
Deng, J., et al.: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009)
Google Scholar
ZOOM H6. https://zoom-na.com/products/field-video-recording/field-recording/zoom-h6-handy-recorder-1. Accessed 30 May 2010
Liberman, M.Y., Streeter, L.A.: Use of nonsense-syllable mimicry in the study of prosodic phenomena. J. Acoust. Soc. Am. 63(1), 231–233 (1978)
Article Google Scholar
Icht, M., Ben-David, B.M.: Oral-diadochokinetic rates for Hebrew-speaking school-age children: real words vs. non-words repetition. Clin. Linguist. Phon. 29(2), 102–114 (2015)
Article Google Scholar
Gadesmann, M., Miller, N.: Reliability of speech diadochokinetic test measurement. Int. J. Lang. Commun. Disord. 43(1), 41–54 (2008)
Article Google Scholar
Boersma, P.: PRAAT, a system for doing phonetics by computer. Glot Int. 5(9/10), 341–345 (2001)
Google Scholar
Eyben, F., Weninger, F., Gross, F., Schuller, B.: Recent developments in openSMILE, the Munich open-source multimedia feature extractor. In: Proceedings of the 21st ACM Multimedia, pp. 835–838 (2013)
Google Scholar

Download references

Acknowledgements

This research was performed using a grant 506442 (37183) from the Research Authority of The Open University of Israel to conduct a study on “Analysis of acoustic and physiological signals to identify childhood apraxia of speech”. We are grateful to Daphna Amit for the segmentation and annotation of the recordings.

Author information

Authors and Affiliations

Mathematics and Computer Science Department, The Open University of Israel, Ra’anana, Israel
Dima Shulga, Diamanta Benson-Karai, Ofer Levi & Anat Lerner
Open Media and Information Lab (OMILab), The Open University of Israel, Ra’anana, Israel
Vered Silber-Varod
Yael Center, Alonei Abba, Israel
Elad Vashdi

Authors

Dima Shulga
View author publications
You can also search for this author in PubMed Google Scholar
Vered Silber-Varod
View author publications
You can also search for this author in PubMed Google Scholar
Diamanta Benson-Karai
View author publications
You can also search for this author in PubMed Google Scholar
Ofer Levi
View author publications
You can also search for this author in PubMed Google Scholar
Elad Vashdi
View author publications
You can also search for this author in PubMed Google Scholar
Anat Lerner
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Vered Silber-Varod .

Editor information

Editors and Affiliations

St. Petersburg Institute for Informatics and Automation, Russian Academy of Sciences, St. Petersburg, Russia
Alexey Karpov
Institute for Applied and Mathematical Linguistics, Moscow State Linguistic University, Moscow, Russia
Rodmonga Potapova

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Shulga, D., Silber-Varod, V., Benson-Karai, D., Levi, O., Vashdi, E., Lerner, A. (2020). Toward Explainable Automatic Classification of Children’s Speech Disorders. In: Karpov, A., Potapova, R. (eds) Speech and Computer. SPECOM 2020. Lecture Notes in Computer Science(), vol 12335. Springer, Cham. https://doi.org/10.1007/978-3-030-60276-5_49

Download citation

DOI: https://doi.org/10.1007/978-3-030-60276-5_49
Published: 29 September 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-60275-8
Online ISBN: 978-3-030-60276-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics