Abstract
The accurate and automatic recognition of speech sound errors in children is crucial to facilitate the early detection and correction of any faulty phonological process in their early life. This paper addresses the problem of speech sound error classification in native Arabic children when they wrongly pronounce Arabic words containing the letter r (pronounced as /ra/). We identify whether the speech sound error occurs when the letter appears at the beginning, middle, or end of the words. To classify the spoken words, we represent the speech signal with Mel Frequency Cepstral Coefficients (MFCC) features and then train a probabilistic classifier. We evaluate the performance of our proposed approach using a real-world database consisting of speech recordings from native Arabic speaking children. The proposed method achieves a classification accuracy of 71.75%, 77.20%, and 74.06% on average, for speech sound error with Arabic words containing the letter r at the beginning, middle, and end of the words, respectively. These results are superior to those obtained with Hidden Markov Model: another state-of-the-art method on the same dataset.
Similar content being viewed by others
Notes
CDC (2018). Language and Speech Disorders in Children [online]. https://www.cdc.gov/ncbddd/childdevelopment/language-disorders.html [accessed 15.05.2019].
Owaida, Husen (2015). Speech sound acquisition and phonological error patterns in child speakers of Syrian Arabic: a normative study [online]. http://openaccess.city.ac.uk/15182/ [accessed 20.05.2019].
Tajweed Me (2012). Points of articulation [online]. https://tajweed.me/tag/points-of-articulation/ [accessed 15.05.2019].
References
Al-Anzi, F. S., & Abuzeina, D. (2017). The impact of phonological rules on Arabic speech recognition. International Journal of Speech Technology, 20, 715–723.
Al-nasheri, A., Muhammad, G., Alsulaiman, M., & Ali, Z. (2017). Investigation of voice pathology detection and classification on different frequency regions using correlation functions. Journal of Voice, 31(1), 3–15.
Ali, Z., Alsulaiman, M., Elamvazuthi, I., Muhammad, G., Mesallam, T. A., Farahat, M., et al. (2016). Voice pathology detection based on the modified voice contour and SVM. Biologically Inspired Cognitive Architectures, 15, 10–18.
Bader, S. (2009). Speech and language impairments of Arabic-speaking Jordanian children within natural phonology and phonology as human behaviour. Poznan Studies in Contemporary Linguistics, 45, 191–210.
Baghai-Ravary, L., & Beet, S. W. (2013). Automatic speech signal analysis for clinical diagnosis and assessment of speech disorders. New York, NY: Springer.
El-Gayyar, M. M., Ibrahim, A. S., & Wahed, M. (2016). Translation from Arabic speech to Arabic Sign language based on cloud computing. Egyptian Informatics Journal, 17, 295–303.
Embrechts, P. (2003). Modelling Dependence with Copulas and Applications to Risk Management. In F. Lindskog, A. McNeil, & S. Rachev (Eds.), Handbook of Heavy Tailed Distribution in Finance (pp. 329–384). Amsterdam: Elsevier.
Gad-Allah, H., Abd-Elraouf, S., Abou-Elsaad, T., & Abd-Elwahed, M. (2012). Identification of communication disorders among Egyptian Arabic-speaking nursery school children. Egyptian Journal of Ear, Nose, Throat and Allied Sciences, 13, 83–90.
Ganchev, T., Fakotakis, N., & Kokkinakis, G. (2005). Comparative evaluation of various MFCC implementations on the speaker verification task. In: International conference speech and computer. pp. 191–194.
Gupta, M. R., & Chen, Y. (2011). Theory and use of the EM algorithm. Foundations and Trends I Signal Processing, 4, 223–296.
Hai, J., & Joo, E.M. (2003). Improved linear predictive coding method for speech recognition. In: International conference on information, communications and signal processing. pp. 1614–1618.
Hammami, N., Bedda, M., Farah, N., & Mansouri, S. (2015). /r/-Letter disorder diagnosis (/r/-LDD): Arabic speech database development for automatic diagnosis of childhood speech disorders. In: IEEE conference on intelligent systems and computer vision; pp. 1–7.
Hanani, A., Attari, M., Farakhna, A., Hussein, M., Jomaa, A., & Taylor, S. (2016). Automatic identification of articulation disorders for Arabic children speakers. In: Workshop on child computer interaction. pp. 35–39.
Honig, F., Stemmer, G., Hacker, C., & Brugnara, F. (2005). Revising Perceptual Linear Prediction (PLP). In: European conference on speech communication and technology. pp. 2997–3000.
Ijitona, TB., Soraghan, JJ., Lowit, A., Di-Caterina, G., & Yue, H. (2017). Automatic detection of speech disorder in dysarthria using extended speech feature extraction and neural networks classification. In: International conference on intelligent signal processing. pp. 1–6.
Kim, M., Kim, Y., Yoo, J., Wang, J., & Kim, H. (2017). Regularized speaker adaptation of KL-HMM for dysarthric speech recognition. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 25(9), 1581–1591.
Lawal, I. A. (2017). Spoken character classification using abductive network. International Journal of Speech Technology, 20, 881–890.
Lawal I. A. (2019). Incremental SVM learnin: Review. In Sayed-Mouchaweh Moamar (Ed.), Learning from data streams in evolving environments: Methods and applications (pp. 279–296). New York: Springer.
Logan, B. (2000) Mel frequency cepstral coefficients for music modeling. In: International symposium on music information retrieval.
Nelsen, R. B. (2006). An introduction to copulas (2nd ed.). New York, NY: Springer.
Paliwal, K. K., Lyons, J. G., & Wójcicki, K. K. (2010). Preference for 20–40 ms window duration in speech analysis. In: International conference on signal processing and communication systems; pp. 1–4.
Rasmussen, C. E. (2000). The infinite gaussian mixture model. In S. A. Solla, T. K. Leen, & K. Müller (Eds.), Advances in neural information processing systems (pp. 554–560). Cambridge: MIT Press.
Sithara, A., Thomas, A., & Mathew, D. (2018). Study of MFCC and IHC feature extraction methods with probabilistic acoustic models for speaker biometric applications. In: International conference on advances in computing and communications. pp. 267–276.
Terbeh, N., Trigui, A., Maraoui, M., & Zrigui, M. (2016). Arabic speech analysis to identify factors posing pronunciation disorders and to assist learners with vocal disabilities. In: International conference on engineering and MIS; pp. 1–8.
Verde, L., De Pietro, G., & Sannino, G. (2018). Voice disorder identification by using machine learning techniques. IEEE Access, 6, 16246–16255.
von der Linden, W., Dose, V., & Toussaint (2014). Bayesian probability theory: Applications in the physical sciences (1st ed.). Cambridge: Cambridge University Press.
Wehr, H. (1979). Arabic-English Dictionary: The Hans Wehr Dictionary of Modern Written Arabic. Cowan J M. editor. 4th ed. Spoken Language Services Inc.
Wu, H., Soraghan, J., Lowit, A., & Di Caterina, G. (2018). A deep learning method for pathological voice detection using convolutional deep belief networks. In: Interspeech. pp. 446–450.
You, C. H., Li, H., & Lee, K. A. (2010). A GMM supervector approach to language recognition with adaptive relevance factor. In: 18th European signal processing conference. pp. 1993–1997.
Zhang, S., Liu, C., Yao, K., & Gong, Y. (2015). Deep neural support vector machines for speech recognition. In: International conference on acoustics, speech and signal processing (pp. 4275–4279).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations
Rights and permissions
About this article
Cite this article
Hammami, N., Lawal, I.A., Bedda, M. et al. Recognition of Arabic speech sound error in children. Int J Speech Technol 23, 705–711 (2020). https://doi.org/10.1007/s10772-020-09746-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10772-020-09746-3