Skip to main content
Log in

Recognition of Arabic speech sound error in children

  • Published:
International Journal of Speech Technology Aims and scope Submit manuscript

Abstract

The accurate and automatic recognition of speech sound errors in children is crucial to facilitate the early detection and correction of any faulty phonological process in their early life. This paper addresses the problem of speech sound error classification in native Arabic children when they wrongly pronounce Arabic words containing the letter r (pronounced as /ra/). We identify whether the speech sound error occurs when the letter appears at the beginning, middle, or end of the words. To classify the spoken words, we represent the speech signal with Mel Frequency Cepstral Coefficients (MFCC) features and then train a probabilistic classifier. We evaluate the performance of our proposed approach using a real-world database consisting of speech recordings from native Arabic speaking children. The proposed method achieves a classification accuracy of 71.75%, 77.20%, and 74.06% on average, for speech sound error with Arabic words containing the letter r at the beginning, middle, and end of the words, respectively. These results are superior to those obtained with Hidden Markov Model: another state-of-the-art method on the same dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. CDC (2018). Language and Speech Disorders in Children [online]. https://www.cdc.gov/ncbddd/childdevelopment/language-disorders.html [accessed 15.05.2019].

  2. Owaida, Husen (2015). Speech sound acquisition and phonological error patterns in child speakers of Syrian Arabic: a normative study [online]. http://openaccess.city.ac.uk/15182/ [accessed 20.05.2019].

  3. Tajweed Me (2012). Points of articulation [online]. https://tajweed.me/tag/points-of-articulation/ [accessed 15.05.2019].

References

  • Al-Anzi, F. S., & Abuzeina, D. (2017). The impact of phonological rules on Arabic speech recognition. International Journal of Speech Technology, 20, 715–723.

    Article  Google Scholar 

  • Al-nasheri, A., Muhammad, G., Alsulaiman, M., & Ali, Z. (2017). Investigation of voice pathology detection and classification on different frequency regions using correlation functions. Journal of Voice, 31(1), 3–15.

    Article  Google Scholar 

  • Ali, Z., Alsulaiman, M., Elamvazuthi, I., Muhammad, G., Mesallam, T. A., Farahat, M., et al. (2016). Voice pathology detection based on the modified voice contour and SVM. Biologically Inspired Cognitive Architectures, 15, 10–18.

    Article  Google Scholar 

  • Bader, S. (2009). Speech and language impairments of Arabic-speaking Jordanian children within natural phonology and phonology as human behaviour. Poznan Studies in Contemporary Linguistics, 45, 191–210.

    Article  Google Scholar 

  • Baghai-Ravary, L., & Beet, S. W. (2013). Automatic speech signal analysis for clinical diagnosis and assessment of speech disorders. New York, NY: Springer.

    Book  Google Scholar 

  • El-Gayyar, M. M., Ibrahim, A. S., & Wahed, M. (2016). Translation from Arabic speech to Arabic Sign language based on cloud computing. Egyptian Informatics Journal, 17, 295–303.

    Article  Google Scholar 

  • Embrechts, P. (2003). Modelling Dependence with Copulas and Applications to Risk Management. In F. Lindskog, A. McNeil, & S. Rachev (Eds.), Handbook of Heavy Tailed Distribution in Finance (pp. 329–384). Amsterdam: Elsevier.

    Chapter  Google Scholar 

  • Gad-Allah, H., Abd-Elraouf, S., Abou-Elsaad, T., & Abd-Elwahed, M. (2012). Identification of communication disorders among Egyptian Arabic-speaking nursery school children. Egyptian Journal of Ear, Nose, Throat and Allied Sciences, 13, 83–90.

    Article  Google Scholar 

  • Ganchev, T., Fakotakis, N., & Kokkinakis, G. (2005). Comparative evaluation of various MFCC implementations on the speaker verification task. In: International conference speech and computer. pp. 191–194.

  • Gupta, M. R., & Chen, Y. (2011). Theory and use of the EM algorithm. Foundations and Trends I Signal Processing, 4, 223–296.

    Article  Google Scholar 

  • Hai, J., & Joo, E.M. (2003). Improved linear predictive coding method for speech recognition. In: International conference on information, communications and signal processing. pp. 1614–1618.

  • Hammami, N., Bedda, M., Farah, N., & Mansouri, S. (2015). /r/-Letter disorder diagnosis (/r/-LDD): Arabic speech database development for automatic diagnosis of childhood speech disorders. In: IEEE conference on intelligent systems and computer vision; pp. 1–7.

  • Hanani, A., Attari, M., Farakhna, A., Hussein, M., Jomaa, A., & Taylor, S. (2016). Automatic identification of articulation disorders for Arabic children speakers. In: Workshop on child computer interaction. pp. 35–39.

  • Honig, F., Stemmer, G., Hacker, C., & Brugnara, F. (2005). Revising Perceptual Linear Prediction (PLP). In: European conference on speech communication and technology. pp. 2997–3000.

  • Ijitona, TB., Soraghan, JJ., Lowit, A., Di-Caterina, G., & Yue, H. (2017). Automatic detection of speech disorder in dysarthria using extended speech feature extraction and neural networks classification. In: International conference on intelligent signal processing. pp. 1–6.

  • Kim, M., Kim, Y., Yoo, J., Wang, J., & Kim, H. (2017). Regularized speaker adaptation of KL-HMM for dysarthric speech recognition. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 25(9), 1581–1591.

    Article  Google Scholar 

  • Lawal, I. A. (2017). Spoken character classification using abductive network. International Journal of Speech Technology, 20, 881–890.

    Article  Google Scholar 

  • Lawal I. A. (2019). Incremental SVM learnin: Review. In Sayed-Mouchaweh Moamar (Ed.), Learning from data streams in evolving environments: Methods and applications (pp. 279–296). New York: Springer.

    Chapter  Google Scholar 

  • Logan, B. (2000) Mel frequency cepstral coefficients for music modeling. In: International symposium on music information retrieval.

  • Nelsen, R. B. (2006). An introduction to copulas (2nd ed.). New York, NY: Springer.

    MATH  Google Scholar 

  • Paliwal, K. K., Lyons, J. G., & Wójcicki, K. K. (2010). Preference for 20–40 ms window duration in speech analysis. In: International conference on signal processing and communication systems; pp. 1–4.

  • Rasmussen, C. E. (2000). The infinite gaussian mixture model. In S. A. Solla, T. K. Leen, & K. Müller (Eds.), Advances in neural information processing systems (pp. 554–560). Cambridge: MIT Press.

    Google Scholar 

  • Sithara, A., Thomas, A., & Mathew, D. (2018). Study of MFCC and IHC feature extraction methods with probabilistic acoustic models for speaker biometric applications. In: International conference on advances in computing and communications. pp. 267–276.

  • Terbeh, N., Trigui, A., Maraoui, M., & Zrigui, M. (2016). Arabic speech analysis to identify factors posing pronunciation disorders and to assist learners with vocal disabilities. In: International conference on engineering and MIS; pp. 1–8.

  • Verde, L., De Pietro, G., & Sannino, G. (2018). Voice disorder identification by using machine learning techniques. IEEE Access, 6, 16246–16255.

    Article  Google Scholar 

  • von der Linden, W., Dose, V., & Toussaint (2014). Bayesian probability theory: Applications in the physical sciences (1st ed.). Cambridge: Cambridge University Press.

    Book  Google Scholar 

  • Wehr, H. (1979). Arabic-English Dictionary: The Hans Wehr Dictionary of Modern Written Arabic. Cowan J M. editor. 4th ed. Spoken Language Services Inc.

  • Wu, H., Soraghan, J., Lowit, A., & Di Caterina, G. (2018). A deep learning method for pathological voice detection using convolutional deep belief networks. In: Interspeech. pp. 446–450.

  • You, C. H., Li, H., & Lee, K. A. (2010). A GMM supervector approach to language recognition with adaptive relevance factor. In: 18th European signal processing conference. pp. 1993–1997.

  • Zhang, S., Liu, C., Yao, K., & Gong, Y. (2015). Deep neural support vector machines for speech recognition. In: International conference on acoustics, speech and signal processing (pp. 4275–4279).

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Isah A. Lawal.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hammami, N., Lawal, I.A., Bedda, M. et al. Recognition of Arabic speech sound error in children. Int J Speech Technol 23, 705–711 (2020). https://doi.org/10.1007/s10772-020-09746-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10772-020-09746-3

Keywords

Navigation