Skip to main content
Log in

Arabic discourse analysis based on acoustic, prosodic and phonetic modeling: elocution evaluation, speech classification and pathological speech correction

  • Published:
International Journal of Speech Technology Aims and scope Submit manuscript

Abstract

This work describes a complete study addressing the pathological speech processing. It focuses principally on the speech correction and the assistance to learners of Arabic vocabulary. For this purpose, we follow five main phases. The first one is dedicated to evaluate the produced speech by assigning a pronunciation level for each speaker according to their forced alignment score. The second step consists in classifying the Arabic produced speech into healthy or pathological based on two different models: a prosodic modeling based on elocution speed and a phonetic modeling based on comparing between a referenced Probabilistic-Phonetic Model and a speaker model. Third, we localize for each speech sequence classified as pathological the problematic phonemes that degrade pronunciation. We differentiate also two factors which can falsify produced acoustic signals: degraded speech can be generated from pathological problems, or it can be produced by non arabophone pronouncers. Hence, we focus on forced alignment scores. Fourth, we develop a new algorithm to correct pathological pronunciation. We opt of two different solutions: lexical and phonetic. The last task is the conception of an application assisting learners of Arabic vocabulary to improve their pronunciation. The achieved results are encouraging. Moreover, the evaluation and classification of produced acoustic signals are satisfactory, learners of Arabic vocabulary have presented good amelioration using the developed application. A lot of applications that design systems of voice signal processing and platforms of e-learning can enjoy from our proposition.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  • Ajibola, A. S., Rashid, N. K. B. A. M., Sediono, W., & Hashim, N. N. W. N. (2016). A novel approach to stuttered speech correction, Jurnal Ilmu Komputer dan Informasi, 9(2), 80–87.

    Article  Google Scholar 

  • Alghamdi, M., Almuhtasib, H., & Elshafei, M. (2004). Arabic phonological rules. King Saud University Journal: Computer Sciences and Information, 16, 85–115.

    Google Scholar 

  • Aljawarneh, S. (2011). A web engineering security methodology for e-learning systems. Network Security, 2011(3), 12–15.

    Article  Google Scholar 

  • Aljlayl, M., & Frieder, O. (2002). On Arabic search: Improving the retrieval effectiveness via a light stemming approach. In Proceedings of the eleventh international conference on Information and knowledge management (pp. 340–347). ACM.

  • American Speech Language Hearing Association. (2014). http://www.asha.org/public/speech/disorders/ChildSandL.htm.

  • Ayadi, R., Maraoui, M., & Zrigui, M. (2016). A survey of arabic text representation and classification methods. Research in Computing Science, 117, 51–62

    Google Scholar 

  • Bassil, Y., & Alwani, M. (2012). Post-editing error correction algorithm for speech recognition using bing spelling suggestion. International Journal of advanced Computer Science and Applications, 3, 2.

    Article  Google Scholar 

  • Belgacem, M. (2011). Reconnaissance automatique de la parole et ALAO: Vers un système d’apprentissage de l’arabe oral, PhD thesis. Stendhal University, Grenoble.

  • Biadsy, F., Hirschberg, J., & Habash, N. (2009). Spoken Arabic dialect identification using phonotactic modeling. In Proceedings of the eacl 2009 workshop on computational approaches to semitic languages (pp. 53–61). Association for Computational Linguistics.

  • Blanc-Brude, T. (2004). Intégration de commandes vocales dans un environnement d’apprentissage par l’action: enjeux ergonomiques, Doctoral dissertation, Grenoble 1.

  • Boite, R., Bourlard, H., Dutoit, T., Hancq, J., & Leich, H. (2000). Traitement de la parole. Lausanne: Presses Polytechniques et Universitaires Romandes, Collection Electricité.

    Google Scholar 

  • Bréhilin, L., & Gascuel, O. (2000). Modèles de Markov caches et apprentissage de séquences.

  • Calliope, E. P. (1989). La parole et son traitement automatique. Paris: Masson.

    Google Scholar 

  • Elshafei, M., Almuhtasib, H., & Alghamdi, M. (2002). Techniques for high quality text-to-speech. Information Science, 140, 255–267

    Article  Google Scholar 

  • Elshafei, M., Al-Muhtaseb, H., & Alghamdi, M. (2006). Statistical methods for automatic diacritization of Arabic text. In The Saudi 18th National Computer Conference. Riyadh (Vol. 18, pp. 301–306).

  • Haffar, N., Maraoui, M., & Aljawarneh, S. (2016). Use of indexed Arabic text in e-learning system. In Engineering & MIS (ICEMIS), International Conference on (pp. 1–7). IEEE.

  • Hawashin, B., Mansour, A., Aljawarneh, S., Fahmy, A. A., Al Raddady, F., Shrivastava, A., Rajawat, A. S., Malika, C. R., Mishra, S., Yadav, R. N. (2013). An efficient feature selection method for arabic text classification. International Journal of Computers and Applications, 83, 17.

    Google Scholar 

  • Huang, X., Acero, A., & Hon, H. W. (2001). Spoken language processing—a guide to theory, algorithm, and system development. Upper Saddle River: Prentice Hall.

    Google Scholar 

  • Kaki, S., Sumita, E., & Iida, H. (1998). A method for correcting errors in speech recognition using the statistical features of character co-occurrence, In COLING-ACL, Montreal, Quebec, Canada.

  • Lin, J., Xie, Y., & Zhang, J. (2016). Automatic pronunciation evaluation of non-native mandarin tone by using multi-level confidence measures. Proc. INTERSPEECH 2016, (pp. 2666–2670).

  • Majidnezhad, V., & Kheidorov, I. (2012). A HMM-based method for vocal fold pathology diagnosis. IJCSI International Journal of Computer Science Issues, 9(6), 135.

    Google Scholar 

  • Majidnezhad, V., & Kheidorov, I. (2013). An ANN-based method for detecting vocal fold pathology. International Journal of Computer Applications, 62, 7.

    Article  Google Scholar 

  • Maraoui, M., Zrigui, M., & Antoniadis, G. (2012). Use of NLP tools in CALL system for Arabic. Journal of Computer Processing of Languages, 24(02), 153–165.

    Google Scholar 

  • Meddeb, O., Maraoui, M., & Aljawarneh, S. (2016). Hybrid modeling of an offline arabic handwriting recognition system AHRS. In Engineering & MIS (ICEMIS), International Conference on (pp. 1–8). IEEE.

  • Merhbene, L., Zouaghi, A., & Zrigui, M. (2013). An experimental study for some supervised lexical disambiguation methods of arabic language. In Information and Communication Technology and Accessibility (ICTA), 2013 Fourth International Conference on (pp. 1–6). IEEE.

  • Paquet, P. (1997). L’utilisation des réseaux de neurones artificiels en finance. Document de recherche 1997-1.

  • Patane, G., & Russo, M. (2001). The enhanced LBG algorithm. Neural Networks, 14(9), 1219–1237

    Article  Google Scholar 

  • Rouhe, A., Karhila, R., Smit, P., & Kurimo, M. (2017). Reading validation for pronunciation evaluation in the Digitala project. Proc. Interspeech 2017, 2050–2051.

  • Terbeh, N., Labidi, M., & Zrigui, M. (2013). Automatic speech correction: A step to speech recognition for people with disabilities. In Information and Communication Technology and Accessibility (ICTA), 2013 Fourth International Conference on (pp. 1–6). IEEE.

  • Terbeh, N., Maraoui, M., & Zrigui, M. (2015). Probabilistic approach for detection of vocal pathologies in the arabic speech. In International Conference on Intelligent Text Processing and Computational Linguistics (pp. 606–616). Springer, Cham

  • Terbeh, N., Trigui, A., Maraoui, M., & Zrigui, M. (2016). Arabic speech analysis to identify factors posing pronunciation disorders and to assist learners with vocal disabilities. In Engineering & MIS (ICEMIS), International Conference on (pp. 1–8). IEEE.

  • Terbeh, N., & Zrigui, M. (2014). Vers la correction automatique de la Parole Arabe, Morocco: Citala 2014.

    Google Scholar 

  • Terbeh, N., & Zrigui, M. (2016). A novel approach to identify factor posing pronunciation disorders. In International Conference on Computational Collective Intelligence (pp. 153–162). Springer, Cham.

  • Terbeh, N., & Zrigui, M. (2017a). A robust algorithm for pathological-speech correction. PACLING, 2017, 341–351.

    Google Scholar 

  • Terbeh, N., & Zrigui, M. (2017b). Identification of pronunciation defects in spoken Arabic language. PACLING, 2017, 355–365.

    Google Scholar 

  • Vu, H. H., Villaneau, J., Saïd, F., & Marteau, P. F. (2015). Mesurer la similarité entre phrases grâce à Wikipédia en utilisant une indexation aléatoire, 22nd Traitement Automatique des Langues Naturelles, Caen

  • Wali, W., Gargouri, B., & Ben Hamadou, A. (2017). Enhancing the sentence similarity measure by semantic and syntactico-semantic knowledge. Vietnam Journal of Computer Science, 4(1), 51–60.

    Article  Google Scholar 

  • Yarra, C., Deshmukh, O. D., & Ghosh, P. K. (2017). Automatic detection of syllable stress using sonority based prominence features for pronunciation evaluation. In Acoustics, Speech and Signal Processing (ICASSP), 2017 IEEE International Conference on (pp. 5845–5849). IEEE.

Download references

Acknowledgements

The editors and reviewers within the International Journal of Speech Technology are acknowledged about their critics, remarks and comments to ameliorate the quality of this paper. My supervisors Mr. Mounir Zrigui and Mr. Mohsen Maraoui are also thanked for their valuable supports to achieve this work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohsen Maraoui.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Maraoui, M., Terbeh, N. & Zrigui, M. Arabic discourse analysis based on acoustic, prosodic and phonetic modeling: elocution evaluation, speech classification and pathological speech correction. Int J Speech Technol 21, 1071–1090 (2018). https://doi.org/10.1007/s10772-018-09566-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10772-018-09566-6

Keywords

Navigation