Abstract
This work describes a complete study addressing the pathological speech processing. It focuses principally on the speech correction and the assistance to learners of Arabic vocabulary. For this purpose, we follow five main phases. The first one is dedicated to evaluate the produced speech by assigning a pronunciation level for each speaker according to their forced alignment score. The second step consists in classifying the Arabic produced speech into healthy or pathological based on two different models: a prosodic modeling based on elocution speed and a phonetic modeling based on comparing between a referenced Probabilistic-Phonetic Model and a speaker model. Third, we localize for each speech sequence classified as pathological the problematic phonemes that degrade pronunciation. We differentiate also two factors which can falsify produced acoustic signals: degraded speech can be generated from pathological problems, or it can be produced by non arabophone pronouncers. Hence, we focus on forced alignment scores. Fourth, we develop a new algorithm to correct pathological pronunciation. We opt of two different solutions: lexical and phonetic. The last task is the conception of an application assisting learners of Arabic vocabulary to improve their pronunciation. The achieved results are encouraging. Moreover, the evaluation and classification of produced acoustic signals are satisfactory, learners of Arabic vocabulary have presented good amelioration using the developed application. A lot of applications that design systems of voice signal processing and platforms of e-learning can enjoy from our proposition.











Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.References
Ajibola, A. S., Rashid, N. K. B. A. M., Sediono, W., & Hashim, N. N. W. N. (2016). A novel approach to stuttered speech correction, Jurnal Ilmu Komputer dan Informasi, 9(2), 80–87.
Alghamdi, M., Almuhtasib, H., & Elshafei, M. (2004). Arabic phonological rules. King Saud University Journal: Computer Sciences and Information, 16, 85–115.
Aljawarneh, S. (2011). A web engineering security methodology for e-learning systems. Network Security, 2011(3), 12–15.
Aljlayl, M., & Frieder, O. (2002). On Arabic search: Improving the retrieval effectiveness via a light stemming approach. In Proceedings of the eleventh international conference on Information and knowledge management (pp. 340–347). ACM.
American Speech Language Hearing Association. (2014). http://www.asha.org/public/speech/disorders/ChildSandL.htm.
Ayadi, R., Maraoui, M., & Zrigui, M. (2016). A survey of arabic text representation and classification methods. Research in Computing Science, 117, 51–62
Bassil, Y., & Alwani, M. (2012). Post-editing error correction algorithm for speech recognition using bing spelling suggestion. International Journal of advanced Computer Science and Applications, 3, 2.
Belgacem, M. (2011). Reconnaissance automatique de la parole et ALAO: Vers un système d’apprentissage de l’arabe oral, PhD thesis. Stendhal University, Grenoble.
Biadsy, F., Hirschberg, J., & Habash, N. (2009). Spoken Arabic dialect identification using phonotactic modeling. In Proceedings of the eacl 2009 workshop on computational approaches to semitic languages (pp. 53–61). Association for Computational Linguistics.
Blanc-Brude, T. (2004). Intégration de commandes vocales dans un environnement d’apprentissage par l’action: enjeux ergonomiques, Doctoral dissertation, Grenoble 1.
Boite, R., Bourlard, H., Dutoit, T., Hancq, J., & Leich, H. (2000). Traitement de la parole. Lausanne: Presses Polytechniques et Universitaires Romandes, Collection Electricité.
Bréhilin, L., & Gascuel, O. (2000). Modèles de Markov caches et apprentissage de séquences.
Calliope, E. P. (1989). La parole et son traitement automatique. Paris: Masson.
Elshafei, M., Almuhtasib, H., & Alghamdi, M. (2002). Techniques for high quality text-to-speech. Information Science, 140, 255–267
Elshafei, M., Al-Muhtaseb, H., & Alghamdi, M. (2006). Statistical methods for automatic diacritization of Arabic text. In The Saudi 18th National Computer Conference. Riyadh (Vol. 18, pp. 301–306).
Haffar, N., Maraoui, M., & Aljawarneh, S. (2016). Use of indexed Arabic text in e-learning system. In Engineering & MIS (ICEMIS), International Conference on (pp. 1–7). IEEE.
Hawashin, B., Mansour, A., Aljawarneh, S., Fahmy, A. A., Al Raddady, F., Shrivastava, A., Rajawat, A. S., Malika, C. R., Mishra, S., Yadav, R. N. (2013). An efficient feature selection method for arabic text classification. International Journal of Computers and Applications, 83, 17.
Huang, X., Acero, A., & Hon, H. W. (2001). Spoken language processing—a guide to theory, algorithm, and system development. Upper Saddle River: Prentice Hall.
Kaki, S., Sumita, E., & Iida, H. (1998). A method for correcting errors in speech recognition using the statistical features of character co-occurrence, In COLING-ACL, Montreal, Quebec, Canada.
Lin, J., Xie, Y., & Zhang, J. (2016). Automatic pronunciation evaluation of non-native mandarin tone by using multi-level confidence measures. Proc. INTERSPEECH 2016, (pp. 2666–2670).
Majidnezhad, V., & Kheidorov, I. (2012). A HMM-based method for vocal fold pathology diagnosis. IJCSI International Journal of Computer Science Issues, 9(6), 135.
Majidnezhad, V., & Kheidorov, I. (2013). An ANN-based method for detecting vocal fold pathology. International Journal of Computer Applications, 62, 7.
Maraoui, M., Zrigui, M., & Antoniadis, G. (2012). Use of NLP tools in CALL system for Arabic. Journal of Computer Processing of Languages, 24(02), 153–165.
Meddeb, O., Maraoui, M., & Aljawarneh, S. (2016). Hybrid modeling of an offline arabic handwriting recognition system AHRS. In Engineering & MIS (ICEMIS), International Conference on (pp. 1–8). IEEE.
Merhbene, L., Zouaghi, A., & Zrigui, M. (2013). An experimental study for some supervised lexical disambiguation methods of arabic language. In Information and Communication Technology and Accessibility (ICTA), 2013 Fourth International Conference on (pp. 1–6). IEEE.
Paquet, P. (1997). L’utilisation des réseaux de neurones artificiels en finance. Document de recherche 1997-1.
Patane, G., & Russo, M. (2001). The enhanced LBG algorithm. Neural Networks, 14(9), 1219–1237
Rouhe, A., Karhila, R., Smit, P., & Kurimo, M. (2017). Reading validation for pronunciation evaluation in the Digitala project. Proc. Interspeech 2017, 2050–2051.
Terbeh, N., Labidi, M., & Zrigui, M. (2013). Automatic speech correction: A step to speech recognition for people with disabilities. In Information and Communication Technology and Accessibility (ICTA), 2013 Fourth International Conference on (pp. 1–6). IEEE.
Terbeh, N., Maraoui, M., & Zrigui, M. (2015). Probabilistic approach for detection of vocal pathologies in the arabic speech. In International Conference on Intelligent Text Processing and Computational Linguistics (pp. 606–616). Springer, Cham
Terbeh, N., Trigui, A., Maraoui, M., & Zrigui, M. (2016). Arabic speech analysis to identify factors posing pronunciation disorders and to assist learners with vocal disabilities. In Engineering & MIS (ICEMIS), International Conference on (pp. 1–8). IEEE.
Terbeh, N., & Zrigui, M. (2014). Vers la correction automatique de la Parole Arabe, Morocco: Citala 2014.
Terbeh, N., & Zrigui, M. (2016). A novel approach to identify factor posing pronunciation disorders. In International Conference on Computational Collective Intelligence (pp. 153–162). Springer, Cham.
Terbeh, N., & Zrigui, M. (2017a). A robust algorithm for pathological-speech correction. PACLING, 2017, 341–351.
Terbeh, N., & Zrigui, M. (2017b). Identification of pronunciation defects in spoken Arabic language. PACLING, 2017, 355–365.
Vu, H. H., Villaneau, J., Saïd, F., & Marteau, P. F. (2015). Mesurer la similarité entre phrases grâce à Wikipédia en utilisant une indexation aléatoire, 22nd Traitement Automatique des Langues Naturelles, Caen
Wali, W., Gargouri, B., & Ben Hamadou, A. (2017). Enhancing the sentence similarity measure by semantic and syntactico-semantic knowledge. Vietnam Journal of Computer Science, 4(1), 51–60.
Yarra, C., Deshmukh, O. D., & Ghosh, P. K. (2017). Automatic detection of syllable stress using sonority based prominence features for pronunciation evaluation. In Acoustics, Speech and Signal Processing (ICASSP), 2017 IEEE International Conference on (pp. 5845–5849). IEEE.
Acknowledgements
The editors and reviewers within the International Journal of Speech Technology are acknowledged about their critics, remarks and comments to ameliorate the quality of this paper. My supervisors Mr. Mounir Zrigui and Mr. Mohsen Maraoui are also thanked for their valuable supports to achieve this work.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Maraoui, M., Terbeh, N. & Zrigui, M. Arabic discourse analysis based on acoustic, prosodic and phonetic modeling: elocution evaluation, speech classification and pathological speech correction. Int J Speech Technol 21, 1071–1090 (2018). https://doi.org/10.1007/s10772-018-09566-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10772-018-09566-6