Arabic discourse analysis based on acoustic, prosodic and phonetic modeling: elocution evaluation, speech classification and pathological speech correction

Maraoui, Mohsen; Terbeh, Naim; Zrigui, Mounir

doi:10.1007/s10772-018-09566-6

Arabic discourse analysis based on acoustic, prosodic and phonetic modeling: elocution evaluation, speech classification and pathological speech correction

Published: 31 October 2018

Volume 21, pages 1071–1090, (2018)
Cite this article

International Journal of Speech Technology Aims and scope Submit manuscript

Mohsen Maraoui²,
Naim Terbeh¹ &
Mounir Zrigui¹

292 Accesses
9 Citations
1 Altmetric
Explore all metrics

Abstract

This work describes a complete study addressing the pathological speech processing. It focuses principally on the speech correction and the assistance to learners of Arabic vocabulary. For this purpose, we follow five main phases. The first one is dedicated to evaluate the produced speech by assigning a pronunciation level for each speaker according to their forced alignment score. The second step consists in classifying the Arabic produced speech into healthy or pathological based on two different models: a prosodic modeling based on elocution speed and a phonetic modeling based on comparing between a referenced Probabilistic-Phonetic Model and a speaker model. Third, we localize for each speech sequence classified as pathological the problematic phonemes that degrade pronunciation. We differentiate also two factors which can falsify produced acoustic signals: degraded speech can be generated from pathological problems, or it can be produced by non arabophone pronouncers. Hence, we focus on forced alignment scores. Fourth, we develop a new algorithm to correct pathological pronunciation. We opt of two different solutions: lexical and phonetic. The last task is the conception of an application assisting learners of Arabic vocabulary to improve their pronunciation. The achieved results are encouraging. Moreover, the evaluation and classification of produced acoustic signals are satisfactory, learners of Arabic vocabulary have presented good amelioration using the developed application. A lot of applications that design systems of voice signal processing and platforms of e-learning can enjoy from our proposition.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Automatic speech recognition: a survey

Article 10 November 2020

Mishaim Malik, Muhammad Kamran Malik, … Imran Makhdoom

Comparison of Outcomes Between Robot-Assisted Language Learning System and Human Tutors: Focusing on Speaking Ability

Article Open access 11 April 2024

Takamasa Iio, Yuichiro Yoshikawa, … Hiroshi Ishiguro

A deep learning approaches in text-to-speech system: a systematic review and recent research perspective

Article 29 September 2022

Yogesh Kumar, Apeksha Koul & Chamkaur Singh

References

Ajibola, A. S., Rashid, N. K. B. A. M., Sediono, W., & Hashim, N. N. W. N. (2016). A novel approach to stuttered speech correction, Jurnal Ilmu Komputer dan Informasi, 9(2), 80–87.
Article Google Scholar
Alghamdi, M., Almuhtasib, H., & Elshafei, M. (2004). Arabic phonological rules. King Saud University Journal: Computer Sciences and Information, 16, 85–115.
Google Scholar
Aljawarneh, S. (2011). A web engineering security methodology for e-learning systems. Network Security, 2011(3), 12–15.
Article Google Scholar
Aljlayl, M., & Frieder, O. (2002). On Arabic search: Improving the retrieval effectiveness via a light stemming approach. In Proceedings of the eleventh international conference on Information and knowledge management (pp. 340–347). ACM.
American Speech Language Hearing Association. (2014). http://www.asha.org/public/speech/disorders/ChildSandL.htm.
Ayadi, R., Maraoui, M., & Zrigui, M. (2016). A survey of arabic text representation and classification methods. Research in Computing Science, 117, 51–62
Google Scholar
Bassil, Y., & Alwani, M. (2012). Post-editing error correction algorithm for speech recognition using bing spelling suggestion. International Journal of advanced Computer Science and Applications, 3, 2.
Article Google Scholar
Belgacem, M. (2011). Reconnaissance automatique de la parole et ALAO: Vers un système d’apprentissage de l’arabe oral, PhD thesis. Stendhal University, Grenoble.
Biadsy, F., Hirschberg, J., & Habash, N. (2009). Spoken Arabic dialect identification using phonotactic modeling. In Proceedings of the eacl 2009 workshop on computational approaches to semitic languages (pp. 53–61). Association for Computational Linguistics.
Blanc-Brude, T. (2004). Intégration de commandes vocales dans un environnement d’apprentissage par l’action: enjeux ergonomiques, Doctoral dissertation, Grenoble 1.
Boite, R., Bourlard, H., Dutoit, T., Hancq, J., & Leich, H. (2000). Traitement de la parole. Lausanne: Presses Polytechniques et Universitaires Romandes, Collection Electricité.
Google Scholar
Bréhilin, L., & Gascuel, O. (2000). Modèles de Markov caches et apprentissage de séquences.
Calliope, E. P. (1989). La parole et son traitement automatique. Paris: Masson.
Google Scholar
Elshafei, M., Almuhtasib, H., & Alghamdi, M. (2002). Techniques for high quality text-to-speech. Information Science, 140, 255–267
Article Google Scholar
Elshafei, M., Al-Muhtaseb, H., & Alghamdi, M. (2006). Statistical methods for automatic diacritization of Arabic text. In The Saudi 18th National Computer Conference. Riyadh (Vol. 18, pp. 301–306).
Haffar, N., Maraoui, M., & Aljawarneh, S. (2016). Use of indexed Arabic text in e-learning system. In Engineering & MIS (ICEMIS), International Conference on (pp. 1–7). IEEE.
Hawashin, B., Mansour, A., Aljawarneh, S., Fahmy, A. A., Al Raddady, F., Shrivastava, A., Rajawat, A. S., Malika, C. R., Mishra, S., Yadav, R. N. (2013). An efficient feature selection method for arabic text classification. International Journal of Computers and Applications, 83, 17.
Google Scholar
Huang, X., Acero, A., & Hon, H. W. (2001). Spoken language processing—a guide to theory, algorithm, and system development. Upper Saddle River: Prentice Hall.
Google Scholar
Kaki, S., Sumita, E., & Iida, H. (1998). A method for correcting errors in speech recognition using the statistical features of character co-occurrence, In COLING-ACL, Montreal, Quebec, Canada.
Lin, J., Xie, Y., & Zhang, J. (2016). Automatic pronunciation evaluation of non-native mandarin tone by using multi-level confidence measures. Proc. INTERSPEECH 2016, (pp. 2666–2670).
Majidnezhad, V., & Kheidorov, I. (2012). A HMM-based method for vocal fold pathology diagnosis. IJCSI International Journal of Computer Science Issues, 9(6), 135.
Google Scholar
Majidnezhad, V., & Kheidorov, I. (2013). An ANN-based method for detecting vocal fold pathology. International Journal of Computer Applications, 62, 7.
Article Google Scholar
Maraoui, M., Zrigui, M., & Antoniadis, G. (2012). Use of NLP tools in CALL system for Arabic. Journal of Computer Processing of Languages, 24(02), 153–165.
Google Scholar
Meddeb, O., Maraoui, M., & Aljawarneh, S. (2016). Hybrid modeling of an offline arabic handwriting recognition system AHRS. In Engineering & MIS (ICEMIS), International Conference on (pp. 1–8). IEEE.
Merhbene, L., Zouaghi, A., & Zrigui, M. (2013). An experimental study for some supervised lexical disambiguation methods of arabic language. In Information and Communication Technology and Accessibility (ICTA), 2013 Fourth International Conference on (pp. 1–6). IEEE.
Paquet, P. (1997). L’utilisation des réseaux de neurones artificiels en finance. Document de recherche 1997-1.
Patane, G., & Russo, M. (2001). The enhanced LBG algorithm. Neural Networks, 14(9), 1219–1237
Article Google Scholar
Rouhe, A., Karhila, R., Smit, P., & Kurimo, M. (2017). Reading validation for pronunciation evaluation in the Digitala project. Proc. Interspeech 2017, 2050–2051.
Terbeh, N., Labidi, M., & Zrigui, M. (2013). Automatic speech correction: A step to speech recognition for people with disabilities. In Information and Communication Technology and Accessibility (ICTA), 2013 Fourth International Conference on (pp. 1–6). IEEE.
Terbeh, N., Maraoui, M., & Zrigui, M. (2015). Probabilistic approach for detection of vocal pathologies in the arabic speech. In International Conference on Intelligent Text Processing and Computational Linguistics (pp. 606–616). Springer, Cham
Terbeh, N., Trigui, A., Maraoui, M., & Zrigui, M. (2016). Arabic speech analysis to identify factors posing pronunciation disorders and to assist learners with vocal disabilities. In Engineering & MIS (ICEMIS), International Conference on (pp. 1–8). IEEE.
Terbeh, N., & Zrigui, M. (2014). Vers la correction automatique de la Parole Arabe, Morocco: Citala 2014.
Google Scholar
Terbeh, N., & Zrigui, M. (2016). A novel approach to identify factor posing pronunciation disorders. In International Conference on Computational Collective Intelligence (pp. 153–162). Springer, Cham.
Terbeh, N., & Zrigui, M. (2017a). A robust algorithm for pathological-speech correction. PACLING, 2017, 341–351.
Google Scholar
Terbeh, N., & Zrigui, M. (2017b). Identification of pronunciation defects in spoken Arabic language. PACLING, 2017, 355–365.
Google Scholar
Vu, H. H., Villaneau, J., Saïd, F., & Marteau, P. F. (2015). Mesurer la similarité entre phrases grâce à Wikipédia en utilisant une indexation aléatoire, 22nd Traitement Automatique des Langues Naturelles, Caen
Wali, W., Gargouri, B., & Ben Hamadou, A. (2017). Enhancing the sentence similarity measure by semantic and syntactico-semantic knowledge. Vietnam Journal of Computer Science, 4(1), 51–60.
Article Google Scholar
Yarra, C., Deshmukh, O. D., & Ghosh, P. K. (2017). Automatic detection of syllable stress using sonority based prominence features for pronunciation evaluation. In Acoustics, Speech and Signal Processing (ICASSP), 2017 IEEE International Conference on (pp. 5845–5849). IEEE.

Download references

Acknowledgements

The editors and reviewers within the International Journal of Speech Technology are acknowledged about their critics, remarks and comments to ameliorate the quality of this paper. My supervisors Mr. Mounir Zrigui and Mr. Mohsen Maraoui are also thanked for their valuable supports to achieve this work.

Author information

Authors and Affiliations

LaTICE Laboratory, Monastir, Tunisia
Naim Terbeh & Mounir Zrigui
Algebra, Number Theory and Nonlinear Analysis Laboratory, Monastir, Tunisia
Mohsen Maraoui

Authors

Mohsen Maraoui
View author publications
You can also search for this author in PubMed Google Scholar
Naim Terbeh
View author publications
You can also search for this author in PubMed Google Scholar
Mounir Zrigui
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mohsen Maraoui.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Maraoui, M., Terbeh, N. & Zrigui, M. Arabic discourse analysis based on acoustic, prosodic and phonetic modeling: elocution evaluation, speech classification and pathological speech correction. Int J Speech Technol 21, 1071–1090 (2018). https://doi.org/10.1007/s10772-018-09566-6

Download citation

Received: 05 November 2017
Accepted: 10 September 2018
Published: 31 October 2018
Issue Date: 15 December 2018
DOI: https://doi.org/10.1007/s10772-018-09566-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Arabic discourse analysis based on acoustic, prosodic and phonetic modeling: elocution evaluation, speech classification and pathological speech correction

Abstract

Access this article

Similar content being viewed by others

Automatic speech recognition: a survey

Comparison of Outcomes Between Robot-Assisted Language Learning System and Human Tutors: Focusing on Speaking Ability

A deep learning approaches in text-to-speech system: a systematic review and recent research perspective

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Arabic discourse analysis based on acoustic, prosodic and phonetic modeling: elocution evaluation, speech classification and pathological speech correction

Abstract

Access this article

Similar content being viewed by others

Automatic speech recognition: a survey

Comparison of Outcomes Between Robot-Assisted Language Learning System and Human Tutors: Focusing on Speaking Ability

A deep learning approaches in text-to-speech system: a systematic review and recent research perspective

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation