An experiment of Moroccan dialect speech recognition in noisy environments using PocketSphinx

Ouisaadane, Abdelkbir; Safi, Said; Frikel, Miloud

doi:10.1007/s10772-024-10103-x

An experiment of Moroccan dialect speech recognition in noisy environments using PocketSphinx

Published: 31 May 2024

Volume 27, pages 329–339, (2024)
Cite this article

International Journal of Speech Technology Aims and scope Submit manuscript

86 Accesses
Explore all metrics

Abstract

In this study, we introduce an experimental framework for Moroccan dialect speech recognition under various additive noise conditions using the open-source tool PocketSphinx. We curated a corpus comprising the ten most commonly used greetings in the Moroccan dialect, extracted from telephone conversations. This corpus was recorded with 60 speakers (30 males and 30 females). Each speaker articulated each expression three times in natural and noisy conditions. Feature extraction utilized Mel Scale Cepstral Coefficients (MFCC), and acoustic modeling, based on monophony, was implemented using Hidden Markov Models (HMM). While automatic speech recognition systems demonstrate commendable performance in noise-free conditions, their efficacy noticeably diminishes in the presence of noise.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Enhancing Automatic Speech Recognition for Punjabi Dialects: An Experimental Analysis of Incorporating Prosodic Features and Acoustic Variability Mitigation

Article 01 August 2024

Building Automatic Speech Recognition Systems for Moroccan Dialect: A Phoneme-Based Approach

Article 25 July 2024

Evaluating the Performance of ASR Systems for TV Interactions in Several Domestic Noise Scenarios

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data availability

The data supporting the findings of this study are available upon reasonable request. Researchers interested in accessing the data can contact corresponding author’s Abdelkbir ouisaadane for further information.

References

Al-Anzi, F. S., & AbuZeina, D. (2017). Exploring the language modeling toolkits for Arabic text. International Conference on Electrical and Computing Technologies and Applications (ICECTA), 2017, 1–4. https://doi.org/10.1109/ICECTA.2017.8251935
Article Google Scholar
Aloqayli, F. M., & Alotaibi, Y. A. (2017). Spoken Arabic vowel recognition using ANN. European Modelling Symposium (EMS), 2017, 78–83. https://doi.org/10.1109/EMS.2017.24
Article Google Scholar
Alotaibi, Y. A. (2012). Comparing ANN to HMM in implementing limited Arabic vocabulary ASR systems. International Journal of Speech Technology, 15(1), 25–32. https://doi.org/10.1007/s10772-011-9107-3
Article Google Scholar
El Amrani, M. Y., Rahman, M. M. H., Wahiddin, M. R., & Shah, A. (2016). Building CMU sphinx language model for the holy quran using simplified Arabic phonemes. Egyptian Informatics Journal, 17(3), 305–314. https://doi.org/10.1016/j.eij.2016.04.002
Article Google Scholar
El Ouahabi, S., Atounti, M., & Bellouki, M. (2017). Building HMM independent isolated speech recognizer system for amazigh language. In Á. Rocha, M. Serrhini, & C. Felgueiras (Eds.), Europe and MENA cooperation advances in information and communication technologies (pp. 299–307). Springer.
Chapter Google Scholar
Elmahdy, M., Hasegawa-Johnson, M., & Mustafawi, E (2012). A baseline speech recognition system for levantine colloquial Arabic. Proceedings of ESOLEC.
Ghazi, A. El, Daoui, C., 523, P. B., & Mellal, M. B. (2012). Automatic speech recognition system concerning the Moroccan dialects (Darija and Tamazight). https://api.semanticscholar.org/CorpusID:212611829
Gupta, K., & Gupta, D. (2016). An analysis on LPC, RASTA and MFCC techniques in automatic speech recognition system. In 2016 6^th international conference—cloud system and big data engineering (Confluence), (pp.493–497). https://doi.org/10.1109/CONFLUENCE.2016.7508170
Haton, J.-P., Cerisara, C., Fohr, D., Laprie, Y., & Smaïli, K. (2006). Reconnaissance automatique de la parole du signal à son interprétation. UniverSciences (Paris).
Google Scholar
Helali, W., Hajaiej, Z., & Cherif, A. (2018). Arabic corpus implementation: Application to speech recognition. In 2018 international conference on advanced systems and electric technologies (IC_ASET), (pp.50–53). https://doi.org/10.1109/ASET.2018.8379833
Huggins-Daines, D., Kumar, M., Chan, A., Black, A. W., Ravishankar, M., & Rudnicky, A. I. (2006). PocketSphinx: A free, real-time continuous speech recognition system for hand-held devices. In 2006 IEEE international conference on acoustics speech and signal processing proceedings, 1, (pp. I–I). https://doi.org/10.1109/ICASSP.2006.1659988
Karpagavalli, S., Deepika, R., Kokila, P., Usha Rani, K., & Chandra, E. (2012). Isolated Tamil digit speech recognition using template based and hmm based approaches. In P. Venkata Krishna, M. Rajasekhara Babu, & E. Ariwa (Eds.), Global trends in information systems and software applications. Springer.
Google Scholar
Kohshelan, & Wahid, N. (2014). Improvement of audio feature extraction techniques in traditional Indian musical instrument. In Recent advances on soft computing and data mining: Proceedings of the first international conference on soft computing and data mining (SCDM-2014), Universiti Tun Hussein Onn Malaysia, Johor, MalaysiaJune 16th-18th, 2014. Springer. https://doi.org/10.1007/978-3-319-07692-8_48
Mouaz, B., Abderrahim, B. H., & Abdelmajid, E. (2019). Speech recognition of Moroccan dialect using Hidden Markov Models. Procedia Computer Science, 151, 985–991. https://doi.org/10.1016/j.procs.2019.04.138
Article Google Scholar
Nasereddin, H. H. O., & Omari, A. A. R. (2017). Classification techniques for automatic speech recognition (ASR) algorithms used with real time speech translation. Computing Conference, 2017, 200–207. https://doi.org/10.1109/SAI.2017.8252104
Article Google Scholar
Nikolay, S. (n.d.). Training an acoustic model for CMUSphinx. https://cmusphinx.github.io/wiki/tutorialam/
Rabiner, L. R. (1989). A tutorial on Hidden Markov Models and selected applications in speech recognition. Proceedings of the IEEE, 77(2), 257–286. https://doi.org/10.1109/5.18626
Article Google Scholar
Rabiner, L. R., & Juang, B.-H. F. (1993). Fundamentals of speech recognition. Prentice Hall Signal Processing Series. https://api.semanticscholar.org/CorpusID:2680950
Satori, H., Harti, M., & Chenfour, N. (2007). Arabic speech recognition system based on CMUSphinx. International Symposium on Computational Intelligence and Intelligent Informatics, 2007, 31–35. https://doi.org/10.1109/ISCIII.2007.367358
Article Google Scholar
Sturim, D. E., Campbell, W. M., & Reynolds, D. A. (2007). Classification methods for speaker recognition. In C. Müller (Ed.), Speaker classification (pp. 278–297). Springer. https://doi.org/10.1007/978-3-540-74200-5_16
Chapter Google Scholar
Telmem, M., & Ghanou, Y. (2018). Amazigh speech recognition system based on CMUSphinx. In M. B. Ahmed & A. A. Boudhir (Eds), Innovations in smart cities and applications: Proceedings of the 2nd Mediterranean symposium on smart city applications. Springer.
Touazi, A., & Debyeche, M. (2017). An experimental framework for Arabic digits speech recognition in noisy environments. International Journal of Speech Technology, 20(2), 205–224. https://doi.org/10.1007/s10772-017-9400-x
Article Google Scholar
Umesh, S. (2011). Studies on inter-speaker variability in speech and its application in automatic speech recognition. Sadhana, 36(5), 853–883.
Article Google Scholar
Wolf, M., & Nadeu, C. (2008). Evaluation of different feature extraction methods for speech recognition in car environment. In 2008 15th international conference on systems, signals and image processing, (pp. 359–362). https://doi.org/10.1109/IWSSIP.2008.4604441
Yin, H., Hohmann, V., & Nadeu, C. (2011). Acoustic features for speech recognition based on gammatone filterbank and instantaneous frequency. Speech Communication, 53(5), 707–715. https://doi.org/10.1016/j.specom.2010.04.008
Article Google Scholar
Yu, D., & Deng, L. (2015). Automatic speech recognition: A deep learning approach. Springer.
Book Google Scholar
Zealouk, O., Hamidi, M., Satori, H., & Satori, K. (2020). Amazigh digits speech recognition system under noise car environment. In C (Eds). Embedded systems and artificial intelligence, Proceedings of ESAI, Fez, Morocco. Springer.
Zealouk, O., Satori, H., Hamidi, M., & Satori, K. (2020b). Pathological detection using HMM speech recognition-based amazigh digits. Springer.
Book Google Scholar

Download references

Funding

Funding for this research is not applicable. This study was conducted without any external financial support, and the authors did not receive any funding from external agencies, organizations, or sponsors.

Author information

Authors and Affiliations

Department of Mathematics and Computer Science Polydisciplinary Faculty, Sultan Moulay Slimane University, Benimellal, Morocco
Abdelkbir Ouisaadane & Said Safi
GREYC Laboratory ENSICAEN School, LAC Laboratory Caen-Normandie University, Caen, France
Miloud Frikel

Authors

Abdelkbir Ouisaadane
View author publications
You can also search for this author in PubMed Google Scholar
Said Safi
View author publications
You can also search for this author in PubMed Google Scholar
Miloud Frikel
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Abdelkbir Ouisaadane.

Ethics declarations

Conflict of interest

On behalf of all authors, I state that there is no conflict of interests as defined by Springer.

Ethical approval

Ethical approval for this study is not applicable, as it does not involve human subjects, animal subjects, or any sensitive personal data.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Ouisaadane, A., Safi, S. & Frikel, M. An experiment of Moroccan dialect speech recognition in noisy environments using PocketSphinx. Int J Speech Technol 27, 329–339 (2024). https://doi.org/10.1007/s10772-024-10103-x

Download citation

Received: 07 December 2023
Accepted: 03 May 2024
Published: 31 May 2024
Issue Date: June 2024
DOI: https://doi.org/10.1007/s10772-024-10103-x

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An experiment of Moroccan dialect speech recognition in noisy environments using PocketSphinx

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Enhancing Automatic Speech Recognition for Punjabi Dialects: An Experimental Analysis of Incorporating Prosodic Features and Acoustic Variability Mitigation

Building Automatic Speech Recognition Systems for Moroccan Dialect: A Phoneme-Based Approach

Evaluating the Performance of ASR Systems for TV Interactions in Several Domestic Noise Scenarios

Explore related subjects

Data availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now