Skip to main content

Advertisement

Log in

An experiment of Moroccan dialect speech recognition in noisy environments using PocketSphinx

  • Published:
International Journal of Speech Technology Aims and scope Submit manuscript

Abstract

In this study, we introduce an experimental framework for Moroccan dialect speech recognition under various additive noise conditions using the open-source tool PocketSphinx. We curated a corpus comprising the ten most commonly used greetings in the Moroccan dialect, extracted from telephone conversations. This corpus was recorded with 60 speakers (30 males and 30 females). Each speaker articulated each expression three times in natural and noisy conditions. Feature extraction utilized Mel Scale Cepstral Coefficients (MFCC), and acoustic modeling, based on monophony, was implemented using Hidden Markov Models (HMM). While automatic speech recognition systems demonstrate commendable performance in noise-free conditions, their efficacy noticeably diminishes in the presence of noise.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig.2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data availability

The data supporting the findings of this study are available upon reasonable request. Researchers interested in accessing the data can contact corresponding author’s Abdelkbir ouisaadane for further information.

References

  • Al-Anzi, F. S., & AbuZeina, D. (2017). Exploring the language modeling toolkits for Arabic text. International Conference on Electrical and Computing Technologies and Applications (ICECTA), 2017, 1–4. https://doi.org/10.1109/ICECTA.2017.8251935

    Article  Google Scholar 

  • Aloqayli, F. M., & Alotaibi, Y. A. (2017). Spoken Arabic vowel recognition using ANN. European Modelling Symposium (EMS), 2017, 78–83. https://doi.org/10.1109/EMS.2017.24

    Article  Google Scholar 

  • Alotaibi, Y. A. (2012). Comparing ANN to HMM in implementing limited Arabic vocabulary ASR systems. International Journal of Speech Technology, 15(1), 25–32. https://doi.org/10.1007/s10772-011-9107-3

    Article  Google Scholar 

  • El Amrani, M. Y., Rahman, M. M. H., Wahiddin, M. R., & Shah, A. (2016). Building CMU sphinx language model for the holy quran using simplified Arabic phonemes. Egyptian Informatics Journal, 17(3), 305–314. https://doi.org/10.1016/j.eij.2016.04.002

    Article  Google Scholar 

  • El Ouahabi, S., Atounti, M., & Bellouki, M. (2017). Building HMM independent isolated speech recognizer system for amazigh language. In Á. Rocha, M. Serrhini, & C. Felgueiras (Eds.), Europe and MENA cooperation advances in information and communication technologies (pp. 299–307). Springer.

    Chapter  Google Scholar 

  • Elmahdy, M., Hasegawa-Johnson, M., & Mustafawi, E (2012). A baseline speech recognition system for levantine colloquial Arabic. Proceedings of ESOLEC.

  • Ghazi, A. El, Daoui, C., 523, P. B., & Mellal, M. B. (2012). Automatic speech recognition system concerning the Moroccan dialects (Darija and Tamazight). https://api.semanticscholar.org/CorpusID:212611829

  • Gupta, K., & Gupta, D. (2016). An analysis on LPC, RASTA and MFCC techniques in automatic speech recognition system. In 2016 6th international conference—cloud system and big data engineering (Confluence), (pp.493–497). https://doi.org/10.1109/CONFLUENCE.2016.7508170

  • Haton, J.-P., Cerisara, C., Fohr, D., Laprie, Y., & Smaïli, K. (2006). Reconnaissance automatique de la parole du signal à son interprétation. UniverSciences (Paris).

    Google Scholar 

  • Helali, W., Hajaiej, Z., & Cherif, A. (2018). Arabic corpus implementation: Application to speech recognition. In 2018 international conference on advanced systems and electric technologies (IC_ASET), (pp.50–53). https://doi.org/10.1109/ASET.2018.8379833

  • Huggins-Daines, D., Kumar, M., Chan, A., Black, A. W., Ravishankar, M., & Rudnicky, A. I. (2006). PocketSphinx: A free, real-time continuous speech recognition system for hand-held devices. In 2006 IEEE international conference on acoustics speech and signal processing proceedings, 1, (pp. I–I). https://doi.org/10.1109/ICASSP.2006.1659988

  • Karpagavalli, S., Deepika, R., Kokila, P., Usha Rani, K., & Chandra, E. (2012). Isolated Tamil digit speech recognition using template based and hmm based approaches. In P. Venkata Krishna, M. Rajasekhara Babu, & E. Ariwa (Eds.), Global trends in information systems and software applications. Springer.

    Google Scholar 

  • Kohshelan, & Wahid, N. (2014). Improvement of audio feature extraction techniques in traditional Indian musical instrument. In Recent advances on soft computing and data mining: Proceedings of the first international conference on soft computing and data mining (SCDM-2014), Universiti Tun Hussein Onn Malaysia, Johor, MalaysiaJune 16th-18th, 2014. Springer. https://doi.org/10.1007/978-3-319-07692-8_48

  • Mouaz, B., Abderrahim, B. H., & Abdelmajid, E. (2019). Speech recognition of Moroccan dialect using Hidden Markov Models. Procedia Computer Science, 151, 985–991. https://doi.org/10.1016/j.procs.2019.04.138

    Article  Google Scholar 

  • Nasereddin, H. H. O., & Omari, A. A. R. (2017). Classification techniques for automatic speech recognition (ASR) algorithms used with real time speech translation. Computing Conference, 2017, 200–207. https://doi.org/10.1109/SAI.2017.8252104

    Article  Google Scholar 

  • Nikolay, S. (n.d.). Training an acoustic model for CMUSphinx. https://cmusphinx.github.io/wiki/tutorialam/

  • Rabiner, L. R. (1989). A tutorial on Hidden Markov Models and selected applications in speech recognition. Proceedings of the IEEE, 77(2), 257–286. https://doi.org/10.1109/5.18626

    Article  Google Scholar 

  • Rabiner, L. R., & Juang, B.-H. F. (1993). Fundamentals of speech recognition. Prentice Hall Signal Processing Series. https://api.semanticscholar.org/CorpusID:2680950

  • Satori, H., Harti, M., & Chenfour, N. (2007). Arabic speech recognition system based on CMUSphinx. International Symposium on Computational Intelligence and Intelligent Informatics, 2007, 31–35. https://doi.org/10.1109/ISCIII.2007.367358

    Article  Google Scholar 

  • Sturim, D. E., Campbell, W. M., & Reynolds, D. A. (2007). Classification methods for speaker recognition. In C. Müller (Ed.), Speaker classification (pp. 278–297). Springer. https://doi.org/10.1007/978-3-540-74200-5_16

    Chapter  Google Scholar 

  • Telmem, M., & Ghanou, Y. (2018). Amazigh speech recognition system based on CMUSphinx. In M. B. Ahmed & A. A. Boudhir (Eds), Innovations in smart cities and applications: Proceedings of the 2nd Mediterranean symposium on smart city applications. Springer.

  • Touazi, A., & Debyeche, M. (2017). An experimental framework for Arabic digits speech recognition in noisy environments. International Journal of Speech Technology, 20(2), 205–224. https://doi.org/10.1007/s10772-017-9400-x

    Article  Google Scholar 

  • Umesh, S. (2011). Studies on inter-speaker variability in speech and its application in automatic speech recognition. Sadhana, 36(5), 853–883.

    Article  Google Scholar 

  • Wolf, M., & Nadeu, C. (2008). Evaluation of different feature extraction methods for speech recognition in car environment. In 2008 15th international conference on systems, signals and image processing, (pp. 359–362). https://doi.org/10.1109/IWSSIP.2008.4604441

  • Yin, H., Hohmann, V., & Nadeu, C. (2011). Acoustic features for speech recognition based on gammatone filterbank and instantaneous frequency. Speech Communication, 53(5), 707–715. https://doi.org/10.1016/j.specom.2010.04.008

    Article  Google Scholar 

  • Yu, D., & Deng, L. (2015). Automatic speech recognition: A deep learning approach. Springer.

    Book  Google Scholar 

  • Zealouk, O., Hamidi, M., Satori, H., & Satori, K. (2020). Amazigh digits speech recognition system under noise car environment. In C (Eds). Embedded systems and artificial intelligence, Proceedings of ESAI, Fez, Morocco. Springer.

  • Zealouk, O., Satori, H., Hamidi, M., & Satori, K. (2020b). Pathological detection using HMM speech recognition-based amazigh digits. Springer.

    Book  Google Scholar 

Download references

Funding

Funding for this research is not applicable. This study was conducted without any external financial support, and the authors did not receive any funding from external agencies, organizations, or sponsors.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Abdelkbir Ouisaadane.

Ethics declarations

Conflict of interest

On behalf of all authors, I state that there is no conflict of interests as defined by Springer.

Ethical approval

Ethical approval for this study is not applicable, as it does not involve human subjects, animal subjects, or any sensitive personal data.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ouisaadane, A., Safi, S. & Frikel, M. An experiment of Moroccan dialect speech recognition in noisy environments using PocketSphinx. Int J Speech Technol 27, 329–339 (2024). https://doi.org/10.1007/s10772-024-10103-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10772-024-10103-x

Keywords