Enhancement of spoken digits recognition for under-resourced languages: case of Algerian and Moroccan dialects

Lounnas, Khaled; Abbas, Mourad; Lichouri, Mohamed; Hamidi, Mohamed; Satori, Hassan; Teffahi, Hocine

doi:10.1007/s10772-022-09971-y

Enhancement of spoken digits recognition for under-resourced languages: case of Algerian and Moroccan dialects

Published: 15 April 2022

Volume 25, pages 443–455, (2022)
Cite this article

International Journal of Speech Technology Aims and scope Submit manuscript

Khaled Lounnas ORCID: orcid.org/0000-0003-2649-4419¹,
Mourad Abbas^2,3,
Mohamed Lichouri³,
Mohamed Hamidi^4,5,
Hassan Satori⁵ &
…
Hocine Teffahi¹

365 Accesses
5 Citations
Explore all metrics

Abstract

In this paper, we present a set of experiments aiming to improve the recognition of spoken digits for under-resourced dialects of the Maghrebi region, using a hybrid system. Indeed, integrating a Dialect Identification module into an Automatic Speech Recognition (ASR) system has shown its efficiency in previous works. In order to make the ASR system able to recognize digits spoken in different dialects, we trained our hybrid system on Moroccan Berber Dialect “MBD,” Moroccan Arabic Dialect “MAD,” and Algerian Arabic dialect “AAD,” in addition to Modern Standard Arabic. We have investigated five machine learning based classifiers and two deep learning models: the first one is based on Convolutional Neural Network (CNN), and the second one uses two pre-trained models: Residual Deep Neural Network (Resnet50 and Resnet101). The findings show that the CNN model outperforms the other proposed methods and consequently enhances the performance of spoken digit recognition system by 20% for both Algerian and Moroccan dialects.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Mixed Bangla-English Spoken Digit Classification Using Convolutional Neural Network

Comparison of Deep Learning Methods for Spoken Language Identification

Hybrid deep learning based automatic speech recognition model for recognizing non-Indian languages

Article 15 September 2023

Astha Gupta, Rakesh Kumar & Yogesh Kumar

Notes

Kabyl is an Algerian Berber dialect.
http://www.fon.hum.uva.nl/praat/.
https://www.audacityteam.org.
https://github.com/tyiannak/pyAudioAnalysis.
https://librosa.org/doc/latest/index.html.
https://github.com/mtobeiyf/audio-classification.
mix-sys-1, mix-sys-2, and mix-sys-3: acoustic and language models have been built using a mixture of (MAD and AAD), (MAD, AAD, and MBD), and MAD, AAD, MBD, MSA) corpora, respectively.

References

Azim, M. A., Hussein, W., & Badr, N. L. (2021). Spoken arabic digits recognition system using convolutional neural network. In International Conference on Advanced Machine Learning Technologies and Applications (pp. 164–172). Springer.
Belgacem, M., Antoniadis, G., & Besacier, L. (2010). Automatic identification of Arabic dialects. In Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC’10), European Language Resources Association (ELRA). Retrieved from http://www.lrec-conf.org/proceedings/lrec2010/pdf/719_Paper.pdf.
Bougrine, S., Cherroun, H., & Ziadi, D. (2018). Prosody-based spoken Algerian arabic dialect identification. Procedia Computer Science, 128, 9–17.
Article Google Scholar
Campbell, W. M., Campbell, J. P., Reynolds, D. A., Singer, E., & Torres-Carrasquillo, P. A. (2006). Support vector machines for speaker and language recognition. Computer Speech & Language, 20(2–3), 210–229.
Article Google Scholar
Chittaragi, N. B., Limaye, A., Chandana, N., Annappa, B., & Koolagudi, S. G. (2019). Automatic text-independent kannada dialect identification system. In Information Systems Design and Intelligent Applications (pp. 79–87). Springer.
Chittaragi, N. B., Prakash, A., & Koolagudi, S. G. (2018). Dialect identification using spectral and prosodic features on single and ensemble classifiers. Arabian Journal for Science and Engineering, 43(8), 4289–4302.
Article Google Scholar
El Ghazi, A., Daoui, C., Idrissi, N., Fakir, M., & Bouikhalene, B. (2011). Speech recognition system based on hidden markov model concerning the moroccan dialect Darija. Global Journal of Computer Science and Technology.
Ezzine, A., Satori, H., Hamidi, M., & Satori, K. (2020). Moroccan dialect speech recognition system based on cmu sphinxtools. In 2020 International Conference on Intelligent Systems and Computer Vision (ISCV) (pp. 1–5). IEEE.
Giannakopoulos, T. (2015). pyaudioanalysis: An open-source python library for audio signal analysis. PLoS ONE, 10(12), e0144610.
Article Google Scholar
Hanani, A., & Naser, R. (2020). Spoken arabic dialect recognition using x-vectors. Natural Language Engineering, 26, 691–700.
Article Google Scholar
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770–778).
Kakouros, S., Hiovain, K., Vainio, M., & Šimko, J. (2020). Dialect identification of spoken north s\(\backslash \)’ami language varieties using prosodic features. arXiv preprint arXiv:2003.10183.
Lachachi, N. E., & Adla, A. (2016). Two approaches-based l2-SVMs reduced to MEB problems for dialect identification. International Journal of Computational Vision and Robotics, 6(1–2), 1–18.
Article Google Scholar
Liu, G. A., & Hansen, J. H. (2011). A systematic strategy for robust automatic dialect identification. In 2011 19th European Signal Processing Conference (pp. 2138–2141). IEEE.
Lounnas, K., Abbas, M., Teffahi, H., & Lichouri, M. (2019). A language identification system based on voxforge speech corpus. In International Conference on Advanced Machine Learning Technologies and Applications (pp. 529–534). Springer.
Lounnas, K., Demri, L., Falek, L., & Teffahi, H. (2018). automatic language identification for berber and arabic languages using prosodic features. In 2018 International Conference on Electrical Sciences and Technologies in Maghreb (CISTEM) (pp. 1–4). IEEE.
Lounnas, K., Satori, H., Teffahi, H., Abbas, M., & Lichouri, M. (2020). Cliasr: a combined automatic speech recognition and language identification system. In 2020 1st International Conference on Innovative Research in Applied Science Engineering and Technology (IRASET) (pp. 1–5). IEEE.
McFee, B., Raffel, C., Liang, D., Ellis, D.P., McVicar, M., Battenberg, E., & Nieto, O. (2015). librosa: Audio and music signal analysis in python. In Proceedings of the 14th Python in Science Conference (Vol. 8, pp. 18–25). Citeseer.
Mouaz, B., Abderrahim, B. H., & Abdelmajid, E. (2019). Speech recognition of Moroccan dialect using hidden markov models. Procedia Computer Science, 151, 985–991.
Article Google Scholar
Najafian, M., Khurana, S., Shan, S., Ali, A., & Glass, J. (2018). Exploiting convolutional neural networks for phonotactic based dialect identification. In 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 5174–5178). IEEE.
Nour-Eddine, L., & Abdelkader, A. (2015). Gmm-based maghreb dialect identification system. JIPS, 11(1), 22–38.
Google Scholar
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., et al. (2011). Scikit-learn: Machine learning in python. The Journal of Machine Learning Research, 12, 2825–2830.
MathSciNet MATH Google Scholar
Satori, H., & ElHaoussi, F. (2014). Investigation amazigh speech recognition using CMU tools. International Journal of Speech Technology, 17(3), 235–243.
Article Google Scholar
Sengupta, S., Yasmin, G., & Ghosal, A. (2019). Speaker recognition using occurrence pattern of speech signal. In Recent Trends in Signal and Image Processing (pp. 207–216). Springer.
Sergyan, S. (2008). Color histogram features based image classification in content-based image retrieval systems. In 2008 6th International Symposium on Applied Machine Intelligence and Informatics (pp. 221–224). IEEE.
Shon, S., Ali, A., Samih, Y., Mubarak, H., & Glass, J. (2020). Adi17: a fine-grained arabic dialect identification dataset. In ICASSP 2020–2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 8244–8248). IEEE.
Sun, Y., Wen, G., & Wang, J. (2015). Weighted spectral features based on local Hu moments for speech emotion recognition. Biomedical Signal Processing and Control, 18, 80–90.
Article Google Scholar
Terbeh, N., Maraoui, M., & Zrigui, M. (2018). Arabic dialect identification based on probabilistic-phonetic modeling. Computación y Sistemas, 22(3), 863–870.
Article Google Scholar
Touazi, A., & Debyeche, M. (2017). An experimental framework for arabic digits speech recognition in noisy environments. International Journal of Speech Technology, 20(2), 205–224.
Article Google Scholar
Wazir, A. S. M. B, & Chuah, J. H. (2019). Spoken arabic digits recognition using deep learning. In 2019 IEEE International Conference on Automatic Control and Intelligent Systems (I2CACIS) (pp. 339–344). IEEE.
Zealouk, O., Satori, H., Hamidi, M., Laaidi, N., & Satori, K. (2018). Vocal parameters analysis of smoker using amazigh language. International Journal of Speech Technology, 21(1), 85–91.
Article Google Scholar
Zerari, N., Abdelhamid, S., Bouzgou, H., & Raymond, C. (2018). Bi-directional recurrent end-to-end neural network classifier for spoken arab digit recognition. In 2018 2nd International Conference on Natural Language and Speech Processing (ICNLSP) (pp. 1–6). IEEE.
Žunić, J., Hirota, K., & Rosin, P. L. (2010). A hu moment invariant as a shape circularity measure. Pattern Recognition, 43(1), 47–57.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Laboratory of Spoken Communication and Signal Processing, USTHB, Algiers, Algeria
Khaled Lounnas & Hocine Teffahi
High Council for the Arabic Language, HCLA, Algiers, Algeria
Mourad Abbas
Computational Linguistics Department, CRSTDLA, Algiers, Algeria
Mourad Abbas & Mohamed Lichouri
Multimedia and Arts department, FLLA, UIT, Kenitra, Morocco
Mohamed Hamidi
LISAC, Department of Mathematics and Computer Science, FSDM, USMBA, Fes, Morocco
Mohamed Hamidi & Hassan Satori

Authors

Khaled Lounnas
View author publications
You can also search for this author in PubMed Google Scholar
Mourad Abbas
View author publications
You can also search for this author in PubMed Google Scholar
Mohamed Lichouri
View author publications
You can also search for this author in PubMed Google Scholar
Mohamed Hamidi
View author publications
You can also search for this author in PubMed Google Scholar
Hassan Satori
View author publications
You can also search for this author in PubMed Google Scholar
Hocine Teffahi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Khaled Lounnas.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lounnas, K., Abbas, M., Lichouri, M. et al. Enhancement of spoken digits recognition for under-resourced languages: case of Algerian and Moroccan dialects. Int J Speech Technol 25, 443–455 (2022). https://doi.org/10.1007/s10772-022-09971-y

Download citation

Received: 10 March 2021
Accepted: 27 March 2022
Published: 15 April 2022
Issue Date: June 2022
DOI: https://doi.org/10.1007/s10772-022-09971-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Enhancement of spoken digits recognition for under-resourced languages: case of Algerian and Moroccan dialects

Abstract

Access this article

Similar content being viewed by others

Mixed Bangla-English Spoken Digit Classification Using Convolutional Neural Network

Comparison of Deep Learning Methods for Spoken Language Identification

Hybrid deep learning based automatic speech recognition model for recognizing non-Indian languages

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Enhancement of spoken digits recognition for under-resourced languages: case of Algerian and Moroccan dialects

Abstract

Access this article

Similar content being viewed by others

Mixed Bangla-English Spoken Digit Classification Using Convolutional Neural Network

Comparison of Deep Learning Methods for Spoken Language Identification

Hybrid deep learning based automatic speech recognition model for recognizing non-Indian languages

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation