Skip to main content
Log in

An automatic speech recognition system for isolated Amazigh word using 1D & 2D CNN-LSTM architecture

  • Published:
International Journal of Speech Technology Aims and scope Submit manuscript

Abstract

The availability of automatic speech recognition systems is crucial in various domains such as communication, healthcare, security, education, etc. However, currently, the existing systems often favor dominant languages such as English, French, Arabic, or Asian languages, leaving under-resourced languages without the consideration they deserve. In this specific context, our work is focused on the Amazigh language, which is widely spoken in North Africa. Our primary objective is to develop an automatic speech recognition system specifically for isolated words, with a particular focus on the Tarifit dialect spoken in the Rif region of Northern Morocco. For the dataset construction, we considered 30 isolated words recorded from 80 diverse speakers, resulting in 2400 audio files. The collected corpus is characterized by its quantity, quality, and variety. Moreover, the dataset serves as a valuable resource for further research and development in the field, supporting the advancement of speech recognition technology for underrepresented languages. For the recognition system, we have chosen the most recent approach in the field of speech recognition, which is a combination of convolutional neural networks and LSTM (CNN-LSTM). For the test, we have evaluated two different architectural models: the 1D CNN LSTM and the 2D CNN LSTM. The experimental results demonstrate a remarkable accuracy rate of over 96% in recognizing spoken words utilizing the 2D CNN LSTM architecture.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Data availability

The data used in this study are not publicly available due to restrictions imposed by the LSA Laboratory. However, the data can be made available by the corresponding author upon reasonable request.

References

Download references

Funding

No funding was received for conducting this study. The authors declare they have no financial interests.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohamed Daouad.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Daouad, M., Allah, F.A. & Dadi, E.W. An automatic speech recognition system for isolated Amazigh word using 1D & 2D CNN-LSTM architecture. Int J Speech Technol 26, 775–787 (2023). https://doi.org/10.1007/s10772-023-10054-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10772-023-10054-9

Keywords

Navigation