Noise effect on Amazigh digits in speech recognition system

Zealouk, Ouissam; Satori, Hassan; Laaidi, Naouar; Hamidi, Mohamed; Satori, Khalid

doi:10.1007/s10772-020-09764-1

Noise effect on Amazigh digits in speech recognition system

Published: 05 November 2020

Volume 23, pages 885–892, (2020)
Cite this article

International Journal of Speech Technology Aims and scope Submit manuscript

Ouissam Zealouk¹,
Hassan Satori ORCID: orcid.org/0000-0002-7393-5726¹,
Naouar Laaidi¹,
Mohamed Hamidi¹ &
…
Khalid Satori¹

114 Accesses
Explore all metrics

Abstract

Automatic Speech Recognition (ASR) for Amazigh speech, particularly Moroccan Tarifit accented speech, is a less researched area. This paper focuses on the analysis and evaluation of the first ten Amazigh digits in the noisy conditions from an ASR perspective based on Signal to Noise Ratio (SNR). Our testing experiments were performed under two types of noise and repeated with added environmental noise with various SNR ratios for each kind ranging from 5 to 45 dB. Different formalisms are used to develop a speaker independent Amazigh speech recognition, like Hidden Markov Model (HMMs), Gaussian Mixture Models (GMMs). The experimental results under noisy conditions show that degradation of performance was observed for all digits with different degrees and the rates under car noisy environment are decreased less than grinder conditions with the difference of 2.84% and 8.42% at SNR 5 dB and 25 dB, respectively. Also, we observed that the most affected digits are those which contain the "S" alphabet.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Amazigh Digits Speech Recognition System Under Noise Car Environment

Enhancements in automatic Kannada speech recognition system by background noise elimination and alternate acoustic modelling

Article 22 January 2020

Amazigh digits through interactive speech recognition system in noisy environment

Article 03 December 2019

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Addarrazi, I., Satori, H., & Satori, K. (2017, April). Amazigh audiovisual speech recognition system design. In: 2017 intelligent systems and computer vision (ISCV). IEEE, 2017 (pp. 1–5).
Alotaibi, Y., Mamun, K., & Ghulam, M. (2009, July). Noise effect on arabic alphadigits in automatic speech recognition. In: IPCV. 2009 (pp. 679–682).
Benesty, J., Sondhi, M. M., & Huang, Y. (Eds.). (2007). Springer handbook of speech processing. Berlin: Springer.
Google Scholar
Besacier, L., Barnard, E., Karpov, A., & Schultz, T. (2014). Automatic speech recognition for under-resourced languages: A survey. Speech Communication, 56, 85–100.
Article Google Scholar
Deng, L., Acero, A., Jiang, L., Droppo, J., & Huang, X. (2001). high-performance robust speech recognition using stereo training data. In: Proceedings of ICASSP. Salt Lake City, Utah: ICASSP.
Book Google Scholar
Fadoua, A. A., & Siham, B. (2012). Natural language processing for Amazigh language. Challenges and future directions. Language technology for normalisation of less-resourced languages, 19.
Feng, A., Zhang, Y., & Glass, J. (2014, May). Speech feature denoising and dereverberation via deep autoencoders for noisy reverberant speech recognition. In: 2014 IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 1759–1763). IEEE.
Gaikwad, S. K., Gawali, B. W., & Yannawar, P. (2010). A review on speech recognition technique. International Journal of Computer Applications, 10(3), 16–24.
Article Google Scholar
Gales, M. J. (1998). Maximum likelihood linear transformations for HMM-based speech recognition. Computer Speech & Language, 12(2), 75–98.
Article Google Scholar
Gales, M. J. F., & Young, S. J. (1996). Robust continuous speech recognition using parallel model combination. IEEE Transactions on Speech Audio Process., 4(5), 352–359.
Article Google Scholar
Hamidi, M., Satori, H., Zealouk, O., Satori, K., & Laaidi, N. (2018, October). Interactive voice response server voice network administration using hidden markov model speech recognition system. In: Second 8 World conference on smart trends in systems, security and sustainability (WorldS4). IEEE (pp. 16–21).
Hamidi, M., Satori, H., Zealouk, O., & Satori, K. (2019). Speech coding effect on Amazigh alphabet speech recognition performance. Journal of Advanced Research in Dynamical and Control Systems, 11(2), 1392–1400.
Google Scholar
Hamidi, M., Satori, H., Zealouk, O., & Satori, K. (2020). Amazigh digits through interactive speech recognition system in noisy environment. International Journal of Speech Technology, 23(1), 101–109.
Article Google Scholar
Hansen, J. H., Sarikaya, R., Yapanel, U., & Pellom, B. (2001). Robust speech recognition in noise: An evaluation using the SPINE corpus. In: Proceedings of eurospeech. Aalborg: Eurospeech.
Google Scholar
Haque, S., Togneri, R., & Zaknich, A. (2009). Perceptual features for automatic speech recognition in noisy environments. Speech Communication, 51(1), 58–75.
Article Google Scholar
Hoffman, K. E. (2006). Berber language ideologies, maintenance, and contraction: Gendered variation in the indigenous margins of Morocco. Language & Communication, 26(2), 144–167.
Article Google Scholar
Huang, X., Acero, A., & Hon, H. (2001). Spoken language processing: A guide to theory, system and algorithm development. New Jersey: Prentice Hall.
Google Scholar
Hu, Y., & Huo, Q. (2006, December). An HMM compensation approach using unscented transformation for noisy speech recognition. In: ISCSLP (pp. 346–357).
Janicki, A., & Wawer, D. (2013). Voice-driven computer game in noisy environments. IJCSA, 10(1), 31–45.
Google Scholar
Kalinli, O., Seltzer, M. L., Droppo, J., & Acero, A. (2010). Noise adaptive training for robust automatic speech recognition. IEEE Transactions on Audio, Speech, and Language Processing, 18(8), 1889–1901.
Article Google Scholar
Kim, C., & Stern, R. M. (2009). Feature extraction for robust speech recognition using a power-law nonlinearity and power-bias subtraction. Interspeech, 2009, 28–31.
Google Scholar
Kumar, K., Aggarwal, R. K., & Jain, A. (2012). A Hindi speech recognition system for connected words using HTK. International Journal of Computational Systems Engineering, 1(1), 25–32.
Article Google Scholar
Lee, S. H., Chung, H., Park, J. G., Young, H.-J., Lee, Y. (2009). A commercial car navigation system using korean large vocabulary automatic speech recognizer. In: APSIPA 2009 annual summit and conference (pp. 286–289).
Li, J., Deng, L., Yu, D., et al. (2007). High-performance HMM adaptation with joint compensation of additive and convolutive distortions via vector Taylor series. In: Automatic speech recognition & understanding, ASRU. IEEE workshop (pp. 65–70).
Moreno, P., Raj, B., & Stern, R. (1996). A vector Taylor series approach for environment-independent speech recognition. In: Proceedings of international conference on audio, speech, signal processing, Atlanta, GA (pp. 733–736).
Rabiner, L. R. (1989). A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE, 77(2), 257–286.
Article Google Scholar
Raj, B., & Stern, R. M. (2005). Missing-feature approaches in speech recognition. IEEE Signal Processing Magazine, 22(5), 101–116.
Article Google Scholar
Satori, H., & Elhaoussi, F. (2014). Investigation Amazigh speech recognition using CMU tools. International Journal of Speech Technology, 17(3), 235–243.
Article Google Scholar
Satori, H., Zealouk, O., Satori, K., & ElHaoussi, F. (2017). Voice comparison between smokers and non-smokers using HMM speech recognition system. International Journal of Speech Technology, 20(4), 771–777.
Article Google Scholar
Seltzer, M. L., Acero, A., & Kalgaonkar, K. (2010, March). Acoustic model adaptation via linear spline interpolation for robust speech recognition. In: IEEE International Conference on Acoustics speech and signal processing (ICASSP), 2010 (pp. 4550–4553).
SoX - Sound eXchange. (2019). Retrieved March 2019 from https://sox.sourceforge.net/.
“Wavesurfer”. (2018). Version1.8.8p4. Retrieved January 2018 from https://sourceforge.net/projects/wavesurfer.
Yu, D., Deng, L., Droppo, J., Wu, J., Gong, Y., & Acero, A. (2008, March). A minimum-mean-square-error noise reduction algorithm on mel-frequency cepstra for robust speech recognition. In: International conference on acoustics, speech and signal processing. IEEE (pp. 4041–4044).
Zealouk, O., Satori, H., Hamidi, M., Laaidi, N., & Satori, K. (2018). Vocal parameters analysis of smoker using Amazigh language. International Journal of Speech Technology, 21(1), 85–91.
Article Google Scholar
Zealouk, O., Satori, H., Hamidi, M., & Satori, K. (2019). Speech recognition for Moroccan dialects: Feature extraction and classification methods. Journal of Advanced Research in Dynamical and Control Systems, 11(2), 1401–1408.
Google Scholar

Download references

Author information

Authors and Affiliations

Laboratory Computer Science, Image Processing and Numerical Analysis, Faculty of Sciences Dhar Mahraz, Sidi Mohammed Ben Abbdallah University, B.P. 1796, Fez, Morocco
Ouissam Zealouk, Hassan Satori, Naouar Laaidi, Mohamed Hamidi & Khalid Satori

Authors

Ouissam Zealouk
View author publications
You can also search for this author inPubMed Google Scholar
Hassan Satori
View author publications
You can also search for this author inPubMed Google Scholar
Naouar Laaidi
View author publications
You can also search for this author inPubMed Google Scholar
Mohamed Hamidi
View author publications
You can also search for this author inPubMed Google Scholar
Khalid Satori
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Hassan Satori.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zealouk, O., Satori, H., Laaidi, N. et al. Noise effect on Amazigh digits in speech recognition system. Int J Speech Technol 23, 885–892 (2020). https://doi.org/10.1007/s10772-020-09764-1

Download citation

Received: 28 January 2020
Accepted: 21 October 2020
Published: 05 November 2020
Issue Date: December 2020
DOI: https://doi.org/10.1007/s10772-020-09764-1

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Noise effect on Amazigh digits in speech recognition system

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Amazigh Digits Speech Recognition System Under Noise Car Environment

Enhancements in automatic Kannada speech recognition system by background noise elimination and alternate acoustic modelling

Amazigh digits through interactive speech recognition system in noisy environment

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now