Skip to main content

Advertisement

Log in

Noise effect on Amazigh digits in speech recognition system

  • Published:
International Journal of Speech Technology Aims and scope Submit manuscript

Abstract

Automatic Speech Recognition (ASR) for Amazigh speech, particularly Moroccan Tarifit accented speech, is a less researched area. This paper focuses on the analysis and evaluation of the first ten Amazigh digits in the noisy conditions from an ASR perspective based on Signal to Noise Ratio (SNR). Our testing experiments were performed under two types of noise and repeated with added environmental noise with various SNR ratios for each kind ranging from 5 to 45 dB. Different formalisms are used to develop a speaker independent Amazigh speech recognition, like Hidden Markov Model (HMMs), Gaussian Mixture Models (GMMs). The experimental results under noisy conditions show that degradation of performance was observed for all digits with different degrees and the rates under car noisy environment are decreased less than grinder conditions with the difference of 2.84% and 8.42% at SNR 5 dB and 25 dB, respectively. Also, we observed that the most affected digits are those which contain the "S" alphabet.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  • Addarrazi, I., Satori, H., & Satori, K. (2017, April). Amazigh audiovisual speech recognition system design. In: 2017 intelligent systems and computer vision (ISCV). IEEE, 2017 (pp. 1–5).

  • Alotaibi, Y., Mamun, K., & Ghulam, M. (2009, July). Noise effect on arabic alphadigits in automatic speech recognition. In: IPCV. 2009 (pp. 679–682).

  • Benesty, J., Sondhi, M. M., & Huang, Y. (Eds.). (2007). Springer handbook of speech processing. Berlin: Springer.

    Google Scholar 

  • Besacier, L., Barnard, E., Karpov, A., & Schultz, T. (2014). Automatic speech recognition for under-resourced languages: A survey. Speech Communication, 56, 85–100.

    Article  Google Scholar 

  • Deng, L., Acero, A., Jiang, L., Droppo, J., & Huang, X. (2001). high-performance robust speech recognition using stereo training data. In: Proceedings of ICASSP. Salt Lake City, Utah: ICASSP.

    Book  Google Scholar 

  • Fadoua, A. A., & Siham, B. (2012). Natural language processing for Amazigh language. Challenges and future directions. Language technology for normalisation of less-resourced languages, 19.

  • Feng, A., Zhang, Y., & Glass, J. (2014, May). Speech feature denoising and dereverberation via deep autoencoders for noisy reverberant speech recognition. In: 2014 IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 1759–1763). IEEE.

  • Gaikwad, S. K., Gawali, B. W., & Yannawar, P. (2010). A review on speech recognition technique. International Journal of Computer Applications, 10(3), 16–24.

    Article  Google Scholar 

  • Gales, M. J. (1998). Maximum likelihood linear transformations for HMM-based speech recognition. Computer Speech & Language, 12(2), 75–98.

    Article  Google Scholar 

  • Gales, M. J. F., & Young, S. J. (1996). Robust continuous speech recognition using parallel model combination. IEEE Transactions on Speech Audio Process., 4(5), 352–359.

    Article  Google Scholar 

  • Hamidi, M., Satori, H., Zealouk, O., Satori, K., & Laaidi, N. (2018, October). Interactive voice response server voice network administration using hidden markov model speech recognition system. In: Second 8 World conference on smart trends in systems, security and sustainability (WorldS4). IEEE (pp. 16–21).

  • Hamidi, M., Satori, H., Zealouk, O., & Satori, K. (2019). Speech coding effect on Amazigh alphabet speech recognition performance. Journal of Advanced Research in Dynamical and Control Systems, 11(2), 1392–1400.

    Google Scholar 

  • Hamidi, M., Satori, H., Zealouk, O., & Satori, K. (2020). Amazigh digits through interactive speech recognition system in noisy environment. International Journal of Speech Technology, 23(1), 101–109.

    Article  Google Scholar 

  • Hansen, J. H., Sarikaya, R., Yapanel, U., & Pellom, B. (2001). Robust speech recognition in noise: An evaluation using the SPINE corpus. In: Proceedings of eurospeech. Aalborg: Eurospeech.

    Google Scholar 

  • Haque, S., Togneri, R., & Zaknich, A. (2009). Perceptual features for automatic speech recognition in noisy environments. Speech Communication, 51(1), 58–75.

    Article  Google Scholar 

  • Hoffman, K. E. (2006). Berber language ideologies, maintenance, and contraction: Gendered variation in the indigenous margins of Morocco. Language & Communication, 26(2), 144–167.

    Article  Google Scholar 

  • Huang, X., Acero, A., & Hon, H. (2001). Spoken language processing: A guide to theory, system and algorithm development. New Jersey: Prentice Hall.

    Google Scholar 

  • Hu, Y., & Huo, Q. (2006, December). An HMM compensation approach using unscented transformation for noisy speech recognition. In: ISCSLP (pp. 346–357).

  • Janicki, A., & Wawer, D. (2013). Voice-driven computer game in noisy environments. IJCSA, 10(1), 31–45.

    Google Scholar 

  • Kalinli, O., Seltzer, M. L., Droppo, J., & Acero, A. (2010). Noise adaptive training for robust automatic speech recognition. IEEE Transactions on Audio, Speech, and Language Processing, 18(8), 1889–1901.

    Article  Google Scholar 

  • Kim, C., & Stern, R. M. (2009). Feature extraction for robust speech recognition using a power-law nonlinearity and power-bias subtraction. Interspeech, 2009, 28–31.

    Google Scholar 

  • Kumar, K., Aggarwal, R. K., & Jain, A. (2012). A Hindi speech recognition system for connected words using HTK. International Journal of Computational Systems Engineering, 1(1), 25–32.

    Article  Google Scholar 

  • Lee, S. H., Chung, H., Park, J. G., Young, H.-J., Lee, Y. (2009). A commercial car navigation system using korean large vocabulary automatic speech recognizer. In: APSIPA 2009 annual summit and conference (pp. 286–289).

  • Li, J., Deng, L., Yu, D., et al. (2007). High-performance HMM adaptation with joint compensation of additive and convolutive distortions via vector Taylor series. In: Automatic speech recognition & understanding, ASRU. IEEE workshop (pp. 65–70).

  • Moreno, P., Raj, B., & Stern, R. (1996). A vector Taylor series approach for environment-independent speech recognition. In: Proceedings of international conference on audio, speech, signal processing, Atlanta, GA (pp. 733–736).

  • Rabiner, L. R. (1989). A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE, 77(2), 257–286.

    Article  Google Scholar 

  • Raj, B., & Stern, R. M. (2005). Missing-feature approaches in speech recognition. IEEE Signal Processing Magazine, 22(5), 101–116.

    Article  Google Scholar 

  • Satori, H., & Elhaoussi, F. (2014). Investigation Amazigh speech recognition using CMU tools. International Journal of Speech Technology, 17(3), 235–243.

    Article  Google Scholar 

  • Satori, H., Zealouk, O., Satori, K., & ElHaoussi, F. (2017). Voice comparison between smokers and non-smokers using HMM speech recognition system. International Journal of Speech Technology, 20(4), 771–777.

    Article  Google Scholar 

  • Seltzer, M. L., Acero, A., & Kalgaonkar, K. (2010, March). Acoustic model adaptation via linear spline interpolation for robust speech recognition. In: IEEE International Conference on Acoustics speech and signal processing (ICASSP), 2010 (pp. 4550–4553).

  • SoX - Sound eXchange. (2019). Retrieved March 2019 from https://sox.sourceforge.net/.

  • “Wavesurfer”. (2018). Version1.8.8p4. Retrieved January 2018 from https://sourceforge.net/projects/wavesurfer.

  • Yu, D., Deng, L., Droppo, J., Wu, J., Gong, Y., & Acero, A. (2008, March). A minimum-mean-square-error noise reduction algorithm on mel-frequency cepstra for robust speech recognition. In: International conference on acoustics, speech and signal processing. IEEE (pp. 4041–4044).

  • Zealouk, O., Satori, H., Hamidi, M., Laaidi, N., & Satori, K. (2018). Vocal parameters analysis of smoker using Amazigh language. International Journal of Speech Technology, 21(1), 85–91.

    Article  Google Scholar 

  • Zealouk, O., Satori, H., Hamidi, M., & Satori, K. (2019). Speech recognition for Moroccan dialects: Feature extraction and classification methods. Journal of Advanced Research in Dynamical and Control Systems, 11(2), 1401–1408.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hassan Satori.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zealouk, O., Satori, H., Laaidi, N. et al. Noise effect on Amazigh digits in speech recognition system. Int J Speech Technol 23, 885–892 (2020). https://doi.org/10.1007/s10772-020-09764-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10772-020-09764-1

Keywords

Navigation