Skip to main content
Log in

Methods for applying VAD in Kazakh speech recognition systems

  • Published:
International Journal of Speech Technology Aims and scope Submit manuscript

Abstract

This article considers the algorithm “Voice activity detection” and the using VAD algorithm in the system of Kazakh speech recognition. The paper presents a mathematical model VAD and methods for detecting voice data: pauses between sentences, words, individual sounds. VAD algorithm is adapted to the recognition of Kazakh speech counting the basic properties of Kazakh language. Voice activity detection researches in Kazakh speech are being conducted for the first time. The results of the spectral analysis are displayed on the picture.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  • Dorokhin, O. A., & Starushko, D. G. (2003). Speech signal segmentation. Artificial Intellect, 3, 450–478.

    Google Scholar 

  • Shelepov, V. J., & Nitsenko, A. V. (2003). Amplitude segmentation of speech signal using filtration and known phonetic composition. Artificial Intellect, 6, 120–123.

    Google Scholar 

  • Lamel, L. F., Rabiner, L. R., Rosenberg, A. E., & Wilpon, J. G. (1981). An improved endpoint detector for isolated word recognition. IEEE Transactions on Acoustics, Speech, and Signal Processing, Assp-29(4), 777–785.

    Article  Google Scholar 

  • Rabiner, L. R., & Juang, B.-H. (1993). Fundamentals of speech recognition. Englewood Cliffs: Prentice-Hall.

    Google Scholar 

  • Tucker, R. (1992). Voice activity detection using a periodicity measure. IEE Proceedings Communications Speech and Vision, 139(4), 377–380.

    Article  Google Scholar 

  • Nemer, E., Goubran, R., & Mahmoud, S. (2001). Robust voice activity detection using higher-order statistics in the LPC residual domain. IEEE Transactions on Speech and Audio Processing, 9(3), 217–231.

    Article  Google Scholar 

  • Deller, J. R., Hansen, H. L., & Proakis, J. G. (2008). Discrete-time processing of speech signals. New York: Wiley.

    Google Scholar 

  • Nilsson, M., & Ejnarsson, M. (2002). Speech recognition using hidden Markov model. Department of Telecommunications and Speech Processing. Blekinge Institute of Technology, Blekinge.

  • Aida-Zade, K. R., Ardil, C., & Rustamov, S. S. (2006). Investigation of combined use of MFCC and LPC features in speech recognition systems. In Proc. of world academy of science, engineering and technology 13 (pp. 275–276).

    Google Scholar 

  • Rabiner, L. R., & Sambur, M. R. (1975). An algorithm for determining the endpoints of isolated utterances. The Bell System Technical Journal, 54(3), 298–315.

    Google Scholar 

  • Rabiner, L. R., & Schafer, R. V. (1978). Digital processing of speech signals. Englewood Cliffs: Prentice-Hall. ISBN-13: 9780132136037.

    Google Scholar 

  • Rabiner, L. R., & Schafer, R. V. (1981). Digital processing of speech signals. Radio and Communication (pp. 495–515).

  • Atal, B., & Rabiner, L. R. (1984). A pattern recognition approach to voiced-unvoiced-silence classification with applications to speech recognition. IEEE Transactions on Acoustics, Speech, and Signal Processing, ASSP-24(197), 201–212.

    Google Scholar 

  • Reddy, D. R. (1967). Computer recognition of connected speech. The Journal of the Acoustical Society of America, 42(2), 329–347.

    Article  Google Scholar 

  • Schafer, R. W., & Rabiner, L. R. (1970). System for automatic formant analysis of voiced speech. The Journal of the Acoustical Society of America, 47(2), 634–648.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Orken J. Mamyrbayev.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kalimoldayev, M.N., Alimhan, K. & Mamyrbayev, O.J. Methods for applying VAD in Kazakh speech recognition systems. Int J Speech Technol 17, 199–204 (2014). https://doi.org/10.1007/s10772-013-9220-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10772-013-9220-6

Keywords