Abstract
The development of society promotes the continuous progress of science and technology, and speech processing technology gradually occupies an increasingly important position in people’s life and work, which puts forward higher requirements on the speech processing technology, especially in noisy environment. Due to the complexity of the real environment, denoising processing has great practical significance. In order to improve the level of speech denoising and increase the accuracy of the speech recognition system, wavelet denoising technology was used to analyze the de-noising requirements and hard and soft threshold functions in the speech recognition system, and an improved wavelet threshold denoising algorithm was put forward. Firstly, the signals were processed by wavelet decomposition according to primary function; then denoising was performed using the improved function; finally the denoised signals were reconstructed using inverse operation. The denoising effect of the algorithm was verified. The results showed that it was effective in denoising conventional speech signals. Besides, it was applied to the speech recognition system to denoise the noisy speech collected in the real environment, and finally high system self-assessment parameters were obtained. Thus it is concluded that wavelet denoising is effective in the speech denoising of the speech recognition system and can be put into practice.
Similar content being viewed by others
References
Aicha, A. B., & Jebara, S. B. (2012). Reduction of musical residual noise using perceptual tools with classic speech denoising techniques. Signal Image & Video Processing, 6(1), 85–97.
Alissali, D., Deleglise, P., & Rogozan, A. (2017). Asynchronous integration of visual information in an automatic speech recognition system. International conference on Spoken Language, 1996. Icslp 96. Proceedings of the IEEE 1, 34–37.
Bauer, B., & Kraiss, K. F. (2001). Towards an automatic sign language recognition system using subunits. Revised papers from the international gesture workshop on gesture and sign languages in human-computer interaction (pp. 64–75). New York: Springer-Verlag.
Gesell, G., Fischer, H., & König, T. (2009). Reduction of noise interference from METEOSAT water vapor image data by means of Fourier transform and frequency domain filtering. Journal of Atmospheric Oceanic Technology, 1(2), 147–151.
Halim, Z., Abbas, G. (2015). A kinect-based sign language hand gesture recognition system for hearing- and speech-impaired: A pilot study of Pakistani sign language. Assistive Technology the Official Journal of Resna, 27(1), 34.
Kumar, P., & Agarwal, S. K. (2015). Analysis of wavelet denoising of a colour image with different types of noises. International Journal of Signal Processing Image Processing & Pattern Recognition, 8, 125–134
Lecouteux, B., Linares, G., Esteve, Y., et al. (2016). Generalized driven decoding for speech recognition system combination. In IEEE international conference on acoustics, speech and signal processing. IEEE (pp. 1549–1552).
Mak, M. W., & Yu, H. B. (2014). A study of voice activity detection techniques for NIST speaker recognition evaluations. Computer Speech & Language, 28(1), 295–313.
Rajam, P. S., & Balakrishnan, G. (2012). Real time Indian sign language recognition system to aid deaf-dumb people. In IEEE, international conference on communication technology (pp. 737–742).
Shanthi, T. S., & Lingam, C. (2015). Speaker based Language Independent Isolated Speech Recognition System. In International conference on communication, information & computing technology. IEEE (pp. 1–7).
Srivastava, M., Anderson, C. L., & Freed, J. H. (2016). A New wavelet denoising method for selecting decomposition levels and noise thresholds. IEEE Access Practical Innovations Open Solutions, 4, 3862.
Sui, C., Bennamoun, M., & Togneri, R. (2015). Listening with your eyes: Towards a practical visual speech recognition system using deep boltzmann machines. In IEEE international conference on computer vision (pp. 154–162).
Sun, L., & Feng, Z. R. (2016). Classification of imagery motor EEG data with wavelet denoising and features selection. In IEEE international conference on wavelet analysis and pattern recognition (pp. 184–188).
Swaminathan, D., Kiruthika, S., Anton, A. L. N., et al. (2015). Video based indian sign language recognition system for single and double handed gestures with unique motion trace as feature. International Journal of Tomography & Simulation, 28(1), 71–88.
Tharwat, A., Gaber, T., Hassanien, A. E., et al. (2015). SIFT-based arabic sign language recognition system. In Afro-European conference for industrial advancement (pp. 359–370).
Torres-Carrasquillo, P. A., Singer, E., Gleason, T., et al. (2010). The MITLL NIST LRE 2009 language recognition system. In IEEE international conference on acoustics, speech, and signal processing, ICASSP 2010, 14–19 March 2010, Sheraton Dallas Hotel, Dallas, Texas, USA. DBLP, (pp. 4994–4997).
Zhenxing, L., & Hongzhou, X. (2009). A wavelet threshold de-noising algorithm based on empirical mode decomposition. Computer Simulation, 26(9), 192–325.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Zhong, X., Dai, Y., Dai, Y. et al. Study on processing of wavelet speech denoising in speech recognition system. Int J Speech Technol 21, 563–569 (2018). https://doi.org/10.1007/s10772-018-9516-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10772-018-9516-7