Abstract
The paper studied endpoint detection algorithm of noisy speech, since the visual differences of spectrogram employed by speech and noise, the paper chose spectrogram endpoint detection methods. Technical difficulties of spectrogram endpoint detection is how to describe the intuitive difference of spectrogram by mathematical amount, according to the descriptive power of autocorrelation coefficients on texture features, the paper described the difference by selecting the autocorrelation function, and proposed column autocorrelation spectrogram detection method. Through the distribution of spectrogram self-correlation function, as the threshold of endpoint detection for the noisy speech, the cut-off point between speech and noise was found out. Since the paper used broadband spectrogram, which employed poor frequency resolution, so there were still residual noise in speech column after autocorrelation spectrum detection, in order to further de-noising in different bands, combined with the multi resolution of empirical mode decomposition (EMD), the paper analyzed the noisy speech by multi-resolution, the target was broken down into different frequency scales and was further analyzed by column autocorrelation spectrogram, experiments shown that the noise reduction effect for noisy speech was ideal.
This work is supported by Guangdong Provincial Science and technology projects#2013B040401015.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
You, K.H., Wang, H.: Robust features for noisy speech recognition based on temporal trajectory fitting of short-time autocorrelation sequences. Speech Commun. 28(99), 13–24 (1999)
Haweel, T.I., Haweel, M.T.: Adaptive multichannel LMS signal decoupling. In: IEEE International Conference on Communications, Signal Processing, and their Applications, pp. 1–4 (2015)
Sase, T., Ramírez, J.P., Kitajo, K.: Estimating the level of dynamical noise in time series by using fractal dimensions. Phys. Lett. A 380(11–12), 1151–1163 (2016)
Zhang, X., Mei, C., Chen, D.: Feature selection in mixed data: a method using a novel fuzzy rough set-based information entropy. Pattern Recogn. 56(1), 1–15 (2016)
Obin, N., Liu, N.M.: On the generalization of Shannon entropy for speech recognition. In: Spoken Language Technology Workshop (SLT), vol. 8537, no. 11, pp. 97–102 (2012)
Qiu-fang, A., Xiao-Jun, W.: A method for endpoint detection of speech using FBV based on harmonious analysis. Comput. Simul. 26(8), 330–333 (2009)
Lareau, J., Lareau, J: Application of Shifted Delta Cepstral Features for GMM Language Identification (2006)
Xiang-min, C., Zhang, J., Wei, G.: A speech endpoint detection algorithm based on spectrogram. Audio Eng. 4(8), 46–49 (2006)
Xiao, C., Sun, D., Gao, Y.: A speech enhancement algorithm based on speech spectrogram. Audio Eng. 36(9), 44–48 (2012)
Gonzalez, R.C., Woods, R.E.: Digital Image Processing. Electronics Industry Press, Beijing (2011)
Wang, X., Shen, H., Zhang, W.: Image mosaic by using the local autocorrelation algorithm in triangular geometric constraints. Opto-Electron. Eng. 42(4), 32–37 (2015)
Wang, L., Sun, Y.: Gastroscopy image retrieval based on color-texture autocorrelation algorithm. J. Circ. Syst. 16(2), 46–50 (2011)
Soon, I.Y., Koh, S.N.: Speech enhancement using 2-D Fourier transform. IEEE Trans. Speech Audio Process. 11(6), 717–724 (2003)
Zhao, l: Speech Signal Processing. China Machine Press, Beijing (2009)
Sun, Yan-kui: Wavelet Transform and Image Processing Techniques. Tsinghua University Press, Beijing (2012)
Huang, N.E., Shen, Z., Long, S.R.: The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis. R. Soc. London Proc. 454(1971), 903–993 (1998)
Fu, J., Wang, S.W., Cao, X.L.: The research on speech endpoint detection algorithm based on spectrogram row self-correlation. In: 2nd International Conference on Computer Science and Network Technology, pp. 212–216 (2012)
Acknowledgment
This work is supported by Guangdong Provincial Science and technology projects#2013B040401015.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Zhang, J. (2016). Endpoint Detection and De-noising Method Based on Multi-resolution Spectrogram. In: Huang, DS., Jo, KH. (eds) Intelligent Computing Theories and Application. ICIC 2016. Lecture Notes in Computer Science(), vol 9772. Springer, Cham. https://doi.org/10.1007/978-3-319-42294-7_34
Download citation
DOI: https://doi.org/10.1007/978-3-319-42294-7_34
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-42293-0
Online ISBN: 978-3-319-42294-7
eBook Packages: Computer ScienceComputer Science (R0)