Abstract
Endpoint detection plays a crucial role in speech recognition systems. An effective endpoint detection algorithm can not only reduce the processing time, but also can interfere with the noise of the silent segment. The traditional endpoint detection algorithms are mostly processed in a noise-free environment, so there will be problems such as weak noise immunity. In the problem of low SNR, this paper proposes an improved endpoint detection algorithm based on improved spectral subtraction with multi-taper spectrum and energy-zero ratio. The algorithm uses the improved spectral subtraction method of multi-window spectrum estimation to reduce the speech noise, and then combines the energy-zero ratio with endpoint detection. Experiments show that the proposed algorithm has better robustness under different SNR conditions.
This paper was supported by National Undergraduate Innovation Project with Granted No. 201710488004 and Fund of Hubei Province Key Laboratory of Intelligent Information Processing and Real-time Industrial System with Granted No. znxx2018MS03 and znxx2018QN07.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Ying, G.S., Mitchell, C.D., Jamieson, L.H.: Endpoint detection of isolated utterances based on a modified Teager energy measurement. In: IEEE International Conference on Acoustics, Speech, and Signal Processing: Speech Processing, pp. 732–735. IEEE Computer Society (1993)
Cao, Y., Gao, J., Yang, G.: Study on speech endpoint detection algorithm based on wavelet energy entropy. In: Control and Decision Conference, pp. 3965–3969. IEEE (2016)
Wu, D., Tao, Z., Wu, Y., et al.: Speech endpoint detection in noisy environment using spectrogram boundary factor. In: International Congress on Image and Signal Processing, Biomedical Engineering and Informatics, pp. 964–968. IEEE (2017)
Zhang, C., Dong, M.: An improved speech endpoint detection based on adaptive sub-band selection spectral variance. In: Control Conference, pp. 5033–5037. IEEE (2016)
Junqua, J.C., Reaves, B., Mak, B.: A study of endpoint detection algorithms in adverse conditions: incidence on a DTW and HMM recognize. In: European Conference on Speech Communication and Technology, Eurospeech 1991, Genova, Italy, September. DBLP, pp. 757–762 (1991)
Shin, W.H., Lee, B.S., Lee, Y.K., et al.: Speech/non-speech classification using multiple features for robust endpoint detection. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2000, Proceedings, vol. 3, pp. 1399–1402. IEEE (2000)
Lynch, J.J., Josenhans, J., Crochiere, R.E.: Speech/Silence segmentation for real-time coding via rule based adaptive endpoint detection. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP, pp. 1348–1351. IEEE (1987)
Rui, H.U.G., Wei, X.D.: Endpoint detection of noisy speech based on cepstrum. Acta Electronica Sinica 28(10), 95–97 (2000)
Wilpon, J.G., Rabiner, L.R.: Application of hidden Markov models to automatic speech endpoint detection. Comput. Speech Lang. 2(3–4), 321–341 (1987)
Huang, L.S., Yang, C.H.: A novel approach to robust speech endpoint detection in car environments. In: IEEE ICASSP-2000, vol. 3, pp. 1751–1754 (2000)
Shen, J.L., Hung, J.W., Lee, L.S.: Robust entropy-based endpoint detection for speech recognition in noisy environments. In: International Conference on Spoken Language Processing, Incorporating the, Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, November–December. DBLP (1998)
Ramirez, J., Yelamos, P., Gorriz, J.M., et al.: SVM-based speech endpoint detection using contextual speech features. Electron. Lett. 42(7), 426–428 (2006)
Ganapathiraju, A., Hamaker, J., Picone, J.: Support vector machines for speech recognition. In: International Conference on Spoken Language Processing, Incorporating the, Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, November–December, pp. 2348–2355. DBLP (2002)
Matsumoto, M., Hori, J.: Classification of silent speech using support vector machine and relevance vector machine. Appl. Soft Comput. 20(7), 95–102 (2014)
Lim, C., Chang, J.H.: Enhancing support vector machine-based speech/music classification using conditional maximum a posteriori criterion. IET Signal Process. 6(4), 335–340 (2012)
Tseng, Y.H., Chiu, T.H., Lin, J.M., et al.: Linear precoding and adaptive multi-taper spectrum detector for cognitive radios. In: International Symposium on VlSI Design, Automation and Test, pp. 1–4. IEEE (2016)
Dai, Y.H., Chen, H.C., Qiao, D.J., et al.: Speech endpoint detection algorithm analysis based on short-term energy ratio. Commun. Technol. 42(2), 181–183 (2009)
Kumar, S., Phadikar, S., Majumder, K.: Modified segmentation algorithm based on short term energy & zero crossing rate for Maithili speech signal. In: International Conference on Accessibility to Digital World, pp. 169–172. IEEE (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Bao, T., Li, Y., Xu, K., Wang, Y., Hu, W. (2018). An Improved Endpoint Detection Algorithm Based on Improved Spectral Subtraction with Multi-taper Spectrum and Energy-Zero Ratio. In: Huang, DS., Bevilacqua, V., Premaratne, P., Gupta, P. (eds) Intelligent Computing Theories and Application. ICIC 2018. Lecture Notes in Computer Science(), vol 10954. Springer, Cham. https://doi.org/10.1007/978-3-319-95930-6_25
Download citation
DOI: https://doi.org/10.1007/978-3-319-95930-6_25
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-95929-0
Online ISBN: 978-3-319-95930-6
eBook Packages: Computer ScienceComputer Science (R0)