Robust Speech Endpoint Detection Based on Improved Adaptive Band-Partitioning Spectral Entropy

Li, Xin; Liu, Huaping; Zheng, Yu; Xu, Bolin

doi:10.1007/978-3-540-74769-7_5

Xin Li^1,2,3,
Huaping Liu³,
Yu Zheng⁴ &
…
Bolin Xu¹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4688))

Included in the following conference series:

International Conference on Life System Modeling and Simulation

Abstract

The performance of speech recognition system is often degraded in adverse environments. Accurate Speech endpoint detection is very important for robust speech recognition. In this paper, an improved adaptive band-partitioning spectral entropy algorithm was proposed for speech endpoint detection, which utilized the weighted power spectral subtraction to boost up the signal-to-noise ratio (SNR) as well as keep the robustness. The idea of adaptive band-partitioning spectral entropy is to divide a frame into some sub-bands which the number of it could be selected adaptively, and calculate spectral entropy of them. Although it has good robustness, the accuracy degrades rapidly when the SNR are low. Therefore, the weighted power spectral subtraction is presented for reducing the spectral effects of acoustically added noise in speech. The speech recognition experiment results indicate that the recognition accuracy have improved well in adverse environments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

An adaptive speech endpoint detection method in low SNR environments

Article 05 July 2017

Speech Endpoint Detection Based on Improvement Feature and S-Transform

An Improved Endpoint Detection Algorithm Based on Improved Spectral Subtraction with Multi-taper Spectrum and Energy-Zero Ratio

References

Bush, K., Ganapathiraju, A., Kornman, P.: A Comparison of Energy-based Endpoint Detectors for Speech Signal Processing [C]. In: MS State DSP Conference, pp. 85–98 (1995)
Google Scholar
Jia, C., Xu, B.: An Improved Entropy based Endpoint Detection Algorithm[C]. In: ISCSLP, p. 96 (2002)
Google Scholar
En-qing, D., He-ming, Z., Ya-tong, Z., Xiao-di, Z.: Applying Support Vector Machines to Voice Activity Detection. Journal of China Institute of Communications 24(3), 3 (2003)
Google Scholar
Sohn, J., Kim, N.S., Sung, W.: A Statistical Model-Based Voice Activity Detection. IEEE Signal Processing Letters 6(1), 1–3 (1999)
Article Google Scholar
Qi, Y., Hunt, B.R.: Voiced-unvoiced-silence classification of speech using hybrid features and a network classifier. IEEE Tran. Speech Audio Processing 1, 250–255 (1993)
Article Google Scholar
Lamel, L., Labiner, L., Rosenberg, A., Wilpon, J.: An improved endpoint detect for isolated word recognition. IEEE ASSP Mag. 29(4), 777–785 (1981)
Google Scholar
Savoji, M.H.: A robust algorithm for accurate endpointing of speech. Speech Commun. 8, 45–60 (1989)
Article Google Scholar
Ney, H.: An optimization algorithm for determining the endpoints of isolated utterances. In: Proc. ICASSP, pp. 720–723 (1981)
Google Scholar
Rabiner, L.R., Sambur, M.R.: Voiced-unvoiced-silence detection using the Itakura LPC distance measure. In: Proc. ICASSP, pp. 323–326 (May 1977)
Google Scholar
Haign, J.A., Mason, J.S.: Robust voice activity detection using cepstral features. In: Proc. IEEE TEN-CON, pp. 321–324. IEEE Computer Society Press, Los Alamitos (1993)
Google Scholar
Chengalvarayan, R.: Robust energy normalization using speech/nonspeech discriminator for German connected digit recognition. In: Proc. Eurospeech, pp. 61–64 (September 1999)
Google Scholar
Junqua, J.C., Mak, B., Revaes, B.: A robust algorithm for word boundary detection in the presence of noise. IEEE Trans. Speech Audio Process. 2(4), 406–412 (1994)
Article Google Scholar
Wu, G.D., Lin, C.T.: Word boundary detection with mel-scale frequency bank in noise environment. IEEE Trans. Speech Audio Process. 8(3), 541–554 (2000)
Google Scholar
Shen, J.L., Hung, J.W., Lee, L.S.: Robust entropy-based endpoint detection for speech recognition in noisy environments. In: ICSLP (1998)
Google Scholar
Wu, B.-F., Wang, K.-C.: Robust Endpoint Detection Algorithm Based on the Adaptive Band-Partitioning Spectral Entropy in Adverse Environments. IEEE Transactions On Speech and Audio Processing 13(5(9), 762–775 (2005)
Google Scholar
Wu, G.-D., Lin, C.-T.: Word Boundary Detection with Mel-Scale Frequency Bank in Noisy Environment. IEEE Transactions On Speech and Audio Processing 8(5(09)), 541–554 (2000)
Google Scholar
Boll, S.F.: Suppression of acoustic noise in speech using spectral subtraction. IEEE Transaction on Acoustics, Speech and Signal Proc. 27, 113–120 (1979)
Article Google Scholar
XiaoMing, L., Sheng, Q., ZongHang, L.: Simulation of Speech Endpoint Detection. Journal of system simulation 17(8(8)), 1974–1976 (2005)
Google Scholar
Hua, L.-s., Yang, C.-h.: A Novel Approach To Robust Speech Endpoint Detection In Car Environments. In: ICASSP, pp. 1751–1754 (2000)
Google Scholar
Yamamoto, K., Jabloun, F., Reinhard, K., Kawamura, A.: Robust endpoint detection for speech recognition based on discriminative feature extraction. In: ICASSP, pp. 805–808 (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electronic and Information Engineering, Nanjing University, 210093 Nanjing, China
Xin Li & Bolin Xu
State Key Lab of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, 100080 Beijing, China
Xin Li
School of Electromechanical Engineering and Automation, Shanghai University, 200072 Shanghai, China
Xin Li & Huaping Liu
School of Computer Engineering and Science, Shanghai University, Shanghai 200072, China
Yu Zheng

Authors

Xin Li
View author publications
You can also search for this author in PubMed Google Scholar
Huaping Liu
View author publications
You can also search for this author in PubMed Google Scholar
Yu Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Bolin Xu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Kang Li Minrui Fei George William Irwin Shiwei Ma

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, X., Liu, H., Zheng, Y., Xu, B. (2007). Robust Speech Endpoint Detection Based on Improved Adaptive Band-Partitioning Spectral Entropy. In: Li, K., Fei, M., Irwin, G.W., Ma, S. (eds) Bio-Inspired Computational Intelligence and Applications. LSMS 2007. Lecture Notes in Computer Science, vol 4688. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74769-7_5

Download citation

DOI: https://doi.org/10.1007/978-3-540-74769-7_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74768-0
Online ISBN: 978-3-540-74769-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics