Abstract
The performance of speech recognition system is often degraded in adverse environments. Accurate Speech endpoint detection is very important for robust speech recognition. In this paper, an improved adaptive band-partitioning spectral entropy algorithm was proposed for speech endpoint detection, which utilized the weighted power spectral subtraction to boost up the signal-to-noise ratio (SNR) as well as keep the robustness. The idea of adaptive band-partitioning spectral entropy is to divide a frame into some sub-bands which the number of it could be selected adaptively, and calculate spectral entropy of them. Although it has good robustness, the accuracy degrades rapidly when the SNR are low. Therefore, the weighted power spectral subtraction is presented for reducing the spectral effects of acoustically added noise in speech. The speech recognition experiment results indicate that the recognition accuracy have improved well in adverse environments.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Bush, K., Ganapathiraju, A., Kornman, P.: A Comparison of Energy-based Endpoint Detectors for Speech Signal Processing [C]. In: MS State DSP Conference, pp. 85–98 (1995)
Jia, C., Xu, B.: An Improved Entropy based Endpoint Detection Algorithm[C]. In: ISCSLP, p. 96 (2002)
En-qing, D., He-ming, Z., Ya-tong, Z., Xiao-di, Z.: Applying Support Vector Machines to Voice Activity Detection. Journal of China Institute of Communications 24(3), 3 (2003)
Sohn, J., Kim, N.S., Sung, W.: A Statistical Model-Based Voice Activity Detection. IEEE Signal Processing Letters 6(1), 1–3 (1999)
Qi, Y., Hunt, B.R.: Voiced-unvoiced-silence classification of speech using hybrid features and a network classifier. IEEE Tran. Speech Audio Processing 1, 250–255 (1993)
Lamel, L., Labiner, L., Rosenberg, A., Wilpon, J.: An improved endpoint detect for isolated word recognition. IEEE ASSP Mag. 29(4), 777–785 (1981)
Savoji, M.H.: A robust algorithm for accurate endpointing of speech. Speech Commun. 8, 45–60 (1989)
Ney, H.: An optimization algorithm for determining the endpoints of isolated utterances. In: Proc. ICASSP, pp. 720–723 (1981)
Rabiner, L.R., Sambur, M.R.: Voiced-unvoiced-silence detection using the Itakura LPC distance measure. In: Proc. ICASSP, pp. 323–326 (May 1977)
Haign, J.A., Mason, J.S.: Robust voice activity detection using cepstral features. In: Proc. IEEE TEN-CON, pp. 321–324. IEEE Computer Society Press, Los Alamitos (1993)
Chengalvarayan, R.: Robust energy normalization using speech/nonspeech discriminator for German connected digit recognition. In: Proc. Eurospeech, pp. 61–64 (September 1999)
Junqua, J.C., Mak, B., Revaes, B.: A robust algorithm for word boundary detection in the presence of noise. IEEE Trans. Speech Audio Process. 2(4), 406–412 (1994)
Wu, G.D., Lin, C.T.: Word boundary detection with mel-scale frequency bank in noise environment. IEEE Trans. Speech Audio Process. 8(3), 541–554 (2000)
Shen, J.L., Hung, J.W., Lee, L.S.: Robust entropy-based endpoint detection for speech recognition in noisy environments. In: ICSLP (1998)
Wu, B.-F., Wang, K.-C.: Robust Endpoint Detection Algorithm Based on the Adaptive Band-Partitioning Spectral Entropy in Adverse Environments. IEEE Transactions On Speech and Audio Processing 13(5(9), 762–775 (2005)
Wu, G.-D., Lin, C.-T.: Word Boundary Detection with Mel-Scale Frequency Bank in Noisy Environment. IEEE Transactions On Speech and Audio Processing 8(5(09)), 541–554 (2000)
Boll, S.F.: Suppression of acoustic noise in speech using spectral subtraction. IEEE Transaction on Acoustics, Speech and Signal Proc. 27, 113–120 (1979)
XiaoMing, L., Sheng, Q., ZongHang, L.: Simulation of Speech Endpoint Detection. Journal of system simulation 17(8(8)), 1974–1976 (2005)
Hua, L.-s., Yang, C.-h.: A Novel Approach To Robust Speech Endpoint Detection In Car Environments. In: ICASSP, pp. 1751–1754 (2000)
Yamamoto, K., Jabloun, F., Reinhard, K., Kawamura, A.: Robust endpoint detection for speech recognition based on discriminative feature extraction. In: ICASSP, pp. 805–808 (2006)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Li, X., Liu, H., Zheng, Y., Xu, B. (2007). Robust Speech Endpoint Detection Based on Improved Adaptive Band-Partitioning Spectral Entropy. In: Li, K., Fei, M., Irwin, G.W., Ma, S. (eds) Bio-Inspired Computational Intelligence and Applications. LSMS 2007. Lecture Notes in Computer Science, vol 4688. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74769-7_5
Download citation
DOI: https://doi.org/10.1007/978-3-540-74769-7_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74768-0
Online ISBN: 978-3-540-74769-7
eBook Packages: Computer ScienceComputer Science (R0)