Skip to main content

Robust Speech Endpoint Detection Based on Improved Adaptive Band-Partitioning Spectral Entropy

  • Conference paper
Bio-Inspired Computational Intelligence and Applications (LSMS 2007)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4688))

Included in the following conference series:

Abstract

The performance of speech recognition system is often degraded in adverse environments. Accurate Speech endpoint detection is very important for robust speech recognition. In this paper, an improved adaptive band-partitioning spectral entropy algorithm was proposed for speech endpoint detection, which utilized the weighted power spectral subtraction to boost up the signal-to-noise ratio (SNR) as well as keep the robustness. The idea of adaptive band-partitioning spectral entropy is to divide a frame into some sub-bands which the number of it could be selected adaptively, and calculate spectral entropy of them. Although it has good robustness, the accuracy degrades rapidly when the SNR are low. Therefore, the weighted power spectral subtraction is presented for reducing the spectral effects of acoustically added noise in speech. The speech recognition experiment results indicate that the recognition accuracy have improved well in adverse environments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bush, K., Ganapathiraju, A., Kornman, P.: A Comparison of Energy-based Endpoint Detectors for Speech Signal Processing [C]. In: MS State DSP Conference, pp. 85–98 (1995)

    Google Scholar 

  2. Jia, C., Xu, B.: An Improved Entropy based Endpoint Detection Algorithm[C]. In: ISCSLP, p. 96 (2002)

    Google Scholar 

  3. En-qing, D., He-ming, Z., Ya-tong, Z., Xiao-di, Z.: Applying Support Vector Machines to Voice Activity Detection. Journal of China Institute of Communications 24(3), 3 (2003)

    Google Scholar 

  4. Sohn, J., Kim, N.S., Sung, W.: A Statistical Model-Based Voice Activity Detection. IEEE Signal Processing Letters 6(1), 1–3 (1999)

    Article  Google Scholar 

  5. Qi, Y., Hunt, B.R.: Voiced-unvoiced-silence classification of speech using hybrid features and a network classifier. IEEE Tran. Speech Audio Processing 1, 250–255 (1993)

    Article  Google Scholar 

  6. Lamel, L., Labiner, L., Rosenberg, A., Wilpon, J.: An improved endpoint detect for isolated word recognition. IEEE ASSP Mag. 29(4), 777–785 (1981)

    Google Scholar 

  7. Savoji, M.H.: A robust algorithm for accurate endpointing of speech. Speech Commun. 8, 45–60 (1989)

    Article  Google Scholar 

  8. Ney, H.: An optimization algorithm for determining the endpoints of isolated utterances. In: Proc. ICASSP, pp. 720–723 (1981)

    Google Scholar 

  9. Rabiner, L.R., Sambur, M.R.: Voiced-unvoiced-silence detection using the Itakura LPC distance measure. In: Proc. ICASSP, pp. 323–326 (May 1977)

    Google Scholar 

  10. Haign, J.A., Mason, J.S.: Robust voice activity detection using cepstral features. In: Proc. IEEE TEN-CON, pp. 321–324. IEEE Computer Society Press, Los Alamitos (1993)

    Google Scholar 

  11. Chengalvarayan, R.: Robust energy normalization using speech/nonspeech discriminator for German connected digit recognition. In: Proc. Eurospeech, pp. 61–64 (September 1999)

    Google Scholar 

  12. Junqua, J.C., Mak, B., Revaes, B.: A robust algorithm for word boundary detection in the presence of noise. IEEE Trans. Speech Audio Process. 2(4), 406–412 (1994)

    Article  Google Scholar 

  13. Wu, G.D., Lin, C.T.: Word boundary detection with mel-scale frequency bank in noise environment. IEEE Trans. Speech Audio Process. 8(3), 541–554 (2000)

    Google Scholar 

  14. Shen, J.L., Hung, J.W., Lee, L.S.: Robust entropy-based endpoint detection for speech recognition in noisy environments. In: ICSLP (1998)

    Google Scholar 

  15. Wu, B.-F., Wang, K.-C.: Robust Endpoint Detection Algorithm Based on the Adaptive Band-Partitioning Spectral Entropy in Adverse Environments. IEEE Transactions On Speech and Audio Processing 13(5(9), 762–775 (2005)

    Google Scholar 

  16. Wu, G.-D., Lin, C.-T.: Word Boundary Detection with Mel-Scale Frequency Bank in Noisy Environment. IEEE Transactions On Speech and Audio Processing 8(5(09)), 541–554 (2000)

    Google Scholar 

  17. Boll, S.F.: Suppression of acoustic noise in speech using spectral subtraction. IEEE Transaction on Acoustics, Speech and Signal Proc. 27, 113–120 (1979)

    Article  Google Scholar 

  18. XiaoMing, L., Sheng, Q., ZongHang, L.: Simulation of Speech Endpoint Detection. Journal of system simulation 17(8(8)), 1974–1976 (2005)

    Google Scholar 

  19. Hua, L.-s., Yang, C.-h.: A Novel Approach To Robust Speech Endpoint Detection In Car Environments. In: ICASSP, pp. 1751–1754 (2000)

    Google Scholar 

  20. Yamamoto, K., Jabloun, F., Reinhard, K., Kawamura, A.: Robust endpoint detection for speech recognition based on discriminative feature extraction. In: ICASSP, pp. 805–808 (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Kang Li Minrui Fei George William Irwin Shiwei Ma

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Li, X., Liu, H., Zheng, Y., Xu, B. (2007). Robust Speech Endpoint Detection Based on Improved Adaptive Band-Partitioning Spectral Entropy. In: Li, K., Fei, M., Irwin, G.W., Ma, S. (eds) Bio-Inspired Computational Intelligence and Applications. LSMS 2007. Lecture Notes in Computer Science, vol 4688. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74769-7_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-74769-7_5

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-74768-0

  • Online ISBN: 978-3-540-74769-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics