Skip to main content

Noisy Speech Endpoint Detection using Robust Feature

  • Conference paper
  • First Online:
Biometric Authentication (BIOMET 2014)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 8897))

Included in the following conference series:

Abstract.

In this paper a new robust feature for speech endpoint detection is proposed. It combines the properties of the Modified Group Delay Spectrum (MGDS) and the Mean Delta (MD) approach in order to obtain the more robust endpoint detection. This feature is named as Group Delay Mean Delta (GDMD) feature. The effectiveness of proposed feature and other three features for trajectory-based endpoint detection is experimentally evaluated in the fixed-text Dynamic Time Warping (DTW) - based speaker verification task with short phrases of telephone speech. The analysed features are - Modified Teager Energy (MTE), Energy-Entropy (EE) feature and MD feature. The results of the experiments have shown that the GDMD feature demonstrates the best performance in endpoint detection tests in terms of verification rate.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bengio, S., Mariethoz, J.: A Statistical Significance Test for Person Authentication. In: ODYSSEY - The Speaker and Language Recognition Workshop, pp. 237–244 (2004)

    Google Scholar 

  2. Burileanu, C., Moraru, D., Bojan, L., Puchiu, M., Stan, A.: On Performance Improvement of a Speaker Verification System Using Vector Quantization, Cohorts and Hybrid Cohort-World Models. International Journal of Speech Technology (5), 247–257 (2002)

    Google Scholar 

  3. Gerven, S., Xie, F.: A comparative study of speech detection methods. In: Eurospeech, pp. 1095–1098 (1997).

    Google Scholar 

  4. Gu, L., Zahorian, S.: A new robust algorithm for isolated word endpoint detection. In: IEEE ICASSP, vol. IV, pp. 4161–4164 (2002)

    Google Scholar 

  5. Hegde, R., Murthy, H., Gadde, V.: Significance of the Modified Group Delay Feature in Speech Recognition. IEEE Transactions on Audio, Speech and Language Processing 15(1), 190–202 (2007)

    Article  Google Scholar 

  6. Huang, L., Yang, C.: A Novel Approach to Robust Speech Endpoint Detection in Car Environment. In: IEEE ICASSP, pp. 1751–1754 (2000)

    Google Scholar 

  7. Jia, C., Xu, B.: An Improved Entropy based Endpoint Detection Algorithm. In: ISCSLP, pp. 96–100 (2002)

    Google Scholar 

  8. Krishnan, S., Padmanabhan, R., Murthy, H.: Robust Voice Activity Detection using Group Delay Functions. In: IEEE International Conference on Industrial Technology, pp. 2603–2607 (2006)

    Google Scholar 

  9. Li, Q., Zheng, J., Tsai, A., Zhou, Q.: Robust Endpoint Detection and Energy Normalization for Real-Time Speech and Speaker Recognition. IEEE Transaction on SAP 10(3), 146–157 (2002)

    Google Scholar 

  10. Mesa-Navarro, J., Moreno-Bilbao, A., Lleida-Solano, E.: An Improved Speech Endpoint Detection System in Noisy Environments by Means of Third-Order Spectra. IEEE Signal Processing Letters 6(9), 224–226 (1999)

    Article  Google Scholar 

  11. Murthy, H., Gadde, V.: The modified group delay function and its application to phoneme recognition. In: IEEE ICASSP, vol. 1, pp. 68–71 (2003)

    Google Scholar 

  12. Myers, C., Rabiner, L., Rosenberg, A.: Performance Tradeoffs in Dynamic Time Warping Algorithms for Isolated Word Recognition. IEEE Transactions on ASSP 28(6), 623–635 (1980)

    Article  MATH  Google Scholar 

  13. Ouzounov, A.: BG-SRDat: A Corpus in Bulgarian Language for Speaker Recognition over Telephone Channels. Cybernetics and Information Technologies 3(2), 101–108 (2003)

    Google Scholar 

  14. Ouzounov, A.: A Robust Feature for Speech Detection. Cybernetics and Information Technologies 4(2), 3–14 (2004)

    Google Scholar 

  15. Ouzounov, A.: Robust Features and Neural Network for Noisy Speech Detection. Cybernetics and Information Technologies 6(3), 75–84 (2006)

    Google Scholar 

  16. Ouzounov, A.: Cepstral Features and Text-Dependent Speaker Identification - A Comparative Study. Cybernetics and Information Technologies 10(1), 1–12 (2010)

    Google Scholar 

  17. Ouzounov, A.: Telephone Speech Endpoint Detection Using Mean-Delta Feature. Cybernetics and Information Technologies 14(2), 127–139 (2014)

    Article  Google Scholar 

  18. Padmanabhan, R., Krishnan, P., Murthy, H.: A Pattern Recognition approach to VAD using Modified Group Delay. In: Proceedings of the National Conference on Communications, pp. 432–436 (2008)

    Google Scholar 

  19. Ramirez, J., Segura, J., Benítez, C., De la Torre, A., Rubio, A.: Efficient Voice Activity Detection Algorithms Using Long-Term Speech Information. Speech Communication 42(3-4), 271–287 (2004)

    Article  Google Scholar 

  20. Ramirez, J., Yelamos, P., Gorriz, J., Seguraet, J.: SVM-based speech endpoint detection using contextual speech features. Electronics Letters 42(7), 426–428 (2006)

    Article  Google Scholar 

  21. Seok, J., Bae, K.: A Novel Endpoint Detection using Discrete Wavelet Transform. IEICE Transaction on Inf. & Syst., E82-D(11) 1489–1491 (1999)

    Google Scholar 

  22. Shin, W., Lee, B., Lee, Y., Lee, J.: Speech/non-speech classification using multiple features for robust endpoint detection. In: IEEE ICASSP, pp. 1399–1402 (2000)

    Google Scholar 

  23. Wu, B.F., Wang, K.C.: Robust Endpoint Detection Algorithm based on the Adaptive Band-Partitioning Spectral Entropy in Adverse Environments. IEEE Transactions on SAP 13(5), 762–775 (2005)

    Google Scholar 

  24. Yamamoto, K., Jabloun, F., Reinhard, K., Kawamura, A.: Robust Endpoint Detection for Speech Recognition Based on Discriminative Feature Extraction. In: IEEE ICASSP, vol. I, pp. 805–808 (2006)

    Google Scholar 

  25. Zelinski, R., Class, F.: A Learning Procedure for Speaker–Dependent Word Recognition System based on Sequential Processing of Input Tokens. In: IEEE ICASSP, pp. 1053–1056 (1983)

    Google Scholar 

  26. Zhang, Z., Furui, S.: Noisy Speech Recognition based on Robust End-point Detection and Model Adaptation. IEEE ICASSP 1, 441–444 (2005)

    Google Scholar 

  27. Zhu, J., Chen, F.: The Analysis and Application of a New Endpoint Detection Method based on Distance of Autocorrelated Similarity. In: Eurospeech, pp. 105–108 (1999)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Atanas Ouzounov .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Ouzounov, A. (2014). Noisy Speech Endpoint Detection using Robust Feature. In: Cantoni, V., Dimov, D., Tistarelli, M. (eds) Biometric Authentication. BIOMET 2014. Lecture Notes in Computer Science(), vol 8897. Springer, Cham. https://doi.org/10.1007/978-3-319-13386-7_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-13386-7_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-13385-0

  • Online ISBN: 978-3-319-13386-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics