Noisy Speech Endpoint Detection using Robust Feature

Ouzounov, Atanas

doi:10.1007/978-3-319-13386-7_9

Atanas Ouzounov¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 8897))

Included in the following conference series:

International Workshop on Biometric Authentication

1029 Accesses
1 Citations

Abstract.

In this paper a new robust feature for speech endpoint detection is proposed. It combines the properties of the Modified Group Delay Spectrum (MGDS) and the Mean Delta (MD) approach in order to obtain the more robust endpoint detection. This feature is named as Group Delay Mean Delta (GDMD) feature. The effectiveness of proposed feature and other three features for trajectory-based endpoint detection is experimentally evaluated in the fixed-text Dynamic Time Warping (DTW) - based speaker verification task with short phrases of telephone speech. The analysed features are - Modified Teager Energy (MTE), Energy-Entropy (EE) feature and MD feature. The results of the experiments have shown that the GDMD feature demonstrates the best performance in endpoint detection tests in terms of verification rate.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bengio, S., Mariethoz, J.: A Statistical Significance Test for Person Authentication. In: ODYSSEY - The Speaker and Language Recognition Workshop, pp. 237–244 (2004)
Google Scholar
Burileanu, C., Moraru, D., Bojan, L., Puchiu, M., Stan, A.: On Performance Improvement of a Speaker Verification System Using Vector Quantization, Cohorts and Hybrid Cohort-World Models. International Journal of Speech Technology (5), 247–257 (2002)
Google Scholar
Gerven, S., Xie, F.: A comparative study of speech detection methods. In: Eurospeech, pp. 1095–1098 (1997).
Google Scholar
Gu, L., Zahorian, S.: A new robust algorithm for isolated word endpoint detection. In: IEEE ICASSP, vol. IV, pp. 4161–4164 (2002)
Google Scholar
Hegde, R., Murthy, H., Gadde, V.: Significance of the Modified Group Delay Feature in Speech Recognition. IEEE Transactions on Audio, Speech and Language Processing 15(1), 190–202 (2007)
Article Google Scholar
Huang, L., Yang, C.: A Novel Approach to Robust Speech Endpoint Detection in Car Environment. In: IEEE ICASSP, pp. 1751–1754 (2000)
Google Scholar
Jia, C., Xu, B.: An Improved Entropy based Endpoint Detection Algorithm. In: ISCSLP, pp. 96–100 (2002)
Google Scholar
Krishnan, S., Padmanabhan, R., Murthy, H.: Robust Voice Activity Detection using Group Delay Functions. In: IEEE International Conference on Industrial Technology, pp. 2603–2607 (2006)
Google Scholar
Li, Q., Zheng, J., Tsai, A., Zhou, Q.: Robust Endpoint Detection and Energy Normalization for Real-Time Speech and Speaker Recognition. IEEE Transaction on SAP 10(3), 146–157 (2002)
Google Scholar
Mesa-Navarro, J., Moreno-Bilbao, A., Lleida-Solano, E.: An Improved Speech Endpoint Detection System in Noisy Environments by Means of Third-Order Spectra. IEEE Signal Processing Letters 6(9), 224–226 (1999)
Article Google Scholar
Murthy, H., Gadde, V.: The modified group delay function and its application to phoneme recognition. In: IEEE ICASSP, vol. 1, pp. 68–71 (2003)
Google Scholar
Myers, C., Rabiner, L., Rosenberg, A.: Performance Tradeoffs in Dynamic Time Warping Algorithms for Isolated Word Recognition. IEEE Transactions on ASSP 28(6), 623–635 (1980)
Article MATH Google Scholar
Ouzounov, A.: BG-SRDat: A Corpus in Bulgarian Language for Speaker Recognition over Telephone Channels. Cybernetics and Information Technologies 3(2), 101–108 (2003)
Google Scholar
Ouzounov, A.: A Robust Feature for Speech Detection. Cybernetics and Information Technologies 4(2), 3–14 (2004)
Google Scholar
Ouzounov, A.: Robust Features and Neural Network for Noisy Speech Detection. Cybernetics and Information Technologies 6(3), 75–84 (2006)
Google Scholar
Ouzounov, A.: Cepstral Features and Text-Dependent Speaker Identification - A Comparative Study. Cybernetics and Information Technologies 10(1), 1–12 (2010)
Google Scholar
Ouzounov, A.: Telephone Speech Endpoint Detection Using Mean-Delta Feature. Cybernetics and Information Technologies 14(2), 127–139 (2014)
Article Google Scholar
Padmanabhan, R., Krishnan, P., Murthy, H.: A Pattern Recognition approach to VAD using Modified Group Delay. In: Proceedings of the National Conference on Communications, pp. 432–436 (2008)
Google Scholar
Ramirez, J., Segura, J., Benítez, C., De la Torre, A., Rubio, A.: Efficient Voice Activity Detection Algorithms Using Long-Term Speech Information. Speech Communication 42(3-4), 271–287 (2004)
Article Google Scholar
Ramirez, J., Yelamos, P., Gorriz, J., Seguraet, J.: SVM-based speech endpoint detection using contextual speech features. Electronics Letters 42(7), 426–428 (2006)
Article Google Scholar
Seok, J., Bae, K.: A Novel Endpoint Detection using Discrete Wavelet Transform. IEICE Transaction on Inf. & Syst., E82-D(11) 1489–1491 (1999)
Google Scholar
Shin, W., Lee, B., Lee, Y., Lee, J.: Speech/non-speech classification using multiple features for robust endpoint detection. In: IEEE ICASSP, pp. 1399–1402 (2000)
Google Scholar
Wu, B.F., Wang, K.C.: Robust Endpoint Detection Algorithm based on the Adaptive Band-Partitioning Spectral Entropy in Adverse Environments. IEEE Transactions on SAP 13(5), 762–775 (2005)
Google Scholar
Yamamoto, K., Jabloun, F., Reinhard, K., Kawamura, A.: Robust Endpoint Detection for Speech Recognition Based on Discriminative Feature Extraction. In: IEEE ICASSP, vol. I, pp. 805–808 (2006)
Google Scholar
Zelinski, R., Class, F.: A Learning Procedure for Speaker–Dependent Word Recognition System based on Sequential Processing of Input Tokens. In: IEEE ICASSP, pp. 1053–1056 (1983)
Google Scholar
Zhang, Z., Furui, S.: Noisy Speech Recognition based on Robust End-point Detection and Model Adaptation. IEEE ICASSP 1, 441–444 (2005)
Google Scholar
Zhu, J., Chen, F.: The Analysis and Application of a New Endpoint Detection Method based on Distance of Autocorrelated Similarity. In: Eurospeech, pp. 105–108 (1999)
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Information and Communication Technologies, Sofia, Bulgaria
Atanas Ouzounov

Authors

Atanas Ouzounov
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Atanas Ouzounov .

Editor information

Editors and Affiliations

University of Pavia, Pavia, Italy
Virginio Cantoni
Bulgarian Academy of Sciences, Sofia, Bulgaria
Dimo Dimov
University of Sassari, Alghero, Sassari, Italy
Massimo Tistarelli

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ouzounov, A. (2014). Noisy Speech Endpoint Detection using Robust Feature. In: Cantoni, V., Dimov, D., Tistarelli, M. (eds) Biometric Authentication. BIOMET 2014. Lecture Notes in Computer Science(), vol 8897. Springer, Cham. https://doi.org/10.1007/978-3-319-13386-7_9

Download citation

DOI: https://doi.org/10.1007/978-3-319-13386-7_9
Published: 30 November 2014
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-13385-0
Online ISBN: 978-3-319-13386-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics