Abstract.
In this paper a new robust feature for speech endpoint detection is proposed. It combines the properties of the Modified Group Delay Spectrum (MGDS) and the Mean Delta (MD) approach in order to obtain the more robust endpoint detection. This feature is named as Group Delay Mean Delta (GDMD) feature. The effectiveness of proposed feature and other three features for trajectory-based endpoint detection is experimentally evaluated in the fixed-text Dynamic Time Warping (DTW) - based speaker verification task with short phrases of telephone speech. The analysed features are - Modified Teager Energy (MTE), Energy-Entropy (EE) feature and MD feature. The results of the experiments have shown that the GDMD feature demonstrates the best performance in endpoint detection tests in terms of verification rate.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bengio, S., Mariethoz, J.: A Statistical Significance Test for Person Authentication. In: ODYSSEY - The Speaker and Language Recognition Workshop, pp. 237–244 (2004)
Burileanu, C., Moraru, D., Bojan, L., Puchiu, M., Stan, A.: On Performance Improvement of a Speaker Verification System Using Vector Quantization, Cohorts and Hybrid Cohort-World Models. International Journal of Speech Technology (5), 247–257 (2002)
Gerven, S., Xie, F.: A comparative study of speech detection methods. In: Eurospeech, pp. 1095–1098 (1997).
Gu, L., Zahorian, S.: A new robust algorithm for isolated word endpoint detection. In: IEEE ICASSP, vol. IV, pp. 4161–4164 (2002)
Hegde, R., Murthy, H., Gadde, V.: Significance of the Modified Group Delay Feature in Speech Recognition. IEEE Transactions on Audio, Speech and Language Processing 15(1), 190–202 (2007)
Huang, L., Yang, C.: A Novel Approach to Robust Speech Endpoint Detection in Car Environment. In: IEEE ICASSP, pp. 1751–1754 (2000)
Jia, C., Xu, B.: An Improved Entropy based Endpoint Detection Algorithm. In: ISCSLP, pp. 96–100 (2002)
Krishnan, S., Padmanabhan, R., Murthy, H.: Robust Voice Activity Detection using Group Delay Functions. In: IEEE International Conference on Industrial Technology, pp. 2603–2607 (2006)
Li, Q., Zheng, J., Tsai, A., Zhou, Q.: Robust Endpoint Detection and Energy Normalization for Real-Time Speech and Speaker Recognition. IEEE Transaction on SAP 10(3), 146–157 (2002)
Mesa-Navarro, J., Moreno-Bilbao, A., Lleida-Solano, E.: An Improved Speech Endpoint Detection System in Noisy Environments by Means of Third-Order Spectra. IEEE Signal Processing Letters 6(9), 224–226 (1999)
Murthy, H., Gadde, V.: The modified group delay function and its application to phoneme recognition. In: IEEE ICASSP, vol. 1, pp. 68–71 (2003)
Myers, C., Rabiner, L., Rosenberg, A.: Performance Tradeoffs in Dynamic Time Warping Algorithms for Isolated Word Recognition. IEEE Transactions on ASSP 28(6), 623–635 (1980)
Ouzounov, A.: BG-SRDat: A Corpus in Bulgarian Language for Speaker Recognition over Telephone Channels. Cybernetics and Information Technologies 3(2), 101–108 (2003)
Ouzounov, A.: A Robust Feature for Speech Detection. Cybernetics and Information Technologies 4(2), 3–14 (2004)
Ouzounov, A.: Robust Features and Neural Network for Noisy Speech Detection. Cybernetics and Information Technologies 6(3), 75–84 (2006)
Ouzounov, A.: Cepstral Features and Text-Dependent Speaker Identification - A Comparative Study. Cybernetics and Information Technologies 10(1), 1–12 (2010)
Ouzounov, A.: Telephone Speech Endpoint Detection Using Mean-Delta Feature. Cybernetics and Information Technologies 14(2), 127–139 (2014)
Padmanabhan, R., Krishnan, P., Murthy, H.: A Pattern Recognition approach to VAD using Modified Group Delay. In: Proceedings of the National Conference on Communications, pp. 432–436 (2008)
Ramirez, J., Segura, J., BenÃtez, C., De la Torre, A., Rubio, A.: Efficient Voice Activity Detection Algorithms Using Long-Term Speech Information. Speech Communication 42(3-4), 271–287 (2004)
Ramirez, J., Yelamos, P., Gorriz, J., Seguraet, J.: SVM-based speech endpoint detection using contextual speech features. Electronics Letters 42(7), 426–428 (2006)
Seok, J., Bae, K.: A Novel Endpoint Detection using Discrete Wavelet Transform. IEICE Transaction on Inf. & Syst., E82-D(11) 1489–1491 (1999)
Shin, W., Lee, B., Lee, Y., Lee, J.: Speech/non-speech classification using multiple features for robust endpoint detection. In: IEEE ICASSP, pp. 1399–1402 (2000)
Wu, B.F., Wang, K.C.: Robust Endpoint Detection Algorithm based on the Adaptive Band-Partitioning Spectral Entropy in Adverse Environments. IEEE Transactions on SAP 13(5), 762–775 (2005)
Yamamoto, K., Jabloun, F., Reinhard, K., Kawamura, A.: Robust Endpoint Detection for Speech Recognition Based on Discriminative Feature Extraction. In: IEEE ICASSP, vol. I, pp. 805–808 (2006)
Zelinski, R., Class, F.: A Learning Procedure for Speaker–Dependent Word Recognition System based on Sequential Processing of Input Tokens. In: IEEE ICASSP, pp. 1053–1056 (1983)
Zhang, Z., Furui, S.: Noisy Speech Recognition based on Robust End-point Detection and Model Adaptation. IEEE ICASSP 1, 441–444 (2005)
Zhu, J., Chen, F.: The Analysis and Application of a New Endpoint Detection Method based on Distance of Autocorrelated Similarity. In: Eurospeech, pp. 105–108 (1999)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Ouzounov, A. (2014). Noisy Speech Endpoint Detection using Robust Feature. In: Cantoni, V., Dimov, D., Tistarelli, M. (eds) Biometric Authentication. BIOMET 2014. Lecture Notes in Computer Science(), vol 8897. Springer, Cham. https://doi.org/10.1007/978-3-319-13386-7_9
Download citation
DOI: https://doi.org/10.1007/978-3-319-13386-7_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-13385-0
Online ISBN: 978-3-319-13386-7
eBook Packages: Computer ScienceComputer Science (R0)