Voice Activity Detection Based on Discriminative Weight Training Incorporating a Spectral Flatness Measure

Kang, Sang-Ick; Chang, Joon-Hyuk

doi:10.1007/s00034-009-9141-4

Voice Activity Detection Based on Discriminative Weight Training Incorporating a Spectral Flatness Measure

Published: 29 December 2009

Volume 29, pages 183–194, (2010)
Cite this article

Circuits, Systems and Signal Processing Aims and scope Submit manuscript

Sang-Ick Kang¹ &
Joon-Hyuk Chang¹

122 Accesses
3 Citations
Explore all metrics

Abstract

In this paper, we present an approach to incorporate discriminative weight training into a statistical model-based voice activity detection (VAD) method. In our approach, the VAD decision rule is derived from the optimally weighted likelihood ratios (LRs) using a minimum classification error (MCE) method. An adaptive on-line means of selecting two kinds of weights based on a power spectral flatness measure (PSFM) is devised for performance improvement. The proposed approach is compared to conventional schemes under various noise conditions, and shows better performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Survey on SVM and their application in image classification

Article 11 January 2018

Comparative analysis of audio classification with MFCC and STFT features using machine learning techniques

Article Open access 03 January 2024

Speech Emotion Recognition: A Comprehensive Survey

Article 08 March 2023

References

J.-H. Chang, N.S. Kim, Distorted speech rejection for automatic speech recognition in wireless communication. IEICE Trans. Inf. Syst. E87-D(7), 1978–1981 (2004)
Google Scholar
J.-H. Chang, J.W. Shin, N.S. Kim, Voice activity detector employing generalised Gaussian distribution. Electron. Lett. 40(24), 1561–1563 (2004)
Article Google Scholar
J.-H. Chang, N.S. Kim, S.K. Mitra, Voice activity detection based on multiple statistical models. IEEE Trans. Signal Process. 54(6), 1965–1976 (2006)
Article Google Scholar
J.-H. Chang, S. Gazor, N.S. Kim, S.K. Mitra, Multiple statistical models for soft decision in noisy speech enhancement. Pattern Recognit. 40(3), 1123–1134 (2007)
Article MATH Google Scholar
Y.D. Cho, A. Kondoz, Analysis and improvement of a statistical model-based voice activity detector. IEEE Signal Process. Lett. 8(10), 276–278 (2001)
Article Google Scholar
Y. Ephraim, D. Malah, Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator. IEEE Trans. Acoust. Speech Signal Process. ASSP-32(6), 1190–1121 (1984)
Google Scholar
ETSI, Voice activity detector (VAD) for adaptive multi-rate (AMR) speech traffic channels. ETSI EN 301 708 v7.1.1
ITU-T, A silence compression scheme for G.729 optimised for terminals conforming to ITU-T V.70. ITU-T Rec. G.729 Annex B
B.-H. Juang, W. Chou, C.-H. Lee, Minimum classification error rate methods for speech recognition. IEEE Trans. Speech Audio Process. 5(3), 257–265 (1997)
Article Google Scholar
S.-I. Kang, Q.-H. Jo, J.-H. Chang, Discriminative weight training for a statistical model-based voice activity detection. IEEE Signal Process. Lett. 15, 170–173 (2008)
Article Google Scholar
Y.C. Lee, S.S. Ahn, Statistical model-based VAD algorithm with wavelet transform. IEICE Trans. Fundam. E89-A(6), 1594–1600 (2006)
Article Google Scholar
J. Ramirez, J.M. Gorriz, J.C. Segura, C.G. Puntonet, A.J. Rubio, Speech/non-speech discrimination based on contextual information integrated bispectrum LRT. IEEE Signal Process. Lett. 13(8), 497–500 (2006)
Article Google Scholar
M.H. Savoji, A robust algorithm for accurate endpointing of speech signals. Speech Commun. 8, 45–60 (1989)
Article Google Scholar
J. Sohn, W. Sung, A voice activity detector employing soft decision based noise spectrum adaptation. Proc. Int. Conf. Acoust. Speech Signal Process. 1, 365–368 (1998)
Google Scholar
J. Sohn, N.S. Kim, W. Sung, A statistical model-based voice activity detection. IEEE Signal Process. Lett. 6(1), 1–3 (1999)
Article Google Scholar
A. Varga, H.J.M. Steeneken, Assessment for automatic speech recognition: II. NOISEX-92: A database and an experiment to study the effect of additive noise on speech recognition systems. Speech Commun. 12(3), 247–251 (1993)
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Electronic Engineering, Inha University, Incheon, 402-751, Korea
Sang-Ick Kang & Joon-Hyuk Chang

Authors

Sang-Ick Kang
View author publications
You can also search for this author in PubMed Google Scholar
Joon-Hyuk Chang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Joon-Hyuk Chang.

Additional information

This work was supported by the MKE (Ministry of Knowledge Economy), Korea, under the ITRC (Information Technology Research Center) support program supervised by the IITA (Institute for Information Technology Advancement) (IITA-2008-C1090-0902-0010) and This research was financially supported by the MKE and KOTEF through the Human Resource Training Project for Strategic Technology.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kang, SI., Chang, JH. Voice Activity Detection Based on Discriminative Weight Training Incorporating a Spectral Flatness Measure. Circuits Syst Signal Process 29, 183–194 (2010). https://doi.org/10.1007/s00034-009-9141-4

Download citation

Received: 13 March 2008
Revised: 10 February 2009
Published: 29 December 2009
Issue Date: April 2010
DOI: https://doi.org/10.1007/s00034-009-9141-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Voice Activity Detection Based on Discriminative Weight Training Incorporating a Spectral Flatness Measure

Abstract

Access this article

Similar content being viewed by others

Survey on SVM and their application in image classification

Comparative analysis of audio classification with MFCC and STFT features using machine learning techniques

Speech Emotion Recognition: A Comprehensive Survey

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Voice Activity Detection Based on Discriminative Weight Training Incorporating a Spectral Flatness Measure

Abstract

Access this article

Similar content being viewed by others

Survey on SVM and their application in image classification

Comparative analysis of audio classification with MFCC and STFT features using machine learning techniques

Speech Emotion Recognition: A Comprehensive Survey

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation