Skip to main content
Log in

Voice Activity Detection Based on Discriminative Weight Training Incorporating a Spectral Flatness Measure

  • Published:
Circuits, Systems and Signal Processing Aims and scope Submit manuscript

Abstract

In this paper, we present an approach to incorporate discriminative weight training into a statistical model-based voice activity detection (VAD) method. In our approach, the VAD decision rule is derived from the optimally weighted likelihood ratios (LRs) using a minimum classification error (MCE) method. An adaptive on-line means of selecting two kinds of weights based on a power spectral flatness measure (PSFM) is devised for performance improvement. The proposed approach is compared to conventional schemes under various noise conditions, and shows better performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. J.-H. Chang, N.S. Kim, Distorted speech rejection for automatic speech recognition in wireless communication. IEICE Trans. Inf. Syst. E87-D(7), 1978–1981 (2004)

    Google Scholar 

  2. J.-H. Chang, J.W. Shin, N.S. Kim, Voice activity detector employing generalised Gaussian distribution. Electron. Lett. 40(24), 1561–1563 (2004)

    Article  Google Scholar 

  3. J.-H. Chang, N.S. Kim, S.K. Mitra, Voice activity detection based on multiple statistical models. IEEE Trans. Signal Process. 54(6), 1965–1976 (2006)

    Article  Google Scholar 

  4. J.-H. Chang, S. Gazor, N.S. Kim, S.K. Mitra, Multiple statistical models for soft decision in noisy speech enhancement. Pattern Recognit. 40(3), 1123–1134 (2007)

    Article  MATH  Google Scholar 

  5. Y.D. Cho, A. Kondoz, Analysis and improvement of a statistical model-based voice activity detector. IEEE Signal Process. Lett. 8(10), 276–278 (2001)

    Article  Google Scholar 

  6. Y. Ephraim, D. Malah, Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator. IEEE Trans. Acoust. Speech Signal Process. ASSP-32(6), 1190–1121 (1984)

    Google Scholar 

  7. ETSI, Voice activity detector (VAD) for adaptive multi-rate (AMR) speech traffic channels. ETSI EN 301 708 v7.1.1

  8. ITU-T, A silence compression scheme for G.729 optimised for terminals conforming to ITU-T V.70. ITU-T Rec. G.729 Annex B

  9. B.-H. Juang, W. Chou, C.-H. Lee, Minimum classification error rate methods for speech recognition. IEEE Trans. Speech Audio Process. 5(3), 257–265 (1997)

    Article  Google Scholar 

  10. S.-I. Kang, Q.-H. Jo, J.-H. Chang, Discriminative weight training for a statistical model-based voice activity detection. IEEE Signal Process. Lett. 15, 170–173 (2008)

    Article  Google Scholar 

  11. Y.C. Lee, S.S. Ahn, Statistical model-based VAD algorithm with wavelet transform. IEICE Trans. Fundam. E89-A(6), 1594–1600 (2006)

    Article  Google Scholar 

  12. J. Ramirez, J.M. Gorriz, J.C. Segura, C.G. Puntonet, A.J. Rubio, Speech/non-speech discrimination based on contextual information integrated bispectrum LRT. IEEE Signal Process. Lett. 13(8), 497–500 (2006)

    Article  Google Scholar 

  13. M.H. Savoji, A robust algorithm for accurate endpointing of speech signals. Speech Commun. 8, 45–60 (1989)

    Article  Google Scholar 

  14. J. Sohn, W. Sung, A voice activity detector employing soft decision based noise spectrum adaptation. Proc. Int. Conf. Acoust. Speech Signal Process. 1, 365–368 (1998)

    Google Scholar 

  15. J. Sohn, N.S. Kim, W. Sung, A statistical model-based voice activity detection. IEEE Signal Process. Lett. 6(1), 1–3 (1999)

    Article  Google Scholar 

  16. A. Varga, H.J.M. Steeneken, Assessment for automatic speech recognition: II. NOISEX-92: A database and an experiment to study the effect of additive noise on speech recognition systems. Speech Commun. 12(3), 247–251 (1993)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Joon-Hyuk Chang.

Additional information

This work was supported by the MKE (Ministry of Knowledge Economy), Korea, under the ITRC (Information Technology Research Center) support program supervised by the IITA (Institute for Information Technology Advancement) (IITA-2008-C1090-0902-0010) and This research was financially supported by the MKE and KOTEF through the Human Resource Training Project for Strategic Technology.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kang, SI., Chang, JH. Voice Activity Detection Based on Discriminative Weight Training Incorporating a Spectral Flatness Measure. Circuits Syst Signal Process 29, 183–194 (2010). https://doi.org/10.1007/s00034-009-9141-4

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00034-009-9141-4

Keywords

Navigation