Skip to main content

An Efficient VAD Based on a Hang-Over Scheme and a Likelihood Ratio Test

  • Conference paper
Computational and Ambient Intelligence (IWANN 2007)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4507))

Included in the following conference series:

  • 2286 Accesses

Abstract

The emerging applications of wireless speech communication are demanding increasing levels of performance in noise adverse environments together with the design of high response rate speech processing systems. This is a serious obstacle to meet the demands of modern applications and therefore these systems often needs a noise reduction algorithm working in combination with a precise voice activity detector (VAD). This paper presents a new voice activity detector (VAD) for improving speech detection robustness in noisy environments and the performance of speech recognition systems. The algorithm defines an optimum likelihood ratio test (LRT) involving Multiple and correlated Observations (MO) and assuming a jointly Gaussian probability density function (jGpdf). An analysis of the methodology for N = {2,3} shows the robustness of the proposed approach by means of a clear reduction of the classification error as the number of observations is increased. The algorithm is also compared to different VAD methods including the G.729, AMR and AFE standards, as well as recently reported algorithms showing a sustained advantage in speech/non-speech detection accuracy and speech recognition performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Benyassine, A., Shlomot, E., Su, H., Massaloux, D., Lamblin, C., Petit, J.: ITU-T Recommendation G.729 Annex B: A silence compression scheme for use with G.729 optimized for V.70 digital simultaneous voice and data applications. IEEE Communications Magazine 35(9), 64–73 (1997)

    Article  Google Scholar 

  2. ITU, A silence compression scheme for G.729 optimized for terminals conforming to recommendation V.70. ITU-T Recommendation G.729-Annex B (1996)

    Google Scholar 

  3. ETSI, Voice activity detector (VAD) for Adaptive Multi-Rate (AMR) speech traffic channels. ETSI EN 301 708 Recommendation (1999)

    Google Scholar 

  4. ETSI, Speech processing, transmission and quality aspects (STQ); distributed speech recognition; advanced front-end feature extraction algorithm; compression algorithms. ETSI ES 201 108 Recommendation (2002)

    Google Scholar 

  5. Bouquin-Jeannes, R.L., Faucon, G.: Study of a voice activity detector and its influence on a noise reduction system. Speech Communication 16, 245–254 (1995)

    Article  Google Scholar 

  6. Sohn, J., Kim, N.S., Sung, W.: A statistical model-based voice activity detection. IEEE Signal Processing Letters 16(1), 1–3 (1999)

    Article  Google Scholar 

  7. Cho, Y.D., Al-Naimi, K., Kondoz, A.: Improved voice activity detection based on a smoothed statistical likelihood ratio. In: Proc. of the International Conference on Acoustics, Speech and Signal Processing (ICASSP), vol. 2, pp. 737–740 (2001)

    Google Scholar 

  8. Górriz, J.M., Ramírez, J., Puntonet, C.G., Segura, J.C.: An effective cluster-based model for robust speech detection and speech recognition in noisy environments. Journal of Acoustical Society of America 120(470), 470–481 (2006)

    Article  Google Scholar 

  9. Górriz, J.M., Ramirez, J., Segura, J.C., Puntonet, C.G.: An improved mo-lrt vad based on a bispectra gaussian model. Electronic Letters 41(15), 877–879 (2005)

    Article  Google Scholar 

  10. Moreno, A., Borge, L., Christoph, D., Gael, R., Khalid, C., Stephan, E., Jeffrey, A.: SpeechDat-Car: A Large Speech Database for Automotive Environments. In: Proceedings of the II LREC Conference (2000)

    Google Scholar 

  11. Akhiezer, N.I.: The Classical Moment Problem. Oliver and Boyd, Edinburgh (1965)

    Google Scholar 

  12. Yamani, H.A., Abdelmonem, M.S.: The analytic inversion of any finite symmetric tridiagonal matrix. J. Phys. A: Math. Gen. 30, 2889–2893 (1997)

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Francisco Sandoval Alberto Prieto Joan Cabestany Manuel Graña

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Pernía, O., Górriz, J.M., Ramírez, J., Puntonet, C.G., Turias, I. (2007). An Efficient VAD Based on a Hang-Over Scheme and a Likelihood Ratio Test. In: Sandoval, F., Prieto, A., Cabestany, J., Graña, M. (eds) Computational and Ambient Intelligence. IWANN 2007. Lecture Notes in Computer Science, vol 4507. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73007-1_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-73007-1_5

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-73006-4

  • Online ISBN: 978-3-540-73007-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics