Skip to main content
Log in

VAD Based on Kernel Smoothed Function of EGARCH Models

  • Published:
Wireless Personal Communications Aims and scope Submit manuscript

Abstract

An algorithm for a voice activity detector (VAD) is proposed. It is based on the exponential generalized autoregressive conditional heteroscedasticity (EGARCH) filter for generalized hyperbolic (GH), Gaussian random variables, adaptive threshold values and autocorrelation coefficients. EGARCH models are a new variation of GARCH models used especially in economic time series. A speech signal is assumed to have a GH because GH has heavier tails than the Gaussian distribution (GD) covering other heavy tailed distributions like hyperbolic, skewed \(t\), variance gamma (VG), inverse Gaussian (NIG), Cauchy, Student’s \(t\) and Laplace distributions. The distribution of noise signal is assumed to be uncorrelated (white noise), but in general, that is not necessary. In the proposed method, heteroscedasticity is modeled by EGARCH. A kernel smoothed function of conditional variances and autocorrelations generate the soft detection vector. Finally, hard detection is the result of comparing the soft detection vector with an adaptive threshold value. The simulation results show that the proposed VAD is able to operate down to \(-5\) dB.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

References

  1. Alberg, D., Shalit, H., & Yosef, R. (2008). Estimating stock market volatility using asymmetric GARCH models. Applied Financial Economics, 18, 1201–1208.

    Article  Google Scholar 

  2. Barndorff-Nielsen, O. E. (1977). Exponentially decreasing distributions for the logarithm of the particle size. Proceedings of the Royal Society. London. Series A. Mathematical and Physical Sciences, 353, 401–419.

    Article  Google Scholar 

  3. Bollerslev, T., Engle, R. F., & Nelson, D. B. (1994). ARCH models in finance. In R. F. Engle & D. L. McFadden (Eds.), Handbook of Econometrics, volume IV, Chapter 49. Amsterdam: Elsevier Sciences B. V.

    Google Scholar 

  4. Cho, Y. D., Al-Naimi, K., & Kondoz, A. (2001). Improved voice activity detection based on a smoothed statistical likelihood ratio. Proceedings of the International Conference on Acoustics, Speech and Signal Processing, 2, 737–740.

    Google Scholar 

  5. Engle, R., & Ng, V. (1993). Measuring and testing the impact of news on volatility. Journal of Finance, 48, 1749–1778.

    Article  Google Scholar 

  6. Fan, J., & Yao, Q. (2003). Nonlinear time series: Nonparametric and parametric methods. New York: Springer.

    Book  Google Scholar 

  7. Garner, N. R., Barrett, P. A., Howard, D. M., & Tyrrell, A. M. (1997). Robust noise detection for speech detection and enhancement. Electronics Letters, 33, 270–271.

    Article  Google Scholar 

  8. Gazor, S., & Zhang, W. (2003). A soft voice activity detector based on a Laplacian-Gaussian model. IEEE Transactions on Speech and Audio Processing, 11, 498–505.

    Article  Google Scholar 

  9. Hartz, C., Mittnik, S., & Paolella, M. (2006). Accurate value-at-risk forecasting based on the normal-GARCH model. Computational Statistics and Data Analysis, 51, 2295–2312.

    Article  MathSciNet  MATH  Google Scholar 

  10. Huang, Y. C., & Chen, S. C. (2002). Warrants pricing: Stochastic volatility vs. Black-Scholes. Pacific-Basin Finance Journal, 10, 393–409.

    Article  Google Scholar 

  11. Lee, S., & Hansen, B. E. (1994). Asymptotic theory for the GARCH(1, l) quasimaximum likelihood estimator. Econometric Theory, 10, 29–52.

    Google Scholar 

  12. Liu, X., He, J., & Liu, Q. (2005). Volatility analysis of Shenzheng Stock Market based on VaR-EGARCH(1, 1)-GED model. Nankai Business Review.

  13. McNeil, A. J., Frey, R., & Embrechts, P. (2005). Quantitative risk management: Concepts, techniques and tools. Princeton: Princeton University Press.

    Google Scholar 

  14. Mousazadeh, S., & Cohen, I. (2011). AR-GARCH in presence of noise: Parameter estimation and its application to voice activity detection. IEEE Transactions on Audio, Speech and Language Processing, 19, 916–926.

    Article  Google Scholar 

  15. Nadaraya, E. A. (1989). Nonparametric estimation of probability densities and regression curves. English translation by S. Kotz. Kluwer, Dordrecht.

  16. Nelson, D. B. (1991). Conditional heteroscedasticity in asset pricing: A new approach. Econometrica, 59, 347–370.

    Article  MathSciNet  MATH  Google Scholar 

  17. Pagan, A. R., & Schwert, G. W. (1990). Alternative models for conditional stock volatility. Journal of Econometrics, 45, 267–290.

    Article  Google Scholar 

  18. Pederzoli, C. (2006). Stochastic volatility and GARCH: A comparison based on UK stock data. The European Journal of Finance, 12, 41–59.

    Article  Google Scholar 

  19. R Development Core Team. (2012). R: A language and environment for statistical computing. Vienna: R Foundation for Statistical Computing.

    Google Scholar 

  20. Ramírez, J., & Segura, J. C. (2005). Statistical voice activity detection using a multiple observation likelihood ratio test. IEEE Signal Processing Letters, 12, 689–692.

    Article  Google Scholar 

  21. Ramírez, J., Segura, J. C., Benítez, C., de la Torre, A., & Rubio, A. (2004). Efficient voice activity detection algorithms using long-term speech information. Speech Communication, 42, 271–287.

    Article  Google Scholar 

  22. Rezayee, A., & Gazor, S. (2001). An adaptive KLT approach for speech enhancement. IEEE Transactions on Speech and Audio Processing, 9, 87–95.

    Article  Google Scholar 

  23. Sohn, J., Kim, N. S., & Sung, W. (1999). A statistical model-based voice activity detection. IEEE Signal Processing Letters, 6, 1–3.

    Article  Google Scholar 

  24. Tahmasbi, R., & Rezaei, S. (2007). A soft voice activity detection using GARCH filter and variance gamma distribution. IIE Transactions on Audio, Speech, Language processing, 15, 1129–1134.

    Article  Google Scholar 

  25. Tahmasbi, R., & Rezaei, S. (2008). Change point detection in GARCH model for voice activity detection. IIE Transactions on Audio, Speech, Language processing, 16, 1038–1046.

    Article  Google Scholar 

Download references

Acknowledgments

The authors would like to thank the Editor and the referee for careful reading and for their comments which greatly improved the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Saralees Nadarajah.

Appendix

Appendix

1.1 Proof of (6)

Note that we can write

$$\begin{aligned} f_X \left(x\right)&= \int ^{\infty }_0 f_W \left(w\right) f_{X \mid W = w} \left( x \left|w\right.\right) dw \\&= \int \limits ^{\infty }_0 \frac{ 1}{ \sqrt{2\pi w}\eta } \exp \left\{ -\frac{ 1}{ 2} \frac{ \left(x-\mu -w\gamma \right)^2}{ w\eta ^2} \right\} f_W \left(w\right)dw \\&= \int \limits ^{\infty }_0 \frac{ \exp \left\{ \frac{ \gamma \left(x-\mu \right)}{ \eta ^2} \right\} }{ \sqrt{2\pi w}\eta } \exp \left\{ \frac{ \left(x-\mu \right)^2}{ 2w \eta ^2}\right\} \exp \left\{ -\frac{ \gamma ^2 \eta ^{-2}}{ 2 / w} \right\} f_W \left(w\right)dw, \end{aligned}$$

where \(f_W (w)\) denotes the pdf of \(W\). Substituting the form for \(f_W (\cdot )\) from (5), we obtain

$$\begin{aligned} f_X \left(x\right)&= \frac{ 1}{ 2} \frac{ \psi ^{\lambda } \left(\sqrt{\chi \psi }\right)^{-\lambda } \exp \left\{ \frac{ \gamma \left(x-\mu \right)}{ \eta ^2}\right\} }{ \sqrt{2\pi }\eta k_{\lambda } \left(\sqrt{\chi \psi }\right)} \nonumber \\&\quad \times \int \limits ^{\infty }_0 w^{\lambda - \frac{ 1}{ 2} - 1} \exp \left\{ -\frac{ \eta ^{-2} \left(x-\mu \right)^2+\chi }{ 2w} - \frac{ \gamma ^2 {\eta }^{-2}+\psi }{ 2/w} \right\} dw. \end{aligned}$$
(15)

Transforming \(w\) to

$$\begin{aligned} y=w \frac{ \sqrt{\gamma ^2 \eta ^{-2}+\psi }}{ \sqrt{\eta ^{-2} \left(x-\mu \right)^2+\chi }} \end{aligned}$$

reduces (15) to

$$\begin{aligned} f_X \left(x\right)&= \frac{ \left(\sqrt{\chi \psi }\right)^{-\lambda } {\psi }^{\lambda } \left[ \psi + \left(\frac{ \gamma }{ \eta }\right)^2 \right]^{\frac{1}{2}-\lambda }}{ \sqrt{2\pi {\eta }^2} K_{\lambda } \left(\sqrt{\chi \psi }\right)} \frac{ \exp \left\{ \frac{ \gamma \left(x-\mu \right)}{ \eta ^2} \right\} }{ \left[ \sqrt{\left( \eta ^{-2} \left(x-\mu \right)^2 + \chi \right) \left( \gamma ^2 \eta ^{-2} + \psi \right)} \right]^{\frac{ 1}{ 2}-\lambda }} \\&\quad \times \int \limits ^{\infty }_0 \frac{1}{2} y^{\lambda - \frac{ 1}{ 2} - 1} \exp \left\{ -\frac{ 1}{ 2} \sqrt{\left( \eta ^{-2} \left(x-\mu \right)^2 + \chi \right) \left( \gamma ^2 \eta ^{-2}+\psi \right)} \left[ \frac{ 1}{ y}+y\right]\right\} dy. \end{aligned}$$

The result follows by the definition of the modified Bessel function of the third kind. \(\square \)

1.2 Mean and variance of the GH distribution

If \(W\) is a GIG random variable then

$$\begin{aligned} \mathbb E \left( W^{\alpha }\right)&= \frac{ \chi ^{-\lambda } \left(\sqrt{\chi \psi }\right)^{\lambda }}{ k_{\lambda } \left(\sqrt{\chi \psi }\right)} \int \limits ^\infty _0 \frac{ 1}{ 2} w^{\lambda +\alpha - 1} \exp \left\{ -\frac{ 1}{ 2}\left(\chi w^{-1}+\psi w\right) \right\} dw \\&= \left( \frac{ \chi }{ \psi }\right)^{\alpha / 2} \frac{ K_{\lambda +\alpha } \left(\sqrt{\chi \psi } \right)}{ k_{\lambda } \left(\sqrt{\chi \psi }\right)} \\&\times \int \limits ^{\infty }_0 \underbrace{\frac{ \chi ^{-\left(\lambda +\alpha \right)} \left(\sqrt{\chi \psi }\right)^{\lambda +\alpha }}{ 2k_{\lambda +\alpha } \left(\sqrt{\chi \psi }\right)} w^{\lambda +\alpha -1} \exp \left\{ -\frac{ 1}{ 2} \left(\chi w^{-1}+\psi w\right) \right\} }_{f_W \left(w \right)} dw \\&= \left( \frac{ \chi }{ \psi }\right)^{\alpha / 2} \frac{ K_{\lambda +\alpha } \left(\sqrt{\chi \psi } \right)}{ k_{\lambda } \left(\sqrt{\chi \psi }\right)}. \end{aligned}$$

In addition, \(\mathbb E (X) = \mu + \mathbb E (W) \gamma \) and Var \((X) = \eta ^2 \mathbb E (W) + \gamma ^2 \mathrm{Var} (W)\). \(\square \)

1.3 Proof of Proposition 1

Let \(X\sim \ GH (\lambda , \chi , \psi , \mu , \eta ^2, \gamma )\). By (2),

$$\begin{aligned} \phi _X\left(s\right)&= \mathbb E \left( \mathbb E \left( \exp (\mathrm{i} sX) \left|W\right.\right)\right) = \mathbb E \left( \exp \left(\mathrm{i} s\left(\mu +W\gamma \right) - W s^2{\eta }^2 / 2 \right)\right)\\&= \exp (\mathrm{i} s\mu ) H\left( s^2 \eta ^2 / 2 - \mathrm{i} s\gamma \right), \end{aligned}$$

where \(H (\theta ) = \mathbb E [\exp (-\theta W)]\) is the Laplace transform of a GIG random variable.

Let \(Y=c+\sum ^n_{j=1} b_jX_j \) and let \(X_j \sim GH (\lambda , \chi , \psi , \mu _j, \eta ^2_j, \gamma _j)\) for each \(j\). Then,

$$\begin{aligned} \phi _Y\left(s\right)&= \mathbb E \left[ \exp (\mathrm{i} s\left(c+\sum ^n_{j=1} b_j X_j \right) \right] \\&= \exp (\mathrm{i} sc) \phi _X \left(s\sum ^n_{j=1}{b_j}\right) \\&= \exp \left[ \mathrm{i} s \left(\mu \sum ^n_{j=1}{b_j}+c\right) \right] H\left( s^2 \eta ^2 \sum ^n_{j=1} b^2_j / 2 - \mathrm{i} s \gamma \sum ^n_{j=1}{b_j}\right). \end{aligned}$$

The result follows. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Salemi, U.H., Rezaei, S. & Nadarajah, S. VAD Based on Kernel Smoothed Function of EGARCH Models. Wireless Pers Commun 72, 299–313 (2013). https://doi.org/10.1007/s11277-013-1015-1

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11277-013-1015-1

Keywords

Navigation