Skip to main content

Speech Enhancement for Automatic Speech Recognition Using Complex Gaussian Mixture Priors for Noise and Speech

  • Conference paper
  • 621 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5933))

Abstract

Statistical speech enhancement methods often rely on a set of assumptions, like gaussianity of speech and noise processes or perfect knowledge of their parameters, that are not fully met in reality. Recent advancements have shown the potential improvement in speech enhancement obtained by employing supergaussian speech models conditioned on the estimated signal to noise ratio. In this paper we derive a supergaussian model for speech enhancement in which both speech and noise priors are assumed to be complex Gaussian mixture models. We introduce as well a method for the computation of the noise prior based on the noise variance estimator used. Finally, we compare the developed estimators with the conventional Ephraim-Malah filters in the context of robust automatic speech recognition.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. MCAulay, R.J., Malpass, L.M.: Speech enhancement Using a Soft-Decision Noise Suppression Filter. IEEE Trans. ASSP 28(2), 137–145 (1980)

    Article  Google Scholar 

  2. Ephraim, Y., Malah, D.: Speech enhancement using a minimum mean-square error short-time amplitude estimator. IEEE Trans. ASSP 32(6), 1109–1121 (1984)

    Article  Google Scholar 

  3. Ephraim, Y., Malah, D.: Speech enhancement using a minimum mean-square error log-spectral amplitude estimator. IEEE Trans. ASSP 33(2), 443–445 (1985)

    Article  Google Scholar 

  4. Martin, R.: Speech enhancement using MMSE short time spectral estimation with gamma distributed speech priors. IEEE Int. Conf. Acoustics, Speech, Signal Processing 1, 253–256 (2002)

    Article  Google Scholar 

  5. Cohen, I.: Noise Spectrum Estimation in Adverse Environments: Improved Minima controlled Recursive Averaging. IEEE Trans. ASSP 11(5), 466–475 (2003)

    Google Scholar 

  6. Lotter, T., Vary, P.: Speech Enhancement by MAP Spectral Amplitude Estimation Using a Super-Gaussian SpeechModel. EURASIP Journal on Applied Signal Processing 7, 1110–1126 (2005)

    Google Scholar 

  7. Ephraim, Y., Cohen, I.: Recent advancements in Speech Enhancement. In: The Electrical Engineering Handbook. CRC Press, Boca Raton (2005)

    Google Scholar 

  8. Erkelens, J., Hendriks, R., Heusdens, R., Jensen, J.: Minimum Mean-Square Error Estimation of Discrete Fourier Coefficients With Generalized Gamma Priors. IEEE Trans ASSP 15, 1741–1752 (2007)

    Google Scholar 

  9. Erkelens, J.S.: Speech Enhancement based on Rayleigh mixture modeling of Speech Spectral Amplitude Distributions. In: Proc. EUSIPCO 2007, pp. 65–69 (2007)

    Google Scholar 

  10. Hirsch, G.: Aurora-5 Experimental Framework for the Performance Evaluation of Speech Recognition in Case of a Hands-free Speech Input in Noisy Environments (2007)

    Google Scholar 

  11. Astudillo, R.F., Kolossa, D., Orglmeister, R.: Accounting for the Uncertainty of Speech Estimates in the Complex Domain for Minimum Mean Square Error Speech Enhancement. In: Proc. Interspeech, pp. 2491–2494 (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Astudillo, R.F., Hoffmann, E., Mandelartz, P., Orglmeister, R. (2010). Speech Enhancement for Automatic Speech Recognition Using Complex Gaussian Mixture Priors for Noise and Speech. In: Solé-Casals, J., Zaiats, V. (eds) Advances in Nonlinear Speech Processing. NOLISP 2009. Lecture Notes in Computer Science(), vol 5933. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-11509-7_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-11509-7_8

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-11508-0

  • Online ISBN: 978-3-642-11509-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics