Speech Enhancement for Automatic Speech Recognition Using Complex Gaussian Mixture Priors for Noise and Speech

Astudillo, Ramón F.; Hoffmann, Eugen; Mandelartz, Philipp; Orglmeister, Reinhold

doi:10.1007/978-3-642-11509-7_8

Ramón F. Astudillo²¹,
Eugen Hoffmann²¹,
Philipp Mandelartz²¹ &
…
Reinhold Orglmeister²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5933))

Included in the following conference series:

International Conference on Nonlinear Speech Processing

Abstract

Statistical speech enhancement methods often rely on a set of assumptions, like gaussianity of speech and noise processes or perfect knowledge of their parameters, that are not fully met in reality. Recent advancements have shown the potential improvement in speech enhancement obtained by employing supergaussian speech models conditioned on the estimated signal to noise ratio. In this paper we derive a supergaussian model for speech enhancement in which both speech and noise priors are assumed to be complex Gaussian mixture models. We introduce as well a method for the computation of the noise prior based on the noise variance estimator used. Finally, we compare the developed estimators with the conventional Ephraim-Malah filters in the context of robust automatic speech recognition.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Minimum mean square error estimator for speech enhancement in additive noise assuming Weibull speech priors and speech presence uncertainty

Article 13 November 2020

Spectral Reconstruction and Noise Model Estimation Based on a Masking Model for Noise Robust Speech Recognition

Article 06 January 2017

Bayesian STSA estimation using masking properties and generalized Gamma prior for speech enhancement

Article Open access 06 October 2015

References

MCAulay, R.J., Malpass, L.M.: Speech enhancement Using a Soft-Decision Noise Suppression Filter. IEEE Trans. ASSP 28(2), 137–145 (1980)
Article Google Scholar
Ephraim, Y., Malah, D.: Speech enhancement using a minimum mean-square error short-time amplitude estimator. IEEE Trans. ASSP 32(6), 1109–1121 (1984)
Article Google Scholar
Ephraim, Y., Malah, D.: Speech enhancement using a minimum mean-square error log-spectral amplitude estimator. IEEE Trans. ASSP 33(2), 443–445 (1985)
Article Google Scholar
Martin, R.: Speech enhancement using MMSE short time spectral estimation with gamma distributed speech priors. IEEE Int. Conf. Acoustics, Speech, Signal Processing 1, 253–256 (2002)
Article Google Scholar
Cohen, I.: Noise Spectrum Estimation in Adverse Environments: Improved Minima controlled Recursive Averaging. IEEE Trans. ASSP 11(5), 466–475 (2003)
Google Scholar
Lotter, T., Vary, P.: Speech Enhancement by MAP Spectral Amplitude Estimation Using a Super-Gaussian SpeechModel. EURASIP Journal on Applied Signal Processing 7, 1110–1126 (2005)
Google Scholar
Ephraim, Y., Cohen, I.: Recent advancements in Speech Enhancement. In: The Electrical Engineering Handbook. CRC Press, Boca Raton (2005)
Google Scholar
Erkelens, J., Hendriks, R., Heusdens, R., Jensen, J.: Minimum Mean-Square Error Estimation of Discrete Fourier Coefficients With Generalized Gamma Priors. IEEE Trans ASSP 15, 1741–1752 (2007)
Google Scholar
Erkelens, J.S.: Speech Enhancement based on Rayleigh mixture modeling of Speech Spectral Amplitude Distributions. In: Proc. EUSIPCO 2007, pp. 65–69 (2007)
Google Scholar
Hirsch, G.: Aurora-5 Experimental Framework for the Performance Evaluation of Speech Recognition in Case of a Hands-free Speech Input in Noisy Environments (2007)
Google Scholar
Astudillo, R.F., Kolossa, D., Orglmeister, R.: Accounting for the Uncertainty of Speech Estimates in the Complex Domain for Minimum Mean Square Error Speech Enhancement. In: Proc. Interspeech, pp. 2491–2494 (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Energy and Automation Technology, TU-Berlin, Einsteinufer 17, 10587, Berlin, Germany
Ramón F. Astudillo, Eugen Hoffmann, Philipp Mandelartz & Reinhold Orglmeister

Authors

Ramón F. Astudillo
View author publications
You can also search for this author in PubMed Google Scholar
Eugen Hoffmann
View author publications
You can also search for this author in PubMed Google Scholar
Philipp Mandelartz
View author publications
You can also search for this author in PubMed Google Scholar
Reinhold Orglmeister
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, Escola Politecnica Superior, Universidat de Vic, c/. Sagrada Familia, 7, 08500, Vic (Barcelona), Spain
Jordi Solé-Casals
Department of Computer Science, Escola Politecnica Superior, Universitat de Vic, c./. Sagrada Familia, 7, 08500, Vic (Barcelona), Spain
Vladimir Zaiats

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Astudillo, R.F., Hoffmann, E., Mandelartz, P., Orglmeister, R. (2010). Speech Enhancement for Automatic Speech Recognition Using Complex Gaussian Mixture Priors for Noise and Speech. In: Solé-Casals, J., Zaiats, V. (eds) Advances in Nonlinear Speech Processing. NOLISP 2009. Lecture Notes in Computer Science(), vol 5933. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-11509-7_8

Download citation

DOI: https://doi.org/10.1007/978-3-642-11509-7_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-11508-0
Online ISBN: 978-3-642-11509-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics