Model-Driven Speech Enhancement for Multisource Reverberant Environment (Signal Separation Evaluation Campaign (SiSEC) 2011)

Mowlaee, Pejman; Saeidi, Rahim; Martin, Rainer

doi:10.1007/978-3-642-28551-6_56

Pejman Mowlaee¹⁶,
Rahim Saeidi¹⁷ &
Rainer Martin¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7191))

Included in the following conference series:

International Conference on Latent Variable Analysis and Signal Separation

2486 Accesses
6 Citations

Abstract

We present a low complexity speech enhancement technique for real-life multi-source environments. Assuming that the speaker identity is known a priori, we present the idea of incorporating speaker model to enhance a target signal corrupted in non-stationary noise in a reverberant scenario. Based on experiments, this helps to improve the limited performance of noise-tracking based speech enhancement methods under unpredictable and non-stationary noise scenarios. Using pre-trained speaker model captures a constrained subspace for target speech and is capable to provide enhanced speech estimate by rejecting the non-stationary noise sources. Experimental results on Signal Separation Evaluation Campaign (SiSEC) showed that the proposed approach is successful in canceling the interference signal in the noisy input and providing an enhanced output signal.

The work of Pejman Mowlaee was funded by the European Commission within the Marie Curie ITN AUDIS, grant PITNGA-2008-214699. The work of Rahim Saeidi was funded by the European Community’s Seventh Framework Programme (FP7 2007-2013) under grant agreement no. 238803.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Ephraim, Y., Malah, D.: Speech enhancement using a minimum-mean square error short-time spectral amplitude estimator. IEEE Trans. Audio, Speech, and Language Process. 32(6), 1109–1121 (1984)
Article Google Scholar
Ephraim, Y., Malah, D.: Speech enhancement using a minimum mean-square error log-spectral amplitude estimator. IEEE Transactions on Acoustics, Speech and Signal Processing 33(2), 443–445 (1985)
Article Google Scholar
Hendriks, R.C., Heusdens, R., Jensen, J.: MMSE based noise PSD tracking with low complexity. In: Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, pp. 4266–4269 (2010)
Google Scholar
Christensen, H., Barker, J., Ma, N., Green, P.: The CHiME corpus: a resource and a challenge for computational hearing in multisource environments. In: Proc. Interspeech, pp. 1918–1921 (2010)
Google Scholar
Mowlaee, P.: New Stategies for Single-channel Speech Separation, Ph.D. thesis, Institut for Elektroniske Systemer, Aalborg Universitet (2010)
Google Scholar
Mowlaee, P., Christensen, M., Jensen, S.: New results on single-channel speech separation using sinusoidal modeling. IEEE Trans. Audio, Speech, and Language Process. 19(5), 1265–1277 (2011)
Article Google Scholar
Rangachari, S., Loizou, P.C.: A noise-estimation algorithm for highly non-stationary environments. Speech Communication 48(2), 220–231 (2006)
Article Google Scholar
Cohen, I., Berdugo, B.: Speech enhancement for non-stationary noise environments. Signal Processing 81(11), 2403–2418 (2001)
Article MATH Google Scholar
Wang, D.: On ideal binary mask as the computational goal of auditory scene analysis. In: Speech Separation by Humans and Machines, pp. 181–197. Kluwer (2005)
Google Scholar
Erkelens, J., Hendriks, R., Heusdens, R., Jensen, J.: Minimum mean-square error estimation of discrete Fourier coefficients with generalized gamma priors. IEEE Transactions on Audio, Speech, and Language Processing 15(6), 1741–1752 (2007)
Article Google Scholar
Vincent, E., Gribonval, R., Fevotte, C.: Performance measurement in blind audio source separation. IEEE Transactions on Audio, Speech, and Language Processing 14(4), 1462–1469 (2006)
Article Google Scholar
The third community-based Signal Separation Evaluation Campaign (SiSEC 2011), http://sisec.wiki.irisa.fr/tiki-index.php
Emiya, V., Vincent, E., Harlander, N., Hohmann, V.: Subjective and objective quality assessment of audio source separation. IEEE Transactions on Audio, Speech, and Language Processing (99), 1 (2011)
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Communication Acoustics (IKA), Ruhr-Universität Bochum (RUB), Germany
Pejman Mowlaee & Rainer Martin
Centre for Language and Speech Technology, Radboud University Nijmegen, The Netherlands
Rahim Saeidi

Authors

Pejman Mowlaee
View author publications
You can also search for this author in PubMed Google Scholar
Rahim Saeidi
View author publications
You can also search for this author in PubMed Google Scholar
Rainer Martin
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Fabian Theis Andrzej Cichocki Arie Yeredor Michael Zibulevsky

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mowlaee, P., Saeidi, R., Martin, R. (2012). Model-Driven Speech Enhancement for Multisource Reverberant Environment (Signal Separation Evaluation Campaign (SiSEC) 2011). In: Theis, F., Cichocki, A., Yeredor, A., Zibulevsky, M. (eds) Latent Variable Analysis and Signal Separation. LVA/ICA 2012. Lecture Notes in Computer Science, vol 7191. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28551-6_56

Download citation

DOI: https://doi.org/10.1007/978-3-642-28551-6_56
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28550-9
Online ISBN: 978-3-642-28551-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics