Abstract
This paper exploits non-negative matrix factorization (NMF)-based method for speech enhancement within speaker identification framework. The proposed algorithm considers speech atoms in deterministic way as a sum of harmonically-related sinusoids in spectral domain. This approach allows us to estimate specific signal structure of vowel signal in the presence of noise in order to make an efficient noise reduction using only noise exemplars. The experiments of the present research in application to the speaker identification are conducted on the computational hearing in multisource environments (CHiME) dataset. The obtained results demonstrate the effectiveness of the preprocessing enhancement, and outperforming the general NMF-based speech enhancer. Further studies show the channel compensation effect of the proposed method leads to performance comparable to the common mismatch reduction methods such as feature warping.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Saeidi, R., Hurmalainen, A., Virtanen, T., van Leeuwen, D.A.: Exemplar-based Sparse Representation and Sparse Discrimination for Noise Robust Speaker Identification. In: Proc. Odyssey: The Speaker and Language Recognition Workshop, Singapore (2012)
Hurmalainen, A., Saeidi, R., Virtanen, T.: Group Sparsity for Speaker Identity Discrimination in Factorisation-based Speech Recognition. In: Proc. of the Interspeech (2012)
Wu, Q., Liu, J., Sun, J., Cichoki, A.: Shift-invariant Features with Multiple Factors for Robust Text-independent Speaker Identifcation, J. of Computational Information Systems 8(21), 8937–8944 (2012)
Lee, D.D., Seung, H.S.: Learning the parts of objects by non-negative matrix factorization. Nature 401(6755), 788–791 (1999)
Bertin, N., Badeau, R., Vincent, E.: Fast bayesian nmf algorithms enforcing harmonicity and temporal continuity in polyphonic music transcription. In: IEEE Workshop on App. of Signal Proc. to Audio and Acoustics, pp. 29–32 (2009)
Virtanen, T.: Monaural Sound Source Separation by Nonnegative Matrix Factorization With Temporal Continuity and Sparseness Criteria. IEEE Trans. on Audio, Speech and Language Processing 15(3) (2007)
Schmidt, M.N., Olsson, R.K.: Single-Channel Speech Separation using Sparse Non-Negative Matrix Factorization. In: Proc. of Interspeech, pp. 2614–2617 (2006)
Schmidt, M.N., Larsen, J., Hsiao, F.-T.: Wind Noise Reduction using Non-Negative Sparse Coding. In: IEEE Workshop on Machine Learning for Signal Proc., pp. 431–436 (2007)
Cauchi, B., Goetze, S., Doclo, S.: Reduction of non-stationary noise for a robotic living assistant using sparse non-negative matrix factorization. In: Proc. of the 1st Workshop on Speech and Multimodal Interaction in Assistive Environments, pp. 28–33 (2012)
Berry, M.W., et al.: Algorithms and applications for approximate nonnegative matrix factorization 52(1), 155–173 (2007)
Doroshin, D., Tkachenko, M., Lubimov, N., Kotov, M.: Application of l 1 Estimation of Gaussian Mixture Model Parameters for Language Identification. In: Železný, M., Habernal, I., Ronzhin, A., et al. (eds.) SPECOM 2013. LNCS, vol. 8113, pp. 41–45. Springer, Heidelberg (2013)
Lyubimov, N., Kotov, M.: Non-negative Matrix Factorization with Linear Constraints for Single-Channel Speech Enhancement. In: Proc. of Interspeech (2013)
Christensen, H., Barker, J., Ma, N., Green, P.: The CHiME corpus: a resource and a challenge for computational hearing in multisource environments. In: Proc. Interspeech, pp. 1918–1921 (2010)
Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker Verification Using Adapted Gaussian Mixture Models. Digital Signal Processing 10(1–3), 19–41 (2000)
Pelecanos, J., Sridharan, S.: Feature Warping for Robust Speaker Verification. In: Proc. Odyssey: the speaker recognition workshop, Crete (2001)
Ephraim, Y., Malah, D.: Speech enhancement using a minimum mean-square error log-spectral amplitude estimator. IEEE Trans. on Acoustic, Speech and Signal Proc. 33(2), 443–445 (1985)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Lyubimov, N., Nastasenko, M., Kotov, M., Doroshin, D. (2014). Exploiting Non-negative Matrix Factorization with Linear Constraints in Noise-Robust Speaker Identification. In: Ronzhin, A., Potapova, R., Delic, V. (eds) Speech and Computer. SPECOM 2014. Lecture Notes in Computer Science(), vol 8773. Springer, Cham. https://doi.org/10.1007/978-3-319-11581-8_25
Download citation
DOI: https://doi.org/10.1007/978-3-319-11581-8_25
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11580-1
Online ISBN: 978-3-319-11581-8
eBook Packages: Computer ScienceComputer Science (R0)