Sparse NMF based speech enhancement with bases update

Sunnydayal, V.; Siva Prasad, N.; Ravishankar, S.; Surendran, S.; Ragesh, N. K.

doi:10.1007/s10772-017-9418-0

Sparse NMF based speech enhancement with bases update

Published: 09 May 2017

Volume 20, pages 443–454, (2017)
Cite this article

International Journal of Speech Technology Aims and scope Submit manuscript

V. Sunnydayal¹,
N. Siva Prasad²,
S. Ravishankar³,
S. Surendran⁴ &
…
N. K. Ragesh²

304 Accesses
1 Citation
Explore all metrics

Abstract

In this paper, a combination of methods based on statistical modelling and Non-negative Matrix Factorization (NMF) for speech enhancement using speech and noise bases with on-line update is proposed. Template-based approaches are known to be more robust in the presence of non-stationary noises than methods based on statistical modeling. However, template-based approaches depend on a-priori information. The drawbacks of both the approaches can be avoided by combining them. In NMF approach, speech bases and noise bases are simultaneously adapted to further improve the performance. The proposed method outperforms other benchmark algorithms in terms of perceptual evaluation of speech quality (PESQ) and source-to-distortion ratio (SDR) in stationary and non-stationary noise environment conditions with matched and mismatched noise basis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Benaroya, L., Mcdonagh, L., Bimbot, F., & Gribonval, R. (2003). Non negative sparse representation for Wiener based source separation with a single sensor. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2003) Vol. 6, pp. VI-613-616.
Berry, M. W., Browne, M., Langville, A. N., Pauca, V. P., & Plemmons, R. J. (2007). Algorithms and applications for approximate nonnegative matrix factorization. Computational statistics and data analysis, 52(1), 155–173.
Article MathSciNet MATH Google Scholar
Bhargava, S., Blättler, F., Kollmorgen, S., Liu, S. C., & Hahnloser, R. H. (2015). Linear methods for efficient and fast separation of two sources recorded with a single microphone. Neural computation. doi:10.1162/NECO_a_00776.
Google Scholar
Cabras, G., Canazza, S., Montessoro, P. L., & Rinaldo, R. (2010). Restoration of audio documents with low SNR: A NMF parameter estimation and perceptually motivated Bayesian suppression rule. In Proc. Sound and Music Computing Conference, pp. 314–321.
Ephraim, Y., & Malah, D. (1984). Speech enhancement using a minimum-mean square error short-time spectral amplitude estimator. IEEE Transactions on Acoustics, Speech, and Signal Processing, 32(6), 1109–1121.
Article Google Scholar
Ephraim, Y., & Malah, D. (1985). Speech enhancement using a minimum mean-square error log-spectral amplitude estimator. IEEE Transactions on Acoustics, Speech, and Signal Processing, 33(2), 443–445.
Article Google Scholar
Févotte, C., Bertin, N., & Durrieu, J. L. (2009). Nonnegative matrix factorization with the Itakura-Saito divergence: With application to music analysis. Neural Computation, 21(3), 793–830.
Article MATH Google Scholar
Févotte, C., Le Roux, J., & Hershey, J. R. (2013). Non-negative dynamical system with application to speech and audio. In 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 3158–3162.
Garofolo, J. S. (1988). Getting started with the DARPA TIMIT CD-ROM: An acoustic phonetic continuous speech database. National Institute of Standards and Technology (NIST), Gaithersburgh, MD, 107.
Hu, Y., & Loizou, P. C. (2008). Evaluation of objective quality measures for speech enhancement. IEEE Transactions on Audio, Speech, and Language Processing, 16(1), 229–238.
Article Google Scholar
Kwon, K., Shin, J. W., & Kim, N. S. (2015). NMF-based speech enhancement using bases update. IEEE Signal Processing Letters, 22(4), 450–454.
Article Google Scholar
Kwon, K., Shin, J. W., Sonowat, S., Choi, I., & Kim, N. S. (2014). Speech enhancement combining statistical models and NMF with update of speech and noise bases. In 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 7053–7057.
Lee, D. D., & Seung, H. S. (1999). Learning the parts of objects by non-negative matrix factorization. Nature, 401(6755), 788–791.
Article Google Scholar
Lee, D. D., & Seung, H. S. (2001). Algorithms for non-negative matrix factorization. In Advances in neural information processing systems (pp. 556–562). Cambridge: MIT Press.
Google Scholar
Mohammadiha, N., Gerkmann, T., & Leijon, A. (2011). A new linear MMSE filter for single channel speech enhancement based on nonnegative matrix factorization. In 2011 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 45–48.
Mohammadiha, N., Smaragdis, P., & Leijon, A. (2013). Supervised and unsupervised speech enhancement using nonnegative matrix factorization. IEEE Transactions on Audio, Speech, and Language Processing, 21(10), 2140–2151.
Article Google Scholar
Raj, B., & Smaragdis, P. (2005). Latent variable decomposition of spectrograms for single channel speaker separation. In IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2005, pp. 17–20.
Rangachari, S., & Loizou, P. C. (2006). A noise-estimation algorithm for highly non-stationary environments. Speech Communication, 48(2), 220–231.
Article Google Scholar
Rebhan, S., Sharif, W., & Eggert, J. (2008). Incremental learning in the non-negative matrix factorization. In International Conference on Neural Information Processing (pp. 960–969). Berlin Heidelberg: Springer.
Google Scholar
Schmidt, M. N., Larsen, J., & Hsiao, F. T. (2007). Wind noise reduction using non-negative sparse coding. In 2007 IEEE Workshop on Machine Learning for Signal Processing, pp. 431–436.
Smaragdis, P., & Brown, J. C. (2003). Non-negative matrix factorization for polyphonic music transcription. In IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, pp. 177–180.
Smaragdis, P., Raj, B., & Shashanka, M. (2006). A probabilistic latent variable model for acoustic modeling. Advances in Models for Acoustic Processing, NIPS, 148, 1–8.
Google Scholar
Smaragdis, P., Raj, B., & Shashanka, M. (2007, September). Supervised and semi-supervised separation of sounds from single-channel mixtures. In International Conference on Independent Component Analysis and Signal Separation. Berlin Heidelberg: Springer, pp. 414–421.
Chapter Google Scholar
Varga, A., & Steeneken, H. J. (1993). Assessment for automatic speech recognition: II. NOISEX-92: A database and an experiment to study the effect of additive noise on speech recognition systems. Speech Communication, 12(3), 247–251.
Article Google Scholar
Vincent, E., Gribonval, R., & Févotte, C. (2006). Performance measurement in blind audio source separation. IEEE Transactions on Audio, Speech, and Language Processing, 14(4), 1462–1469.
Article Google Scholar
Virtanen, T. (2007). Monaural sound source separation by nonnegative matrix factorization with temporal continuity and sparseness criteria. IEEE Transactions on Audio, Speech, and Language Processing, 15(3), 1066–1074.
Article Google Scholar
Wilson, K. W., Raj, B., & Smaragdis, P. (2008). Regularized non-negative matrix factorization with temporal dependencies for speech denoising. In Interspeech, pp. 411–414.

Download references

Author information

Authors and Affiliations

Computer Science Department, University of Crete, Heraklion, Greece
V. Sunnydayal
DSP Multimedia, TBU, TATA ELXSI LIMITED, Bangalore, India
N. Siva Prasad & N. K. Ragesh
Department of ECE, Vignan University, Guntur, India
S. Ravishankar
Electronics and Communication Department, National Institute of Technology Warangal, Warangal, India
S. Surendran

Authors

V. Sunnydayal
View author publications
You can also search for this author in PubMed Google Scholar
N. Siva Prasad
View author publications
You can also search for this author in PubMed Google Scholar
S. Ravishankar
View author publications
You can also search for this author in PubMed Google Scholar
S. Surendran
View author publications
You can also search for this author in PubMed Google Scholar
N. K. Ragesh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to V. Sunnydayal.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sunnydayal, V., Siva Prasad, N., Ravishankar, S. et al. Sparse NMF based speech enhancement with bases update. Int J Speech Technol 20, 443–454 (2017). https://doi.org/10.1007/s10772-017-9418-0

Download citation

Received: 29 October 2016
Accepted: 29 April 2017
Published: 09 May 2017
Issue Date: September 2017
DOI: https://doi.org/10.1007/s10772-017-9418-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Sparse NMF based speech enhancement with bases update

Abstract

Access this article

Similar content being viewed by others

Speech denoising using Bayesian NMF with online base update

Hybrid Method for Speech Enhancement Using α-Divergence

An Iterative Posterior Regularized NMF-Based Adaptive Wiener Filter for Speech Enhancement

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Sparse NMF based speech enhancement with bases update

Abstract

Access this article

Similar content being viewed by others

Speech denoising using Bayesian NMF with online base update

Hybrid Method for Speech Enhancement Using α-Divergence

An Iterative Posterior Regularized NMF-Based Adaptive Wiener Filter for Speech Enhancement

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation