skip to main content
10.1145/2818869.2818913acmotherconferencesArticle/Chapter ViewAbstractPublication Pagesase-bigdataConference Proceedingsconference-collections
research-article

Single Channel Source Separation Using Sparse NMF and Graph Regularization

Published: 07 October 2015 Publication History

Abstract

The aim of single channel source separation is to accurately recover signals from mixtures. In supervised case, non-negative matrix factorization (NMF) is a popular method to separate mixed signals from learned dictionaries. These dictionaries can be produced efficiently by sparse NMF to approximate the input signal as closely as possible. However, previous methods neither consider the structure of the data in terms of the similarity between vertices of the input signal nor use state-of-art variants of NMF that are more efficient than conventional ones. This paper presents a method that incorporate graph regularization constraint into a group sparsity NMF to improve the performance of source separation. Experimental results demonstrate that our method is outstandingly effective for speech separation in two representative scenarios.

References

[1]
Lee, D. D. and Seung, H. S. 2001. Algorithms for non-negative matrix factorization, Advances in Neural Information Processing Systems. Cambridge, MA, USA: MIT Press, 13.
[2]
Minje, K. and Smaragdis, P. 2015. Mixtures of local dictionaries for unsupervised speech enhancement, IEEE Signal Processing Letters. 22, 3 (March. 2015), 293--297.
[3]
Févotte, C. and Idier, J. 2011. Algorithms for nonnegative matrix factorization with the beta-divergence, Neural Computation.
[4]
Roux, J. L., Weninger, F., and Hershey, J. R. 2015. Sparse NMF -- half-baked or well done?, Mitsubishi Electric Research Laboratories Technical Report. (Mar. 2015).
[5]
Hoyer, P. 2004. Non-negative matrix factorization with sparseness Constraints, J. Mach. Learn. Res. 5, 1457--1469.
[6]
Lefèvre, A., Bach, F., and Févotte, C., Itakura-Saito non-negative matrix factorization with group sparsity, in Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Process. (ICASSP).
[7]
Hurmalainen, A., Saeidi, R., and Virtanen, T. 2015. Similarity induced group sparsity for non-negative matrix factorization, in Proc. ICASSP 2015. (Brisbane, Australia, April. 2015).
[8]
Eguchi, S. and Kano, Y. 2001. Robustifying maximum likelihood estimation, ISM Research Memo. (June. 2001).
[9]
Eggert, J. and Körner, E. 2004. Sparse coding and NMF, in Proc. IEEE International Joint Conference on Neural Networks. 4, 2529--2533.
[10]
Belkin, M. and Niyogi, P. 2001. Laplacian eigenmaps and spectral techniques for embedding and clustering, Advances in neural information processing systems. Cambridge, MA: MIT Press.
[11]
Seneff, S., Glass, J., Zue, V. 1990. Speech database development at MIT: Timit and beyond, Speech Communication. 9, 4 (Aug. 1990), 351--356.
[12]
Vincent, E., Gribonval, R., and Févotte, C. 2006. Performance measurement in blind audio source separation, IEEE Trans. Audio, Speech and Language Processing. 14, 1462--1469.
[13]
Mikkel, N. 2007. Speech separation using non-negative features and sparse non-negative matrix factorization, Elsevier.
[14]
Reddy, A. M. and Raj, B. 2004. Soft mask estimation for single channel speaker separation, in Proc. ISCA Tutorial and Research Workshop on Statistical and Perceptual Audio Processing SAPA-04. (October. 2004).
[15]
Cai, D., He, X., Han, J. and Huang, T. 2010. Graph regularized nonnegative matrix factorization for data representation, IEEE Trans. Pattern Anal. Mach. Intell. 33, 8, 1548--1560.
[16]
Fevotte, C., Bertin, N. and Durrieu, J. L. 2009. Nonnegative matrix factorization with the Itakura-Saito divergence: With application to music, Neural Comput. 21, 3, 793--830.
[17]
Sun, D. L. and Mysore, G. J. 2013. Universal speech models for speaker independent single channel source separation, in Proc. IEEE Int.Conf. Acoustics, Speech, and Signal Process. Vancouver.
[18]
Bao, G., Xu, Y. and Ye, Z. 2014. Learning a discriminative dictionary for single-channel speech separation, IEEE/ACM Transactions on Audio, Speech, and Language Processing. 22, 7 (April. 2014), 1130--1138.
[19]
Lin, Y. B., Pham, T., Lee, Y. S. and Wang, J. C. 2015. Monaural source separation using nonnegative matrix factorization with graph regularization constraint, Conference on Computational Linguistics and Speech Processing, (Oct 2015).
[20]
Gyoon, K. T., Kwon, K., Shin, J. W. and Soo, K. N. 2015. NMF-based target source separation using deep neural network, IEEE Signals Processing Letters, 22, 2, (Feb. 2015), 229--233.
[21]
Cooke, M., Barker, J., Cunningham, S. and Shao, X. 2006. An audio-visual corpus for speech perception and automatic speech recognition, J. of the Acoustical Society of America. 120, 2421--2424.
[22]
Schmidt, M. and Olsson, R. 2006. Single-channel speech separation using sparse non-negative matrix factorization, in Proc. Interspeech. 2614--2617.
[23]
Radfar, M. H. and Dansereau, R. M. 2007. Single-channel speech separation using soft mask filtering, IEEE Trans. Audio Speech Lang. Process.15, 8 (Nov. 2007), 2299--2310.
[24]
Mowlaee, P., Saeidi, R., Christensen, M. G., Tan, Z. H., Kinnunen, T., Franti, P. and Jensen, S. H. 2012. A joint approach for single-channel speaker identification and speech separation, IEEE Trans. Audio Speech Lang. Process. 20, 9 (Nov. 2012), 2586--2601.
[25]
Xu, W., Xin, L. and Yihong, G. 2003. Document clustering based on non-negative matrix factorization.in Proc. of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval, 2003.DOI=http://doi.acm.org/10.1145/860435.860485.
[26]
Pauca, V. P., Farial, S., Berry, M. W. and Plemmons, R. J. 2004. Text mining using non-negative matrix factorizations. Society for Industrial and Applied Mathematics. 4.
[27]
Zheng, M., Bu, J., Chen, C., Wang, C., Zhang, L., Qiu, G. and Cai, D. 2011. Graph regularized sparse coding for image representation, IEEE Trans. Image Process. 20, 5, 1327--133.

Cited By

View all
  • (2024) NES 2 Net : A Number Estimation and Signal Separation Network for Single-Channel Blind Signal Separation 2024 7th International Conference on Advanced Algorithms and Control Engineering (ICAACE)10.1109/ICAACE61206.2024.10549012(469-474)Online publication date: 1-Mar-2024
  • (2021)An Improved Unsupervised Single-Channel Speech Separation Algorithm for Processing Speech Sensor SignalsWireless Communications & Mobile Computing10.1155/2021/66551252021Online publication date: 1-Jan-2021
  • (2015)A review on speech separation using NMF and its extensions2015 International Conference on Orange Technologies (ICOT)10.1109/ICOT.2015.7498486(26-29)Online publication date: Dec-2015

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ASE BD&SI '15: Proceedings of the ASE BigData & SocialInformatics 2015
October 2015
381 pages
ISBN:9781450337359
DOI:10.1145/2818869
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 October 2015

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Graph regularization
  2. non-negative matrix factorization
  3. source separation
  4. sparse coding

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ASE BD&SI '15
ASE BD&SI '15: ASE BigData & SocialInformatics 2015
October 7 - 9, 2015
Kaohsiung, Taiwan

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)4
  • Downloads (Last 6 weeks)0
Reflects downloads up to 12 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024) NES 2 Net : A Number Estimation and Signal Separation Network for Single-Channel Blind Signal Separation 2024 7th International Conference on Advanced Algorithms and Control Engineering (ICAACE)10.1109/ICAACE61206.2024.10549012(469-474)Online publication date: 1-Mar-2024
  • (2021)An Improved Unsupervised Single-Channel Speech Separation Algorithm for Processing Speech Sensor SignalsWireless Communications & Mobile Computing10.1155/2021/66551252021Online publication date: 1-Jan-2021
  • (2015)A review on speech separation using NMF and its extensions2015 International Conference on Orange Technologies (ICOT)10.1109/ICOT.2015.7498486(26-29)Online publication date: Dec-2015

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media