Parallel multichannel blind source separation using a spatial covariance model and nonnegative matrix factorization

Muñoz-Montoro, A. J.; Carabias-Orti, J. J.; Cortina, R.; García-Galán, S.; Ranilla, J.

doi:10.1007/s11227-021-03771-y

Parallel multichannel blind source separation using a spatial covariance model and nonnegative matrix factorization

Published: 06 April 2021

Volume 77, pages 12143–12156, (2021)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

A. J. Muñoz-Montoro ORCID: orcid.org/0000-0001-9518-8955^2,3,
J. J. Carabias-Orti¹,
R. Cortina³,
S. García-Galán¹ &
…
J. Ranilla³

416 Accesses
2 Citations
Explore all metrics

Abstract

In this paper, we present a multichannel nonnegative matrix factorization (MNMF) system for the task of source separation. We propose a novel signal model using spatial covariance matrices (SCM) where the mixing filter encodes the spatial information and the source variances are modeled using a NMF structure. Moreover, the proposed model is initialized with the estimated source direction of arrival (DoA) in order to mitigate the strong sensitivity to parameter initialization. The proposed system has been evaluated for the task of music source separation using a multichannel classical chamber music dataset showing that it is possible to reach real time in the tested scenarios by combining multi-core architectures with parallel and high-performance techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

Determined Blind Source Separation with Independent Low-Rank Matrix Analysis

Parallel multichannel music source separation system

Article 22 April 2020

Multichannel Audio Source Separation Exploiting NMF-Based Generic Source Spectral Model in Gaussian Modeling Framework

Notes

RT\(_{60}\) is the time required for reflections of a direct sound to decay by 60 dB below the level of the direct sound.
https://www.openblas.net
http://www.fftw.org

References

Campbell DR, Palomaki KJ, Brown G (2005) A MATLAB simulation of “shoebox’’ room acoustics for use in research and teaching. Comput Inf Syst 9:48–51
Google Scholar
Canadas-Quesada F, Fitzgerald D, Vera-Candeas P, Ruiz-Reyes N (2017) Harmonic-percussive sound separation using rhythmic information from non-negative matrix factorization in single-channel music recordings. DAFx 2017 - Proceedings of the 20th International Conference on Digital Audio Effects (i), 276–282
Carabias-Orti JJ, Nikunen J, Virtanen T, Vera-Candeas P (2018) Multichannel blind Sound source separation using spatial covariance model With level and time Differences and nonnegative matrix factorization. IEEE/ACM Trans Audio Speech Lang Process 26(9):1512–1527. https://doi.org/10.1109/TASLP.2018.2830105
Article Google Scholar
Défossez A, Bach F, Usunier N, Bottou L (2019) Music source separation in the waveform domain (2019)
Durrieu JL, Richard G, David B, Fevotte C (2010) Source/filter model for unsupervised main melody extraction from polyphonic audio signals. IEEE Trans Audio Speech Lang Process 18(3):564–575. https://doi.org/10.1109/TASL.2010.2041114
Article Google Scholar
Ewert S, Muller M (2011) Estimating note intensities in music recordings. In: 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 385–388. IEEE. https://doi.org/10.1109/ICASSP.2011.5946421
Févotte C, Bertin N, Durrieu JL (2009) Nonnegative matrix factorization with the Itakura-Saito divergence: with application to music analysis. Neural Comput 21(3):793–830. https://doi.org/10.1162/neco.2008.04-08-771
Article MATH Google Scholar
Herre J, Falch C, Mahne D, Del Galdo G, Kallinger M, Thiergart O (2010) Interactive teleconferencing combining spatial Audio Object Coding and DirAC technology. In: 128th Audio Engineering Society Convention 2010, vol. 3, pp. 1579–1590
Huang PS, Chen SD, Smaragdis P, Hasegawa-Johnson M (2012) Singing-Voice Separation From Monaural Recordings Using Robust Principal Component Analysis. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) pp. 57–60
Ito N, Nakatani T (2019) FastMNMF: Joint Diagonalization Based Accelerated Algorithms for Multichannel Nonnegative Matrix Factorization. In: ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. https://doi.org/10.1109/ICASSP.2019.8682291
Itoyama K, Goto M, Komatani K, Ogata T, Okuno HG (2008) Instrument equalizer for query-by-example retrieval: improving sound source separation based on Integrated harmonic and Inharmonic Models. Ismir. https://doi.org/10.1136/bmj.324.7341.827
Article Google Scholar
Jensen JR, Christensen MG, Jensen SH (2013) Nonlinear least squares methods for joint DOA and pitch estimation. IEEE Trans audio Speech Lang Process 21(5):923–933. https://doi.org/10.1109/TASL.2013.2239290
Article Google Scholar
Kitamura D, Ono N, Sawada H, Kameoka H, Saruwatari H (2016) Determined blind source separation unifying independent vector analysis and nonnegative matrix factorization. IEEE/ACM Trans Audio Speech Lang Process 24(9):1626–1641. https://doi.org/10.1109/TASLP.2016.2577880
Article Google Scholar
Li B, Liu X, Dinesh K, Duan Z, Sharma G (2019) Creating a multitrack classical music performance dataset for multimodal music analysis: challenges, insights, and applications. IEEE Trans Multimedia 21(2):522–535. https://doi.org/10.1109/TMM.2018.2856090
Article Google Scholar
Liutkus A, Durrieu JL, Daudet L, Richard G (2013) An overview of informed audio source separation. In: 2013 14th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS), pp. 1–4. IEEE. https://doi.org/10.1109/WIAMIS.2013.6616139
Marro C, Mahieux Y, Simmer K (1998) Analysis of noise reduction and dereverberation techniques based on microphone arrays with postfiltering. IEEE Trans Speech Audio Process 6(3):240–259. https://doi.org/10.1109/89.668818
Article Google Scholar
McDonough J, Kumatani K (2012) Microphone Arrays. Techniques for Noise Robustness in Automatic Speech Recognition. Wiley, Chichester, UK, pp 109–157. https://doi.org/10.1002/9781118392683.ch6
Chapter Google Scholar
Merimaa J, Pulkki V (2005) Spatial impulse response rendering I: analysis and synthesis. AES J Audio Eng Soc 53(12):1115–1127
Google Scholar
Mitsufuji Y, Roebel A (2013) Sound source separation based on non-negative tensor factorization incorporating spatial cue as prior knowledge. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 71–75. IEEE. https://doi.org/10.1109/ICASSP.2013.6637611
Mitsufuji Y, Uhlich S, Takamune N, Kitamura D, Koyama S, Saruwatari H (2020) Multichannel non-negative matrix factorization using nanded spatial covariance matrices in wavenumber domain. IEEE/ACM Trans Audio Speech Lang Process 28:49–60. https://doi.org/10.1109/TASLP.2019.2948770
Article Google Scholar
Munoz-Montoro AJ, Politis A, Drossos K, Carabias-Orti JJ (2020) Multichannel Singing Voice Separation by Deep Neural Network Informed DOA Constrained CMNMF. In: 2020 IEEE 22nd International Workshop on Multimedia Signal Processing (MMSP), pp. 1–6. IEEE. https://doi.org/10.1109/MMSP48831.2020.9287068
Nikunen J, Virtanen T (2014) Direction of arrival based spatial covariance model for blind sound source separation. IEEE/ACM Trans Audio Speech Lang Process 22(3):727–739. https://doi.org/10.1109/TASLP.2014.2303576
Article Google Scholar
Nikunen J, Virtanen T (2014) Multichannel audio separation by direction of arrival based spatial covariance model and non-negative matrix factorization. In: ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, pp. 6677–6681. IEEE. https://doi.org/10.1109/ICASSP.2014.6854892
Nugraha AA, Liutkus A, Vincent E (2016) Multichannel audio source separation with deep neural networks. IEEE/ACM Trans Audio Speech Lang Process 24(9):1652–1664. https://doi.org/10.1109/TASLP.2016.2580946
Article Google Scholar
Pulkki V (2007) Spatial sound reproduction with directional audio coding. AES: J Audio Eng Soc 55(6):503–516
Google Scholar
Sawada H, Kameoka H, Araki S, Ueda N (2013) Multichannel extensions of non-negative matrix factorization with complex-valued data. IEEE Trans Audio Speech Lang Process 21(5):971–982. https://doi.org/10.1109/TASL.2013.2239990
Article Google Scholar
Sekiguchi K, Bando Y, Nugraha AA, Yoshii K, Kawahara T (2020) Fast multichannel nonnegative matrix factorization with directivity-aware jointly-diagonalizable spatial covariance matrices for blind source separation. IEEE/ACM Trans Audio Speech Lang Process 28:2610–2625. https://doi.org/10.1109/TASLP.2020.3019181
Article Google Scholar
Sekiguchi K, Nugraha AA, Bando Y, Yoshii K (2019) Fast Multichannel Source Separation Based on Jointly Diagonalizable Spatial Covariance Matrices. In: 2019 27th European Signal Processing Conference (EUSIPCO), pp. 1–5. IEEE. https://doi.org/10.23919/EUSIPCO.2019.8902557
Smaragdis P (2012) Extraction of Speech from mixture signals. Techniques for noise robustness in automatic speech recognition. Wiley, Chichester, UK, pp 87–108. https://doi.org/10.1002/9781118392683.ch5
Chapter Google Scholar
Tashev IJ (2009) Sound capture and processing. Wiley, Chichester, UK. https://doi.org/10.1002/9780470994443
Book Google Scholar
Vincent E, Gribonval R, Fevotte C (2006) Performance measurement in blind audio source separation. IEEE Trans Audio Speech Lang Process 14(4):1462–1469. https://doi.org/10.1109/TSA.2005.858005
Article Google Scholar
Wang L, Ding H, Yin F (2010) Combining superdirective beamforming and frequency-domain blind source separation for highly reverberant signals. EURASIP J Audio Speech Process 2010(1):1–13. https://doi.org/10.1155/2010/797962
Article Google Scholar

Download references

Acknowledgements

This work was supported by the Regional Ministry of the Principality of Asturias under grant FC-GRUPIN-IDI/2018/000226, by the Ministry of Economy, Knowledge and University of the Government of the “Junta de Andalucía” under project P18-RT-1994, by the “Programa Operativo FEDER Andalucía 2014-2020” under project with reference 1257914, and by Pre-doctoral Fellowship Program from the “Ministerio de Ciencia, Innovación y Universidades” of Spain under the reference BES-2016-078512.

Author information

Authors and Affiliations

Department of Telecommunication Engineering, Universidad de Jaén, Jaén, Spain
J. J. Carabias-Orti & S. García-Galán
Escuela de Ciencias Técnicas e Ingeniería, Universidad a Distancia de Madrid (UDIMA), Madrid, Spain
A. J. Muñoz-Montoro
Department of Computer Science, University of Oviedo, Oviedo, Spain
A. J. Muñoz-Montoro, R. Cortina & J. Ranilla

Authors

A. J. Muñoz-Montoro
View author publications
You can also search for this author in PubMed Google Scholar
J. J. Carabias-Orti
View author publications
You can also search for this author in PubMed Google Scholar
R. Cortina
View author publications
You can also search for this author in PubMed Google Scholar
S. García-Galán
View author publications
You can also search for this author in PubMed Google Scholar
J. Ranilla
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to A. J. Muñoz-Montoro.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Muñoz-Montoro, A.J., Carabias-Orti, J.J., Cortina, R. et al. Parallel multichannel blind source separation using a spatial covariance model and nonnegative matrix factorization. J Supercomput 77, 12143–12156 (2021). https://doi.org/10.1007/s11227-021-03771-y

Download citation

Accepted: 22 March 2021
Published: 06 April 2021
Issue Date: October 2021
DOI: https://doi.org/10.1007/s11227-021-03771-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Parallel multichannel blind source separation using a spatial covariance model and nonnegative matrix factorization

Abstract

Access this article

Similar content being viewed by others

Determined Blind Source Separation with Independent Low-Rank Matrix Analysis

Parallel multichannel music source separation system

Multichannel Audio Source Separation Exploiting NMF-Based Generic Source Spectral Model in Gaussian Modeling Framework

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Parallel multichannel blind source separation using a spatial covariance model and nonnegative matrix factorization

Abstract

Access this article

Similar content being viewed by others

Determined Blind Source Separation with Independent Low-Rank Matrix Analysis

Parallel multichannel music source separation system

Multichannel Audio Source Separation Exploiting NMF-Based Generic Source Spectral Model in Gaussian Modeling Framework

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation