Underdetermined Reverberant Audio-Source Separation Through Improved Expectation–Maximization Algorithm

Xie, Yuan; Xie, Kan; Yang, Junjie; Wu, Zongze; Xie, Shengli

doi:10.1007/s00034-018-1011-5

Underdetermined Reverberant Audio-Source Separation Through Improved Expectation–Maximization Algorithm

Short Paper
Published: 02 January 2019

Volume 38, pages 2877–2889, (2019)
Cite this article

Circuits, Systems, and Signal Processing Aims and scope Submit manuscript

Yuan Xie¹,
Kan Xie¹,
Junjie Yang¹,
Zongze Wu¹ &
…
Shengli Xie¹

420 Accesses
12 Citations
Explore all metrics

Abstract

Underdetermined reverberant audio-source separation is an important issue in speech and audio processing. To solve this problem, many separation algorithms have been proposed, in which model parameter estimation is performed in the time–frequency domain, leading to permutation ambiguity and poor separation performance. Additionally, in the existing expectation–maximization (EM) algorithms, one of the crucial problem is that updating the model parameters at each iterative step is time-consuming. In this paper, we present an improved EM algorithm that combines nonnegative matrix factorization (NMF) and time differences of arrival (TDOA) estimation, avoiding the time consumption by properly selecting initial values of the EM algorithm. In the proposed algorithm, NMF source model is used to avoid the permutation ambiguity problem, and acoustic localization can be achieved by transforming the TDOA. Then, model parameters are updated to obtain better separation results. Finally, the source signals are separated using Wiener filters. The experimental results show that compared with existing blind separation methods, the proposed algorithm achieves better performance on source separation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Singular value decomposition of noisy data: noise filtering

Article Open access 16 July 2019

An ensemble of optimal smoothing and minima controlled through iterative averaging for speech enhancement under uncontrolled environment

Article Open access 17 April 2024

A Review on Sound Source Localization Systems

Article 05 May 2022

References

X. Alameda-Pineda, S. Gannot, D. Kounades-Bastian, L. Girin, R. Horaud, A variational EM algorithm for the separation of time-varying convolutive audio mixtures. IEEE/ACM Trans. Audio Speech Lang. Process. 24(8), 1408–1423 (2016)
Article Google Scholar
A. Al-Tmeme, W.L. Woo, S.S. Dlay, B. Gao, Underdetermined convolutive source separation using GEM-MU with variational approximated optimum model order NMF2D. IEEE ACM Trans. Audio Speech Lang. Process. 25(1), 35–49 (2017)
Article Google Scholar
C. Blandin, A. Ozerov, E. Vincent, Multi-source TDOA estimation in reverberant audio using angular spectra and clustering. Signal Process. 91(10), 1950–1960 (2012)
Article Google Scholar
R. Chai, G. Naik, T.N. Nguyen, S. Ling, Y. Tran, A. Craig, H. Nguyen, Driver fatigue classification with independent component by entropy rate bound minimization analysis in an EEG-based system. IEEE J Biomed Health Inform 21(3), 715–724 (2017)
Article Google Scholar
Y. Chi, Guaranteed blind sparse spikes deconvolution via lifting and convex optimization. IEEE J. Select. Topics Signal Process. 10(4), 782–794 (2015)
Article Google Scholar
J. Cho, D.Y. Chang, Underdetermined convolutive BSS: Bayes risk minimization based on a mixture of super-Gaussian posterior approximation. IEEE/ACM Trans. Audio Speech Lang. Process. 23(5), 828–839 (2015)
Article Google Scholar
P. Comon, C. Jutten, Handbook of Blind Source Separation: Independent Component Analysis and Separation (Academic, Cambridge, 2010)
Google Scholar
C.P. Demo, J. Srel, Cocktail Party Problem (Springer, New York, 2015)
Google Scholar
S.C. Douglas, M. Gupta, H. Sawada, S. Makino, Spatiotemporal fastICA algorithms for the blind separation of convolutive mixtures. IEEE Trans. Audio Speech Lang. Process. 15(5), 1511–1520 (2007)
Article Google Scholar
N.Q.K. Duong, E. Vincent, Under-determined reverberant audio source separation using a full-rank spatial covariance model. IEEE Trans. Audio Speech Lang. Process. 18(7), 1830–1840 (2010)
Article Google Scholar
C. Fvotte, N. Bertin, J.L. Durrieu, Nonnegative matrix factorization with the Itakura–Saito divergence: with application to music analysis. Neural Comput. 21(3), 793 (2009)
Article MATH Google Scholar
Y. Guo, G. R. Naik, H. Nguyen, Single channel blind source separation based local mean decomposition for biomedical applications, in Engineering in Medicine and Biology Society 2013, pp. 6812–6815
Y. Guo, S. Huang, Y. Li, G.R. Naik, Edge effect elimination in single-mixture blind source separation. Circuits Syst. Signal Process. 32(5), 2317–2334 (2013)
Article MathSciNet Google Scholar
http://sisec2011.wiki.irisa.fr/tiki-indexbfd7.html?page
Y. Hu, P.C. Loizou, Evaluation of objective quality measures for speech enhancement. IEEE Trans. Audio Speech Lang. Process. 16(1), 229–238 (2008)
Article Google Scholar
D. Kitamura, N. Ono, H. Sawada, H. Kameoka, H. Saruwatari, Determined blind source separation unifying independent vector analysis and nonnegative matrix factorization. IEEE/ACM Trans. Audio Speech Lang. Process. 24(9), 1626–1641 (2016)
Article Google Scholar
H. Liu, S. Liu, T. Huang, Z. Zhang, Y. Hu, T. Zhang, Infrared spectrum blind deconvolution algorithm via learned dictionaries and sparse representation. Appl. Optics 55(10), 2813 (2016)
Article Google Scholar
G.R. Naik, S.E. Selvan, H.T. Nguyen, Single-channel EMG classification with ensemble-empirical-mode-decomposition-based ICA for diagnosing neuromuscular disorders. IEEE Trans. Neural Syst. Rehabil. Eng. 24(7), 734–743 (2016)
Article Google Scholar
G. Naik, A. Altimemy, H. Nguyen, Transradial amputee gesture classification using an optimal number of sEMG sensors: an approach using ICA clustering. IEEE Trans. Neural Syst. Rehabil. Eng. 24(8), 837–846 (2016)
Article Google Scholar
F. Nesta and M. Omologo, Convolutive underdetermined source separation through weighted interleaved ICA and spatio-temporal source correlation. In: International Conference on Latent Variable Analysis and Signal Separation, Lva/ica 2012, Tel Aviv, Israel, March 12–15, 2012. Proceedings, 2012, pp. 222–230
A. Ozerov, C. Fvotte, R. Blouet, J. L. Durrieu, Multichannel nonnegative tensor factorization with structured constraints for user-guided audio source separation. In: IEEE International Conference on Acoustics, Speech and Signal Processing, 2011, pp. 257–260
A. Ozerov, C. Fevotte, Multichannel nonnegative matrix factorization in convolutive mixtures for audio source separation. IEEE Trans. Audio Speech Lang. Process. 18(3), 550–563 (2010)
Article Google Scholar
G. Pendharkar, G.R. Naik, H.T. Nguyen, Using blind source separation on accelerometry data to analyze and distinguish the toe walking gait from normal gait in ITW children. Biomed. Signal Process. Control 13(5), 41–49 (2014)
Article Google Scholar
K. Rahbar, J.P. Reilly, A frequency domain method for blind source separation of convolutive audio mixtures. IEEE Trans. Speech Audio Process. 13(5), 832–844 (2005)
Article Google Scholar
H. Sawada, S. Araki, S. Makino, Underdetermined convolutive blind source separation via frequency bin-wise clustering and permutation alignment. IEEE Trans. Audio Speech Lang. Process. 19(3), 516–527 (2010)
Article Google Scholar
H. Sawada, H. Kameoka, S. Araki, N. Ueda, Multichannel extensions of non-negative matrix factorization with complex-valued data. IEEE Trans. Audio Speech Lang. Process. 21(5), 971–982 (2013)
Article Google Scholar
C.H. Taal, R.C. Hendriks, R. Heusdens, J. Jensen, An algorithm for intelligibility prediction of timefrequency weighted noisy speech. IEEE Trans. Audio Speech Lang. Process. 19(7), 2125–2136 (2011)
Article Google Scholar
E. Vincent, R. Gribonval, C. Fevotte, Performance measurement in blind audio source separation. IEEE Trans. Audio Speech Lang. Process. 14(4), 1462–1469 (2006)
Article Google Scholar
L. Wang, Y. Chi, Blind deconvolution from multiple sparse inputs. IEEE Signal Process. Lett. 23(10), 1384–1388 (2016)
Article MathSciNet Google Scholar
S. Xie, L. Yang, J.M. Yang, G. Zhou, Y. Xiang, Time-frequency approach to underdetermined blind source separation. IEEE Trans. Neural Netw. Learn. Syst. 23(2), 306–316 (2012)
Article Google Scholar
Y. Xie, K. Xie, J. Yang, S. Xie, Underdetermined blind source separation combining tensor decomposition and nonnegative matrix factorization. Symmetry 10(10), 521 (2018)
Article Google Scholar
J.-J. Yang, H.-L. Liu, Blind identification of the underdetermined mixing matrix based on k-weighted hyperline clustering. Neurocomputing 149(PB), 483–489 (2015)
Article Google Scholar

Download references

Acknowledgements

The authors would like to thank the anonymous reviewers for their insightful comments and helpful critiques of the manuscript that helped improve this paper. This work was partially supported by the National Natural Science Foundation of China (Grants 613300032, 61773128, 61673126, U1701261). Additionally, this work was partially supported by the Postdoctoral Science Foundation of China, No. 2018M643022.

Author information

Authors and Affiliations

Guangdong University of Technology, Guangzhou, 510006, China
Yuan Xie, Kan Xie, Junjie Yang, Zongze Wu & Shengli Xie

Authors

Yuan Xie
View author publications
You can also search for this author in PubMed Google Scholar
Kan Xie
View author publications
You can also search for this author in PubMed Google Scholar
Junjie Yang
View author publications
You can also search for this author in PubMed Google Scholar
Zongze Wu
View author publications
You can also search for this author in PubMed Google Scholar
Shengli Xie
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shengli Xie.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xie, Y., Xie, K., Yang, J. et al. Underdetermined Reverberant Audio-Source Separation Through Improved Expectation–Maximization Algorithm. Circuits Syst Signal Process 38, 2877–2889 (2019). https://doi.org/10.1007/s00034-018-1011-5

Download citation

Received: 12 September 2018
Revised: 12 December 2018
Accepted: 16 December 2018
Published: 02 January 2019
Issue Date: 15 June 2019
DOI: https://doi.org/10.1007/s00034-018-1011-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Underdetermined Reverberant Audio-Source Separation Through Improved Expectation–Maximization Algorithm

Abstract

Access this article

Similar content being viewed by others

Singular value decomposition of noisy data: noise filtering

An ensemble of optimal smoothing and minima controlled through iterative averaging for speech enhancement under uncontrolled environment

A Review on Sound Source Localization Systems

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Underdetermined Reverberant Audio-Source Separation Through Improved Expectation–Maximization Algorithm

Abstract

Access this article

Similar content being viewed by others

Singular value decomposition of noisy data: noise filtering

An ensemble of optimal smoothing and minima controlled through iterative averaging for speech enhancement under uncontrolled environment

A Review on Sound Source Localization Systems

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation