Underdetermined blind source separation using CapsNet

Kumar, M.; Jayanthi, V. E.

doi:10.1007/s00500-019-04430-4

Underdetermined blind source separation using CapsNet

Methodologies and Application
Published: 15 October 2019

Volume 24, pages 9011–9019, (2020)
Cite this article

Soft Computing Aims and scope Submit manuscript

285 Accesses
6 Citations
Explore all metrics

Abstract

In this paper, we consider the problem of separating the speech source signal from the underdetermined convolutive mixture signals using capsule network (CapsNet). The objective of this paper is twofold. They are (1) to improve the underdetermined convolutive blind source separation algorithm in terms of signal-to-distortion ratio, signal-to-interference ratio and signal-to-artifact ratio; (2) to minimize the computational burden of the algorithm so that it is useful for applications like speech recognition system. The time–frequency points of the observed mixture signals are input to the first layer of CapsNet. In the first layer, single-source active point (SSP) is calculated using the ratio of mixtures. These SSPs are lower-level capsules in our system. In the second layer, we find a cluster center using a dynamic routing algorithm and these clusters are used to construct a binary mask. Finally, the algorithm solves the permutation problem by determining the correlation between the amplitudes of adjacent frequency bins. We test our algorithm on the live recording mixture signals obtained in the real environment and synthetically convoluted mixture signals. The test result shows the effectiveness of the proposed method when compared with the existing algorithms in terms of computational load, signal-to-distortion ratio and signal-to-interference ratio.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 4

Fig. 5

Blind separation of underdetermined Convolutive speech mixtures by time–frequency masking with the reduction of musical noise of separated signals

Article 12 January 2021

A hybrid algorithm for blind source separation of a convolutive mixture of three speech sources

Article Open access 17 June 2014

Performance Analysis of Blind Source Separation Using Canonical Correlation

Article 04 May 2017

References

Abrard F, Deville Y (2003) Blind separation of dependent sources using the “time–frequency ratio of mixtures” approach. In: Seventh international symposium on signal processing and its applications proceedings. https://doi.org/10.1109/isspa.2003.1224820
Aissa-El-Bey A, Abed-Meraim K, Grenier Y (2007a) Blind separation of underdetermined convolutive mixtures using their time–frequency representation. IEEE Trans Audio Speech Lang Process 15(5):1540–1550. https://doi.org/10.1109/tasl.2007.898455
Article MATH Google Scholar
Aissa-El-Bey A, Linh-Trung N, Abed-Meraim K, Belouchrani A, Grenier Y (2007b) Underdetermined blind separation of nondisjoint sources in the time–frequency domain. IEEE Trans Signal Process 55(3):897–907. https://doi.org/10.1109/tsp.2006.888877
Article MathSciNet MATH Google Scholar
Anusuya MA, Katti SK (2009) Speech recognition by machines: a review. Int J Comput Sci Secur 6(3). http://arxiv.org/ftp/arxiv/papers/1001/1001.2267.pdf. Accessed 14 July 2019
Araki S, Vincent E (2016) https://sisec.inria.fr/sisec-2016/2016-underdetermined-speech-and-music-mixtures/
Araki S, Sawada H, Mukai R, Makino S (2007) Underdetermined blind sparse source separation for arbitrarily arranged multiple sensors. Signal Process 87(8):1833–1847. https://doi.org/10.1016/j.sigpro.2007.02.003
Article MATH Google Scholar
Araki S et al (2012) The 2011 signal separation evaluation campaign (SiSEC2011): audio source separation. In: Theis F, Cichocki A, Yeredor A, Zibulevsky M (eds) Latent variable analysis and signal separation. LVA/ICA 2012. Lecture notes in computer science, vol 7191. Springer, Berlin, Heidelberg
Belouchrani A, Amin MG (1998) Blind source separation based on time–frequency signal representations. IEEE Trans Signal Process 46(11):2888–2897. https://doi.org/10.1109/78.726803
Article Google Scholar
Blin A, Araki S, Makino S (2005) Underdetermined blind separation of convolutive mixtures of speech using time–frequency mask and mixing matrix estimation. IEICE Trans Fundam Electron Commun Comput Sci E88A(7):1693–1700
Article Google Scholar
Bobin J, Rapin J, Larue A, Starck JL (2015) Sparsity and adaptivity for the blind separation of partially correlated sources. IEEE Trans Signal Process 63(5):1199–1213. https://doi.org/10.1109/tsp.2015.2391071
Article MathSciNet MATH Google Scholar
Cermak J, Smekal Z (2009) Underdetermined blind source separation using linear separation system. Lecture notes in computer science. pp 300–305. https://doi.org/10.1007/978-3-642-00525-1_30
Cho J, Choi J, Yoo CD (2011) Underdetermined convolutive blind source separation using a novel mixing matrix estimation and MMSE-based source estimation. In: 2011 IEEE international workshop on machine learning for signal processing. https://doi.org/10.1109/mlsp.2011.6064629
Fevotte C, Godsill SJ (2006) A Bayesian approach for blind separation of sparse sources. IEEE Trans Audio Speech Lang Process 14(6):2174–2188. https://doi.org/10.1109/tsa.2005.858523
Article MATH Google Scholar
Fevotte C, Gribonval R, Vincent E (2005) BSS_EVAL toolbox user guide—revision 2.0 [Technical Report]: 19 inria-00564760
Kim SG, Yoo CD (2009) Underdetermined blind source separation based on subspace representation. IEEE Trans Signal Process 57(7):2604–2614. https://doi.org/10.1109/tsp.2009.2017570
Article MathSciNet MATH Google Scholar
Li Y, Amari S, Cichocki A, Ho DWC, Xie S (2006) Underdetermined blind source separation based on sparse representation. IEEE Trans Signal Process 54(2):423–437. https://doi.org/10.1109/tsp.2005.861743
Article MATH Google Scholar
Reju VG, Koh SN, Soon IY (2010) Underdetermined convolutive blind source separation via time–frequency masking. IEEE Trans Audio Speech Lang Process 18(1):101–116. https://doi.org/10.1109/tasl.2009.2024380
Article Google Scholar
Sabour S, Frosst N, Hinton GE (2017) Dynamic routing between capsules proceedings of advances in neural information processing systems 30 (NIPS 2017). https://arxiv.org/pdf/1710.09829.pdf. Accessed 14 July 2019
Sawada H, Mukai R, Araki S, Makino S (2004) A robust and precise method for solving the permutation problem of frequency-domain blind source separation. IEEE Trans Speech Audio Process 12(5):530–538. https://doi.org/10.1109/tsa.2004.832994
Article Google Scholar
Sawada H, Araki S, Makino S (2007) Measuring dependence of bin-wise separated signals for permutation alignment in frequency-domain BSS. In: 2007 IEEE international symposium on circuits and systems. https://doi.org/10.1109/iscas.2007.378164
Sawada H, Araki S, Makino S (2011) Underdetermined convolutive blind source separation via frequency bin-wise clustering and permutation alignment. IEEE Trans Audio Speech Lang Process 19(3):516–527. https://doi.org/10.1109/tasl.2010.2051355
Article Google Scholar
Vincent E, Gribonval R, Fevotte C (2006) Performance measurement in blind audio source separation. IEEE Trans Audio Speech Lang Process 14(4):1462–1469. https://doi.org/10.1109/tsa.2005.858005
Article Google Scholar
Winter S, Kellermann W, Sawada H et al (2006) MAP-based underdetermined blind source separation of convolutive mixtures by hierarchical clustering and l1-norm minimization. EURASIP J Adv Signal Process 2007:024717. https://doi.org/10.1155/2007/24717
Article Google Scholar
Yang L, Lv J, Xiang Y (2013) Underdetermined blind source separation by parallel factor analysis in time–frequency domain. Cogn Comput 5(2):207–214. https://doi.org/10.1007/s12559-012-9177-9
Article Google Scholar
Yilmaz O, Rickard S (2004) Blind separation of speech mixtures via time–frequency masking. IEEE Trans Signal Process 52(7):1830–1847. https://doi.org/10.1109/tsp.2004.828896
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Chettinad College of Engineering and Technology, Karur-Trichy Highways, Puliyur CF PO, Karur, Tamil Nadu, India
M. Kumar
PSNA College of Engineering and Technology, Kothandaraman Nagar, Dindigul, Tamil Nadu, India
V. E. Jayanthi

Authors

M. Kumar
View author publications
You can also search for this author in PubMed Google Scholar
V. E. Jayanthi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to M. Kumar.

Ethics declarations

Conflict of interest

Author M. Kumar declares that he has no conflict of interest. Author V. E. Jayanthi declares that she has no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Communicated by V. Loia.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kumar, M., Jayanthi, V.E. Underdetermined blind source separation using CapsNet. Soft Comput 24, 9011–9019 (2020). https://doi.org/10.1007/s00500-019-04430-4

Download citation

Published: 15 October 2019
Issue Date: June 2020
DOI: https://doi.org/10.1007/s00500-019-04430-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Underdetermined blind source separation using CapsNet

Abstract

Access this article

Similar content being viewed by others

Blind separation of underdetermined Convolutive speech mixtures by time–frequency masking with the reduction of musical noise of separated signals

A hybrid algorithm for blind source separation of a convolutive mixture of three speech sources

Performance Analysis of Blind Source Separation Using Canonical Correlation

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Underdetermined blind source separation using CapsNet

Abstract

Access this article

Similar content being viewed by others

Blind separation of underdetermined Convolutive speech mixtures by time–frequency masking with the reduction of musical noise of separated signals

A hybrid algorithm for blind source separation of a convolutive mixture of three speech sources

Performance Analysis of Blind Source Separation Using Canonical Correlation

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation