Single-Channel Source Separation of Audio Signals Using Bark Scale Wavelet Packet Decomposition

Litvin, Yevgeni; Cohen, Israel

doi:10.1007/s11265-010-0510-9

Single-Channel Source Separation of Audio Signals Using Bark Scale Wavelet Packet Decomposition

Published: 06 August 2010

Volume 65, pages 339–350, (2011)
Cite this article

Journal of Signal Processing Systems Aims and scope Submit manuscript

Yevgeni Litvin¹ &
Israel Cohen¹

333 Accesses
13 Citations
Explore all metrics

Abstract

We address the problem of blind source separation from a single channel audio source using a statistical model of the sources. We modify the Bark Scale aligned Wavelet Packet Decomposition, to acquire approximate-shiftability property. We allow oversampling in some decomposition nodes to equalize sampling rate in all terminal nodes. Statistical models are trained from samples of each source separately. The separation is performed using these models. The proposed psycho-acoustically motivated non-uniform filterbank structure reduces signal space dimension and simplifies training procedure of the statistical model. In our experiments we show that the proposed algorithm performs better when compared to a competing algorithm. We study the effect that different wavelet families have on the performance of the proposed signal analysis in the single-channel source separation task.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Dual-Transform Source Separation Using Sparse Nonnegative Matrix Factorization

Article 23 October 2020

Single-Channel Blind Source Separation using Adaptive Mode Separation-Based Wavelet Transform and Density-Based Clustering with Sparse Reconstruction

Article 19 April 2023

Underdetermined blind source separation of speech mixtures unifying dictionary learning and sparse representation

Article 03 August 2021

References

Vincent, E., Févotte, C., Benaroya, L., & Gribonval, R. (2003). A tentative typology of audio source separation tasks. In Proc. 4th international symposium on independent component analysis and blind signal separation (ICA2003) (pp. 715–720). Nara, Japan.
Google Scholar
Cherry, C. E. (1953). Some experiments on the recognition of speech, with one and with two ears. Journal of the Acoustical Society of America, 25(5), 975–979.
Article Google Scholar
Comon P. (1994). Independent component analysis, a new concept? Signal Processing, 36(3), 287–314.
Article MATH Google Scholar
Hyvärinen, A., Karhunen, J., & Oja, E. (2001). Independent component analysis. Wiley-Interscience.
Ozerov, A., Philippe, P., Bimbot, F., & Gribonval, R. (2007). Adaptation of bayesian models for single-channel source separation and its application to voice/music separation in popular songs. IEEE Transactions on Audio, Speech & Language Processing, 15(5), 1564–1578.
Article Google Scholar
Benaroya, L., Bimbot, F., & Gribonval, R. (2006). Audio source separation with a single sensor. IEEE Transactions on Audio, Speech & Language Processing, 14(1), 191–199.
Article Google Scholar
Benaroya, L., & Bimbot, F. (2003). Wiener based source separation with HMM/GMM using a single sensor. In Proc. 4th international symposium on independent component analysis and blind signal separation (ICA2003) (pp. 957–961). Nara, Japan.
Google Scholar
Srinivasan, S., Samuelsson, J., & Kleijn, W. B. (2006). Codebook driven short-term predictor parameter estimation for speech enhancement. IEEE Transactions on Audio, Speech & Language Processing, 14(1), 163–176.
Article Google Scholar
Srinivasan, S., Samuelsson, J., & Kleijn, W. B. (2007). Codebook-based bayesian speech enhancement for nonstationary environments. IEEE Transactions on Audio, Speech & Language Processing, 15(2), 441–452.
Article Google Scholar
Cohen, I. (2001). Enhancement of speech using bark-scaled wavelet packet decomposition. In Proc. 7th European conf. speech, communication and technology, EUROSPEECH-2001 (pp. 1933–1936). Aalborg, Denmark.
Google Scholar
Fernandes, F. C .A., van Spaendonck, R. L. C., & Burrus, C. S. (2003). A new framework for complex wavelet transforms. IEEE Transactions Signal Processing, 51(7), 1825–1837.
Article Google Scholar
Litvin, Y., & Cohen, I. (2009). Single-channel source separation of audio signals using bark scale wavelet packet decomposition. In 2009 IEEE international workshop on machine learning for signal processing (MLSP09).
Fernandes, F. C. A. (2002). Directional, shift-insensitive, complex wavelet transforms with controllable redundancy. Ph.D. thesis, Rice Univ., Houston, TX, USA.
Simoncelli, E. P., Freeman, W. T., Adelson, E. H., & Heeger, D. J. (1992). Shiftable multiscale transforms. IEEE Transactions on Information Theory, 38(2), 587–607.
Article MathSciNet Google Scholar
Ephraim, Y., & Malah, D. (1984). Speech enhancement using a minimum-mean square error short-time spectral amplitude estimator. IEEE Transactions on Acoustics, Speech, and Signal Processing, 32(6), 1109–1121.
Article Google Scholar
Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society. Series B (Methodological), 39(1), 1–38.
MathSciNet MATH Google Scholar
Gribonval, R., Benaroya, L., Vincent, E., & Févotte, C. (2003). Proposals for performance measurement in source separation. In Proc. 4th international symposium on ICA and BSS (ICA2003) (pp. 763–768). Nara, Japan.
Google Scholar
Févotte, C., Gribonval, R., & Vincent, E. (2005). BSS_EVAL toolbox user guide revision 2.0. Tech. Rep. 1706, IRISA, Rennes, France.

Download references

Author information

Authors and Affiliations

Department of Electrical Engineering, Technion—Israel Institute of Technology, Technion City, Haifa, 32000, Israel
Yevgeni Litvin & Israel Cohen

Authors

Yevgeni Litvin
View author publications
You can also search for this author in PubMed Google Scholar
Israel Cohen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yevgeni Litvin.

Additional information

This work was supported by the Israel Science Foundation under Grant 1085/05 and by the European Commission under project Memories FP6-IST-035300.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Litvin, Y., Cohen, I. Single-Channel Source Separation of Audio Signals Using Bark Scale Wavelet Packet Decomposition. J Sign Process Syst 65, 339–350 (2011). https://doi.org/10.1007/s11265-010-0510-9

Download citation

Received: 29 December 2009
Revised: 02 May 2010
Accepted: 20 July 2010
Published: 06 August 2010
Issue Date: December 2011
DOI: https://doi.org/10.1007/s11265-010-0510-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Single-Channel Source Separation of Audio Signals Using Bark Scale Wavelet Packet Decomposition

Abstract

Access this article

Similar content being viewed by others

Dual-Transform Source Separation Using Sparse Nonnegative Matrix Factorization

Single-Channel Blind Source Separation using Adaptive Mode Separation-Based Wavelet Transform and Density-Based Clustering with Sparse Reconstruction

Underdetermined blind source separation of speech mixtures unifying dictionary learning and sparse representation

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Single-Channel Source Separation of Audio Signals Using Bark Scale Wavelet Packet Decomposition

Abstract

Access this article

Similar content being viewed by others

Dual-Transform Source Separation Using Sparse Nonnegative Matrix Factorization

Single-Channel Blind Source Separation using Adaptive Mode Separation-Based Wavelet Transform and Density-Based Clustering with Sparse Reconstruction

Underdetermined blind source separation of speech mixtures unifying dictionary learning and sparse representation

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation