Monaural Source Separation

Jang, Gil-Jin; Lee, Te-Won

doi:10.1007/978-1-4020-6479-1_12

Gil-Jin Jang³ &
Te-Won Lee³

Part of the book series: Signals and Communication Technology ((SCT))

2451 Accesses
1 Citations

This chapter discusses source separation methods when only single channel observation is available. The problem is underdeterministic, in that multiple source signals should be extracted from a single stream of observations. To overcome the mathematical intractability, prior information on the source characteristics is generally assumed and applied to derive a source separation algorithm. This chapter describes one of the monaural source separation approach, which is based on exploiting a priori sets of time-domain basis functions learned by independent component analysis (ICA). The inherent time structure of sound sources is reflected in the ICA basis functions, which encode the sources in a statistically effi- cient manner. Detailed derivation of the source separation algorithm is described, given the observed single channel data and sets of basis functions. The prior knowledge given by the basis functions and the associated coefficient densities enables inferring the original source signals. A flexible model for density estimation allows accurate modeling of the observation and the experimental results exhibit a high level of separation performance for simulated mixtures as well as real environment recordings employing mixtures of two different sources.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

A. S. Bregman, Auditory Scene Analysis: The Perceptual Organization of Sound. MIT Press, Cambridge MA, 1990.
Google Scholar
A. S. Bregman, Computational Auditory Scene Analysis. MIT Press, Cambridge MA, 1994.
Google Scholar
G. J. Brown and M. Cooke, “Computational auditory scene analysis,” Com-puter Speech and Language, vol. 8, no. 4, pp. 297-336, 1994.
Article Google Scholar
P. Comon, “Independent component analysis, A new concept?” Signal Process-ing, vol. 36, pp. 287-314, 1994.
Article MATH Google Scholar
A. J. Bell and T. J. Sejnowski, “An information-maximization approach to blind separation and blind deconvolution,” Neural Computation, vol. 7, no. 6, pp. 1004-1034, 1995.
Article Google Scholar
J.-F. Cardoso and B. Laheld, “Equivariant adaptive source separation,” IEEE Trans. on S.P., vol. 45, no. 2, pp. 424-444, 1996.
Google Scholar
S. T. Roweis, “One microphone source separation,” Advances in Neural Infor-mation Processing Systems, vol. 13, pp. 793-799, 2001.
Google Scholar
D. D. Lee and S. S. Seung, “Learning the parts of objects by non-negative matrix factorization,” Nature, vol. 401, pp. 788-791, 1999.
Article Google Scholar
P. Smaragdis, “Non-negative matrix factor deconvolution; extraction of mul-tiple sound sources from monophonic inputs,” in Proc. ICA2004, vol. 3195, pp. 494-501, Sept. 2004.
Google Scholar
M. N. Schmidt and M. Mørup, “Nonnegative matrix factor 2-D deconvolution for blind single channel source separation,” in Proc. ICA2006, Apr. 2006.
Google Scholar
A. J. Bell and T. J. Sejnowski, “The “independent components” of natural scenes are edge filters,” Vision Research, vol. 37, no. 23, pp. 3327-3338, 1997.
Article Google Scholar
A. J. Bell and T. J. Sejnowski, “Learning the higher-order structures of a natural sound,” Network: Computation in Neural Systems, vol. 7, pp. 261-266, July 1996.
Article MATH Google Scholar
S. A. Abdallah and M. D. Plumbley, “If the independent components of natural images are edges, what are the independent components of natural sounds?” in Proceedings of International Conference on Independent Component Analysis and Signal Separation (ICA2001), (San Diego, CA), pp. 534-539, Dec. 2001.
Google Scholar
T.-W. Lee and G.-J. Jang, “The statistical structures of male and female speech signals,” in Proc. ICASSP, (Salt Lake City, Utah), May 2001.
Google Scholar
B. A. Olshausen and D. J. Field, “Emergence of simple-cell receptive-field prop-erties by learning a sparse code for natural images,” Nature, vol. 381, pp. 607-609,1996.
Article Google Scholar
M. Zibulevsky and B. A. Pearlmutter, “Blind source separation by sparse de-composition,” Neural Computations, vol. 13, no. 4, 2001.
Google Scholar
M. S. Lewicki, “Efficient coding of natural sounds,” Nature Neuroscience, vol. 5, no. 4, pp. 356-363, 2002.
Article Google Scholar
J. Hopgood and P. Rayner, “Single channel signal separation using linear time-varying filters: Separability of non-stationary stochastic signals,” in Proc. ICASSP, vol. 3, (Phoenix, Arizona), pp. 1449-1452, Mar. 1999.
Google Scholar
B. Pearlmutter and L. Parra, “A context-sensitive generalization of ICA,” in Proc. ICONIP, (Hong Kong), pp. 151-157, Sept. 1996.
Google Scholar
J.-F. Cardoso, “Infomax and maximum likelihood for blind source separation,” IEEE Signal Processing Letters, vol. 4, pp. 112-114, Apr. 1997.
Article Google Scholar
T.-W. Lee, M. Girolami, A. Bell, and T. Sejnowski, “A unifying information-theoretic framework for independent component analysis,” Computers & Math-ematics with Applications, vol. 31, pp. 1-21, Mar. 2000.
Article MathSciNet Google Scholar
D. T. Pham and P. Garrat, “Blind source separation of mixture of indepen-dent sources through a quasi-maximum likelihood approach,” IEEE Trans. on Signal Proc., vol. 45, no. 7, pp. 1712-1725, 1997.
Article MATH Google Scholar
A. Hyvärinen, “Sparse code shrinkage: denoising of nongaussian data by maxi-mum likelihood estimation,” Neural Computation, vol. 11, no. 7, pp. 1739-1768, 1999.
Article Google Scholar
J.-H. Lee, H.-Y. Jung, T.-W. Lee, and S.-Y. Lee, “Speech feature extraction using independent component analysis,” in Proc. ICASSP, vol. 3, (Istanbul, Turkey), pp. 1631-1634, June 2000.
Google Scholar
G. Box and G. Tiao, Baysian Inference in Statistical Analysis. John Wiley and Sons, 1973.
Google Scholar
T.-W. Lee and M. S. Lewicki, “The generalized Gaussian mixture model us-ing ICA,” in International Workshop on Independent Component Analysis (ICA’00), (Helsinki, Finland), pp. 239-244, June 2000.
Google Scholar
S. Rickard, R. Balan, and J. Rosca, “Real-time time-frequency based blind source separation,” in Proceedings of International Conference on Indepen-dent Component Analysis and Signal Separation (ICA2001), (San Diego, CA), pp. 651-656, Dec. 2001.
Google Scholar
T. Virtanen, “Sound source separation using sparse coding with temporal conti-nuity objective,” in Proceedings of International Computer Music Conference, Oct. 2003.
Google Scholar
T. Virtanen, “Separation of sound sources by convolutive sparse coding,” in ISCA Tutorial and Research Workshop on Statistical and Perceptual Audio Processing, 2004.
Google Scholar
T. Virtanen, Signal Processing Methods for Music Transcription, Eds. A. Klapuri and M. Davy, ch. Unsupervised Learning Methods for Source Separation. Springer-Verlag, 2006.
Google Scholar
T. Virtanen, “Speech recognition using factorial hidden markov models for separation in the feature space,” in Interspeech (ICSLP), (Pittsburgh, USA), 2006.
Google Scholar
R. Balan, A. Jourjine, and J. Rosca, “AR processes and sources can be recon-structed from degenerate mixtures,” in Proceedings of the First International Workshop on Independent Component Analysis and Signal Separation (ICA99), (Aussois, France), pp. 467-472, Jan. 1999.
Google Scholar
E. Wan and A. T. Nelson, “Neural dual extended Kalman filtering: Applications in speech enhancement and monaural blind signal separation,” in Proceedings of IEEE Workshop on Neural Networks and Signal Processing, 1997.
Google Scholar

Download references

Author information

Authors and Affiliations

Institute for Neural Computation, University of California, San Diego, 9500 Gilman Drive, 0523, 92093-0523, La Jolla, CA, USA
Gil-Jin Jang & Te-Won Lee

Authors

Gil-Jin Jang
View author publications
You can also search for this author in PubMed Google Scholar
Te-Won Lee
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

NTT Corporation, 2-4 Hikaridai, 619-0237, Soraku-gun, Kyoto, Japan
Shoji Makino & Hiroshi Sawada &
University of California, San Diego, 9500 Gilman Drive, 0523, 92093-0523, La Jolla, CA, USA
Te-Won Lee

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Jang, GJ., Lee, TW. (2007). Monaural Source Separation. In: Makino, S., Sawada, H., Lee, TW. (eds) Blind Speech Separation. Signals and Communication Technology. Springer, Dordrecht. https://doi.org/10.1007/978-1-4020-6479-1_12

Download citation

DOI: https://doi.org/10.1007/978-1-4020-6479-1_12
Publisher Name: Springer, Dordrecht
Print ISBN: 978-1-4020-6478-4
Online ISBN: 978-1-4020-6479-1
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics