Probabilistic Decompositions of Spectra for Sound Separation

Smaragdis, Paris

doi:10.1007/978-1-4020-6479-1_13

Probabilistic Decompositions of Spectra for Sound Separation

Paris Smaragdis³

Chapter

2438 Accesses
6 Citations

Part of the book series: Signals and Communication Technology ((SCT))

In this chapter we present a decomposition algorithm within a probabilistic framework and present some of its extensions which directly manipulate sparsity and introduce invariances. We show that this particular decomposition allows us to use probabilistic analyses that can decompose mixtures of sounds into fundamental building components that facilitate separation. We will present some of these analyses and demonstrate their utility by using them for a variety of sound separation scenarios ranging from the completely blind case, to the case where models of sources are available.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

M.A. Casey and W. Westner, “Separation of Mixed Audio Sources by Indepen-dent Subspace Analysis,” International Computer Music Conference (ICMC), Aug. 2000.
Google Scholar
M.A. Casey, “Auditory Group Theory: with Applications to Statistical Basis Methods for Structured Audio,” Ph.D. Dissertation, Massachusetts Institute of Technology, MA, USA. Feb. 1998.
Google Scholar
P. Smaragdis, “Redundancy Reduction for Computational Audition, a Unifying Approach,” Ph.D. Dissertation, Massachusetts Institute of Technology, MA, USA. June 2001.
Google Scholar
P. Smaragdis, “Convolutive speech bases and their application to supervised speech separation,” IEEE Transaction on Audio, Speech and Language Process-ing, Jan. 2007.
Google Scholar
T. Virtanen, “Monaural Sound Source Separation by Non-Negative Matrix Fac-torization with Temporal Continuity and Sparseness Criteria,” IEEE Transac-tions on Audio, Speech, and Language Processing, vol. 15, no. 3, Mar. 2007.
Google Scholar
T. Virtanen, “Separation of sound sources by convolutive sparse coding,” Work-shop on Statistical and Perceptual Audio Processing (SAPA), Oct. 2004.
Google Scholar
B. Raj, M.V. Shashanka, and P. Smaragdis, “Latent Dirichlet decomposi-tion for single channel speaker separation,” IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), May 2006.
Google Scholar
B. Raj and P. Smaragdis, “Latent variable decomposition of spectrograms for single channel speaker separation,” IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), Oct. 2005.
Google Scholar
P. Smaragdis,“Discovering auditory objects through non-negativity constraints,” Workshop on Statistical and Perceptual Audio Processing (SAPA), and Oct. 2004.
Google Scholar
M. Mørup and M.N. Schmidt, “Sparse non-negative tensor 2D deconvolution (SNTF2D) for multi channel time-frequency analysis,” DTU Informatics and Mathematical Modeling Technical Report 2006. Available at: http://www2.imm dtu.dk/pubdb/views/edoc download.php/4659/pdf/imm4659.pdf .
M.N. Schmidt and M. Mørup, “Nonnegative Matrix Factor 2-D Deconvolution for Blind Single Channel Source Separation,” 6th International Conference on Independent Component Analysis and Blind Signal Separation, Mar. 2006.
Google Scholar
P. Smaragdis, “From learning music to learning to separate,” Forum Acusticum, Aug. 2005.
Google Scholar
P. Smaragdis and J.C. Brown, “Non-negative matrix factorization for poly-phonic music transcription,” IEEE Workshop on Applications of Signal Process-ing to Audio and Acoustics (WASPAA), Oct. 2003.
Google Scholar
D. D. Lee and H. S. Seung, “Algorithms for non-negative matrix factorization,” Advances in Neural Information Processing 13, 2001.
Google Scholar
D. FitzGerald, M. Cranitch, and E. Coyle, “Sound source separation using shifted non-negative tensor factorisation,” IEEE International Conference on Acoustics, Speech, and Signal Processing, May 2006.
Google Scholar
M.E. Brand, “Structure learning in conditional probability models via an eutropic prior and parameter extinction,” Neural Computation, vol. 11, no. 5, pp. 1155-1182, July 1999.
Article Google Scholar
M.V. Shashanka “A Unified Probabilistic Approach to Modeling and Sepa-rating Single-Channel Acoustic Sources,” Ph.D. Dissertation, Department of Cognitive and Neural Systems. Boston University, 2007.
Google Scholar

Download references

Author information

Authors and Affiliations

Mitsubishi Electric Research Laboratories, 02139, Cambridge, MA, USA
Paris Smaragdis

Authors

Paris Smaragdis
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

NTT Corporation, 2-4 Hikaridai, 619-0237, Soraku-gun, Kyoto, Japan
Shoji Makino & Hiroshi Sawada &
University of California, San Diego, 9500 Gilman Drive, 0523, 92093-0523, La Jolla, CA, USA
Te-Won Lee

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Smaragdis, P. (2007). Probabilistic Decompositions of Spectra for Sound Separation. In: Makino, S., Sawada, H., Lee, TW. (eds) Blind Speech Separation. Signals and Communication Technology. Springer, Dordrecht. https://doi.org/10.1007/978-1-4020-6479-1_13

Download citation

DOI: https://doi.org/10.1007/978-1-4020-6479-1_13
Publisher Name: Springer, Dordrecht
Print ISBN: 978-1-4020-6478-4
Online ISBN: 978-1-4020-6479-1
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Buying options