In this chapter we present a decomposition algorithm within a probabilistic framework and present some of its extensions which directly manipulate sparsity and introduce invariances. We show that this particular decomposition allows us to use probabilistic analyses that can decompose mixtures of sounds into fundamental building components that facilitate separation. We will present some of these analyses and demonstrate their utility by using them for a variety of sound separation scenarios ranging from the completely blind case, to the case where models of sources are available.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
M.A. Casey and W. Westner, “Separation of Mixed Audio Sources by Indepen-dent Subspace Analysis,” International Computer Music Conference (ICMC), Aug. 2000.
M.A. Casey, “Auditory Group Theory: with Applications to Statistical Basis Methods for Structured Audio,” Ph.D. Dissertation, Massachusetts Institute of Technology, MA, USA. Feb. 1998.
P. Smaragdis, “Redundancy Reduction for Computational Audition, a Unifying Approach,” Ph.D. Dissertation, Massachusetts Institute of Technology, MA, USA. June 2001.
P. Smaragdis, “Convolutive speech bases and their application to supervised speech separation,” IEEE Transaction on Audio, Speech and Language Process-ing, Jan. 2007.
T. Virtanen, “Monaural Sound Source Separation by Non-Negative Matrix Fac-torization with Temporal Continuity and Sparseness Criteria,” IEEE Transac-tions on Audio, Speech, and Language Processing, vol. 15, no. 3, Mar. 2007.
T. Virtanen, “Separation of sound sources by convolutive sparse coding,” Work-shop on Statistical and Perceptual Audio Processing (SAPA), Oct. 2004.
B. Raj, M.V. Shashanka, and P. Smaragdis, “Latent Dirichlet decomposi-tion for single channel speaker separation,” IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), May 2006.
B. Raj and P. Smaragdis, “Latent variable decomposition of spectrograms for single channel speaker separation,” IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), Oct. 2005.
P. Smaragdis,“Discovering auditory objects through non-negativity constraints,” Workshop on Statistical and Perceptual Audio Processing (SAPA), and Oct. 2004.
M. Mørup and M.N. Schmidt, “Sparse non-negative tensor 2D deconvolution (SNTF2D) for multi channel time-frequency analysis,” DTU Informatics and Mathematical Modeling Technical Report 2006. Available at: http://www2.imm dtu.dk/pubdb/views/edoc download.php/4659/pdf/imm4659.pdf .
M.N. Schmidt and M. Mørup, “Nonnegative Matrix Factor 2-D Deconvolution for Blind Single Channel Source Separation,” 6th International Conference on Independent Component Analysis and Blind Signal Separation, Mar. 2006.
P. Smaragdis, “From learning music to learning to separate,” Forum Acusticum, Aug. 2005.
P. Smaragdis and J.C. Brown, “Non-negative matrix factorization for poly-phonic music transcription,” IEEE Workshop on Applications of Signal Process-ing to Audio and Acoustics (WASPAA), Oct. 2003.
D. D. Lee and H. S. Seung, “Algorithms for non-negative matrix factorization,” Advances in Neural Information Processing 13, 2001.
D. FitzGerald, M. Cranitch, and E. Coyle, “Sound source separation using shifted non-negative tensor factorisation,” IEEE International Conference on Acoustics, Speech, and Signal Processing, May 2006.
M.E. Brand, “Structure learning in conditional probability models via an eutropic prior and parameter extinction,” Neural Computation, vol. 11, no. 5, pp. 1155-1182, July 1999.
M.V. Shashanka “A Unified Probabilistic Approach to Modeling and Sepa-rating Single-Channel Acoustic Sources,” Ph.D. Dissertation, Department of Cognitive and Neural Systems. Boston University, 2007.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2007 Springer
About this chapter
Cite this chapter
Smaragdis, P. (2007). Probabilistic Decompositions of Spectra for Sound Separation. In: Makino, S., Sawada, H., Lee, TW. (eds) Blind Speech Separation. Signals and Communication Technology. Springer, Dordrecht. https://doi.org/10.1007/978-1-4020-6479-1_13
Download citation
DOI: https://doi.org/10.1007/978-1-4020-6479-1_13
Publisher Name: Springer, Dordrecht
Print ISBN: 978-1-4020-6478-4
Online ISBN: 978-1-4020-6479-1
eBook Packages: EngineeringEngineering (R0)