Skip to main content

Probabilistic Decompositions of Spectra for Sound Separation

  • Chapter

Part of the book series: Signals and Communication Technology ((SCT))

In this chapter we present a decomposition algorithm within a probabilistic framework and present some of its extensions which directly manipulate sparsity and introduce invariances. We show that this particular decomposition allows us to use probabilistic analyses that can decompose mixtures of sounds into fundamental building components that facilitate separation. We will present some of these analyses and demonstrate their utility by using them for a variety of sound separation scenarios ranging from the completely blind case, to the case where models of sources are available.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. M.A. Casey and W. Westner, “Separation of Mixed Audio Sources by Indepen-dent Subspace Analysis,” International Computer Music Conference (ICMC), Aug. 2000.

    Google Scholar 

  2. M.A. Casey, “Auditory Group Theory: with Applications to Statistical Basis Methods for Structured Audio,” Ph.D. Dissertation, Massachusetts Institute of Technology, MA, USA. Feb. 1998.

    Google Scholar 

  3. P. Smaragdis, “Redundancy Reduction for Computational Audition, a Unifying Approach,” Ph.D. Dissertation, Massachusetts Institute of Technology, MA, USA. June 2001.

    Google Scholar 

  4. P. Smaragdis, “Convolutive speech bases and their application to supervised speech separation,” IEEE Transaction on Audio, Speech and Language Process-ing, Jan. 2007.

    Google Scholar 

  5. T. Virtanen, “Monaural Sound Source Separation by Non-Negative Matrix Fac-torization with Temporal Continuity and Sparseness Criteria,” IEEE Transac-tions on Audio, Speech, and Language Processing, vol. 15, no. 3, Mar. 2007.

    Google Scholar 

  6. T. Virtanen, “Separation of sound sources by convolutive sparse coding,” Work-shop on Statistical and Perceptual Audio Processing (SAPA), Oct. 2004.

    Google Scholar 

  7. B. Raj, M.V. Shashanka, and P. Smaragdis, “Latent Dirichlet decomposi-tion for single channel speaker separation,” IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), May 2006.

    Google Scholar 

  8. B. Raj and P. Smaragdis, “Latent variable decomposition of spectrograms for single channel speaker separation,” IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), Oct. 2005.

    Google Scholar 

  9. P. Smaragdis,“Discovering auditory objects through non-negativity constraints,” Workshop on Statistical and Perceptual Audio Processing (SAPA), and Oct. 2004.

    Google Scholar 

  10. M. Mørup and M.N. Schmidt, “Sparse non-negative tensor 2D deconvolution (SNTF2D) for multi channel time-frequency analysis,” DTU Informatics and Mathematical Modeling Technical Report 2006. Available at: http://www2.imm dtu.dk/pubdb/views/edoc download.php/4659/pdf/imm4659.pdf .

  11. M.N. Schmidt and M. Mørup, “Nonnegative Matrix Factor 2-D Deconvolution for Blind Single Channel Source Separation,” 6th International Conference on Independent Component Analysis and Blind Signal Separation, Mar. 2006.

    Google Scholar 

  12. P. Smaragdis, “From learning music to learning to separate,” Forum Acusticum, Aug. 2005.

    Google Scholar 

  13. P. Smaragdis and J.C. Brown, “Non-negative matrix factorization for poly-phonic music transcription,” IEEE Workshop on Applications of Signal Process-ing to Audio and Acoustics (WASPAA), Oct. 2003.

    Google Scholar 

  14. D. D. Lee and H. S. Seung, “Algorithms for non-negative matrix factorization,” Advances in Neural Information Processing 13, 2001.

    Google Scholar 

  15. D. FitzGerald, M. Cranitch, and E. Coyle, “Sound source separation using shifted non-negative tensor factorisation,” IEEE International Conference on Acoustics, Speech, and Signal Processing, May 2006.

    Google Scholar 

  16. M.E. Brand, “Structure learning in conditional probability models via an eutropic prior and parameter extinction,” Neural Computation, vol. 11, no. 5, pp. 1155-1182, July 1999.

    Article  Google Scholar 

  17. M.V. Shashanka “A Unified Probabilistic Approach to Modeling and Sepa-rating Single-Channel Acoustic Sources,” Ph.D. Dissertation, Department of Cognitive and Neural Systems. Boston University, 2007.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer

About this chapter

Cite this chapter

Smaragdis, P. (2007). Probabilistic Decompositions of Spectra for Sound Separation. In: Makino, S., Sawada, H., Lee, TW. (eds) Blind Speech Separation. Signals and Communication Technology. Springer, Dordrecht. https://doi.org/10.1007/978-1-4020-6479-1_13

Download citation

  • DOI: https://doi.org/10.1007/978-1-4020-6479-1_13

  • Publisher Name: Springer, Dordrecht

  • Print ISBN: 978-1-4020-6478-4

  • Online ISBN: 978-1-4020-6479-1

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics