Abstract
Non-negative spectrogram factorization has been proposed for single-channel source separation tasks. These methods operate on the magnitude or power spectrogram of the input mixture and estimate the magnitude or power spectrogram of source components. The usual assumption is that the mixture spectrogram is well approximated by the sum of source components. However, this relationship additionally depends on the unknown phase of the sources. Using a probabilistic representation of phase, we derive a cost function that incorporates this uncertainty. We compare this cost function against four standard approaches for a variety of spectrogram sizes, numbers of components, and component distributions. This phase-aware cost function reduces the estimation error but is more affected by detection errors.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Casey, M., Westner, W.: Separation of mixed audio sources by independent subspace analysis. In: Proc. of the Int’l. Computer Music Conf. (2000)
Smaragdis, P.: Redundancy Reduction for Computational Audition, a Unifying Approach. PhD thesis, MAS Dept. Massachusetts Institute of Technology (2001)
Wang, B., Plumbley, M.D.: Investigating single-channel audio source separation methods based on non-negative matrix factorization. In: Rosca, J., Erdogmus, D., PrÃncipe, J.C., Haykin, S. (eds.) ICA 2006. LNCS, vol. 3889, pp. 17–20. Springer, Heidelberg (2006)
Abdallah, S.A., Plumbley, M.D.: Polyphonic transcription by non-negative sparse coding of power spectra. In: Proc. of the Int’l. Conf. on Music Information Retrieval, pp. 318–325 (2004)
FitzGerald, D., Coyle, E., Laylor, B.: Sub-band independent subspace analysis for drum transcription. In: Proc. of Int’l. Conf. on Digital Audio Effects, pp. 65–69 (2002)
Raj, B., Singh, R., Smaragdis, P.: Recognizing speech from simultaneous speakers. In: Eurospeech (2005)
Virtanen, T.: Separation of sound sources by convolutive sparse coding. In: ISCA Tutorial & Research Wkshp on Statistical & Perceptual Audio Processing (2004)
FitzGerald, D., Cranitch, M., Coyle, E.: Sound source separation using shifted non-negative tensor factorisation. In: Proc. of the IEEE ICASSP (2006)
Lee, D.D., Seung, H.S.: Algorithms for non-negative matrix factorization. In: Advances in NIPS 13, pp. 556–562. MIT Press, Cambridge (2001)
Parry, R.M., Essa, I.: Incorporating phase information for source separation via spectrogram factorization. In: Proc. of IEEE ICASSP (2007)
Lee, D.D., Seung, H.S.: Learning the parts of objects by non-negative matrix factorization. Nature 401, 788–791 (1999)
Bello, J.P., Sandler, M.B.: Phase-based note onset detection for music signals. In: Proc. of the IEEE ICASSP. vol. 5, pp. 441–444 (2003)
Feller, W.: An Introduction to Probability Theory and Its Applications. Wiley, New York (1971)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Parry, R.M., Essa, I. (2007). Phase-Aware Non-negative Spectrogram Factorization. In: Davies, M.E., James, C.J., Abdallah, S.A., Plumbley, M.D. (eds) Independent Component Analysis and Signal Separation. ICA 2007. Lecture Notes in Computer Science, vol 4666. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74494-8_67
Download citation
DOI: https://doi.org/10.1007/978-3-540-74494-8_67
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74493-1
Online ISBN: 978-3-540-74494-8
eBook Packages: Computer ScienceComputer Science (R0)