Abstract
In this paper we present a method for polyphonic music source separation from their monaural mixture, where the underlying assumption is that the harmonic structure of a musical instrument remains roughly the same even if it is played at various pitches and is recorded in various mixing environments. We incorporate with nonnegativity, shift-invariance, and sparseness to select representative spectral basis vectors that are used to restore music sources from their monaural mixture. Experimental results with monaural instantaneous mixture of voice/cello and monaural convolutive mixture of saxophone/viola, are shown to confirm the validity of our proposed method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Lee, D.D., Seung, H.S.: Learning the parts of objects by non-negative matrix factorization. Nature 401, 788–791 (1999)
Smaragdis, P.: Non-negative matrix factor deconvolution: Extraction of multiple sound sources from monophonic inputs. In: Proc. Int’l Conf. Independent Component Analysis and Blind Signal Separation, Granada, Spain, pp. 494–499 (2004)
Plumbley, M.D., Abdallah, S.A., Bello, J.P., Davies, M.E., Monti, G., Sandler, M.B.: Automatic transcription and audio source separation. Cybernetics and Systems, 603–627 (2002)
Smaragdis, P., Brown, J.C.: Non-negative matrix factorization for polyphonic music transcription. In: Proc. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz, NY, pp. 177–180 (2003)
Abdallah, S.A., Plumbley, M.D.: Polyphonic music transcription by non-negative sparse coding of power spectra. In: Proc. Int’l Conf. Music Information Retrieval, Barcelona, Spain, pp. 318–325 (2004)
Helén, M., Virtanin, T.: Separation of drums from polyphonic music using nonnegative matrix factorization and support vector machine. In: Proc. European Signal Processing Conference, Antalaya, Turkey (2005)
Cho, Y.C., Choi, S.: Nonnegative features of spectro-temporal sounds for classfication. Pattern Recognition Letters 26, 1327–1336 (2005)
Eggert, J., Wersing, H., Körner, E.: Transformation-invariant representation and NMF. In: Proc. Int’l Joint Conf. Neural Networks (2004)
Kim, M., Choi, S.: On spectral basis selection for single channel polyphonic music separation. In: Duch, W., Kacprzyk, J., Oja, E., Zadrożny, S. (eds.) ICANN 2005. LNCS, vol. 3697, pp. 157–162. Springer, Heidelberg (2005)
FitzGerald, D., Cranitch, M., Coyle, E.: Generalised prior subspace analysis for polyphonic pitch transcription. In: Proc. Int’l Conf. Digital Audio Effects (2005)
Hoyer, P.O.: Non-negative matrix factorization with sparseness constraints. Journal of Machine Learning Research 5, 1457–1469 (2004)
Ru, P., Chi, T., Shamma, S.: NSL Toolbox (1997)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kim, M., Choi, S. (2006). Monaural Music Source Separation: Nonnegativity, Sparseness, and Shift-Invariance. In: Rosca, J., Erdogmus, D., PrÃncipe, J.C., Haykin, S. (eds) Independent Component Analysis and Blind Signal Separation. ICA 2006. Lecture Notes in Computer Science, vol 3889. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11679363_77
Download citation
DOI: https://doi.org/10.1007/11679363_77
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-32630-4
Online ISBN: 978-3-540-32631-1
eBook Packages: Computer ScienceComputer Science (R0)