Abstract
In this paper, we describe a method for recognizing sound sources in a mixture. While many audio-based content analysis methods focus on detecting or classifying target sounds in a discriminative manner, we approach this as a regression problem, in which we estimate the relative proportions of sound sources in the given mixture. Using source separation ideas based on probabilistic latent component analysis, we directly estimate these proportions from the mixture without actually separating the sources. We also introduce a method for learning a transition matrix to temporally constrain the problem. We demonstrate the proposed method on a mixture of five classes of sounds and show that it is quite effective in correctly estimating the relative proportions of the sounds in the mixture.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Radhakrishnan, R., Xiong, Z., Otsuka, I.: A Content-Adaptive Analysis and Representation Framework for Audio Event Discovery from Unscripted Multimedia. EURASIP Journal on Applied Signal Processing, 1–24 (2006)
Li, Y., Dorai, C.: Instructional Video Content Analysis Using Audio Information. IEEE TASLP 14(6) (2006)
Tran, H.D., Li, H.: Sound Event Recognition With Probabilistic Distance SVMs. IEEE TASLP 19(6) (2011)
Smaragdis, P., Raj, B., Shashanka, M.: A probabilistic latent variable model for acoustic modeling. In: Advances in Models for Acoustic Processing, NIPS (2006)
Smaragdis, P., Raj, B., Shashanka, M.: Supervised and Semi-Supervised Separation of Sounds from Single-Channel Mixtures. In: Davies, M.E., James, C.J., Abdallah, S.A., Plumbley, M.D. (eds.) ICA 2007. LNCS, vol. 4666, pp. 414–421. Springer, Heidelberg (2007)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Nam, J., Mysore, G.J., Smaragdis, P. (2012). Sound Recognition in Mixtures. In: Theis, F., Cichocki, A., Yeredor, A., Zibulevsky, M. (eds) Latent Variable Analysis and Signal Separation. LVA/ICA 2012. Lecture Notes in Computer Science, vol 7191. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28551-6_50
Download citation
DOI: https://doi.org/10.1007/978-3-642-28551-6_50
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28550-9
Online ISBN: 978-3-642-28551-6
eBook Packages: Computer ScienceComputer Science (R0)