Abstract
In this contribution we present a novel method for identifying novelty and, more specifically, sound objects within texture sounds. We introduce the notion of texture sound and sound object and explain how the properties of a sound that is known to be textural may be exploited in order to detect deviations which suggest the presence of novelty or distinct sound event, which may then be called sound object. The suggested approach is based on Gabor multipliers, which map the Gabor coefficients corresponding to certain time-segments of the signal to each other. We present the results of simulations based on both synthetic and real audio signals.
This research was supported by the Vienna Science and Technology Fund (WWTF) through project VRG12-009 and the Austrian Science Fund (FWF):[V 312-N25].
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
\(\langle g,h\rangle _{L^2}\) denotes the \(L^2\)-inner product, defined as \(\langle g,h\rangle _{L^2} = \int _t g(t)\overline{h(t)}\).
- 2.
Note that the resulting visual representation is also known as spectrogram. However, for the reconstruction as discussed in the previous section, the phase factors of the coefficients are crucial and cannot be omitted.
- 3.
We define the signal to noise ratio (SNR) by \(SNR_{dB} = 10\log _{10} (\Vert s\Vert _2^2 /\Vert f\Vert ^2_2)\), given in dB, by where \(f\) is the background signal, which can be seen as “noise” in which \(s\), the sound object is to be traced.
References
Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2(1), 183–202 (2009)
Beck, A., Teboulle, M.: Fast gradient-based algorithms for constrained total variation image denoising and deblurring problems. IEEE Trans. Image Process. 18(11), 2419–2434 (2009)
Daubechies, I., Defrise, M., De Mol, C.: An iterative thresholding algorithm for linear inverse problems with a sparsity constraint. Comm. Pure Appl. Math. 57(11), 1413–1457 (2004)
Daubechies, I., Grossmann, A., Meyer, Y.: Painless nonorthogonal expansions. J. Math. Phys. 27(5), 1271–1283 (1986)
Dörfler, M.: Time-frequency analysis for music signals. a mathematical approach. J. New Music Res. 30(1), 3–12 (2001)
Dörfler, M., Torrésani, B.: Representation of operators in the time-frequency domain and generalized Gabor multipliers. J. Fourier Anal. Appl. 16(2), 261–293 (2010)
Dörfler, M., Torrésani, B.: Representation of operators by sampling in the time-frequency domain. Sampl. Theory Signal Image Process. 10(1–2), 171–190 (2011)
Feichtinger, H.G., Nowak, K.: A first survey of Gabor multipliers. In: Feichtinger, H.G., Strohmer, T. (eds.) Advances in Gabor Analysis. Applied and Numerical Harmonic Analysis, pp. 99–128. Birkhäuser, Boston (2003)
Feichtinger, H.G., Strohmer, T. Introduction. In: Feichtinger, H.G., Strohmer, T. (eds.) Gabor Analysis and Algorithms Theory and Applications, Boston, MA, Applied and Numerical Harmonic Analysis. Birkhäuser, Boston, pp. 1–31, 453–488 (1998)
Gröchenig, K.: Foundations of Time-Frequency Analysis. Applied and Numerical Harmonic Analysis. Birkhäuser, Boston (2001)
Klien, V., Grill, T., Flexer, A.: On automated annotation of acousmatic music. J. New Music Res. 41(2), 153–173 (2012)
Kowalski, M., Siedenburg, K., Dörfler, M.: Social sparsity! neighborhood systems enrich structured shrinkage operators. IEEE Trans. Signal Process. 61(10), 2498–2511 (2013)
Kowalski, M., Torrésani, B.: Sparsity and persistence: mixed norms provide simple signals models with dependent coefficients. Sig. Image Video Process. 3(3), 251–264 (2009)
Kowalski, M., Torrésani, B. Structured sparsity: from mixed norms to structured shrinkage. In: SPARS’09 - Signal Processing with Adaptive Sparse Structured Representations (2009)
Olivero, A. Les multiplicateurs temps-fréquence. Applications à l’analyse et à la synthèse de signaux sonores et musicaux. Ph.D. thesis (2012)
Olivero, A., Torresani, B., Kronland-Martinet, R.: A class of algorithms for time-frequency multiplier estimation. IEEE Trans. Audio, Speech Lang. Process. 21(8), 1550–1559 (2013)
Schaeffer, P.: On Automated Annotation of Acousmatic Music. Editions du Seuil, Paris (2002)
Siedenburg, K., Dörfler, M.: Persistent time-frequency shrinkage for audio denoising. J. Audio Eng. Soc. 61(1/2), 29–38 (2013)
Sondergaard, P., Torrésani, B., Balazs, P.: The linear time frequency analysis toolbox. Int. J. Wavelets Multiresolut. Inf. Process. 10(4), 1250032 (2012)
Acknowledgments
We would like to thank Anaïk Olivero for sharing code for computing Gabor masks and Richard Kronland-Martinet and his team in LMA, CNRS Marseille, for giving us permission to use their software SPAD.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Dörfler, M., Matusiak, E. (2014). Sparse Gabor Multiplier Estimation for Identification of Sound Objects in Texture Sound. In: Aramaki, M., Derrien, O., Kronland-Martinet, R., Ystad, S. (eds) Sound, Music, and Motion. CMMR 2013. Lecture Notes in Computer Science(), vol 8905. Springer, Cham. https://doi.org/10.1007/978-3-319-12976-1_26
Download citation
DOI: https://doi.org/10.1007/978-3-319-12976-1_26
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-12975-4
Online ISBN: 978-3-319-12976-1
eBook Packages: Computer ScienceComputer Science (R0)