Abstract
We describe a new method for recognizing notes from monophonic audio, such as sung or whistled queries. Our method achieves results similar to known methods, but without any probabilistic models that would need to be trained. Instead, we define a distance function for audio frames that captures three criteria of closeness which usually coincide with frames belonging to the same note: small pitch difference, small loudness fluctuations between the frames, and the absence of non-pitched frames between the compared frames. We use this distance function for clustering frames such that the total intra-cluster costs are minimized. Criteria for clustering termination include the uniformity of note costs. This new method is fast, does not rely on any particular fundamental frequency estimation method being used, and it is largely independent of the input mode (singing, whistling, playing an instrument). It is already being used successfully for the “query by humming/whistling/playing” search feature on the publicly available collaborative melody directory Musipedia.org.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Boersma, P., Weenink, D.: Praat: doing phonetics by computer (version 5.0.29) [computer program] (2008), http://www.praat.org/ (retrieved August 4, 2008)
Boersma, P.: Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of a sampled sound. IFA Proceedings 17, 97–110 (1993)
Brossier, P., Bello, J.P., Plumbley, M.D.: Fast labelling of notes in music signals. In: Proceedings of the 5th International Conference on Music Information Retrieval (ISMIR 2004), Barcelona (2004)
Brossier, P., Bello, J.P., Plumbley, M.D.: Real-time temporal segmentation of note objects in music signals. In: Proceedings of the International Computer Music Conference (ICMC 2004), Miami, Florida, USA (2004)
Wang, Y., Toh, C.C., Zhang, B.J.: Multiple-feature fusion based onset detection for solo singing voice. In: ISMIR (2008)
Camacho, A.: Detection of pitched/unpitched sound using pitch strength clustering. In: ISMIR, pp. 533–537 (2008)
Collins, N.: Using a pitch detector for onset detection. In: ISMIR (2005)
de Mulder, T., Martens, J.-P., Lesaffre, M., Leman, M.M., de Baets, B., de Meyer, H.: An auditory model based transcriber of vocal queries. In: ISMIR (2003)
Harte, C., Sandler, M., Gasser, M.: Detecting harmonic change in musical audio. In: Audio and Musical Computing for Multimedia Workshop 2006 (in conjunction with ACM Multimedia) (2006)
Haus, G., Pollastri, E.: An audio front end for query-by-humming systems. In: Proceedings of the International Conference on Music Information Retrieval, ISMIR (2001)
Pauws, S.: CubyHum: a fully operational query by humming system. In: ISMIR, pp. 187–196 (2002)
Ryynänen, M.: Probabilistic Modelling of Note Events in the Transcription of Monophonic Melodies, Master’s thesis, Tampere University of Technology (2004)
Ryynänen, M., Klapuri, A.: Query by humming using locality sensitive hashing for MIREX 2008 (2008), MIREX abstract, http://www.music-ir.org/mirex/2008/abs/QBSH_ryynanen.pdf
Ryynänen, M., Klapuri, A.: Transcription of the singing melody in polyphonic music (2006), MIREX abstract, http://www.music-ir.org/evaluation/MIREX/2006_abstracts/AME_ryynanen.pdf
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Typke, R. (2011). Note Recognition from Monophonic Audio: A Clustering Approach. In: Detyniecki, M., García-Serrano, A., Nürnberger, A. (eds) Adaptive Multimedia Retrieval. Understanding Media and Adapting to the User. AMR 2009. Lecture Notes in Computer Science, vol 6535. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-18449-9_5
Download citation
DOI: https://doi.org/10.1007/978-3-642-18449-9_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-18448-2
Online ISBN: 978-3-642-18449-9
eBook Packages: Computer ScienceComputer Science (R0)