Skip to main content

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6535))

Included in the following conference series:

  • 326 Accesses

Abstract

We describe a new method for recognizing notes from monophonic audio, such as sung or whistled queries. Our method achieves results similar to known methods, but without any probabilistic models that would need to be trained. Instead, we define a distance function for audio frames that captures three criteria of closeness which usually coincide with frames belonging to the same note: small pitch difference, small loudness fluctuations between the frames, and the absence of non-pitched frames between the compared frames. We use this distance function for clustering frames such that the total intra-cluster costs are minimized. Criteria for clustering termination include the uniformity of note costs. This new method is fast, does not rely on any particular fundamental frequency estimation method being used, and it is largely independent of the input mode (singing, whistling, playing an instrument). It is already being used successfully for the “query by humming/whistling/playing” search feature on the publicly available collaborative melody directory Musipedia.org.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Boersma, P., Weenink, D.: Praat: doing phonetics by computer (version 5.0.29) [computer program] (2008), http://www.praat.org/ (retrieved August 4, 2008)

  2. Boersma, P.: Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of a sampled sound. IFA Proceedings 17, 97–110 (1993)

    Google Scholar 

  3. Brossier, P., Bello, J.P., Plumbley, M.D.: Fast labelling of notes in music signals. In: Proceedings of the 5th International Conference on Music Information Retrieval (ISMIR 2004), Barcelona (2004)

    Google Scholar 

  4. Brossier, P., Bello, J.P., Plumbley, M.D.: Real-time temporal segmentation of note objects in music signals. In: Proceedings of the International Computer Music Conference (ICMC 2004), Miami, Florida, USA (2004)

    Google Scholar 

  5. Wang, Y., Toh, C.C., Zhang, B.J.: Multiple-feature fusion based onset detection for solo singing voice. In: ISMIR (2008)

    Google Scholar 

  6. Camacho, A.: Detection of pitched/unpitched sound using pitch strength clustering. In: ISMIR, pp. 533–537 (2008)

    Google Scholar 

  7. Collins, N.: Using a pitch detector for onset detection. In: ISMIR (2005)

    Google Scholar 

  8. de Mulder, T., Martens, J.-P., Lesaffre, M., Leman, M.M., de Baets, B., de Meyer, H.: An auditory model based transcriber of vocal queries. In: ISMIR (2003)

    Google Scholar 

  9. Harte, C., Sandler, M., Gasser, M.: Detecting harmonic change in musical audio. In: Audio and Musical Computing for Multimedia Workshop 2006 (in conjunction with ACM Multimedia) (2006)

    Google Scholar 

  10. Haus, G., Pollastri, E.: An audio front end for query-by-humming systems. In: Proceedings of the International Conference on Music Information Retrieval, ISMIR (2001)

    Google Scholar 

  11. Pauws, S.: CubyHum: a fully operational query by humming system. In: ISMIR, pp. 187–196 (2002)

    Google Scholar 

  12. Ryynänen, M.: Probabilistic Modelling of Note Events in the Transcription of Monophonic Melodies, Master’s thesis, Tampere University of Technology (2004)

    Google Scholar 

  13. Ryynänen, M., Klapuri, A.: Query by humming using locality sensitive hashing for MIREX 2008 (2008), MIREX abstract, http://www.music-ir.org/mirex/2008/abs/QBSH_ryynanen.pdf

  14. Ryynänen, M., Klapuri, A.: Transcription of the singing melody in polyphonic music (2006), MIREX abstract, http://www.music-ir.org/evaluation/MIREX/2006_abstracts/AME_ryynanen.pdf

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Typke, R. (2011). Note Recognition from Monophonic Audio: A Clustering Approach. In: Detyniecki, M., García-Serrano, A., Nürnberger, A. (eds) Adaptive Multimedia Retrieval. Understanding Media and Adapting to the User. AMR 2009. Lecture Notes in Computer Science, vol 6535. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-18449-9_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-18449-9_5

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-18448-2

  • Online ISBN: 978-3-642-18449-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics