Skip to main content

On Musical Performances Identification, Entropy and String Matching

  • Conference paper
MICAI 2006: Advances in Artificial Intelligence (MICAI 2006)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4293))

Included in the following conference series:

Abstract

In this paper we address the problem of matching musical renditions of the same piece of music also known as performances. We use an entropy based Audio-Fingerprint delivering a framed, small footprint AFP which reduces the problem to a string matching problem. The Entropy AFP has very low resolution (750 ms per symbol), making it suitable for flexible string matching.

We show experimental results using dynamic time warping (DTW), Levenshtein or edit distance and the Longest Common Subsequence (LCS) distance. We are able to correctly (100%) identify different renditions of masterpieces as well as pop music in less than a second per comparison.

The three approaches are 100% effective, but LCS and Levenshtein can be computed online, making them suitable for monitoring applications (unlike DTW), and since they are distances a metric index could be use to speed up the recognition process.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 239.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Hu, N., Dannenberg, R.B., Tzanetakis, G.: Polyphonic audio matching and alignment for music retrieval. In: IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (2003)

    Google Scholar 

  2. Shalev-Shwartz, S., Dubnov, S., Friedman, N., Singer, Y.: Robust temporal and spectral modeling for query by melody. In: Proc. of ACM SIGIR 2002 (2002)

    Google Scholar 

  3. Cano, P., Loscos, A., Bonada, J.: Score-performance matching using hmms. In: Proceedings ICMC 1999 (1999)

    Google Scholar 

  4. Dixon, S.: Live tracking of musical performances using on-line time warping. In: Proc of the 8th Int Conf on Digital Audio Effects (DAFx 2005) (2005)

    Google Scholar 

  5. Gusfield, D.: Algorithms on Strings, Trees, and Sequences. Computer Science and Computational Biology. Cambridge University Press, Cambridge (1997)

    Google Scholar 

  6. Navarro, G., Raffinot, M.: Flexible Pattern Matching in Strings. Practical On-Line Search for Texts and Biological Sequences, vol. 17, Cambridge University Press, Cambridge (2002)

    Google Scholar 

  7. Ibarrola, A.C., Chavez, E.: A very robust audio-fingerprint based on the information content analysis. IEEE transactions on Multimedia (submitted), available: http://lc.fie.umich.mx/~camarena

  8. Hellmuth, O., Allamanche, E., Cremer, M., Kastner, T., NeuBauer, C., Schmidt, S., Siebenhaar, F.: Content-based broadcast monitoring using mpeg-7 audio fingerprints. In: International Symposium on Music Information Retrieval ISMIR (2001)

    Google Scholar 

  9. Haitsma, J., Kalker, T.: A highly robust audio fingerprinting system. In: IRCAM (2002)

    Google Scholar 

  10. Cano, P., Battle, E., Kalker, T., Haitsma, J.: A review of algorithms for audio fingerprinting. In: IEEE Workshop on Multimedia Signal Processing, pp. 167–169 (2002)

    Google Scholar 

  11. Shannon, C., Weaver, W.: The Mathematical Theory of Communication. University of Illinois Press (1949)

    Google Scholar 

  12. Shen, J.L., Hung, J.w., Lee, L.s.: Robust entropy-based endpoint detection for speech recognition in noisy environments. In: Proc. International Conference on Spoken Language Processing (1998)

    Google Scholar 

  13. You, H., Zhu, Q., Alwan, A.: Entropy-based variable frame rate analysis of speech signal and its applications to asr. In: Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2004)

    Google Scholar 

  14. Ibarrola, A.C., Chavez, E.: A robust, entropy-based audio-fingerprint. In: IEEE International Conference on Multimedia and Expo 2006 (ICME 2006) (to appear, 2006)

    Google Scholar 

  15. Group, M.A.: Text of ISO/IEC Final Draft International Standar 15938-4 Information Technology - Multimedia Content Description Interface - Part 4: Audio. MPEG-7 (2001)

    Google Scholar 

  16. Martin, R.: Noise power spectral density estimation based on optimal smoothing and minimum statistics. IEEE Transactions on Speech and Audio Processing 9, 504–512 (2001)

    Article  Google Scholar 

  17. Sakoe, H., Chiba, S.: Dynamic programming algortihm optimization for spoken word recognition. In: IEEE transactions on Acoustics and Speech Signal Processing (ASSP), pp. 43–49 (1978)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Camarena-Ibarrola, A., Chávez, E. (2006). On Musical Performances Identification, Entropy and String Matching. In: Gelbukh, A., Reyes-Garcia, C.A. (eds) MICAI 2006: Advances in Artificial Intelligence. MICAI 2006. Lecture Notes in Computer Science(), vol 4293. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11925231_91

Download citation

  • DOI: https://doi.org/10.1007/11925231_91

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-49026-5

  • Online ISBN: 978-3-540-49058-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics