On Musical Performances Identification, Entropy and String Matching

Camarena-Ibarrola, Antonio; Chávez, Edgar

doi:10.1007/11925231_91

Antonio Camarena-Ibarrola²⁰ &
Edgar Chávez²⁰

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4293))

Included in the following conference series:

Mexican International Conference on Artificial Intelligence

762 Accesses
8 Citations

Abstract

In this paper we address the problem of matching musical renditions of the same piece of music also known as performances. We use an entropy based Audio-Fingerprint delivering a framed, small footprint AFP which reduces the problem to a string matching problem. The Entropy AFP has very low resolution (750 ms per symbol), making it suitable for flexible string matching.

We show experimental results using dynamic time warping (DTW), Levenshtein or edit distance and the Longest Common Subsequence (LCS) distance. We are able to correctly (100%) identify different renditions of masterpieces as well as pop music in less than a second per comparison.

The three approaches are 100% effective, but LCS and Levenshtein can be computed online, making them suitable for monitoring applications (unlike DTW), and since they are distances a metric index could be use to speed up the recognition process.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 189.00; Price excludes VAT (USA)

Softcover Book: USD 239.00; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Hu, N., Dannenberg, R.B., Tzanetakis, G.: Polyphonic audio matching and alignment for music retrieval. In: IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (2003)
Google Scholar
Shalev-Shwartz, S., Dubnov, S., Friedman, N., Singer, Y.: Robust temporal and spectral modeling for query by melody. In: Proc. of ACM SIGIR 2002 (2002)
Google Scholar
Cano, P., Loscos, A., Bonada, J.: Score-performance matching using hmms. In: Proceedings ICMC 1999 (1999)
Google Scholar
Dixon, S.: Live tracking of musical performances using on-line time warping. In: Proc of the 8th Int Conf on Digital Audio Effects (DAFx 2005) (2005)
Google Scholar
Gusfield, D.: Algorithms on Strings, Trees, and Sequences. Computer Science and Computational Biology. Cambridge University Press, Cambridge (1997)
Google Scholar
Navarro, G., Raffinot, M.: Flexible Pattern Matching in Strings. Practical On-Line Search for Texts and Biological Sequences, vol. 17, Cambridge University Press, Cambridge (2002)
Google Scholar
Ibarrola, A.C., Chavez, E.: A very robust audio-fingerprint based on the information content analysis. IEEE transactions on Multimedia (submitted), available: http://lc.fie.umich.mx/~camarena
Hellmuth, O., Allamanche, E., Cremer, M., Kastner, T., NeuBauer, C., Schmidt, S., Siebenhaar, F.: Content-based broadcast monitoring using mpeg-7 audio fingerprints. In: International Symposium on Music Information Retrieval ISMIR (2001)
Google Scholar
Haitsma, J., Kalker, T.: A highly robust audio fingerprinting system. In: IRCAM (2002)
Google Scholar
Cano, P., Battle, E., Kalker, T., Haitsma, J.: A review of algorithms for audio fingerprinting. In: IEEE Workshop on Multimedia Signal Processing, pp. 167–169 (2002)
Google Scholar
Shannon, C., Weaver, W.: The Mathematical Theory of Communication. University of Illinois Press (1949)
Google Scholar
Shen, J.L., Hung, J.w., Lee, L.s.: Robust entropy-based endpoint detection for speech recognition in noisy environments. In: Proc. International Conference on Spoken Language Processing (1998)
Google Scholar
You, H., Zhu, Q., Alwan, A.: Entropy-based variable frame rate analysis of speech signal and its applications to asr. In: Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2004)
Google Scholar
Ibarrola, A.C., Chavez, E.: A robust, entropy-based audio-fingerprint. In: IEEE International Conference on Multimedia and Expo 2006 (ICME 2006) (to appear, 2006)
Google Scholar
Group, M.A.: Text of ISO/IEC Final Draft International Standar 15938-4 Information Technology - Multimedia Content Description Interface - Part 4: Audio. MPEG-7 (2001)
Google Scholar
Martin, R.: Noise power spectral density estimation based on optimal smoothing and minimum statistics. IEEE Transactions on Speech and Audio Processing 9, 504–512 (2001)
Article Google Scholar
Sakoe, H., Chiba, S.: Dynamic programming algortihm optimization for spoken word recognition. In: IEEE transactions on Acoustics and Speech Signal Processing (ASSP), pp. 43–49 (1978)
Google Scholar

Download references

Author information

Authors and Affiliations

Universidad Michoacana de Sán Nicolás de Hidalgo, Edif “B” Ciudad Universitaria, CP 58000, Morelia, Mich., México
Antonio Camarena-Ibarrola & Edgar Chávez

Authors

Antonio Camarena-Ibarrola
View author publications
You can also search for this author in PubMed Google Scholar
Edgar Chávez
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Center for Computing Research, National Polytechnic Institute, 07738, Mexico City, México
Alexander Gelbukh
Instituto Nacional de Astrofísica, Óptica y Electrónica (INAOE), Luis Enrique Erro No. 1, Sta. Ma. Tonanzintla, 72840, Puebla, México
Carlos Alberto Reyes-Garcia

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Camarena-Ibarrola, A., Chávez, E. (2006). On Musical Performances Identification, Entropy and String Matching. In: Gelbukh, A., Reyes-Garcia, C.A. (eds) MICAI 2006: Advances in Artificial Intelligence. MICAI 2006. Lecture Notes in Computer Science(), vol 4293. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11925231_91

Download citation

DOI: https://doi.org/10.1007/11925231_91
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-49026-5
Online ISBN: 978-3-540-49058-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics