Skip to main content
Log in

An algorithm that minimizes audio fingerprints using the difference of Gaussians

  • Published:
Journal of Zhejiang University SCIENCE C Aims and scope Submit manuscript

Abstract

Recently, many audio search sites headed by Google have used audio fingerprinting technology to search for the same audio and protect the music copyright using one part of the audio data. However, if there are fingerprints per audio file, then the amount of query data for the audio search increases. In this paper, we propose a novel method that can reduce the number of fingerprints while providing a level of performance similar to that of existing methods. The proposed method uses the difference of Gaussians which is often used in feature extraction during image signal processing. In the experiment, we use the proposed method and dynamic time warping and undertake an experimental search for the same audio with a success rate of 90%. The proposed method, therefore, can be used for an effective audio search.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Baluja, S., Covell, M., 2007. Audio Fingerprinting: Combining Computer Vision & Data Stream Processing. Proc. Int. Conf. on Acoustics, Speech, and Signal Processing, p.213–216. [doi:10.1109/ICASSP.2007.366210]

  • Baluja, S., Covell, M., 2008. Waveprint: efficient wavelet-based audio fingerprinting. Pattern Recogn., 41(11):3467–3480. [doi:10.1016/j.patcog.2008.05.006]

    Article  MATH  Google Scholar 

  • Brown, J.C., Hodgins-Davis, A., Miller, P.J.O., 2006. Classification of vocalizations of killer whales using dynamic time warping. J. Acoust. Soc. Am., 119(3):EL34–EL40. [doi:10.1121/1.2166949]

    Article  Google Scholar 

  • Cano, P., Batlle, E., Kalker, T., Haitsma, J., 2005. A review of audio fingerprinting. J. VLSI Signal Process., 41(3):271–284. [doi:10.1007/s11265-005-4151-3]

    Article  Google Scholar 

  • Chung, M.B., Ko, I.J., 2010. Identical-video retrieval using the low-peak feature of a video’s audio information. J. Zhejiang Univ.-Sci. C (Comput. & Electron.), 11(3):151–159. [doi:10.1631/jzus.C0910472]

    Article  Google Scholar 

  • Enswers Co., 2008. Method and Apparatus for Generating Audio Fingerprint Data and Comparing Audio Data Using the Same (10-2008-0098878). Korean Intellectual Property.

  • Independent Recording Network, 2006. Interactive Frequency Charts. Available from http://www.independentrecording.net/irn/resources/freqchart/main_display.htm [Accessed on Sept. 2010].

  • Kennedy, J., 2009. Digital Music Report 2009. IFPI. Available from http://www.ifpi.org/content/library/dmr2009.pdf [Accessed on Sept. 2010].

  • Kim, S., Unal, E., Narayanan, S., 2008. Music Fingerprint Extraction for Classical Music Cover Song Identification. Proc. IEEE Int. Conf. on Multimedia and Expo, p.1261–1264. [doi:10.1109/ICME.2008.4607671]

  • Kimura, A., Kashino, K., Kurozumi, T., Murase, H., 2001. Very Quick Audio Searching: Introducing Global Pruning to the Time-Series Active Search. Proc. IEEE Int. Conf. on the Acoustics, Speech, and Signal Processing, p.1429–1432. [doi:10.1109/ICASSP.2001.941198]

  • Logan, B., 2000. Mel Frequency Cepstral Coefficients for Music Modeling. Proc. Int. Symp. on Music Information Retrieval.

  • Lowe, D.G., 2004. Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis., 60(2):91–110. [doi:10.1023/B:VISI.0000029664.99615.94]

    Article  Google Scholar 

  • Mihak, M., Venkatesan, R., 2001. A perceptual audio hashing algorithm: a tool for robust audio identification and information hiding. LNCS, 2137:51–65. [doi:10.1007/3-540-45496-9]

    Google Scholar 

  • Mikolajczyk, K., Schmid, C., 2004. Scale & affine invariant interest point detectors. Int. J. Comput. Vis., 60(1):63–86. [doi:10.1023/B:VISI.0000027790.02288.

    Article  Google Scholar 

  • Pickens, J., 2002. A Comparison of Language Modeling and Probabilistic Text Information Retrieval Approaches to Monophonic Music Retrieval. Proc. Int. Symp. on Music Information Retrieval.

  • Ponte, J.M., Croft, W.B., 1998. A Language Modeling Approach to Information Retrieval. Proc. 21st Annual Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, p.275–281. [doi:10.1145/290941.291008]

  • Rein, S., Reisslein, M., 2006. Identifying the classical music composition of an unknown performance with wavelet dispersion vector and neural nets. Inf. Sci., 176(12):1629–1655. [doi:10.1016/j.ins.2005.06.002]

    Article  MATH  Google Scholar 

  • Stephen, D.J., 1999. Music Retrieval as Text Retrieval: Simple yet Effective. Proc. 22nd Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, p.297–298. [doi:10.1145/312624.312727]

  • Wang, A.L.C., Smith, J.O., 2002. Method for Search in an Audio Database. World Intellectual Property Organization Publication WO/2002/11123A2. Available from http://www.wipo.int/pctdb/en/wo.jspIA=WO2002011123 [Accessed on Sept. 2010].

  • Wold, E., Blum, T., Keislar, D., Wheaton, J., 1996. Content-based classification, search, and retrieval of audio. IEEE Multimedia, 3(3):27–36. [doi:10.1109/93.556537]

    Article  Google Scholar 

  • Youssef, A.M., Abdel-Galil, T.K., El-Saadany, E.F., Salama, M.M.A., 2004. Disturbance classification utilizing dynamic time warping classifier. IEEE Trans. Power Del., 19(1):272–278. [doi:10.1109/TPWRD.2003.820178]

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to MyoungBeom Chung.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chung, M., Ko, I. An algorithm that minimizes audio fingerprints using the difference of Gaussians. J. Zhejiang Univ. - Sci. C 12, 836–845 (2011). https://doi.org/10.1631/jzus.C1000396

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1631/jzus.C1000396

Key words

CLC number

Navigation