Abstract
We report results on audio copy detection for TRECVID 2009 copy detection task. This task involves searching for transformed audio queries in over 385 h of test audio. The queries were transformed in seven different ways, three of them involved mixing unrelated speech to the original query, making it a much more difficult task. We give results with two different audio fingerprints and show that mapping each test frame to the nearest query frame (nearest-neighbor fingerprint) results in robust audio copy detection. The most difficult task in TRECVID 2009 was to detect audio copies using predetermined thresholds computed from 2008 data. We show that the nearest-neighbor fingerprints were robust to even this task and gave actual minimal normalized detection cost rate (NDCR) of around 0.06 for all the transformations. These results are close to those obtained by using the optimal threshold for each transform. This result shows the robustness of the nearest-neighbor fingerprints. These nearest-neighbor fingerprints can be efficiently computed on a graphics processing unit, leading to a very fast search.
Similar content being viewed by others
Notes
These features are not strictly fingerprints, as their value changes when we change the query.
References
2009 US Music Purchases up 2.1%... www.businesswire.com/portal/site/home/permalink/?ndmViewId=news_view&newsId=20100106007077&newsLang=en. Accessed 6 January 2010
Cardinal P, Gupta V, Boulianne G (2010) Content-based advertisements detection. In: Proc. InterSpeech 2010
CBCD Evaluation Plan TRECVID 2009 (v1). www-nlpir.nist.gov/projects/tv2009/Evaluation-cbcd-v1.3.htm#eval. Accessed June 2009
Covell M, Baluja S, Fink M (2006) Advertisement detection and replacement using acoustic and visual repetition. In: IEEE workshop multimedia sig proc, Victoria, Canada, pp 461–466
Doets P, Lagendijk R (2005) Extracting quality parameters for compressed audio from fingerprints. ismir2005.ismir.net/proceedings/1063.pdf. Accessed September 2005
Duygulu P, Chen M, Hauptmann A (2004) Comparison and combination of two novel commercial detection methods. In: Proc ICME, pp 1267–1270
Final CBCD Evaluation Plan TRECVID (2008) www-nlpir.nist.gov/projects/tv2008/Evaluation-cbcd-v1.3.htm. Accessed 3 June 2008
Gupta V, Boulianne G, Kenny P, Dumouchel P (2008) Advertisement detection in French broadcast news using acoustic repetition and Gaussian mixture models. In: Proc. InterSpeech 2008, Brisbane, Australia
Haitsma J, Kalker T (2002) A highly robust audio fingerprinting system. ismir2002.ismir.net/proceedings/02-FP04-2.pdf. Accessed October 2002
Héritier M, Gupta V, Gagnon L, Boulianne G, Foucher S, Cardinal P (2008) CRIM’s content-based copy detection system for TRECVID. In: Proc TRECVID-2009, Gaithersburg, USA
Hurley N, Balado F, McCarthy E, Silvestre G (2007) Performance of Phillips Audio Fingerprinting under desynchronisation. ismir2007.ismir.net/proceedings/ISMIR2007_p133_hurley.pdf. Accessed September 2007
IFPI digital music report (2009) www.ifpi.org/content/library/dmr2009.pdf. Accessed 16 January 2009
IFPI digital music report (2010) www.ifpi.org/content/library/dmr2010.pdf. Accessed 21 January 2010
Ke Y, Hoiem D, Sukthankar R (2005) Computer vision for music identification. In: Proc comp vision pattern recog
Kraaij W, Awad G, Over P (2008) TRECVID-2008 content-based copy detection. www-nlpir.nist.gov/projects/tvpubs/tv8.slides/CBCD.slides.pdf. Accessed November 2008
Saracoğlu A, Esen E, Ates T, Acar B, Zubari U, Ozan E, Özalp E, Alatan A, Çiloglu T (2009) Content based copy detection with coarse audio-visual fingerprints. Seventh international workshop on content-based multimedia indexing (CBMI), pp 213–218
Shrestha P, Kalker T (2004) Audio fingerprinting in peer-to-peer networks. ismir2004.ismir.net/proceedings/p062-page-341-paper91.pdf. Accessed October 2004
Smeaton A, Over P, Kraaij W (2006) Evaluation campaigns and TRECVid. In: Proc. 8th ACM international workshop multimedia information retrieval (Santa Barbara, California), MIR ’06. ACM, New York. http://doi.acm.org/10.1145/1178677.1178722
Author information
Authors and Affiliations
Corresponding author
Additional information
This work was supported in part by the Natural Science and Engineering Research Council of Canada (NSERC).
Rights and permissions
About this article
Cite this article
Gupta, V.N., Boulianne, G. & Cardinal, P. CRIM’s content-based audio copy detection system for TRECVID 2009. Multimed Tools Appl 60, 371–387 (2012). https://doi.org/10.1007/s11042-010-0608-x
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-010-0608-x