Abstract
Video copy detection is mainly required for protecting owners against unauthorized use of their content. Content-based copy detection methods rely on the evaluation of the similarity between potential copies and the original videos. Scalability is the key issue in making these methods practical for very large video databases. To address this challenge, we put forward here an optimized similarity-based search method that takes into account the local characteristics of the space of content signatures. First, refined models of the distortions undergone by the signatures during the copy creation process allow to search in a more appropriately defined area of the description space, increasing query selectivity and improving detection quality. Second, by identifying in the description space those regions where the local density of content signatures is high, a significant additional reduction of the computation cost is obtained. An evaluation on ground truth databases shows that the proposed solution is reliable. Scalability is then demonstrated on larger databases of up to 280,000 h of video.















Similar content being viewed by others
References
Bay H, Tuytelaars T, Gool LJV (2006) SURF: speeded up robust features. In: Leonardis A, Bischof H, Pinz A (eds) Proc. European conf. on computer vision (ECCV’06), LNCS, vol 3951. Springer, New York, pp 404–417
Berrani S-A, Amsaleg L, Gros P (2003) Robust content-based image searches for copyright protection. In: Proc. 1st ACM intl. workshop on multimedia databases (MMDB’03), New Orleans, USA. ACM, New York, pp 70–77
Chang E, Wang J, Li C, Wilderhold G (1998) Rime—a replicated image detector for the world-wide web. In: Proc. SPIE symp. on voice, video and data comm., pp 58–67
Chum O, Philbin J, Isard M, Zisserman A (2007) Scalable near identical image and shot detection. In: Proc. 6th ACM intl. conf. on image and video retrieval (CIVR’07), Amsterdam, The Netherlands. ACM, New York, pp 549–556
Eickeler S, Muller S (1999) Content-based video indexing of TV broadcast news using hidden Markov models. In: Proc. IEEE int. conf. on acoustics, speech, and signal processing (ICASSP’99), Washington, DC, USA. IEEE Computer Society, Los Alamitos, pp 2997–3000
Fischler MA, Bolles RC (1981) Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun ACM 24(6):381–395
Foo JJ (2007) Detection of near-duplicates in large image collections. Ph.D. diss., School of Comp. Sci. and Inf. Tech., Royal Melbourne Institute of Technology, Melbourne, Victoria
Foo JJ, Zobel J, Sinha R, Tahaghoghi SMM (2007) Detection of near-duplicate images for web search. In: Proc. 6th ACM intl. conf. on image and video retrieval (CIVR’07), New York, NY, USA. ACM, New York, pp 557–564
Gengembre N, Berrani S-A (2008) A probabilistic framework for fusing frame-based searches within a video copy detection system. In: Proc. of ACM international conference on content-based image and video retrieval (CIVR), Niagara Falls, Canada. ACM, New York, pp 211–220
Hampapur A, Hyun K, Bolle RM (2002) Comparison of sequence matching techniques for video copy detection. In: Yeung MM, Li C-S, Lienhart RW (eds) Proc. conf. on storage and retrieval for media databases, pp 194–201
Henrich A (1998) The LSDh-tree: an access structure for feature vectors. In: Proc. 14th intl. conf. on data engineering (ICDE’98), Washington, DC, USA. IEEE Computer Society, Los Alamitos, pp 362–369
Jaimes A, Chang S-F, Loui AC (2003) Detection of non-identical duplicate consumer photographs. In: 4th Pacific-Rim conf. on multimedia, vol 1, pp 16–20
Joly A, Buisson O, Frélicot C (2007) Content-based copy detection using distortion-based probabilistic similarity search. IEEE Trans Multimedia 9(2):293–306
Joly A, Frélicot C, Buisson O (2003) Robust content-based video copy identification in a large reference database. In: Intl. conf. on image and video retrieval (CIVR’03), pp 414–424
Joly A, Frélicot C, Buisson O (2005) Discriminant local features selection using efficient density estimation in a large database. In: Proc. 7th ACM SIGMM intl. workshop on multimedia information retrieval (MIR’05), New York, NY, USA. ACM, New York, pp 201–208
Ke Y, Sukthankar R (2004) PCA-SIFT: a more distinctive representation for local image descriptors. In: IEEE conf. on comp. vision and pattern recognition (CVPR’04), vol 2, Los Alamitos, CA, USA. IEEE Computer Society, Los Alamitos, pp 506–513
Ke Y, Sukthankar R, Huston L (2004) An efficient parts-based near-duplicate and sub-image retrieval system. In: Proc. ACM intl. conf. on multimedia, pp 869–876
Kim C, Vasudev B (2005) Spatiotemporal sequence matching for efficient video copy detection. CirSysVideo 2005 15(1):127–132
Law-To J, Buisson O, Gouet-Brunet V, Boujemaa N (2006) Robust voting algorithm based on labels of behavior for video copy detection. In: Proc. 14th ACM intl. conf. on multimedia, New York, NY, USA. ACM, New York, pp 835–844
Law-To J, Chen L, Joly A, Laptev I, Buisson O, Gouet-Brunet V, Boujemaa N, Stentiford F (2007) Video copy detection: a comparative study. In: Proc. 6th ACM intl. conf. on image and video retrieval (CIVR’07), New York, NY, USA. ACM, New York, pp 371–378
Law-To J, Joly A, Boujemaa N (2007) Muscle-VCD-2007: a live benchmark for video copy detection. http://www-rocq.inria.fr/imedia/civr-bench/
Lin E, Eskicioglu A, Lagendijk R, Delp E (2005) Advances in digital video content protection. Proc IEEE 93(1):171–183
Lowe DG (1999) Object recognition from local scale-invariant features. In: Proc. intl. conf. on computer vision (ICCV’99), vol 2, Washington, DC, USA. IEEE Computer Society, Los Alamitos, pp 1150–1157
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110
Marco B, Del Bimbo A, Nunziati W (2006) Video clip matching using MPEG-7 descriptors and edit distance. In: Proc. of ACM international conference on image and video retrieval (CIVR), LNCS, Tempe, AZ, pp 133–142
Mikolajczyk K, Schmid C (2001) Indexing based on scale invariant interest points. In: Proc. 8th intl. conf. on computer vision, pp 525–531
Mikolajczyk K, Schmid C (2004) Scale & affine invariant interest point detectors. Int J Comput Vis 60(1):63–86
Mikolajczyk K, Tuytelaars T, Schmid C, Zisserman A, Matas J, Schaffalitzky F, Kadir T, Gool LV (2005) A comparison of affine region detectors. Int J Comput Vis 65(1–2):43–72
Poullot S, Buisson O, Crucianu M (2007) Z-grid-based probabilistic retrieval for scaling up content-based copy detection. In: Proc. ACM intl. conf. on image and video retrieval (CIVR’07), Amsterdam, pp 348–355
Rothganger F, Lazebnik S, Schmid C, Ponce J (2006) 3D object modeling and recognition using local affine-invariant image descriptors and multi-view spatial constraints. Int J Comput Vis 66(3):231–259
Samet H (2006) Foundations of multidimensional and metric data structures. Morgan Kaufmann, San Francisco
Schaffalitzky F, Zisserman A (2002) Multi-view matching for unordered image sets, or “how do I organize my holiday snaps?”. In: Proc. 7th European conf. on computer vision (ECCV’02), London, UK. Springer, New York, pp 414–431
Schmid C, Mohr R (1997) Local grayvalue invariants for image retrieval. IEEE Trans Pattern Anal Mach Intell 19(5):530–535
Acknowledgements
This work was partly supported by the French National Research Agency (ANR) within the Sigmund project.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Poullot, S., Buisson, O. & Crucianu, M. Scaling content-based video copy detection to very large databases. Multimed Tools Appl 47, 279–306 (2010). https://doi.org/10.1007/s11042-009-0323-7
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-009-0323-7