Aligning codebooks for near duplicate image detection

Battiato, Sebastiano; Farinella, Giovanni Maria; Puglisi, Giovanni; Ravì, Daniele

doi:10.1007/s11042-013-1470-4

Aligning codebooks for near duplicate image detection

Published: 21 April 2013

Volume 72, pages 1483–1506, (2014)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Sebastiano Battiato¹,
Giovanni Maria Farinella¹,
Giovanni Puglisi¹ &
…
Daniele Ravì¹

475 Accesses
10 Citations
Explore all metrics

Abstract

The detection of near duplicate images in large databases, such as the ones of popular social networks, digital investigation archives, and surveillance systems, is an important task for a number of image forensics applications. In digital investigation, hashing techniques are commonly used to index large quantities of images for the detection of copies belonging to different archives. In the last few years, different image hashing techniques based on the Bags of Visual Features paradigm appeared in literature. Recently, this paradigm has been augmented by using multiple descriptors (e.g., Bags of Visual Phrases) in order to exploit the coherence between different feature spaces. In this paper we propose to further improve the Bags of Visual Phrases approach considering the coherence between feature spaces not only at the level of image representation, but also during the codebook generation phase. Also we introduce a novel image database specifically designed for the development and benchmarking of near duplicate image retrieval techniques. The dataset consists of more than 3,300 images depicting more than 500 different scenes having at least three real near duplicates. The dataset has a huge variability in terms of geometric and photometric transformations between scenes and their corresponding near duplicates. Finally, we suggest a method to compress the proposed image representation for storage purposes. Experiments show the effectiveness of the proposed near duplicate retrieval technique, which outperforms the original Bags of Visual Phrases approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Revisiting Gist-PCA Hashing for Near Duplicate Image Detection

Article 09 May 2018

A fast and efficient large-scale near duplicate image retrieval system using double perceptual hashing

Article 16 August 2024

A Review on Near-Duplicate Detection of Images using Computer Vision Techniques

Article 06 January 2020

Notes

Note that at this stage other encoding methods can be used starting from the aligned vocabulary [7].
We consider a dataset as synthetic when the near duplicates are generated from a set of images (or frames of videos) by using transformations typically available on image manipulation software (e.g., ImageMagick http://www.imagemagick.org), such as colorizing, contrast changing, cropping, despeckling, downsampling, format changing, framing, rotating, scaling, saturation changing, intensity changing, shearing. To generate near duplicates the basic transformations are usually applied changing the different involved parameters and/or making combination of them.

References

Battiato S, Farinella GM, Gallo G, Ravì D (2010) Exploiting textons distributions on spatial hierarchy for scene classification. EURASIP J Image Video Process Article ID 919367:1–13. doi:10.1155/2010/919367
Google Scholar
Battiato S, Farinella GM, Messina E, Puglisi G (2012) Robust image alignment for tampering detection. IEEE Trans Inf Forensics Secur 7(4):1105–1117
Article Google Scholar
Battiato S, Farinella GM, Guarnera GC, Meccio T, Puglisi G, Ravì D, Rizzo R (2010) Bags of phrases with codebooks alignment for near duplicate image detection. In: Proceedings of the international acm workshop on multimedia in forensics, security and intelligence (MiFor 2010), in conjunction with international acm multimedia conference, pp 65–70
Bay H, Ess A, Tuytelaars T, Van Gool L (2008) Speeded-up robust features (SURF). Int J Comput Vis Image Understand 110(3):346–359
Article Google Scholar
Belongie S, Malik J, Puzicha J (2002) Shape matching and object recognition using shape contexts. IEEE Trans Pattern Anal Mach Intell 2(4):509–522
Article Google Scholar
Belongie S, Malik J, Puzicha J (2002) Shape matching and object recognition using shape contexts. IEEE Trans Pattern Anal Mach Intell 24(24):509–521
Article Google Scholar
Chatfield K, Lempitsky V, Vedaldi A, Zisserman A (2011) The devil is in the details: an evaluation of recent feature encoding methods. In: Proceedings of the British machine vision conference
Cheng X, Hu Y, Chia L-T (2011) Exploiting local dependencies with spatial-scale space (s-cube) for near-duplicate retrieval. Comput Vis Image Understand 115(6):750–758
Article Google Scholar
Chum O, Philbin J, Zisserman A (2008) Near duplicate image detection: min-hash and tf-idf weighting. In: Proceeding of BMVC
Chum O, Perdoch M, Matas J (2009) Geometric min-hashing: finding a (thick) needle in a haystack. In: IEEE computer society conference on computer vision and pattern recognition, pp 17–24
De Oliveira R, Cherubini M, Oliver N (2010) Looking at near-duplicate videos from a human-centric perspective. ACM Trans Multimedia Comput Commun Appl 6(3):15:1–15:22
Article Google Scholar
Eastlake D, Jones P (2001) RFC 3174. http://tools.ietf.org/html/rfc3174
Freeman W, Adelson E (1991) The design and use of steerable filters. IEEE Trans Pattern Anal Mach Intell 13(9):891–906
Article Google Scholar
Grauman K, Darrell T (2005) The pyramid match kernel: discriminative classification with sets of image features. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp 1458–1465
Hu Y, Cheng X, Chia L-T, Xie X, Rajan D, Tan A-H (2009) Coherent phrase model for efficient image near-duplicate retrieval. IEEE Trans Multimedia 11(8):1434–1445
Article Google Scholar
Huiskes MJ, Lew MS (2008) The MIR Flickr retrieval evaluation. In: MIR ’08: proceedings of the 2008 ACM International conference on multimedia information retrieval. ACM, New York, NY
Google Scholar
Johnson AE, Hebert M (1999) Using spin images for efficient object recognition in cluttered 3d scenes. IEEE Trans Pattern Analy Mach Intell 21(5):433–449
Article Google Scholar
Jonker R, Volgenant A (1987) A shortest augmenting path algorithm for dense and sparse linear assignment problems. Computing 38(4):325–340
Article MATH MathSciNet Google Scholar
Ke Y, Sukthankar R, Huston L (2004) Efficient near-duplicate detection and sub-image retrieval. In: Proceeding of ACM multimedia, pp 869–876
Koenderink J, van Doorn A (1987) Representation of local geometry in the visual system. Biol Cybern 55:367–375
Article MATH Google Scholar
Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: Proceedings of the 2006 IEEE computer society conference on Computer Vision and Pattern Recognition, CVPR ’06, pp 2169–2178
Lazebnik S, Raginsky M (2009) Supervised learning of quantizer codebooks by information loss minimization. IEEE Trans Pattern Anal Mach Intell 31(7):1294–1309
Article Google Scholar
Lejsek H, ÃormóÃřsdóttir H, Ásmundsson F, DaÃřason K, Jóhannsson ÁÃ, Jónsson BÃ, Amsaleg L (2010) Videntifier forensic: large-scale video identification in practice. In: Proceeding of ACM workshop on multimedia in forensics, security and intelligence, pp 1–6
Leung T, Malik JJ (1999) Recognizing surfaces using three-dimensional textons. In: Proceedings of the IEEE international conference on computer vision, pp 1010–1017
Lowe D (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110
Article Google Scholar
Matas J, Chum O, Urban M, Pajdla T (2002) Robust wide-baseline stereo from maximally stable extremal regions. In: Proceedings of the British machine vision conference, pp 384–393
Mikolajczyk K, Schmid C (2004) Scale & affine invariant interest point detectors. Int J Comput Vis (IJCV) 60(1):63–86
Article Google Scholar
Mikolajczyk K, Schmid C (2005) A performance evaluation of local descriptors. IEEE Trans Pattern Analy Mach Intell (PAMI) 27(10):1615–1630
Article Google Scholar
Nistèr D, Stewènius H (2006) Scalable recognition with a vocabulary tree. In: Proceedings of the IEEE conference on Computer Vision and Pattern Recognition (CVPR), pp 2161–2168
Papadimitriou CH, Steiglitz K (1982) Combinatorial optimization: algorithms and complexity. Prentice-Hall, Inc
Philbin J, Chum O, Isard M, Sivic J, Zisserman A (2007) Object retrieval with large vocabularies and fast spatial matching. In: Proceedings of the International conference on computer vision and pattern recognition
Rivest RL (1992) RFC 1321. http://tools.ietf.org/html/rfc1321
Rosten E, Drummond T (2006) Machine learning for high-speed corner detection. In: Proceedings of the European conference on computer vision, pp 430–443
Rongrong J, Hongxun Y, Wei L, Xiaoshuai S, Tian TQ (2012) Task-dependent visual-codebook compression. IEEE Trans Image Process 21(4):2282–2293
Article MathSciNet Google Scholar
Rongrong J, Duan L-Y, Chen J, Xie L, Yao H, Gao W (2013) Learning to distribute vocabulary indexing for scalable visual search. IEEE Trans Multimedia 15(1):153–166
Article Google Scholar
Saffari A, Bischof H (2007) Clustering in a boosting framework. In: Computer vision winter workshop, pp 75–82
Salton G, McGill M (1983) Introduction to modern information retrieval. McGraw-Hill
Salton G, Buckley C (1988) Term-weighting approaches in automatic text retrieval. Inf Process Manage 24(5):513–523
Article Google Scholar
Sivic J, Russell BC, Efros AA, Zisserman A, Freeman WT (2005) Discovering object categories in image collections. In: Proceedings of the international conference on computer vision
Swain MJ, Ballard DH (1991) Color indexing. Int J Comput Vis 7(1):11–32
Article Google Scholar
Szeliski R (2010) Computer vision: algorithms and applications. Springer Available at http://szeliski.org/Book
van Gemert LC, Veenman CJ, Smeulders AWM, Geusebroek JM (2010) Visual word ambiguity. IEEE Trans Pattern Anal Mach Intell 32(7):1271–1283
Article Google Scholar
Wang Y, Hou Z, Leman K (2011) Keypoint-based near-duplicate images detection using affine invariant feature and color matching In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2011), pp 1209–1212
Wu Z, Ke Q, Isard M, Sun J (2009) Bundling features for large scale partial-duplicate web image search. In: Proceedings of the international conference on computer vision and pattern recognition, pp 25–32
Xu D, Chang S-F (2007) Visual event recognition in news video using kernel methods with multi-level temporal alignment. In: Proceeding of IEEE international conference on computer vision and pattern recognition
Xu D, Cham TJ, Yan S, Duan L, Chang S-F (2010) Near duplicate identification with spatially aligned pyramid matching. IEEE Trans Circuits Syst Video Technol (TCSVT) 20(8):1068–1079
Article Google Scholar
Zhang D-Q, Chang S-F (2004) Detecting image near-duplicate by stochastic attributed relational graph matching with learning. In: Proceedings of the ACM multimedia conference, pp 877–884
Zhao W-L, Ngo C-W, Tan H-K, Wu X (2007) Near-duplicate keyframe identification with interest point matching and pattern learning. IEEE Trans Multimedia 9(5):1037–1048
Article Google Scholar
Zhao W-L, Ngo C-W (2009) Scale-rotation invariant pattern entropy for keypoint-based near-duplicate detection. IEEE Trans Image Process 18(2):412–423
Article MathSciNet Google Scholar
Zhao WL, Wu X, Ngo CW (2010) On the annotation of web videos by efficient near-duplicate search. IEEE Trans Multimedia 12(5):448–461
Article Google Scholar
Zhao W-L, Wu X, Ngo C-W (2011) SOTU: a toolkit for efficient near-duplicate image/video & retrieval/detection. Manual for SOTU Version 1.06. http://www.cs.cityu.edu.hk/~wzhao2/sotu.htm
Zhu J, Hoi SC, Lyu MR, Yan S (2008) Near-duplicate keyframe retrieval by nonrigid image matching. In: Proceedings of the ACM multimedia conference, pp 41–50

Download references

Acknowledgements

Part of this work has been performed in the project PANORAMA, co-funded by grants from Belgium, Italy, France, the Netherlands, the United Kingdom, and the ENIAC Joint Undertaking. The authors would like to thank Giuseppe Claudio Guarnera, Tony Meccio and Rosetta Rizzo who have given some help at the beginning of this work.

Author information

Authors and Affiliations

Department of Mathematics and Computer Science, Image Processing Laboratory, University of Catania, Viale A. Doria 6, Catania, 95125, Italy
Sebastiano Battiato, Giovanni Maria Farinella, Giovanni Puglisi & Daniele Ravì

Authors

Sebastiano Battiato
View author publications
You can also search for this author inPubMed Google Scholar
Giovanni Maria Farinella
View author publications
You can also search for this author inPubMed Google Scholar
Giovanni Puglisi
View author publications
You can also search for this author inPubMed Google Scholar
Daniele Ravì
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Giovanni Maria Farinella.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Battiato, S., Farinella, G.M., Puglisi, G. et al. Aligning codebooks for near duplicate image detection. Multimed Tools Appl 72, 1483–1506 (2014). https://doi.org/10.1007/s11042-013-1470-4

Download citation

Published: 21 April 2013
Issue Date: September 2014
DOI: https://doi.org/10.1007/s11042-013-1470-4

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Aligning codebooks for near duplicate image detection

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Revisiting Gist-PCA Hashing for Near Duplicate Image Detection

A fast and efficient large-scale near duplicate image retrieval system using double perceptual hashing

A Review on Near-Duplicate Detection of Images using Computer Vision Techniques

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now