Abstract
Images are amongst the most widely proliferated form of digital information due to affordable imaging technologies and the Web. In such an environment, the use of digital watermarking for image copyright infringement detection is a challenge. For such tasks, near-duplicate image detection is increasingly attractive due to its ability of automated content analysis; moreover, the application domain also extends to data management. The application of PCA-SIFT features and Locality-Sensitive Hashing (LSH) — for indexing and retrieval — has been shown to be highly effective for this task. In this work, we prune the number of PCA-SIFT features and introduce a modified Redundant Bit Vector (RBV) index. This is the first application of the RBV index that shows near-perfect effectiveness. Using the best parameters of our RBV approach, we observe an average recall and precision of 91% and 98%, respectively, with query response time of under 10 seconds on a collection of 20,000 images. Compared to the baseline (the LSH index), the query response times and index size of the RBV index is 12 times faster and 126 times smaller, respectively. As compared to brute-force sequential scan, the RBV index rapidly reduces the search space to 1/80.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Böhm, C., Berchtold, S., Keim, D.A.: Searching in high-dimensional spaces: Index structures for improving the performance of multimedia databases. ACM Computing Surveys 33(3), 322–373 (2001)
Corel Corporation: Corel professional photos CD-ROMs (1994)
Fischler, M.A., Bolles, R.C.: Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24(6), 381–395 (1981)
Foo, J.J., Sinha, R.: Pruning SIFT for Scalable Near-duplicate Image Matching. In: Proc. ADC Australian Database Conference (January 2007)
Foo, J.J., Sinha, R., Zobel, J.: Discovery of Image Versions in Large Collections. In: Cham, T.-J., Cai, J., Dorai, C., Rajan, D., Chua, T.-S., Chia, L.-T. (eds.) MMM 2007. LNCS, vol. 4352, pp. 433–442. Springer, Heidelberg (2006)
Gionis, A., Indyk, P., Motwani, R.: Similarity search in high dimensions via hashing. In: Proc. VLDB Int. Conf. on Very Large Data Bases, Edinburgh, Scotland, UK, September 1999, pp. 518–529. Morgan Kaufmann, San Francisco (1999)
Goldstein, J., Platt, J.C., Burges, C.J.C.: Indexing high dimensional rectangles for fast multimedia identification. Technical report, Microsoft Research, Redmond, WA, USA (2003)
Goldstein, J., Plat, J.C., Burges, C.J.C.: Redundant Bit Vectors for Quickly Searching High-Dimensional Regions. In: Winkler, J.R., Niranjan, M., Lawrence, N.D. (eds.) Deterministic and Statistical Methods in Machine Learning. LNCS (LNAI), vol. 3635, pp. 137–158. Springer, Heidelberg (2005)
Grauman, K., Darrell, T.: Efficient image matching with distributions of local invariant features. In: Proc. CVPR Int. Conf. on Computer Vision and Pattern Recognition, June 2005, pp. 627–634 (2005)
Ke, Y., Sukthankar, R.: PCA-sift: A more distinctive representation for local image descriptors. In: Proc. CVPR Int. Conf. on Computer Vision and Pattern Recognition, Washington, DC, USA, June–July 2004, pp. 506–513. IEEE Computer Society Press, Los Alamitos (2004)
Ke, Y., Sukthankar, R., Huston, L.: An efficient parts-based near-duplicate and sub-image retrieval system. In: Proc. MM Int. Conf. on Multimedia, October 2004, pp. 869–876. ACM Press, New York (2004)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. Journal of Computer Vision 60(2), 91–110 (2004)
Mikolajczyk, K., Schmid, C.: A performance evaluation of local descriptors. In: Proc. CVPR Int. Conf. on Computer Vision and Pattern Recognition, June 2003, pp. 257–263 (2003)
Qamra, A., Meng, Y., Chang, E.Y.: Enhanced perceptual distance functions and indexing for image replica recognition. IEEE Trans. Pattern Analysis and Machine Intelligence 27(3), 379–391 (2005)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Foo, J.J., Sinha, R. (2007). Using Redundant Bit Vectors for Near-Duplicate Image Detection. In: Kotagiri, R., Krishna, P.R., Mohania, M., Nantajeewarawat, E. (eds) Advances in Databases: Concepts, Systems and Applications. DASFAA 2007. Lecture Notes in Computer Science, vol 4443. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71703-4_41
Download citation
DOI: https://doi.org/10.1007/978-3-540-71703-4_41
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-71702-7
Online ISBN: 978-3-540-71703-4
eBook Packages: Computer ScienceComputer Science (R0)