Encoding Spatial Context for Large-Scale Partial-Duplicate Web Image Retrieval

Zhou, Wen-Gang; Li, Hou-Qiang; Lu, Yijuan; Tian, Qi

doi:10.1007/s11390-014-1472-3

Encoding Spatial Context for Large-Scale Partial-Duplicate Web Image Retrieval

Regular Paper
Published: 12 September 2014

Volume 29, pages 837–848, (2014)
Cite this article

Journal of Computer Science and Technology Aims and scope Submit manuscript

Wen-Gang Zhou^1,2,
Hou-Qiang Li^1,2,
Yijuan Lu³ &
…
Qi Tian⁴

138 Accesses
4 Citations
Explore all metrics

Abstract

Many recent state-of-the-art image retrieval approaches are based on Bag-of-Visual-Words model and represent an image with a set of visual words by quantizing local SIFT (scale invariant feature transform) features. Feature quantization reduces the discriminative power of local features and unavoidably causes many false local matches between images, which degrades the retrieval accuracy. To filter those false matches, geometric context among visual words has been popularly explored for the verification of geometric consistency. However, existing studies with global or local geometric verification are either computationally expensive or achieve limited accuracy. To address this issue, in this paper, we focus on partial duplicate Web image retrieval, and propose a scheme to encode the spatial context for visual matching verification. An efficient affine enhancement scheme is proposed to refine the verification results. Experiments on partial-duplicate Web image search, using a database of one million images, demonstrate the effectiveness and efficiency of the proposed approach. Evaluation on a 10-million image database further reveals the scalability of our approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Category-Level Contrastive Learning for Unsupervised Hashing in Cross-Modal Retrieval

Article Open access 02 April 2024

The Open Images Dataset V4

Article 13 March 2020

Semantic-Aligned Matching for Enhanced DETR Convergence and Multi-Scale Feature Fusion

Article 20 February 2024

References

Wu Z, Ke Q, Isard M, Sun J. Bundling features for large scale partial-duplicate web image search. In Proc. CVPR, June 2009, pp.25–32.
Xie H, Gao K, Zhang Y et al. Efficient feature detection and effective post-verification for large scale near-duplicate image search. IEEE Trans. Multimedia, 2011, 13(6): 1319–1332.
Article Google Scholar
Xie L, Tian Q, Zhou W et al. Fast and accurate near-duplicate image search with affinity propagation on the Image Web. Computer Vision and Image Understanding, 2014, 124: 31–41.
Article Google Scholar
Chu L, Jiang S, Wang S, Zhang Y, Huang Q. Robust spatial consistency graph model for partial duplicate image retrieval. IEEE Trans. Multimedia, 2013, 15(8): 1982–1996.
Article Google Scholar
Sivic J, Zisserman A. Video Google: A text retrieval approach to object matching in videos. In Proc. the 9th IEEE Int. Conf. Computer Vision, Oct. 2003, pp.1470–1477.
Nister D, Stewenius H. Scalable recognition with a vocabulary tree. In Proc. CVRP, June 2006, pp.2161–2168.
Chum O, Philbin J, Sivic J, Isard M, Zisserman A. Total recall: Automatic query expansion with a generative featuremodel for object retrieval. In Proc. the 11th IEEE Int. Conf. Computer Vision, Oct. 2007, pp.1–8.
Chum O, Philbin J, Zisserman A. Near duplicate image detection: Min-Hash and tf-idf weighting. In Proc. the 19th BMVC, Sept. 2008, pp.493–502.
Chum O, Perdoch M, Matas J. Geometric min-hashing: Finding a (thick) needle in a haystack. In Proc. CVPR, June 2009, pp.17–24.
Jegou H, Douze M, Schmid C. Hamming embedding and weak geometric consistency for large scale image search. In Proc. the 10th ECCV, Oct. 2008, pp.304–317.
Philbin J, Chum O, Isard M, Sivic J, Zisserman A. Object retrieval with large vocabularies and fast spatial matching. In Proc. CVPR, June 2007, pp.1–8.
Philbin J, Chum O, Isard M et al. Lost in quantization: Improving particular object retrieval in large scale image databases. In Proc. CVPR, June 2008, pp.1–8.
Jégou H, Douze M, Schmid C, Pérez P. Aggregating local descriptors into a compact image representation. In Proc. CVPR, June 2010, pp.3304–3311.
Zhang Y, Jia Z, Chen T. Image retrieval with geometry-preserving visual phrases. In Proc. CVPR, June 2011, pp.809–816.
Zhou W, Lu Y, Li H, Song Y, Tian Q. Spatial coding for large scale partial-duplicate Web image search. In Proc. Int. Conf. Multimedia, Oct. 2010, pp.511–520.
Zheng L,Wang S, Liu Z, Tian Q. LP-Norm IDF for large scale image search. In Proc. CVPR, June 2013, pp.1626–1633.
Xie H, Zhang Y, Tan J, Guo L, Li J. Contextual query expansion for image retrieval. IEEE Trans. Multimedia, 2014, 16(4): 1104–1114.
Article Google Scholar
Liu Z, Li H, Zhou W, Zhao R, Tian Q. Contextual hashing for large-scale image search. IEEE Trans. Image Processing, 2014, 23(4): 1606–1614.
Article MathSciNet Google Scholar
Lowe D G. Distinctive image features from scale invariant keypoints. International Journal of Computer Vision, 2004, 60(2): 91–110.
Article Google Scholar
Zhou W, Lu Y, Li H, Tian Q. Scalar quantization for large scale image search. In Proc. the 20th ACM Multimedia, Oct. 2012, pp.169–178.
Babenko A, Lempitsky V. The inverted multi-index. In Proc. CVPR, June 2012, pp.3069–3076.
Shen X, Lin Z, Brandt J et al. Object retrieval and localization with spatially-constrained similarity measure and k-NN re-ranking. In Proc. CVPR, June 2012, pp.3013–3020.
Zhou W, Li H, Lu Y, Tian Qi. SIFT match verification by geometric coding for large-scale partial-duplicate Web image search. ACM Trans. Multimedia Computing, Communications, and Applications, 2013, 9(1): Article No. 4.
Wang W, Zhang D, Zhang Y, Li J, Gu X. Robust spatial matching for object retrieval and its parallel implementation on GPU. IEEE Trans. Multimedia, 2011, 13(6): 1308–1318.
Article Google Scholar
Chum O, Mikulik A, Perdoch M, Matas J. Total recall II: Query expansion revisited. In Proc. CVPR, June 2011, pp.889–896.
Zhou W, Li H, Lu Y, Wang M, Tian Q. Visual word expansion and BSIFT verification for large-scale image search. Multimedia Systems, 2013. http://link.springer.com/article/10.1007/s00530-013-0330-4, Aug. 2014.
Zhang S, Yang M, Wang X et al. Semantic-aware co-indexing for image retrieval. In Proc. ICCV, 2013, pp.1673–1680.
Zhang S, Tian Q, Lu K et al. Edge-SIFT: Discriminative binary descriptor for scalable partial-duplicate mobile search. IEEE Trans. Image Processing, 2013, 22(7): pp.2889–2902.
Jégou H, Harzallah H, Schmid C. A contextual dissimilarity measure for accurate and efficient image search. In Proc. CVPR, June 2007, pp.1–8.
Zhou W, Yang M, Li H, Wang X, Lin Y, Tian Q. Towards codebook-free: Scalable cascaded hashing for mobile image search. IEEE Trans. Multimedia, 2014, 16(3): 601–611.
Article Google Scholar
Arandjelovic R, Zisserman A. Three things everyone should know to improve object retrieval. In Proc. CVPR, June 2012, pp.2911–2918.
Zhang X, Zhang L, Shum H Y. QsRank: Query-sensitive hash code ranking for efficient ²-neighbor search. In Proc. CVPR, June 2012, pp.2058–2065.
Fischler M A, Bolles R C. Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 1981, 24(6): 381–395.
Article MathSciNet Google Scholar
Chum O, Matas J. Matching with PROSAC-progressive sample consensus. In Proc. CVPR, June 2005, pp.220–226.
Smith J R, Chang S F. VisualSEEk: A fully automated content-based image query system. In Proc. the 4th ACM Multimedia, Nov. 1996, pp.75–84.
Chang S, Shi Q, Yan C. Iconic indexing by 2-D strings. IEEE Trans. Pattern Analysis and Machine Intelligence, 1987, 9(3): 413–328.
Article Google Scholar
Matas J, Chum O, Urban M, Pajdla T. Robust wide-baseline stereo from maximally stable extremal regions. Image and Vision Computing, 2004, 22(10): 761–767.
Article Google Scholar
Chum O, Philbin J, Isard M et al. Scalable near identical image and shot detection. In Proc. CIVR, July 2007, pp.549–556.
Lazebnik S, Schmid C, Ponce J. Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In Proc. CVPR, June 2006, pp.2169–2178.
Belongie S, Malik J, Puzicha J. Shape matching and object recognition using shape contexts. IEEE Trans. Pattern Analysis and Machine Intelligence, 2002, 24(4): 509–522.
Article Google Scholar
Savarese S, Winn J, Criminisi A. Discriminative object class models of appearance and shape by correlatons. In Proc. CVPR, June 2006, pp.2033–2040.
Yuan J, Wu Y, Yang M. Discovery of collocation patterns: From visual words to visual phrases. In Proc. CVPR, June 2007, pp.1–8.
Zhang Y, Chen T. Efficient kernels for identifying unbounded-order spatial features. In Proc. CVPR, June 2009, pp.1762–1769.
Deng J, Dong W, Socher R et al. ImageNet: A large-scale hierarchical image database. In Proc. CVPR, June 2009, pp.248–255.

Download references

Author information

Authors and Affiliations

Chinese Academy of Sciences Key Laboratory of Technology in Geo-Spatial Information Processing and Application System, University of Science and Technology of China, Hefei, 230027, China
Wen-Gang Zhou & Hou-Qiang Li
Department of Electronic Engineering and Information Science, University of Science and Technology of China, Hefei, 230027, China
Wen-Gang Zhou & Hou-Qiang Li
Department of Computer Science, Texas State University, San Marcos, TX, 78666, U.S.A.
Yijuan Lu
Department of Computer Science, University of Texas at San Antonio, San Antonio, TX, 78249, U.S.A.
Qi Tian

Authors

Wen-Gang Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Hou-Qiang Li
View author publications
You can also search for this author in PubMed Google Scholar
Yijuan Lu
View author publications
You can also search for this author in PubMed Google Scholar
Qi Tian
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wen-Gang Zhou.

Additional information

This work was supported in part to Dr. Wen-Gang Zhou by the Fundamental Research Funds for the Central Universities of China under Grant Nos. WK2100060014 and WK2100060011, the Start-Up Funding from the University of Science and Technology of China under Grant No. KY2100000036, the Open Project of Beijing Multimedia and Intelligent Software Key Laboratory in Beijing University of Technology, and the sponsor from Intel ICRI MNC project, in part to Dr. Hou-Qiang Li by the National Natural Science Foundation of China (NSFC) under Grant Nos. 61325009, 61390514, and 61272316, in part to Dr. Yijuan Lu by the Army Research Office (ARO) of USA under Grant No. W911NF-12-1-0057 and the National Science Foundation of USA under Grant No. CRI 1305302, and in part to Dr. Qi Tian by ARO under Grant No. W911NF-12-1-0057 and the Faculty Research Award by NEC Laboratories of America, respectively. This work was supported in part by NSFC under Grant No. 61128007.

Electronic supplementary material

Below is the link to the electronic supplementary material.

ESM 1

(PDF 129 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhou, WG., Li, HQ., Lu, Y. et al. Encoding Spatial Context for Large-Scale Partial-Duplicate Web Image Retrieval. J. Comput. Sci. Technol. 29, 837–848 (2014). https://doi.org/10.1007/s11390-014-1472-3

Download citation

Received: 15 March 2014
Revised: 15 July 2014
Published: 12 September 2014
Issue Date: September 2014
DOI: https://doi.org/10.1007/s11390-014-1472-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Encoding Spatial Context for Large-Scale Partial-Duplicate Web Image Retrieval

Abstract

Access this article

Similar content being viewed by others

Category-Level Contrastive Learning for Unsupervised Hashing in Cross-Modal Retrieval

The Open Images Dataset V4

Semantic-Aligned Matching for Enhanced DETR Convergence and Multi-Scale Feature Fusion

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Electronic supplementary material

ESM 1

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Encoding Spatial Context for Large-Scale Partial-Duplicate Web Image Retrieval

Abstract

Access this article

Similar content being viewed by others

Category-Level Contrastive Learning for Unsupervised Hashing in Cross-Modal Retrieval

The Open Images Dataset V4

Semantic-Aligned Matching for Enhanced DETR Convergence and Multi-Scale Feature Fusion

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Electronic supplementary material

ESM 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation