Skip to main content
Log in

Visual word expansion and BSIFT verification for large-scale image search

  • Regular Paper
  • Published:
Multimedia Systems Aims and scope Submit manuscript

Abstract

Recently, great advance has been made in large-scale content-based image search. Most state-of-the-art approaches are based on the bag-of-visual-words model with local features, such as SIFT, for image representation. Visual matching between images is obtained by vector quantization of local features. Feature quantization is either performed with hierarchical k-NN which introduces severe quantization loss, or with ANN (approximate nearest neighbors) search such as k-d tree, which is computationally inefficient. Besides, feature matching by quantization ignores the vector distance between features, which may cause many false-positive matches. In this paper, we propose constructing a supporting visual word table for all visual words by visual word expansion. Given the initial quantization result, multiple approximate nearest visual words are identified by checking supporting visual word table, which benefits the retrieval recall. Moreover, we present a matching verification scheme based on binary SIFT (BSIFT) signature. The L 2 distance between original SIFT descriptors is demonstrated to be well kept with the metric of Hamming distance between the corresponding binary SIFT signatures. With the BSIFT verification, false-positive matches can be effectively and efficiently identified and removed, which greatly improves the precision of large-scale image search. We evaluate the proposed approach on two public datasets for large-scale image search. The experimental results demonstrate the effectiveness and efficiency of our scheme.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Sivic, J., Zisserman, A.: Video Google: a text retrieval approach to object matching in videos. In Proceedings of ICCV (2003)

  2. Zhou, W., Lu, Y., Li, H., Song, Y., Tian, Q.: Spatial coding for large scale partial-duplicate Web image search. In: Proceedings of ACM Multimedia (2010)

  3. Nister, D., Stewenius, H.: Scalable recognition with a vocabulary tree. In: Proceedings of CVPR (2006)

  4. Chum, O., Philbin, J., Sivic, J., Isard, M., Zisserman, A.: Total recall: automatic query expansion with a generative feature model for object retrieval. In: Proceedings of ICCV (2007)

  5. Lowe, D.: Distinctive image features form scale-invariant keypoints. IJCV 20(2), 91–110 (2004)

    Article  Google Scholar 

  6. Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Lost in quantization: improving particular object retrieval in large scale image databases. In: Proceedings of CVPR (2008)

  7. Tuytelaars, T., Schmid, C.: Vector quantizing feature space with a regular lattice. In: Proceedings of ICCV (2010)

  8. Jegou, H., Douze, M., Schmid, C.: Hamming embedding and weak geometric consistency for large scale image search. In: Proceedings of ECCV (2008)

  9. Kuo, Y., Chen, K., Chiang, C., Hsu, W.H.: Query expansion for hash-based image object retrieval. In: Proceedings of ACM Multimedia (2009)

  10. Philbin, J., Chum, O., Isard, M., Sivic J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: Proceedings of CVPR (2007)

  11. Baeza-Yates, R., Ribeiro-Neto, B.: Modern information retrieval. ACM Press, New York (1999). ISBN 020139829

    Google Scholar 

  12. Jain, M., Jegou, H., Gros, P.: Asymmetric Hamming embedding: taking the best of our bits for large scale image search. In: Proceedings of ACM Multimedia (2011)

  13. Jegou, H., Douze, M., Schmid, C., Pérez, P.: Aggregating local descriptors into a compact image representation. In: Proceedings of CVPR (2010)

  14. Matas, J., Chum, O., Martin, U., Pajdla, T.: Robust wide baseline stereo from maximally stable extremal regions. In: Proceedings of BMVC (2002)

  15. Mikolajczyk, K., Schmid, C.: Scale and affine invariant interest point detectors. IJCV 1(60), 63–86 (2004)

    Article  Google Scholar 

  16. Bay, H., Tuytelaars, T., Gool, L.V.: SURF: speeded up robust features. In: Proceedings of ECCV (2006)

  17. Chum, O., Philbin, J., Zisserman, A.: Near duplicate image detection: min-Hash and tf-idf weighting. In: Proceedings of BMVC (2008)

  18. Chum, O., Perdoch, M., Matas, J.: Geometric min-Hashing: finding a (thick) needle in a haystack. In: Proceedings of CVPR (2009)

  19. Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Comm ACM 24, 381–395 (1981)

    Article  MathSciNet  Google Scholar 

  20. Chum, O., Philbin, J., Isard, M., Zisserman, A.: Scalable near identical image and shot detection. In: Proceedings of CIVR (2007)

  21. Zhou, W., Li, H., Lu, Y., Tian, Q.: Large scale image search with geometric coding. In: Proceedings of ACM Multimedia (2011)

  22. Hess, R.: An open-source SIFT library. In: Proceedings of ACM Multimedia (2010)

  23. Arya, S., Mount, D.: Ann: Library for approximate nearest neighbor searching. http://www.cs.umd.edu/~mount/ANN/

  24. Zhou, W., Lu, Y., Li, H., Tian, Q.: Scalar quantization for large scale image search. In: Proceedings of ACM Multimedia (2012)

  25. Zhang, S., Huang, Q., Hua, G., Jiang, S., Gao, W., Tian, Q.: Building contextual visual vocabulary for large-scale image applications. In: Proceedings of ACM Multimedia, pp. 501–510 (2010)

  26. Perronnin, F., Liu, Y., Sandnchez, J., Poirier, H.: Large-scale image retrieval with compressed fisher vectors. In: Proceedings of CVPR, pp. 3384–3391 (2010)

  27. Li, L., Jiang, S., Huang, Q.: Learning hierarchical semantic description via mixed-norm regularization for image understanding. IEEE Trans. Multimedia 14(5), 1401–1413 (2012)

    Article  Google Scholar 

  28. Jegou, H., Perronnin, F., Douze, M., Sanchez, J., Perez, P., Schmid, C.: Aggregating local images descriptors into compact codes. IEEE Trans. Pattern Anal. Mach. Intell. (2011)

  29. Zhou, W., Li, H., Wang, M., Lu, Y., Tian, Q.: Binary sift: towards efficient feature matching verification for image search. In: Proceedings of ICIMCS, pp. 1–6 (2012)

  30. Zhang, S., Tian, Q., Hua, G., Huang, Q., Wen, G.: Generating descriptive visual words and visual phrases for large-scale image applications. IEEE Trans. Image Process. 20(9), 2664–2677 (2011)

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgments

This work was provided support as follows: Dr. Li was supported in part by NSFC under contract No. 61272316; Dr. Lu in part by Research Enhancement Program (REP), start-up funding from the Texas State University and DoD HBCU/MI grant W911NF-12-1-0057; Dr. Tian in part by ARO grant W911NF-12-1-0057, NSF IIS 1052851, Faculty Research Awards by Google, NEC Laboratories of America, FXPAL and UTSA START-R award.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wengang Zhou.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhou, W., Li, H., Lu, Y. et al. Visual word expansion and BSIFT verification for large-scale image search. Multimedia Systems 21, 245–254 (2015). https://doi.org/10.1007/s00530-013-0330-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00530-013-0330-4

Keywords

Navigation