Tensor index for large scale image retrieval

Zheng, Liang; Wang, Shengjin; Guo, Peizhen; Liang, Hanyue; Tian, Qi

doi:10.1007/s00530-014-0415-8

Tensor index for large scale image retrieval

Regular Paper
Published: 12 October 2014

Volume 21, pages 569–579, (2015)
Cite this article

Multimedia Systems Aims and scope Submit manuscript

Liang Zheng¹,
Shengjin Wang¹,
Peizhen Guo¹,
Hanyue Liang¹ &
…
Qi Tian²

489 Accesses
5 Citations
3 Altmetric
Explore all metrics

Abstract

Recently, the bag-of-words representation is widely applied in the image retrieval applications. In this model, visual word is a core component. However, compared with text retrieval, one major problem associated with image retrieval consists in the visual word ambiguity, i.e., a trade-off between precision and recall of visual matching. To address this problem, this paper proposes a tensor index structure to improve precision and recall simultaneously. Essentially, the tensor index is a multi-dimensional index structure. It combines the strengths of two state-of-the-art indexing strategies, i.e., the inverted multi-index [Babenko and Lempitsky (Computer vision and pattern recognition (CVPR), 2012 IEEE Conference, 3069–3076, 2012)] as well as the joint inverted index [Xia et al. (ICCV, 2013)] which are initially designed for approximate nearest neighbor search problems. This paper, instead, exploits their usage in the scenario of image retrieval and provides insights into how to combine them effectively. We show that on the one hand, the multi-index enhances the discriminative power of visual words, thus improving precision; on the other hand, the introduction of multiple codebooks corrects quantization artifacts, thus improving recall. Extensive experiments on two benchmark datasets demonstrate that tensor index significantly improves the baseline approach. Moreover, when incorporating methods such as Hamming embedding, we achieve competitive performances compared to the state-of-the-art ones.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-index structure based on SIFT and color features for large scale image retrieval

Article 27 July 2016

Multimodal Image Retrieval Based on Keywords and Low-Level Image Features

Image Retrieval Based on Statistical and Geometry Features

References

Arandjelovic, R., Zisserman, A.: Three things everyone should know to improve object retrieval. In: Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference, pp. 2911–2918. IEEE (2012)
Babenko, A., Lempitsky, V.: The inverted multi-index. In: Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference, pp. 3069–3076. IEEE (2012)
Bai, S., Wang, X., Yao, C., Bai, X.: Multiple stage residual model for accurate image classification. In: Computer Vision-ACCV 2012. Springer (2014)
Boix, X., Roig, G., Leistner, C., Van Gool, L.: Nested sparse quantization for efficient feature coding. In: Computer Vision-ECCV 2012, pp. 744–758. Springer (2012)
Cai, J., Liu, Q., Chen, F., Joshi, D., Tian, Q.: Scalable image search with multiple index tables. In: Proceedings of International Conference on Multimedia Retrieval, p. 407. ACM (2014)
Cai, Y., Tong, W., Yang, L., Hauptmann, A.G.: Constrained keypoint quantization: towards better bag-of-words model for large-scale multimedia retrieval. In: Proceedings of the 2nd ACM International Conference on Multimedia Retrieval, p. 16. ACM (2012)
van Gemert, J.C., Veenman, C.J., Smeulders, A.W., Geusebroek, J.M.: Visual word ambiguity. Pattern Anal. Mach. Intell. IEEE Trans. 32(7), 1271–1283 (2010)
Article Google Scholar
Huiskes, M.J., Thomee, B., Lew, M.S.: New trends and ideas in visual concept detection: the mir flickr retrieval evaluation initiative. In: Proceedings of the international conference on Multimedia information retrieval, pp. 527–536. ACM (2010)
Jégou, H., Chum, O.: Negative evidences and co-occurences in image retrieval: The benefit of pca and whitening. In: Computer Vision-ECCV 2012, pp. 774–787. Springer (2012)
Jegou, H., Douze, M., Schmid, C.: Hamming embedding and weak geometric consistency for large scale image search. In: Computer Vision-ECCV 2008, pp. 304–317. Springer (2008)
Jégou, H., Douze, M., Schmid, C.: On the burstiness of visual elements. In: Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on, pp. 1169–1176. IEEE (2009)
Jégou, H., Douze, M., Schmid, C.: Improving bag-of-features for large scale image search. Int. J. Comput. Vis. 87(3), 316–336 (2010)
Article Google Scholar
Jegou, H., Douze, M., Schmid, C.: Product quantization for nearest neighbor search. Pattern Anal. Mach. Intell. IEEE Trans. 33(1), 117–128 (2011)
Article Google Scholar
Jegou, H., Schmid, C., Harzallah, H., Verbeek, J.: Accurate image search using the contextual dissimilarity measure. ern Anal. Mach. Intell. IEEE Trans. 32(1), 2–11 (2010)
Article Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Article Google Scholar
Liu, J., Wang, S.: Salient region detection via simple local and global contrast representation. Neurocomputing 147, 435–443 (2015)
Liu, S., Cui, P., Zhu, W., Yang, S., Tian, Q.: Social embedding image distance learning. In: Proceedings of the 20th ACM international conference on Multimedia (2014)
Liu, Z., Li, H., Zhou, W., Zhao, R., Tian, Q.: Contextual hashing for large-scale image search. Image Process. IEEE Trans. 23(4), 1606–1614 (2014)
Article MathSciNet Google Scholar
Liu, Z., Wang, S., Zheng, L., Tian, Q.: Visual reranking with improved image graph. In: ICASSP, pp. 6889–6893. IEEE (2014)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)
Article Google Scholar
Nister, D., Stewenius, H.: Scalable recognition with a vocabulary tree. In: Computer Vision and Pattern Recognition, 2006 IEEE Computer Society Conference, vol. 2, pp. 2161–2168. IEEE (2006)
Niu, Z., Hua, G., Gao, X., Tian, Q.: Context aware topic model for scene recognition. In: Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference, pp. 2743–2750. IEEE (2012)
Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: Computer Vision and Pattern Recognition, 2007. CVPR’07. IEEE Conference, pp. 1–8. IEEE (2007)
Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Lost in quantization: Improving particular object retrieval in large scale image databases. In: Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference, pp. 1–8. IEEE (2008)
Qin, D., Wengert, C., Van Gool, L.: Query adaptive similarity for large scale object retrieval. In: Computer Vision and Pattern Recognition (CVPR), 2013 IEEE Conference, pp. 1610–1617. IEEE (2013)
Shahbaz Khan, F., Anwer, R.M., van de Weijer, J., Bagdanov, A.D., Vanrell, M., Lopez, A.M.: Color attributes for object detection. In: Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference, pp. 3306–3313. IEEE (2012)
Shen, X., Lin, Z., Brandt, J., Avidan, S., Wu, Y.: Object retrieval and localization with spatially-constrained similarity measure and k-nn re-ranking. In: Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference, pp. 3013–3020. IEEE (2012)
Sivic, J., Zisserman, A.: Video google: A text retrieval approach to object matching in videos. In: Computer Vision, 2003. Proceedings. Ninth IEEE International Conference, pp. 1470–1477. IEEE (2003)
Su, B., Ding, X., Peng, L., Liu, C.: A novel baseline-independent feature set for arabic handwriting recognition. In: Document Analysis and Recognition (ICDAR), 2013 12th International Conference, pp. 1250–1254. IEEE (2013)
Su, Y., Fu, Y., Gao, X., Tian, Q.: Discriminant learning through multiple principal angles for visual recognition. Image Process. IEEE Trans. 21(3), 1381–1390 (2012)
Article MathSciNet Google Scholar
Su, Y., Tao, D., Li, X., Gao, X.: Texture representation in aam using gabor wavelet and local binary patterns. In: Systems, Man and Cybernetics, 2009. SMC 2009. IEEE International Conference, pp. 3274–3279. IEEE (2009)
Wang, D., Lu, H., Yang, M.H.: Online object tracking with sparse prototypes. Image Process. IEEE Trans. 22(1), 314–325 (2013)
Article MathSciNet Google Scholar
Wang, X., Yang, M., Cour, T., Zhu, S., Yu, K., Han, T.X.: Contextual weighting for vocabulary tree based image retrieval. In: Computer Vision (ICCV), 2011 IEEE International Conference, pp. 209–216. IEEE (2011)
Wang, Y., Liu, C., Ding, X.: Similar pattern discriminant analysis for improving chinese character recognition accuracy. In: Document Analysis and Recognition (ICDAR), 2013 12th International Conference, pp. 1056–1060. IEEE (2013)
Wengert, C., Douze, M., Jégou, H.: Bag-of-colors for improved image search. In: Proceedings of the 19th ACM international conference on Multimedia, pp. 1437–1440. ACM (2011)
Xia, Y., He, K., Wen, F., Sun, J.: Joint inverted index. In: ICCV (2013)
Xie, L., Tian, Q., Zhang, B.: Spatial pooling of heterogeneous features for image classification. Image Process. IEEE Trans. 23(5), 1994–2008 (2013)
MathSciNet Google Scholar
Yang, J., Yu, K., Gong, Y., Huang, T.: Linear spatial pyramid matching using sparse coding for image classification. In: Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference, pp. 1794–1801. IEEE (2009)
Yang, Y., Liu, J.: Exploring the large-scale tdoa feature space for speaker diarization. In: HCI International 2014-Posters Extended Abstracts, pp. 551–556. Springer (2014)
Yuan, H., Qian, Y., Zhao, J., Liu, J.: Mispronunciation detection with an optimized detection network and multi-layer perception based features. J. Tsinghua Univ. (Sci. Technol.) 4, 027 (2012)
Zhang, S., Yang, M., Cour, T., Yu, K., Metaxas, D.N.: Query specific fusion for image retrieval. In: Computer Vision-ECCV 2012, pp. 660–673. Springer (2012)
Zhang, S., Yang, M., Wang, X., Lin, Y., Tian, Q.: Semantic-aware co-indexing for image retrieval. In: Computer Vision (ICCV), 2013 IEEE International Conference, pp. 1673–1680. IEEE (2013)
Zhang, Y., Jia, Z., Chen, T.: Image retrieval with geometry-preserving visual phrases. In: Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference, pp. 809–816. IEEE (2011)
Zheng, L., Wang, S.: Visual phraselet: Refining spatial constraints for large scale image search. Signal Process. Lett. IEEE 20(4), 391–394 (2013)
Article Google Scholar
Zheng, L., Wang, S., Tian, Q.: Coupled binary embedding for large-scale image retrieval. Image Process. IEEE Trans. 23(8), 3368–3380 (2014)
Article MathSciNet Google Scholar
Zheng, L., Wang, S., Tian, Q.: Lp-norm idf for scalable image retrieval. Image Process. IEEE Trans. 23(8), 3604–3617 (2014)
Article MathSciNet Google Scholar
Zheng, L., Wang, S., Zhou, W., Tian, Q.: Bayes merging of multiple vocabularies for scalable image retrieval. In: CVPR (2014)
Zhou, W., Lu, Y., Li, H., Tian, Q.: Scalar quantization for large scale image search. In: Proceedings of the 20th ACM international conference on Multimedia, pp. 169–178. ACM (2012)

Download references

Acknowledgements

This work was supported by the National High Technology Research and Development Program of China (863 program) under Grant No. 2012AA011004 and the National Science and Technology Support Program under Grant No. 2013BAK02B04. This work was supported in part to Dr. Qi Tian by ARO grant W911NF-12-1-0057 and Faculty Research Awards by NEC Laboratories of America. This work was supported in part by National Science Foundation of China (NSFC) 61429201.

Author information

Authors and Affiliations

Department of Electronic Engineering, Tsinghua University, Beijing, 100084, China
Liang Zheng, Shengjin Wang, Peizhen Guo & Hanyue Liang
University of Texas, San Antonio, TX, 78249, USA
Qi Tian

Authors

Liang Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Shengjin Wang
View author publications
You can also search for this author in PubMed Google Scholar
Peizhen Guo
View author publications
You can also search for this author in PubMed Google Scholar
Hanyue Liang
View author publications
You can also search for this author in PubMed Google Scholar
Qi Tian
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Shengjin Wang or Qi Tian.

Additional information

Communicated by F. Wu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zheng, L., Wang, S., Guo, P. et al. Tensor index for large scale image retrieval. Multimedia Systems 21, 569–579 (2015). https://doi.org/10.1007/s00530-014-0415-8

Download citation

Received: 08 April 2014
Accepted: 26 August 2014
Published: 12 October 2014
Issue Date: November 2015
DOI: https://doi.org/10.1007/s00530-014-0415-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Tensor index for large scale image retrieval

Abstract

Access this article

Similar content being viewed by others

Multi-index structure based on SIFT and color features for large scale image retrieval

Multimodal Image Retrieval Based on Keywords and Low-Level Image Features

Image Retrieval Based on Statistical and Geometry Features

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Tensor index for large scale image retrieval

Abstract

Access this article

Similar content being viewed by others

Multi-index structure based on SIFT and color features for large scale image retrieval

Multimodal Image Retrieval Based on Keywords and Low-Level Image Features

Image Retrieval Based on Statistical and Geometry Features

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation