Skip to main content
Log in

Multiple complementary inverted indexing based on multiple metrics

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Inverted indexing based on vector quantization has been a popular technique in large scale information retrieval. With vector quantization based on a certain similarity metric, the sample space is partitioned into some voronoi cells, and samples in each cell are indexed by an inverted list. The nearest neighbors of a query are efficiently identified by looking up the cell where the query is located. To improve the recall, the sample space partitioning has been performed multiple times with different initializations of k-means to build multiple inverted indexes. While with the single similarity metric, e.g., Euclidean distance, high correlation may exist between multiple inverted indexes, which constrains the possible gain in recall. A new multiple inverted indexing method based on multiple sample space partitioning with multiple different similarity metrics is presented in this paper. Furthermore, several techniques for defining multiple metrics are investigated empirically. Experiments are conducted on 3 representative datasets, million-scale SIFT and GIST feature sets and a deep-learning-produced feature set, to properly evaluate the effectiveness of the proposed method. Experiment results show that the proposed method has competitive performance compared with the state-of-the-art inverted indexing methods in terms of recall and retrieval time, and the Latin-Hypercube weighting method can generate better diverse multiple metrics and get better gain in recall.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. http://caffe.berkeleyvision.org/

  2. https://github.com/convolutionROC/Inverted-index

References

  1. Anh NT, Yusuke M, Toshihiko Y, Kiyoharu A (2015) Selective K-means tree search. ACM Multimedia Conference

  2. Artem B, Victor L (2015) The inverted multi-index. IEEE Trans Pattern Anal Mach Intell 37:1247–1260

    Article  Google Scholar 

  3. Aude O, Antonio T (2001) Modeling the shape of the scene: A holistic representation of the spatial envelope. Springer Int J Comput Vision 42:145–175

    Article  Google Scholar 

  4. Babenko A, Lempitsky V (2012) The inverted multi-index. In: IEEE conference on computer vision and pattern recognition

  5. Charikar MS (2002) Similarity estimation techniques from rounding algorithms. In: 34th ACM symposium on theory of computing. ACM

  6. Dasgupta S, Stevens CF, Navlakha S (2017) A neural algorithm for a fundamental computing problem[J]. Science 358(6364):793–796

    Article  MathSciNet  Google Scholar 

  7. David N, Henrik S (2006) Scalable recognition with a vocabulary tree. In: IEEE conference on computer vision and pattern recognition

  8. David N, Michal B, Pavel Z (2015) Large-scale image retrieval using neural net descriptors. In: International conference on research on development in information retrieval

  9. Defu C, Dongyuan S, Jinfu C (2014) Probabilistic load flow computation using Copula and Latin hypercube sampling. IET Gener Transm Distrib 8:1539–1549

    Article  Google Scholar 

  10. Dong W, Huchuan LU, Ziyang X, Ming-Hsuan Y (2015) Inverse sparse tracker with a locally weighted distance metric. IEEE IEEE Trans Image Process 24:2646–2657

    Article  MathSciNet  Google Scholar 

  11. Edgar C, Gonzalo N, Ricardo B-Y, Marroquin JL (2001) Searching in metric spaces. ACM Comput Surv 33:273–321

    Article  Google Scholar 

  12. Ge T, He K, Ke Q, Sun J (2014) Optimized product quantization. IEEE Trans Pattern Anal Mach Intell 36(4):744–755

    Article  Google Scholar 

  13. Gong Y, Lazebnik S (2011) Iterative quantization: a procrustean approach to learning binary codes. In: IEEE international conference on computer vision and pattern recognition

  14. Gray RM (1984) Vector quantization. IEEE Signal Process Mag 1:4–29

    Google Scholar 

  15. Haiming L, Dawei S, Stefan R, Rui HU, Victoria U (2008) Comparing dissimilarity measures for content-based image retrieval. Springer Asia Information Retrieval Symposium, pp 44–50

  16. Han YU, Chung C Y, Wong K P, Lee H W, Zhang J H (2009) Probabilistic load flow evaluation with hybrid latin hypercube sampling and cholesky decomposition. IEEE Trans Power Syst 24:661–667

    Article  Google Scholar 

  17. He K, Fang W, Jian S (2013) K-means hashing: an affinity-preserving quantization method for learning binary compact codes. In: IEEE conference on computer vision and pattern recognition

  18. Helton J. C., Davis F. J. (2003) Latin hypercube sampling and the propagation of uncertainty in analyses of complex systems. Elsevier Reliab Eng Syst Safety 81:23–69

    Article  Google Scholar 

  19. Herve J, Matthijs D, Cordelia S (2011) Product quantization for nearest neighbor search. IEEE Trans Pattern Anal Mach Intell 33:117–128

    Article  Google Scholar 

  20. Herve J, Romain T, Matthijs D, Laurent A (2011) Searching in one billion vectors: re-rank with source coding. In: IEEE international conference on acoustics speech and signal processing

  21. Hoi SCH, Wei L, Lyu MR, Ma W-Y (2006) Learning distance metrics with contextual constraints for image retrieval. In: IEEE conference on computer vision and pattern recognition

  22. Hoi SCH, Wei L, Shih-Fu C (2008) Semi-supervised distance metric learning for collaborative image retrieval. In: IEEE conference on computer vision and pattern recognition

  23. Jia Y, Shelhamer E, Donahue J, et al (2014) Caffe: convolutional architecture for fast feature embedding[C]. In: Proceedings of the 22nd ACM international conference on multimedia. ACM

  24. Josef S, Andrew Z (2003) Video Google: a text retrieval approach to object matching in videos. In: IEEE international conference on computer vision

  25. Kalantidis Y, Avrithis Y (2014) Locally optimized product quantization for approximate nearest neighbor search. In: IEEE conference on computer vision and pattern recognition. IEEE Computer Society, pp 2329–2336

  26. Keinosuke F, Narendra Patrenahalli M (1975) A branch and bound algorithm for computing k-nearest neighbors. IEEE Trans Comput 100:750–753

    MATH  Google Scholar 

  27. Kevin L, Huei-Fang Y, Kuan-Hsien L, Jen-Hao H, Chu-Song C (2015) Rapid clothing retrieval via deep learning of binary codes and hierarchical search. In: ACM international conference on multimedia retrieval

  28. Kulis B, Grauman K (2010) Kernelized locality-sensitive hashing for scalable image search. In: IEEE international conference on computer vision

  29. Kyriakidis PC (2005) Sequential spatial simulation using latin hypercube sampling. Springer Quant Geol Geostat 14:65–74

    Article  Google Scholar 

  30. Lei Z, Yongdong Z, Jinhu T, Ke L, Qi T (2013) Binary code ranking with weighted hamming distance. In: IEEE conference on computer vision and pattern recognition

  31. Lejsek H, Ásmundsson FH, Jónsson B (2009) NV-Tree: an efficient disk-based index for approximate search in very large high-dimensional collections[J]. IEEE Trans Pattern Anal Mach Intell 31(5):869–883

    Article  Google Scholar 

  32. Liang Z, Shengjin W, Ziqiong L, Qi T (2013) Lp-norm idf for large scale image search. In: IEEE conference on computer vision and pattern recognition

  33. Liang Z, Shengjin W, Wengang Z, Qi T (2014) Bayes merging of multiple vocabularies for scalable image retrieval. In: IEEE conference on computer vision and pattern recognition

  34. Loic P, Herve J, Laurent A (2010) Locality sensitive hashing: a comparison of hash function types and querying mechanisms. Elsevier Pattern Recogn Lett 31:1348–1358

    Article  Google Scholar 

  35. Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Springer Int J Comput Vision 60:91–110

    Article  Google Scholar 

  36. Marius M, Lowe DG (2009) Fast approximate nearest neighbors with automatic algorithm configuration. In: International conference on computer vision theory and application

  37. McKay MD, Beckman RJ, William C (1979) Comparison of three methods for selecting values of input variables in the analysis of output from a computer code. Technometrics 21:239–245

    MathSciNet  MATH  Google Scholar 

  38. Oren B, Eli S, Michal I (2008) In defense of nearest-neighbor based image classification. In: IEEE conference on computer vision and pattern recognition

  39. Padmashree D, Jagadeesh P, Anita K (2016) An image retrieval using combined approach wavelets and local binary pattern. In: IEEE international conference on information and automation

  40. Shen F, Shen C, Liu W, et al (2015) Supervised discrete hashing. In: IEEE international conference on computer vision and pattern recognition (CVPR)

  41. Simone S, Ramesh J (1999) Similarity measures. IEEE Trans Pattern Anal Mach Intell 21:871–883

    Article  Google Scholar 

  42. Sravanthi B, Davis LS (2016) Semantic binary codes. In: ACM international conference on multimedia retrieval

  43. Weiss Y, Torralba A, Fergus R (2009) Spectral hashing. In: Advances in neural information processing systems

  44. Wengang Z, Yijuan L U, Li H, Yibing S, Qi T (2010) Spatial coding for large scale partial-duplicate web image search. In: ACM multimedia conference

  45. Wengang Z, Ming Y, Xiaoyu W, Li H, Yuanqing L, Qi T (2016) Scalable feature matching by dual cascaded scalar quantization for image retrieval. IEEE Trans Pattern Anal Mach Intell 38:159–171

    Article  Google Scholar 

  46. Yan X, He K, Fang W, Jian S (2013) Joint inverted indexing. In: IEEE international conference on computer vision

  47. Yannis K, Yannis A (2014) Locally optimized product quantization for approximate nearest neighbor search. In: IEEE conference on computer vision and pattern recognition

  48. Yu S-I, Jiang L, Zhongwen X, Yi Y, Hauptmann AG (2015) Content-based video search over 1 million videos with 1 core in 1 second. In: ACM international conference on multimedia retrieval

  49. Yu L, Huang Z, Shen F, et al (2017) Bilinear optimized product quantization for scalable visual content analysis[J]. IEEE Trans Image Process 26:5057–5069

    Article  MathSciNet  Google Scholar 

  50. Zhaohua Z, Jiancong T, Haibing H, Jin L, Li T, Stones RJ, Gang W, Xiaoguang L (2016) Leveraging context-free grammar for efficient inverted index compression. In: ACM international conference on research on development in information retrieval

Download references

Acknowledgements

The work is supported by the National Natural Science Foundation of China under grand No. 61473271 and No. 61331015.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bin Li.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, K., Zhou, W., Sun, S. et al. Multiple complementary inverted indexing based on multiple metrics. Multimed Tools Appl 78, 7727–7747 (2019). https://doi.org/10.1007/s11042-018-6439-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-018-6439-x

Keywords

Navigation