skip to main content
10.1145/2393347.2393378acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Query-driven iterated neighborhood graph search for large scale indexing

Published: 29 October 2012 Publication History

Abstract

In this paper, we address the approximate nearest neighbor (ANN) search problem over large scale visual descriptors. We investigate a simple but very effective approach, neighborhood graph search, which constructs a neighborhood graph to index the data points and conducts a local search, expanding neighborhoods with a best-first manner, for ANN search. Our empirical analysis shows that neighborhood expansion is very efficient, with O(1) cost, for a new NN candidate location, and has high chances to locate true NNs and hence it usually performs well. However, it often gets sub-optimal solutions since local search only checks the neighborhood of the current solution, or conducts exhaustive and continuous neighborhood expansions to find better solutions, which deteriorates the query efficiency.
In this paper, we propose a query-driven iterated neighborhood graph search approach to improve the performance. We follow the iterated local search (ILS) strategy, widely-used in combinatorial optimization, to find a solution beyond a local optimum. We handle the key challenge in making neighborhood graph search adapt to ILS, Perturbation, which generates a new pivot to restart a local search. To this end, we present a criterion to check if the local search over a neighborhood graph arrives at the local solution. Moreover, we exploit the query and search history to design the perturbation scheme, resulting in a more effective search. The major benefit is avoiding unnecessary neighborhood expansions and hence more efficiently finding true NNs. Experimental results on large scale SIFT matching, similar image search, and shape retrieval with non-metric distance measures, show that our approach performs much better than previous state-of-the-art ANN search approaches.

References

[1]
K. Aoyama, K. Saito, H. Sawada, and N. Ueda. Fast approximate similarity search based on degree-reduced neighborhood graphs. In KDD, pages 1055--1063, 2011.
[2]
S. Arya and D. M. Mount. Algorithms for fast vector quantizaton. In Data Compression Conference, pages 381--390, 1993.
[3]
S. Arya and D. M. Mount. Approximate nearest neighbor queries in fixed dimensions. In SODA, pages 271--280, 1993.
[4]
S. Arya, D. M. Mount, N. S. Netanyahu, R. Silverman, and A. Y. Wu. An optimal algorithm for approximate nearest neighbor searching fixed dimensions. J. ACM, 45(6):891--923, 1998.
[5]
J. S. Beis and D. G. Lowe. Shape indexing using approximate nearest-neighbour search in high-dimensional spaces. In CVPR, pages 1000--1006, 1997.
[6]
J. L. Bentley. Multidimensional binary search trees used for associative searching. Commun. ACM, 18(9):509--517, 1975.
[7]
J. Chen, H. ren Fang, and Y. Saad. Fast approximate nn graph construction for high dimensional data via recursive lanczos bisection. Journal of Machine Learning Research, 10:1989--2012, 2009.
[8]
M. Datar, N. Immorlica, P. Indyk, and V. S. Mirrokni. Locality-sensitive hashing scheme based on p-stable distributions. In Symposium on Computational Geometry, pages 253--262, 2004.
[9]
C. Faloutsos and K.-I. Lin. FastMap: A Fast Algorithm for Indexing, Data-Mining and Visualization of Traditional and Multimedia Datasets. In SIGMOD Conference, pages 163--174, 1995.
[10]
L. Fei-Fei, R. Fergus, and P. Perona. Learning generative visual models from few training examples: an incremental bayesian approach tested on 101 object categories. In CVPR 2004 Workshop on Generative-Model Based Vision, 2004.
[11]
R. A. Finkel and J. L. Bentley. Quad trees: A data structure for retrieval on composite keys. Acta Inf., 4:1--9, 1974.
[12]
S. Fleishman, D. Cohen-Or, and C. T. Silva. Robust moving least-squares fitting with sharp features. ACM Trans. Graph., 24(3):544--552, 2005.
[13]
J. H. Friedman, J. L. Bentley, and R. A. Finkel. An algorithm for finding best matches in logarithmic expected time. ACM Trans. Math. Softw., 3(3):209--226, 1977.
[14]
T. A. Funkhouser, P. Min, M. M. Kazhdan, J. Chen, J. A. Halderman, D. P. Dobkin, and D. P. Jacobs. A search engine for 3d models. ACM Trans. Graph., 22(1):83--105, 2003.
[15]
K. Hajebi, Y. Abbasi-Yadkori, H. Shahbazi, and H. Zhang. Fast approximate nearest-neighbor search with k-nearest neighbor graph. In IJCAI, pages 1312--1317, 2011.
[16]
J. He, W. Liu, and S.-F. Chang. Scalable similarity search with optimized kernel hashing. In KDD, pages 1129--1138, 2010.
[17]
H. H. Hoos and T. Stützle. Stochastic Local Search Foundations and Applications. Morgan Kaufmann/Elsevier, 2004.
[18]
P. Jain, B. Kulis, and K. Grauman. Fast image search for learned metrics. In CVPR, 2008.
[19]
Y. Jia, J. Wang, G. Zeng, H. Zha, and X.-S. Hua. Optimizing kd-trees for scalable visual descriptor indexing. In CVPR, pages 3392--3399, 2010.
[20]
W. Johnson and J. Lindenstrauss. Extensions of Lipschitz mappings into a Hilbert space. Contemporary Mathematics, 26:189--206, 1984.
[21]
B. Kulis and T. Darrells. Learning to hash with binary reconstructive embeddings. In NIPS, pages 577--584, 2009.
[22]
B. Kulis and K. Grauman. Kernelized locality-sensitive hashing for scalable image search. In ICCV, pages 2130--2137, 2009.
[23]
T. Liu, A. W. Moore, A. G. Gray, and K. Yang. An investigation of practical approximate nearest neighbor algorithms. In NIPS, 2004.
[24]
A. W. Moore. The anchors hierarchy: Using the triangle inequality to survive high dimensional data. In UAI, pages 397--405, 2000.
[25]
Y. Mu, J. Shen, and S. Yan. Weakly-supervised hashing in kernel space. In CVPR, pages 3344--3351, 2010.
[26]
Y. Mu and S. Yan. Non-metric locality-sensitive hashing. In AAAI, 2010.
[27]
M. Muja and D. G. Lowe. Fast approximate nearest neighbors with automatic algorithm configuration. In VISSAPP (1), pages 331--340, 2009.
[28]
G. Navarro. Searching in metric spaces by spatial approximation. VLDB J., 11(1):28--46, 2002.
[29]
D. Nistér and H. Stewénius. Scalable recognition with a vocabulary tree. In CVPR (2), pages 2161--2168, 2006.
[30]
M. Raginsky and S. Lazebnik. Locality-sensitive binary codes from shift-invariant kernels. In NIPS, pages 1509--1517, 2009.
[31]
H. Samet. Foundations of multidimensional and metric data structures. Elsevier, Amsterdam, 2006.
[32]
T. B. Sebastian and B. B. Kimia. Metric-based shape retrieval in large databases. In ICPR (3), pages 291--296, 2002.
[33]
G. Shakhnarovich, T. Darrell, and P. Indyk. Nearest-Neighbor Methods in Learning and Vision: Theory and Practice. The MIT press, 2006.
[34]
T. Shao, W. Xu, K. Yin, J. Wang, K. Zhou, and B. Guo. Discriminative sketch-based 3d model retrieval via robust shape matching. Comput. Graph. Forum, 30(7):2011--2020, 2011.
[35]
C. Silpa-Anan and R. Hartley. Optimised kd-trees for fast image descriptor matching. In CVPR, 2008.
[36]
J. Sivic and A. Zisserman. Efficient visual search of videos cast as text retrieval. IEEE Trans. Pattern Anal. Mach. Intell., 31(4):591--606, 2009.
[37]
N. Snavely, S. M. Seitz, and R. Szeliski. Photo tourism: exploring photo collections in 3D. ACM Trans. Graph., 25(3):835--846, 2006.
[38]
J. Song, Y. Yang, Z. Huang, H. T. Shen, and R. Hong. Multiple feature hashing for real-time large scale near-duplicate video retrieval. In ACM Multimedia, pages 423--432, 2011.
[39]
A. B. Torralba, R. Fergus, and W. T. Freeman. 80 million tiny images: A large data set for nonparametric object and scene recognition. IEEE Trans. Pattern Anal. Mach. Intell., 30(11):1958--1970, 2008.
[40]
G. T. Toussaint. The relative neighbourhood graph of a finite planar set. Pattern Recognition, 12(4):261--268, 1980.
[41]
W. Tu, R. Pan, and J. Wang. Similar image search with a tiny bag-of-delegates representation. In ACM Multimedia, 2012.
[42]
J. Wang, O. Kumar, and S.-F. Chang. Semi-supervised hashing for scalable image retrieval. In CVPR, pages 3424--3431, 2010.
[43]
J. Wang, J. Wang, X.-S. Hua, and S. Li. Scalable similar image search by joint indices. In ACM Multimedia, 2012.
[44]
J. Wang, J. Wang, G. Zeng, Z. Tu, R. Gan, and S. Li. Scalable k-nn graph construction for visual descriptors. In CVPR, pages 1106--1113, 2012.
[45]
Y. Weiss, A. B. Torralba, and R. Fergus. Spectral hashing. In NIPS, pages 1753--1760, 2008.
[46]
H. Xu, J. Wang, Z. Li, G. Zeng, S. Li, and N. Yu. Complementary hashing for approximate nearest neighbor search. In ICCV, pages 1631--1638, 2011.
[47]
K. Yamaguchi, T. L. Kunii, and K. Fujimura. Octree-related data structures and algorithms. IEEE Computer Graphics and Applications, 4(1):53--59, 1984.
[48]
P. N. Yianilos. Data structures and algorithms for nearest neighbor search in general metric spaces. In SODA, pages 311--321, 1993.
[49]
S. Zhang, Q. Tian, G. Hua, Q. Huang, and S. Li. Descriptive visual words and visual phrases for image applications. In ACM Multimedia, pages 75--84, 2009.
[50]
W. Zhou, Y. Lu, H. Li, Y. Song, and Q. Tian. Spatial coding for large scale partial-duplicate web image search. In ACM Multimedia, pages 511--520, 2010.

Cited By

View all
  • (2024)A Searchable Symmetric Encryption-Based Privacy Protection Scheme for Cloud-Assisted Mobile CrowdsourcingIEEE Internet of Things Journal10.1109/JIOT.2023.332066611:2(1910-1924)Online publication date: 15-Jan-2024
  • (2024)Survey of vector database management systemsThe VLDB Journal10.1007/s00778-024-00864-x33:5(1591-1615)Online publication date: 15-Jul-2024
  • (2023)SPFresh: Incremental In-Place Update for Billion-Scale Vector SearchProceedings of the 29th Symposium on Operating Systems Principles10.1145/3600006.3613166(545-561)Online publication date: 23-Oct-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
MM '12: Proceedings of the 20th ACM international conference on Multimedia
October 2012
1584 pages
ISBN:9781450310895
DOI:10.1145/2393347
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 October 2012

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. ann search
  2. iterated
  3. local neighborhood graph search
  4. query-driven

Qualifiers

  • Research-article

Conference

MM '12
Sponsor:
MM '12: ACM Multimedia Conference
October 29 - November 2, 2012
Nara, Japan

Acceptance Rates

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)51
  • Downloads (Last 6 weeks)5
Reflects downloads up to 28 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)A Searchable Symmetric Encryption-Based Privacy Protection Scheme for Cloud-Assisted Mobile CrowdsourcingIEEE Internet of Things Journal10.1109/JIOT.2023.332066611:2(1910-1924)Online publication date: 15-Jan-2024
  • (2024)Survey of vector database management systemsThe VLDB Journal10.1007/s00778-024-00864-x33:5(1591-1615)Online publication date: 15-Jul-2024
  • (2023)SPFresh: Incremental In-Place Update for Billion-Scale Vector SearchProceedings of the 29th Symposium on Operating Systems Principles10.1145/3600006.3613166(545-561)Online publication date: 23-Oct-2023
  • (2022) Approximate k -NN Graph Construction: A Generic Online Approach IEEE Transactions on Multimedia10.1109/TMM.2021.307381124(1909-1921)Online publication date: 2022
  • (2021)SPANNProceedings of the 35th International Conference on Neural Information Processing Systems10.5555/3540261.3540659(5199-5212)Online publication date: 6-Dec-2021
  • (2021)A comprehensive survey and experimental comparison of graph-based approximate nearest neighbor searchProceedings of the VLDB Endowment10.14778/3476249.347625514:11(1964-1978)Online publication date: 27-Oct-2021
  • (2021)TextGNN: Improving Text Encoder via Graph Neural Network in Sponsored SearchProceedings of the Web Conference 202110.1145/3442381.3449842(2848-2857)Online publication date: 19-Apr-2021
  • (2021)On the Merge of k-NN GraphIEEE Transactions on Big Data10.1109/TBDATA.2021.3101517(1-1)Online publication date: 2021
  • (2020)AutoSysProceedings of the 2020 USENIX Conference on Usenix Annual Technical Conference10.5555/3489146.3489168(323-336)Online publication date: 15-Jul-2020
  • (2020)TwinBERTProceedings of the 29th ACM International Conference on Information & Knowledge Management10.1145/3340531.3412747(2645-2652)Online publication date: 19-Oct-2020
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media