skip to main content
research-article

Space-time tradeoffs for approximate nearest neighbor searching

Published: 27 November 2009 Publication History

Abstract

Nearest neighbor searching is the problem of preprocessing a set of n point points in d-dimensional space so that, given any query point q, it is possible to report the closest point to q rapidly. In approximate nearest neighbor searching, a parameter ε > 0 is given, and a multiplicative error of (1 + ε) is allowed. We assume that the dimension d is a constant and treat n and ε as asymptotic quantities. Numerous solutions have been proposed, ranging from low-space solutions having space O(n) and query time O(log n + 1/εd−1) to high-space solutions having space roughly O((n log n)/εd) and query time O(log (n/ε)).
We show that there is a single approach to this fundamental problem, which both improves upon existing results and spans the spectrum of space-time tradeoffs. Given a tradeoff parameter γ, where 2 ≤ γ ≤ 1/ε, we show that there exists a data structure of space O(nγd−1 log(1/ε)) that can answer queries in time O(log(nγ) + 1/(εγ)(d−1)/2. When γ = 2, this yields a data structure of space O(n log (1/ε)) that can answer queries in time O(log n + 1/ε(d−1)/2). When γ = 1/ε, it provides a data structure of space O((nd−1)log(1/ε)) that can answer queries in time O(log(n/ε)).
Our results are based on a data structure called a (t,ε)-AVD, which is a hierarchical quadtree-based subdivision of space into cells. Each cell stores up to t representative points of the set, such that for any query point q in the cell at least one of these points is an approximate nearest neighbor of q. We provide new algorithms for constructing AVDs and tools for analyzing their total space requirements. We also establish lower bounds on the space complexity of AVDs, and show that, up to a factor of O(log (1/ε)), our space bounds are asymptotically tight in the two extremes, γ = 2 and γ = 1/ε.

References

[1]
Arya, S., da Fonseca, G. D., and Mount, D. M. 2008a. Tradeoffs in approximate range searching made simpler. In SIBGRAPI '08: Proceedings of the 2008 XXI Brazilian Symposium on Computer Graphics and Image Processing. IEEE Computer Society, Los Alamitos, CA, 237--244.
[2]
Arya, S., and Malamatos, T. 2002. Linear-size approximate Voronoi diagrams. In Proceedings of the 13th Annual ACM-SIAM Symposium on Discrete Algorithms. ACM, New York, 147--155.
[3]
Arya, S., Malamatos, T., and Mount, D. M. 2002. Space-efficient approximate Voronoi diagrams. In Proceedings of the 34th Annual ACM Symposium on Theory of Computing. ACM, New York, 721--730.
[4]
Arya, S., Malamatos, T., and Mount, D. M. 2005. Space-time tradeoffs for approximate spherical range counting. In Proceedings of the 16th Annual ACM-SIAM Symposium on Discrete Algorithms. ACM, New York, 535--544.
[5]
Arya, S., Malamatos, T., and Mount, D. M. 2006. On the importance of idempotence. In Proceedings of the 38th Annual ACM Symposium on Theory Computing. ACM, New York, 564--573.
[6]
Arya, S., Malamatos, T., and Mount, D. M. 2009. The effect of corners on the complexity of approximate range searching. Disc. Comput. Geom. 41, 398--443.
[7]
Arya, S., and Mount, D. M. 2000. Approximate range searching. Comput. Geom. Theory Appl. 17, 135--152.
[8]
Arya, S., Mount, D. M., Netanyahu, N., Silverman, R., and Wu, A. Y. 1998. An optimal algorithm for approximate nearest neighbor searching in fixed dimensions. J. ACM 45, 891--923.
[9]
Arya, S., Mount, D. M., Vigneron, A., and Xia, J. 2008b. Space-time tradeoffs for proximity searching in doubling spaces. In Proceedings of the 16th Annual European Symposium on Algorithms. Lecture Notes Computer Science, vol. LNCS 5193/2008. Springer-Verlag, Berlin, Germany, 112--123.
[10]
Bent, S. W., Sleator, D. D., and Tarjan, R. E. 1985. Biased search trees. SIAM J. Comput. 14, 545--568.
[11]
Bespamyatnikh, S. N. 1996. Dynamic algorithms for approximate neighbor searching. In Proceedings of the 8th Canadian Conference on Computer Geometry. 252--257.
[12]
Beyer, K., Goldstein, J., Ramakrishnan, R., and Shaft, U. 1999. When is “nearest neighbor” meaningful. In Proceedings of the International Conference on Database Theory. 217--235.
[13]
Bronshteyn, E. M. and Ivanov, L. D. 1976. The approximation of convex sets by polyhedra. Siber. Math. J. 16, 852--853.
[14]
Callahan, P. B., and Kosaraju, S. R. 1995. A decomposition of multidimensional point sets with applications to k-nearest-neighbors and n-body potential fields. J. ACM 42, 67--90.
[15]
Chan, T. M. 1998. Approximate nearest neighbor queries revisited. Disc. Comput. Geom. 20, 359--373.
[16]
Chan, T. M. 2002. Closest-point problems simplified on the RAM. In Proceedings of the 13th Annual ACM-SIAM Symposium on Discrete Algorithms. ACM, New York, 472--473.
[17]
Chan, T. M. 2006. A minimalist's implementation of an approximate nearest neighbor algorithm in fixed dimensions. Manuscript. (www.cs.vwaterloo.ca/~tmchan/sss.ps).
[18]
Chan, T. M., and Snoeyink, J. 1995. Algorithms for approximate nearest-neighbor queries. Manuscript.
[19]
Clarkson, K. L. 1994. An algorithm for approximate closest-point queries. In Proceedings of the 10th Annual ACM Symposium on Computer Geometry. ACM, New York, 160--164.
[20]
Clarkson, K. L. 1999. Nearest neighbor queries in metric spaces. Disc. Comput. Geom. 22, 63--93.
[21]
Cole, R., and Gottlieb, L. 2006. Searching dynamic point sets in spaces with bounded doubling dimension. In Proceedings of the 38th Annual ACM Symposium on Theory of Computing. ACM, New York, 574--583.
[22]
da Fonseca, G. D. 2007. Approximate range searching in the absolute error model. Ph.D. dissertation, Department of Computer Science, Univ. Maryland.
[23]
de Berg, M., van Kreveld, M., Overmars, M., and Schwarzkopf, O. 2000. Computational Geometry: Algorithms and Applications, 2nd ed. Springer-Verlag, Berlin, Germany.
[24]
Dudley, R. M. 1974. Metric entropy of some classes of sets with differentiable boundaries. J. Approx. Theory 10, 227--236.
[25]
Duncan, C. A., Goodrich, M. T., and Kobourov, S. G. 2001. Balanced aspect ratio trees: Combining the advantages of k-d trees and octrees. J. Algorithms 33, 303--333.
[26]
Edelsbrunner, H. 1987. Algorithms in Combinatorial Geometry. EATCS Monographs on Theoretical Computer Science, vol. 10. Springer-Verlag, Heidelberg, West Germany.
[27]
Eppstein, D., Goodrich, M. T., and Sun, J. Z. 2008. The skip quadtree: A simple dynamic data structure for multidimensional data. Int. J. Comput. Geom. Appl. 18, 131--160.
[28]
François, D., Wertz, V., and Verleysen, M. 2007. The concentration of fractional distances. IEEE Trans. Knowl. Data Eng. 19, 873--886.
[29]
Frederickson, G. N. 1997. A data structure for dynamically maintaining rooted trees. J. Algorithms 24, 37--65.
[30]
Gantmacher, F. R. 1959. Matrix Theory, Volume 2. Chelsea, New York.
[31]
Golub, G. H., and Loan, C. F. V. 1996. Matrix Computations, 3rd ed. Johns Hopkins University Press, Baltimore, MD.
[32]
Har-Peled, S. 2001. A replacement for Voronoi diagrams of near linear size. In Proceedings of the 42nd Annual IEEE Symposium on Foundations of Computer Science, IEEE Computer Society Press, Los Alamitos, CA, 94--103.
[33]
Har-Peled, S. 2008. Geometric approximation algorithms. (Lecture notes available from http://valis.cs.uiuc.edu/~sariel/teach/notes/aprx/).
[34]
Har-Peled, S., and Mendel, M. 2006. Fast construction of nets in low dimensional metrics, and their applications. SIAM J. Comput. 35, 1148--1184.
[35]
Indyk, P., and Motwani, R. 1998. Approximate nearest neighbors: Towards removing the curse of dimensionality. In Proceedings of the 30th Annual ACM Symposium on Theory of Computing. ACM, New York, 604--613.
[36]
Karger, D. R., and Ruhl, M. 2002. Finding nearest neighbors in growth-restricted metrics. In Proceedings of the 34th Annual ACM Symposium on Theory of Computing. ACM, New York, 741--750.
[37]
Krauthgamer, R., and Lee, J. R. 2004. Navigating nets: Simple algorithms for proximity search. In Proceedings of the 15th Annual ACM-SIAM Symposium on Discrete Algorithms. ACM, New York, 798--807.
[38]
Krauthgamer, R., and Lee, J. R. 2005. The black-box complexity of nearest-neighbor search. Theo. Comp. Sci. 348, 262--276.
[39]
Kushilevitz, E., Ostrovsky, R., and Rabani, Y. 2000. Efficient search for approximate nearest neighbor in high dimemsional spaces. SIAM J. Comput. 30, 457--474.
[40]
Panigrahy, R. 2006. Entropy based nearest neighbor search in high dimensions. In Proceedings of the 17th Annual ACM-SIAM Symposium on Discrete Algorithms. ACM, New York, 1186--1195.
[41]
Preparata, F. P., and Shamos, M. I. 1990. Computational Geometry: An Introduction, 3rd ed. Springer-Verlag, Berlin, Germany.
[42]
Sabharwal, Y., Sen, S., and Sharma, N. 2006. Nearest neighbors search using point location in balls with applications to approximate Voronoi decompositions. J. Comput. Sys. Sci. 72, 955--977.
[43]
Samet, H. 1990. The Design and Analysis of Spatial Data Structures. Addison-Wesley, Reading, MA.
[44]
Vleugels, J., and Overmars, M. 1998. Approximating Voronoi diagrams of convex sites in any dimension. Int. J. Comput. Geom. Appl. 8, 201--222.
[45]
Yao, A. C. 1982. On constructing minimum spanning trees in k-dimensional spaces and related problems. SIAM J. Comput. 11, 4, 721--736.

Cited By

View all

Index Terms

  1. Space-time tradeoffs for approximate nearest neighbor searching

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Journal of the ACM
    Journal of the ACM  Volume 57, Issue 1
    November 2009
    149 pages
    ISSN:0004-5411
    EISSN:1557-735X
    DOI:10.1145/1613676
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 27 November 2009
    Accepted: 01 August 2009
    Revised: 01 August 2009
    Received: 01 November 2008
    Published in JACM Volume 57, Issue 1

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Nearest neighbor searching
    2. space-time tradeoffs

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)34
    • Downloads (Last 6 weeks)3
    Reflects downloads up to 27 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Soft Segmentation and Reconstruction of Tree Crown from Laser Scanning DataElectronics10.3390/electronics1210230012:10(2300)Online publication date: 19-May-2023
    • (2023)Soft Segmentation of Terrestrial Laser Scanning Point Cloud of ForestsApplied Sciences10.3390/app1310622813:10(6228)Online publication date: 19-May-2023
    • (2022)Small Candidate Set for Translational Pattern SearchAlgorithmica10.1007/s00453-022-00997-x84:10(3034-3053)Online publication date: 1-Oct-2022
    • (2020)Products of Euclidean Metrics, Applied to Proximity Problems among CurvesACM Transactions on Spatial Algorithms and Systems10.1145/33975186:4(1-20)Online publication date: 21-Jun-2020
    • (2019)Approximate nearest neighbor searching with non-euclidean and weighted distancesProceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms10.5555/3310435.3310458(355-372)Online publication date: 6-Jan-2019
    • (2019)New Directions in Approximate Nearest-Neighbor SearchingAlgorithms and Discrete Applied Mathematics10.1007/978-3-030-11509-8_1(1-15)Online publication date: 14-Feb-2019
    • (2018)Hardness of approximate nearest neighbor searchProceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing10.1145/3188745.3188916(1260-1268)Online publication date: 20-Jun-2018
    • (2018)Randomized Embeddings with Slack and High-Dimensional Approximate Nearest NeighborACM Transactions on Algorithms10.1145/317854014:2(1-21)Online publication date: 16-Apr-2018
    • (2018)Approximate Polytope Membership QueriesSIAM Journal on Computing10.1137/16M106109647:1(1-51)Online publication date: 2-Jan-2018
    • (2018)Approximate Order-Sensitive k-NN Queries over Correlated High-Dimensional DataIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2018.281215330:11(2037-2050)Online publication date: 1-Nov-2018
    • Show More Cited By

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media