Skip to main content
Log in

Fixed Queries Array: A Fast and Economical Data Structure for Proximity Searching

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Pivot-based algorithms are effective tools for proximity searching in metric spaces. They allow trading space overhead for number of distance evaluations performed at query time. With additional search structures (that pose extra space overhead) they can also reduce the amount of side computations. We introduce a new data structure, the Fixed Queries Array (FQA), whose novelties are (1) it permits sublinear extra CPU time without any extra data structure; (2) it permits trading number of pivots for their precision so as to make better use of the available memory. We show experimentally that the FQA is an efficient tool to search in metric spaces and that it compares favorably against other state of the art approaches. Its simplicity converts it into a simple yet effective tool for practitioners seeking for a black-box method to plug in their applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. S. Arya, D. Mount, N. Netanyahu, R. Silverman, and A. Wu, “An optimal algorithm for approximate nearest neighbor searching in fixed dimension,” in Proc. 5th ACM-SIAM Symposium on Discrete Algorithms (SODA'94), Washington DC, 1994, pp. 573–583.

  2. F. Aurenhammer, “Voronoi diagrams—a survey of a fundamental geometric data structure,” ACM Computing Surveys, Vol. 23, No 3, pp. 345–405, 1991.

    Google Scholar 

  3. R. Baeza-Yates, “ Searching: an algorithmic tour,” in Encyclopedia of Computer Science and Technology, A. Kent and J. Williams (Eds.), Vol. 37, Marcel Dekker, Inc., NY 1997, pp. 331–359.

    Google Scholar 

  4. R. Baeza-Yates, W. Cunto, U. Manber, and S. Wu, “Proximity matching using fixed-queries trees,” in Proc. 5th Combinatorial Pattern Matching (CPM'94), Asilomar, CA, 1994, pp. 198–212.

  5. R. Baeza-Yates and G. Navarro, “Fast approximate string matching in a dictionary,” in Proc. 5th Symposium on String Processing and Information Retrieval (SPIRE'98), Santa Cruz de la Sierra, Bolivia, IEEE CS Press, 1998, pp. 14–22.

    Google Scholar 

  6. R. Baeza-Yates and B. Ribeiro-Neto, Modern Information Retrieval, Addison-Wesley, Harlow, England 1999.

    Google Scholar 

  7. J. Bentley, “Multidimensional binary search trees used for associative searching,” Comm. of theACM,Vol. 18, No. 9, pp. 509–517, 1975.

    Google Scholar 

  8. T. Bozkaya and M. Ozsoyoglu, “Distance-based indexing for high-dimensional metric spaces,” in Proc. ACM SIGMOD International Conference on Management of Data, Sigmod Record, ACM Press, NY., 1997, Vol. 26, No. 2, pp. 357–368.

    Google Scholar 

  9. S. Brin, “Near neighbor search in large metric spaces,” in Proc. 21st Conference on Very Large Databases (VLDB'95), Zurich, Switzerland, 1995, pp. 574–584.

  10. E. Chávez, J. Marroqín, and R. Baeza-Yates, “Spaghettis: an array based algorithm for similarity queries in metric spaces,” in Proc. 6th Symposium on String Processing and Information Retrieval (SPIRE'99), Cancun, Mexico, IEEE CS Press, 1999, pp. 38–46.

    Google Scholar 

  11. E. Chávez, J. Marroqín, and G. Navarro,“ Overcoming the curse of dimensionality, ”in EuropeanWorkshop on Content-Based Multimedia Indexing (CBMI'99), Tolouse, France, 1999, pp. 57–64.

  12. E. Chávez, G. Navarro, R. Baeza-Yates, and J. Marroqín, “Searching in metric spaces,” To appear in ACM Computing Surveys, 2001, ACM Press, NY. ftp://ftp.dcc.uchile.cl/pub/users/gnavarro/-survmetric.ps.gz.

    Google Scholar 

  13. P. Ciaccia, M. Patella, and P. Zezula, “M-tree: an efficient access method for similarity search in metric spaces,” in Proc. of the 23rd Conference on Very Large Databases (VLDB'97), Athens, Greece, 1997, pp. 426–435.

  14. K. Clarkson, “Nearest neighbor queries in metric spaces,” Discrete Computational Geometry, Vol. 22, No. 1, pp. 63–93, 1999.

    Google Scholar 

  15. T. Cox and M. Cox, Multidimensional Scaling. Chapman and Hall, NY 1994.

    Google Scholar 

  16. F. Dehne and H. Nolteimer, “Voronoi trees and clustering problems,” Information Systems, Vol. 12, No. 2, pp. 171–175, 1987.

    Google Scholar 

  17. C. Faloutsos and I. Kamel, “Beyond uniformity and independence: analysis of R-trees using the concept of fractal dimension,” in Proc. 13th ACM Symposium on Principles of Database Principles (PODS'94), Minneapolis, MN, 1994, pp. 4–13.

  18. C. Faloutsos and K. Lin, “Fastmap: a fast algorithm for indexing, data mining and visualization of traditional and multimedia datasets,” ACM SIGMOD Record, Vol. 24, No. 2, pp. 163–174, 1995.

    Google Scholar 

  19. A. Guttman, “R-trees: a dynamic index structure for spatial searching,” in Proc. ACMSIGMOD International Conference on Management of Data, Boston, MA 1984, pp. 47–57.

  20. J. Hair, R. Anderson, R. Tatham, and W. Black, Multivariate Data Analysis with Readings, 4th edition, Prentice-Hall, NJ, 1995.

    Google Scholar 

  21. I. Kalantari and G. McDonald, “A data structure and an algorithm for the nearest point problem,” IEEE Transactions on Software Engineering, Vol. 9, No. 5, 1983.

  22. L. Micó, J. Oncina, and E. Vidal, “Anewversion of the nearest-neighbor approximating and eliminating search (AESA) with linear preprocessing-time and memory requirements,” Pattern Recognition Letters, Vol. 15, pp. 9–17, 1994.

    Google Scholar 

  23. G. Navarro, “Searching in metric spaces by spatial approximation,” in Proc. 6th Symposium on String Processing and Information Retrieval (SPIRE'99), Cancun, Mexico, IEEE CS Press, 1999, pp. 141–148.

    Google Scholar 

  24. S. Nene and S. Nayar, “A simple algorithm for nearest neighbor search in high dimensions,” IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 19, No. 9, pp. 989–1003, 1997.

    Google Scholar 

  25. M. Shapiro, “The choice of reference points in best-match file searching,” Comm. of the ACM, Vol. 20, No. 5, pp. 339–343, 1977.

    Google Scholar 

  26. J. Uhlmann, “Implementing metric trees to satisfy general proximity/similarity queries,” Manuscript.

  27. J. Uhlmann, “Satisfying general proximity/similarity queries with metric trees,” Information Processing Letters, Vol. 40, pp. 175–179, 1991.

    Google Scholar 

  28. E. Vidal, “An algorithm for finding nearest neighbors in (approximately) constant average time,” Pattern Recognition Letters, Vol. 4, pp. 145–157, 1986.

    Google Scholar 

  29. P. Yianilos, “Data structures and algorithms for nearest neighbor search in general metric spaces,” in Proc. 4th ACM-SIAM Symposium on Discrete Algorithms (SODA'93), Austin, TX, 1993, pp. 311–321.

  30. P. Yianilos, “Excluded middle vantage point forests for nearest neighbor search,” in DIMACS Implementation Challenge, ALENEX'99, Baltimore, MD, LNCS v. 1619, Springer, Berlin, Germany, 1999.

    Google Scholar 

  31. P. Yianilos, “Locally lifting the curse of dimensionality for nearest neighbor search,” in Proc. 11thACM-SIAM Symposium on Discrete Algorithms (SODA'00), San Francisco, CA, 2000, pp. 361–370.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gonzalo Navarro.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chávez, E., Marroquín, J.L. & Navarro, G. Fixed Queries Array: A Fast and Economical Data Structure for Proximity Searching. Multimedia Tools and Applications 14, 113–135 (2001). https://doi.org/10.1023/A:1011343115154

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1011343115154

Navigation