Skip to main content
Log in

D-Index: Distance Searching Index for Metric Data Sets

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

In order to speedup retrieval in large collections of data, index structures partition the data into subsets so that query requests can be evaluated without examining the entire collection. As the complexity of modern data types grows, metric spaces have become a popular paradigm for similarity retrieval. We propose a new index structure, called D-Index, that combines a novel clustering technique and the pivot-based distance searching strategy to speed up execution of similarity range and nearest neighbor queries for large files with objects stored in disk memories. We have qualitatively analyzed D-Index and verified its properties on actual implementation. We have also compared D-Index with other index structures and demonstrated its superiority on several real-life data sets. Contrary to tree organizations, the D-Index structure is suitable for dynamic environments with a high rate of delete/insert operations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. T. Bozkaya and Ozsoyoglu, “Indexing large metric spaces for similarity search queries,” ACM TODS, Vol. 24, No. 3, pp. 361–404, 1999.

    Google Scholar 

  2. B. Bustos, G. Navarro, and E. Chavez, “Pivot selection techniques for proximity searching in metric spaces,” in Proceedings of the XXI Conference of the Chielan Computer Science Society (SCCC01), IEEE CS Press, 2001, pp. 33–40.

  3. E. Chavez, J. Marroquin, and G. Navarro, “Fixed queries array: A fast and economical data structure for proximity searching,” Multimedia Tools and Applications, Vol. 14, No. 2, pp. 113–135, 2001.

    Google Scholar 

  4. E. Chavez, G. Navarro, R. Baeza-Yates, and J. Marroquin, “Proximity searching in metric spaces,” ACM Computing Surveys. Vol. 33, No. 3, pp. 273–321, 2001.

    Google Scholar 

  5. P. Ciaccia, M. Patella, and P. Zezula, “M-tree: An efficient access method for similarity search in metric spaces,” in Proceedings of the 23rd VLDB Conference, Athens, Greece, 1997, pp. 426–435.

  6. R.F.S. Filho, A. Traina, C. Traina Jr., and C. Faloutsos, “Similarity search without tears: The OMNI-family of all-purpose access methods,” in Proceedings of the 17th ICDE Conference, Heidelberg, Germany, 2001, pp. 623–630.

  7. V. Dohnal, C. Gennaro, P. Savino, and P. Zezula, “Separable splits in metric data sets,” in Proceedings of 9-th Italian Symposium on Advanced Database Systems, Venice, Italy, June 2001, pp. 45–62, LCM Selecta Group—Milano.

  8. C. Gennaro, P. Savino, and P. Zezula, “Similarity search in metric databases through Hashing,” in Proceedings of ACM Multimedia 2001 Workshops, Oct. 2001, Ottawa, Canada, pp. 1–5.

  9. J.M. Hellerstein, J.F. Naughton, and A. Pfeffer, “Generalized search trees for database systems,” in Proceedings of the 21st VLDB Conference, 1995, pp. 562–573.

  10. B. Seeger, P. Larson, and R. McFayden, “Reading a set of disk pages,” in Proceedings of the 19th VLDB Conference, 1993, pp. 592–603.

  11. P.N. Yianilos, “Data structures and algorithms for nearest neighbor search in general metric spaces,” ACMSIAM Symposium on Discrete Algorithms (SODA), 1993, pp. 311–321.

  12. P.N. Yianilos, “Excluded middle vantage point forests for nearest neighbor search,” Tech. rep., NEC Research Institute, 1999, Presented at Sixth DIMACS Implementation Challenge: Nearest Neighbor Searchesworkshop, Jan. 15, 1999.

  13. C. Yu, B.C. Ooi, K.L. Tan, and H.V. Jagadish, “Indexing the Distance: Anefficient method toKNNprocessing,” in Proceedings of the 27th VLDB Conference, Roma, Italy, 2001, pp. 421–430.

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dohnal, V., Gennaro, C., Savino, P. et al. D-Index: Distance Searching Index for Metric Data Sets. Multimedia Tools and Applications 21, 9–33 (2003). https://doi.org/10.1023/A:1025026030880

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1025026030880

Navigation