Skip to main content

Pivot Selection Method for Optimizing both Pruning and Balancing in Metric Space Indexes

  • Conference paper
Database and Expert Systems Applications (DEXA 2010)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6262))

Included in the following conference series:

Abstract

We researched to try to find a way to reduce the cost of nearest neighbor searches in metric spaces. Many similarity search indexes recursively divide a region into subregions by using pivots, and construct a tree structure index. A problem in the existing indexes is that they only focus on the pruning objects and do not take into consideration the tree balancing. The balance of the indexes depends on the data distribution and the indexes don’t reduce the search cost for all data. We propose a similarity search index called the Partitioning Capacity Tree (PCTree). PCTree automatically optimizes the pivot selection based on both the balance of the regions partitioned by a pivot and the estimated effectiveness of the search pruning by the pivot. As a result, PCTree reduces the search cost for various data distributions. Our evaluations comparing it with four indexes on three real datasets showed that PCTree successfully reduces the search cost and is good at handling various data distributions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Metric spaces library, http://www.sisap.org/metric_space_library.html

  2. Bozkaya, T., Ozsoyoglu, Z.M.: Indexing large metric spaces for similarity search queries. ACM Trans. on Database Systems 24(3), 361–404 (1999)

    Article  Google Scholar 

  3. Chevez, E., Marroguin, J.L., Navarro, G.: Fixed queries array: A fast and economical data structure for proximity searching. Multimedia Tools Applications 14(2), 113–135 (2001)

    Article  Google Scholar 

  4. Chevez, E., Navarro, G.: A compact space decomposition for effective metric indexing. Pattern Recognition Letters 24(9), 1363–1376 (2005)

    Article  Google Scholar 

  5. Ciaccia, P., Patella, M., Zezula, P.: M-tree: An efficient access method for similarity search in metric spaces. In: VLDB (1997)

    Google Scholar 

  6. Dohnal, V., Gennaro, C., Savino, P., Zezula, P.: D-index: Distance searching index for metric data sets. Multimedia Tools and Applications 21(1), 9–33 (2003)

    Article  Google Scholar 

  7. Hays, J., Efros, A.A.: Scene completion using millions of photographs. In: SIGGRAPH (2007)

    Google Scholar 

  8. Jagadish, H.V., Ooi, B.C., Tran, K.L., Yu, C., Zhang, R.: idistance: An adaptive b+-tree based indexing method for nearest neighbor earch. ACM Trans. on Database Systems 30(2), 364–397 (2003)

    Article  Google Scholar 

  9. Jones, G.A., Jones, J.M.: Information and Coding Theory. Springer, Heidelberg (2000)

    MATH  Google Scholar 

  10. Traina Jr., C., Santos Filho, R.F., Traina, A.J., Vieira, M.R., Faloutsos, C.: The omni-family of all-purpose access methods: a simple and effective way to make similarity search more efficient. The VLDB Journal 16(4), 483–505 (2007)

    Article  Google Scholar 

  11. Traina Jr., C., Traina, A.J.M., Seeger, B., Faloutsos, C.: Slim-trees: High performance metric trees minimizing overlap between nodes. In: Zaniolo, C., Grust, T., Scholl, M.H., Lockemann, P.C. (eds.) EDBT 2000. LNCS, vol. 1777, p. 51. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  12. Kurasawa, H., Fukagawa, D., Takasu, A., Adachi, J.: Maximal metric margin partitioning for similarity search indexes. In: CIKM (2009)

    Google Scholar 

  13. Navarro, G.: Searching in metric spaces by spatial approximation. The VLDB Journal 11(1), 28–46 (2002)

    Article  Google Scholar 

  14. Uhlmann, J.K.: Satisfying general proximity/similarity queries with metric trees. Information Processing Letters 40(4), 175–179 (1991)

    Article  MATH  Google Scholar 

  15. Yianilos, P.N.: Data structures and algorithms for nearest neighbor search in general metric spaces. In: SODA (1993)

    Google Scholar 

  16. Yianilos, P.N.: Excluded middle vantage point forests for nearest neighbor search. In: ALENEX (1999)

    Google Scholar 

  17. Zhuang, Y., Zhuang, Y., Li, Q., Chen, L., Yu, Y.: Indexing high-dimensional data in dual distance spaces: a symmetrical encoding approach. In: EDBT (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kurasawa, H., Fukagawa, D., Takasu, A., Adachi, J. (2010). Pivot Selection Method for Optimizing both Pruning and Balancing in Metric Space Indexes. In: Bringas, P.G., Hameurlain, A., Quirchmayr, G. (eds) Database and Expert Systems Applications. DEXA 2010. Lecture Notes in Computer Science, vol 6262. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15251-1_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-15251-1_10

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-15250-4

  • Online ISBN: 978-3-642-15251-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics