An eigenvalue-based pivot selection strategy for efficient indexing and searching in metric spaces

Kim, Sung-Hwan; Lee, Da-Young; Cho, Hwan-Gue

doi:10.1007/s10586-017-1153-4

An eigenvalue-based pivot selection strategy for efficient indexing and searching in metric spaces

Published: 12 September 2017

Volume 20, pages 3643–3655, (2017)
Cite this article

Cluster Computing Aims and scope Submit manuscript

Sung-Hwan Kim¹,
Da-Young Lee¹ &
Hwan-Gue Cho¹

242 Accesses
Explore all metrics

Abstract

Pivots are used widely during indexing and searching in metric spaces. We maintain the distances from pivots to data objects to be indexed so the pre-computed distances can be used to prune unpromising objects during the search process. The search efficiency depends on the pivots used, but choosing good pivots is a challenging task. In this paper, we propose a new pivot selection method that incrementally chooses pivots using an eigenvalue-based uncorrelatedness scoring function. We also present a GPU implementation for computing the uncorrelatedness score in order to accelerate the pivot selection process. Our experimental results demonstrated that the proposed method performed better than other previously described pivot selection methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Pivot selection algorithms in metric spaces: a survey and experimental study

Article 11 August 2021

Pivot selection for metric-space indexing

Article 03 February 2016

Extreme pivots: a pivot selection strategy for faster metric search

Article 16 November 2019

References

Beecks, C., Lokoč, J., Seidl, T., Skopal, T.: Indexing the signature quadratic form distance for efficient content-based multimedia retrieval. In: Proceedings of the 1st ACM International Conference on Multimedia Retrieval (ICMR), p. 24 (2011)
Böhm, C., Berchtold, S., Keim, D.A.: Searching in high-dimensional spaces-index structures for improving the performance of multimedia databases. ACM Comput. Surv. 33(3), 322–373 (2001)
Article Google Scholar
Böhm, C., Braunmüller, B., Breunig, M., Kriegel, H.P.: High performance clustering based on the similarity join. In: Proceedings of the 9th International Conference on Information and Knowledge Management (CIKM), pp. 298–305 (2000)
Brin, S.: Near neighbor search in large metric spaces. In: Proceedings of the 21st Conference on Very Large Databases (VLDB), pp. 574–584 (1995)
Bustos, B., Navarro, G., Chavez, E.: Pivot selection techniques for proximity searching in metric spaces. Pattern Recognit. Lett. 24, 2357–2366 (2003)
Article MATH Google Scholar
Chavez, E., Navarro, G., Baeza-Yates, R., Marroquin, J.: Searching in metric spaces. ACM Comput. Surv. 33(3), 273–321 (2001)
Article Google Scholar
Chen, L., Gao, Y., Li, X., Jensen, C.S., Chen, G.: Efficient metric indexing for similarity search. In: Proceedings of the 31st IEEE International Conference on Data Engineering (ICDE), pp. 591–602 (2015)
Coskun, B., Giura, P.: Mitigating SMS spam by online detection of repetitive near-duplicate messages. In: Proceedings of the IEEE International Conference on Communication (ICC), pp. 999–1004 (2012)
Farago, A., Linder, T., Lugosi, G.: Fast nearest-neighbor search in dissimilarity spaces. IEEE Trans. Pattern Anal. Mach. Intell. 15(9), 957–962 (1999)
Article Google Scholar
Traina Jr., C., Filho, R.F.S., Traina, A.J.M., Vieira, M.R.: The omni-family of all purpose access method: a simple and effective way to make similarity search more efficient. VLDB J 16, 483–505 (2007)
Article Google Scholar
Kim, S.H., Lee, D.Y., Cho, H.G.: An eigenvalue-based pivot selection strategy for improving search efficiency in metric spaces. In: Proceedings of the 2016 International Conference on Big Data and Smart Computing (BigComp), pp. 207–214 (2016)
Mao, R., Miranker, L., Miranker, D.P.: Pivot selection: Dimension reduction for distance-based indexing. J Discret. Algorithms 13, 32–46 (2012)
Article MATH MathSciNet Google Scholar
Maon, R., Liu, S., Xu, H., Zhang, D., Miranker, D.P.: On data partitioning in tree structure metric-space indexes. In: Lecture Notes in Computer Science: Database Systems for Advanced Applications, vol. 8421, pp. 141–155 (2014)
Micó, M.L., Oncina, J.: A new version of the nearest-neighbour approximating and eliminating search algorithm (aesa) with linear preprocessing time and memory requirements. Pattern Recognit. Lett. 15(1), 9–17 (1994)
Article Google Scholar
Sarawagi, S., Bhamidipaty, A.: Interactive deduplication using active learning. In: Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 269–278 (2002)
Savary, A.: Typographical nearest-neighbor search in a finite-state lexicon and its application to spelling correction. In: Proceedings of the 6th International Conference on Implementation and Application of Automata (CIAA), pp. 251–260 (2001)
Uhlmann, J.: Satisfying general proximity/similarity queries with metric trees. Inf. Process. Lett 40, 175–179 (1991)
Article MATH Google Scholar
Uribe-Paredes, R., Valero-Lara, ., Arias, E., Sánchez, J.L., Cazorla, D.: A GPU-based implementation for range queries on spaghettis data structure. In: Proceedings of the 11th International Conference on Computational Science and Its Applications. Lecture Notes in Computer Science, vol. 6782, pp. 615–629 (2011)
Yoon, T., Park, S.Y., Cho, H.G.: A smart filtering system for newly coined profanities by using approximate string alignment. In: Proceedings of the 10th IEEE International Conference on Computer and Information Technology (CIT), pp. 643–650 (2010)
Zhou, X., Wang, G., Zhou, X., Yu, G.: Bm+-tree: A hyperplane-based index method for high-dimensional metric spaces. In: Lecture Notes in Computer Science: Database Systems for Advanced Applications, vol. 3453, pp. 398–409 (2005)

Download references

Acknowledgements

This research was supported by Basic Research Laboratory through the National Research Foundations of Korea funded by the Ministry of Science, ICT and Future Planning (NRF-2015R1A4A1041584). A preliminary version of this paper appeared in [11].

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering, Pusan National University, Busan, South Korea
Sung-Hwan Kim, Da-Young Lee & Hwan-Gue Cho

Authors

Sung-Hwan Kim
View author publications
You can also search for this author in PubMed Google Scholar
Da-Young Lee
View author publications
You can also search for this author in PubMed Google Scholar
Hwan-Gue Cho
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hwan-Gue Cho.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kim, SH., Lee, DY. & Cho, HG. An eigenvalue-based pivot selection strategy for efficient indexing and searching in metric spaces. Cluster Comput 20, 3643–3655 (2017). https://doi.org/10.1007/s10586-017-1153-4

Download citation

Received: 10 July 2016
Revised: 10 May 2017
Accepted: 28 August 2017
Published: 12 September 2017
Issue Date: December 2017
DOI: https://doi.org/10.1007/s10586-017-1153-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An eigenvalue-based pivot selection strategy for efficient indexing and searching in metric spaces

Abstract

Access this article

Similar content being viewed by others

Pivot selection algorithms in metric spaces: a survey and experimental study

Pivot selection for metric-space indexing

Extreme pivots: a pivot selection strategy for faster metric search

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An eigenvalue-based pivot selection strategy for efficient indexing and searching in metric spaces

Abstract

Access this article

Similar content being viewed by others

Pivot selection algorithms in metric spaces: a survey and experimental study

Pivot selection for metric-space indexing

Extreme pivots: a pivot selection strategy for faster metric search

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation