HGraph: A Connected-Partition Approach to Proximity Graphs for Similarity Search

Shimomura, Larissa Capobianco; Kaster, Daniel S.

doi:10.1007/978-3-030-27615-7_8

Larissa Capobianco Shimomura^14,15 &
Daniel S. Kaster¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11706))

Included in the following conference series:

International Conference on Database and Expert Systems Applications

1762 Accesses

Abstract

Similarity search is a common approach to support new applications that deal with complex data (e.g., images, videos, georeferenced data, etc.). As a consequence, appropriate indexing structures to support this task have been proposed in the literature. Recently, graph-based methods have shown to be very efficient for approximate similarity search. However, some of the main types of graphs used still suffer from two main drawbacks: (i) slow construction, and (ii) inaccurate retrieval. To reduce these drawbacks, in this paper, we propose the HGraph method. HGraph is a divide-and-conquer method for building graphs for similarity search that recursively partitions the input dataset and connect vertices across partitions at different levels. The method can be used with different types of graphs proposed in the literature to speed up the graph construction time as well as to increase the approximate search results quality through long-range edges connecting pivots of different partitions. We present experimental results using real datasets that show that HGraph applied to the k-NNG graph was able to decrease the construction time while increasing the approximate search recall when compared to the k-NNG. Regarding the application of HGraph to the NSW graph, the query recall also increased, however with a higher computational cost. An analysis of different combinations of the tested methods demonstrated HGraph query times given a recall rate were always among the top results regarding different setups.

This work has been supported by the Brazilian agencies CAPES and CNPq.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Fast Graph Similarity Search via Locality Sensitive Hashing

Performance Analysis of Graph-Based Methods for Exact and Approximate Similarity Search in Metric Spaces

Fast graph similarity search via hashing and its application on image retrieval

Article 14 September 2017

References

Amato, G., Esuli, A., Falchi, F.: A comparison of pivot selection techniques for permutation-based indexing. Inf. Syst. 52(C), 176–188 (2015)
Article Google Scholar
Barioni, M.C.N., Kaster, D.D.S., Razente, H.L., Traina, A.J., Júnior, C.T.: Advanced Database Query Systems. IGI Global (2011)
Google Scholar
Boytsov, L., Naidan, B.: Engineering efficient and effective non-metric space library. In: Brisaboa, N., Pedreira, O., Zezula, P. (eds.) SISAP 2013. LNCS, vol. 8199, pp. 280–293. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-41062-8_28
Chapter Google Scholar
Bustos, B., Navarro, G., Chavez, E.: Pivot selection techniques for proximity searching in metric spaces. In: SCCC, pp. 33–40, November 2001
Google Scholar
Chávez, E., Navarro, G., Baeza-Yates, R., Marroquín, J.L.: Searching in metric spaces. ACM Comput. Surv. 33(3), 273–321 (2001)
Article Google Scholar
Chen, J., Fangand, H.R., Saad, Y.: Fast approximate KNN graph construction for high dimensional data via recursive Lanczos bisection. J. Mach. Learn. Res. 10, 1989–2012 (2009)
MathSciNet MATH Google Scholar
Hajebi, K., Abbasi-Yadkori, Y., Shahbazi, H., Zhang, H.: Fast approximate nearest-neighbor search with k-nearest neighbor graph. In: IJCAI, pp. 1312–1317 (2011)
Google Scholar
Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Article Google Scholar
Malkov, Y., et al.: Approximate nearest neighbor algorithm based on navigable small world graphs. Inf. Syst. 45, 61–68 (2014)
Article Google Scholar
Ocsa, A., Bedregal, C., Cuadros-Vargas, E.: A new approach for similarity queries using neighborhood graphs. In: Brazilian Symposium on Databases, pp. 131–142 (2007)
Google Scholar
Ortega, M., Rui, Y., Chakrabarti, K., Porkaew, K., Mehrotra, S., Huang, T.S.: Supporting ranked boolean similarity queries in MARS. TKDE 10(6), 905–925 (1998)
Google Scholar
Paredes, R., Chávez, E.: Using the k-nearest neighbor graph for proximity searching in metric spaces. In: Consens, M., Navarro, G. (eds.) SPIRE 2005. LNCS, vol. 3772, pp. 127–138. Springer, Heidelberg (2005). https://doi.org/10.1007/11575832_14
Chapter Google Scholar
Park, H.S., Jun, C.H.: A simple and fast algorithm for k-medoids clustering. Expert Syst. Appl. 36(2, Part 2), 3336–3341 (2009)
Article Google Scholar
Shimomura, L.C., Vieira, M.R., Kaster, D.S.: Performance analysis of graph-based methods for exact and approximate similarity search in metric spaces. In: Marchand-Maillet, S., Silva, Y.N., Chávez, E. (eds.) SISAP 2018. LNCS, vol. 11223, pp. 18–32. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-02224-2_2
Chapter Google Scholar
Traina Jr., C., Filho, R.F., Traina, A.J., Vieira, M.R., Faloutsos, C.: The Omni-family of all-purpose access methods: a simple and effective way to make similarity search more efficient. VLDB J. 16(4), 483–505 (2007)
Article Google Scholar
Uhlmann, J.K.: Satisfying general proximity/similarity queries with metric trees. Inf. Process. Lett. 40(4), 175–179 (1991)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Eindhoven University of Technology, Eindhoven, The Netherlands
Larissa Capobianco Shimomura
University of Londrina, Londrina, PR, Brazil
Larissa Capobianco Shimomura & Daniel S. Kaster

Authors

Larissa Capobianco Shimomura
View author publications
You can also search for this author in PubMed Google Scholar
Daniel S. Kaster
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Larissa Capobianco Shimomura .

Editor information

Editors and Affiliations

Clausthal University of Technology, Clausthal-Zellerfeld, Germany
Sven Hartmann
Johannes Kepler University of Linz, Linz, Austria
Josef Küng
The University of Texas at Arlington, Arlington, TX, USA
Sharma Chakravarthy
Johannes Kepler University of Linz, Linz, Austria
Gabriele Anderst-Kotsis
Software Competence Center Hagenberg, Hagenberg im Mühlkreis, Austria
A Min Tjoa
Johannes Kepler University of Linz, Linz, Austria
Ismail Khalil

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Shimomura, L.C., Kaster, D.S. (2019). HGraph: A Connected-Partition Approach to Proximity Graphs for Similarity Search. In: Hartmann, S., Küng, J., Chakravarthy, S., Anderst-Kotsis, G., Tjoa, A., Khalil, I. (eds) Database and Expert Systems Applications. DEXA 2019. Lecture Notes in Computer Science(), vol 11706. Springer, Cham. https://doi.org/10.1007/978-3-030-27615-7_8

Download citation

DOI: https://doi.org/10.1007/978-3-030-27615-7_8
Published: 03 August 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-27614-0
Online ISBN: 978-3-030-27615-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics