Skip to main content

HGraph: A Connected-Partition Approach to Proximity Graphs for Similarity Search

  • Conference paper
  • First Online:
Database and Expert Systems Applications (DEXA 2019)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11706))

Included in the following conference series:

Abstract

Similarity search is a common approach to support new applications that deal with complex data (e.g., images, videos, georeferenced data, etc.). As a consequence, appropriate indexing structures to support this task have been proposed in the literature. Recently, graph-based methods have shown to be very efficient for approximate similarity search. However, some of the main types of graphs used still suffer from two main drawbacks: (i) slow construction, and (ii) inaccurate retrieval. To reduce these drawbacks, in this paper, we propose the HGraph method. HGraph is a divide-and-conquer method for building graphs for similarity search that recursively partitions the input dataset and connect vertices across partitions at different levels. The method can be used with different types of graphs proposed in the literature to speed up the graph construction time as well as to increase the approximate search results quality through long-range edges connecting pivots of different partitions. We present experimental results using real datasets that show that HGraph applied to the k-NNG graph was able to decrease the construction time while increasing the approximate search recall when compared to the k-NNG. Regarding the application of HGraph to the NSW graph, the query recall also increased, however with a higher computational cost. An analysis of different combinations of the tested methods demonstrated HGraph query times given a recall rate were always among the top results regarding different setups.

This work has been supported by the Brazilian agencies CAPES and CNPq.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Amato, G., Esuli, A., Falchi, F.: A comparison of pivot selection techniques for permutation-based indexing. Inf. Syst. 52(C), 176–188 (2015)

    Article  Google Scholar 

  2. Barioni, M.C.N., Kaster, D.D.S., Razente, H.L., Traina, A.J., Júnior, C.T.: Advanced Database Query Systems. IGI Global (2011)

    Google Scholar 

  3. Boytsov, L., Naidan, B.: Engineering efficient and effective non-metric space library. In: Brisaboa, N., Pedreira, O., Zezula, P. (eds.) SISAP 2013. LNCS, vol. 8199, pp. 280–293. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-41062-8_28

    Chapter  Google Scholar 

  4. Bustos, B., Navarro, G., Chavez, E.: Pivot selection techniques for proximity searching in metric spaces. In: SCCC, pp. 33–40, November 2001

    Google Scholar 

  5. Chávez, E., Navarro, G., Baeza-Yates, R., Marroquín, J.L.: Searching in metric spaces. ACM Comput. Surv. 33(3), 273–321 (2001)

    Article  Google Scholar 

  6. Chen, J., Fangand, H.R., Saad, Y.: Fast approximate KNN graph construction for high dimensional data via recursive Lanczos bisection. J. Mach. Learn. Res. 10, 1989–2012 (2009)

    MathSciNet  MATH  Google Scholar 

  7. Hajebi, K., Abbasi-Yadkori, Y., Shahbazi, H., Zhang, H.: Fast approximate nearest-neighbor search with k-nearest neighbor graph. In: IJCAI, pp. 1312–1317 (2011)

    Google Scholar 

  8. Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)

    Article  Google Scholar 

  9. Malkov, Y., et al.: Approximate nearest neighbor algorithm based on navigable small world graphs. Inf. Syst. 45, 61–68 (2014)

    Article  Google Scholar 

  10. Ocsa, A., Bedregal, C., Cuadros-Vargas, E.: A new approach for similarity queries using neighborhood graphs. In: Brazilian Symposium on Databases, pp. 131–142 (2007)

    Google Scholar 

  11. Ortega, M., Rui, Y., Chakrabarti, K., Porkaew, K., Mehrotra, S., Huang, T.S.: Supporting ranked boolean similarity queries in MARS. TKDE 10(6), 905–925 (1998)

    Google Scholar 

  12. Paredes, R., Chávez, E.: Using the k-nearest neighbor graph for proximity searching in metric spaces. In: Consens, M., Navarro, G. (eds.) SPIRE 2005. LNCS, vol. 3772, pp. 127–138. Springer, Heidelberg (2005). https://doi.org/10.1007/11575832_14

    Chapter  Google Scholar 

  13. Park, H.S., Jun, C.H.: A simple and fast algorithm for k-medoids clustering. Expert Syst. Appl. 36(2, Part 2), 3336–3341 (2009)

    Article  Google Scholar 

  14. Shimomura, L.C., Vieira, M.R., Kaster, D.S.: Performance analysis of graph-based methods for exact and approximate similarity search in metric spaces. In: Marchand-Maillet, S., Silva, Y.N., Chávez, E. (eds.) SISAP 2018. LNCS, vol. 11223, pp. 18–32. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-02224-2_2

    Chapter  Google Scholar 

  15. Traina Jr., C., Filho, R.F., Traina, A.J., Vieira, M.R., Faloutsos, C.: The Omni-family of all-purpose access methods: a simple and effective way to make similarity search more efficient. VLDB J. 16(4), 483–505 (2007)

    Article  Google Scholar 

  16. Uhlmann, J.K.: Satisfying general proximity/similarity queries with metric trees. Inf. Process. Lett. 40(4), 175–179 (1991)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Larissa Capobianco Shimomura .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Shimomura, L.C., Kaster, D.S. (2019). HGraph: A Connected-Partition Approach to Proximity Graphs for Similarity Search. In: Hartmann, S., Küng, J., Chakravarthy, S., Anderst-Kotsis, G., Tjoa, A., Khalil, I. (eds) Database and Expert Systems Applications. DEXA 2019. Lecture Notes in Computer Science(), vol 11706. Springer, Cham. https://doi.org/10.1007/978-3-030-27615-7_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-27615-7_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-27614-0

  • Online ISBN: 978-3-030-27615-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics