Skip to main content

Heuristics for Computing k-Nearest Neighbors Graphs

  • Conference paper
  • First Online:
Computer Science – CACIC 2019 (CACIC 2019)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1184))

Included in the following conference series:

  • 383 Accesses

Abstract

The k-Nearest Neighbors Graph (kNNG) consists of links from an object to its k-Nearest Neighbors. This graph is of interest in diverse applications ranging from statistics, machine learning, clustering and outlier detection, computational biology, and even indexing. Obtaining the kNNG is challenging because intrinsically high dimensional spaces are known to be unindexable, even in the approximate case. The cost of building an index is not well amortized over just all the objects in the database. In this paper, we introduce a method to compute the kNNG without building an index. While our approach is sequential, we show experimental evidence that the number of distance computations is a fraction of the \(n^2/2\) used in the naïve solution. We make heavy use of the notion of pivot, that is, database objects with full distance knowledge to all other database objects. From a group of pivots, it is possible to infer upper bounds of distance to other objects.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    At http://www.dimacs.rutgers.edu/Challenges/Sixth/software.html.

  2. 2.

    At http://www.dbs.informatik.uni-muenchen.de/~seidl/DATA/histo112.112682.gz.

References

  1. Archip, N., Rohling, R., Cooperberg, P., Tahmasebpour, H., Warfield, S.K.: Spectral clustering algorithms for ultrasound image segmentation. In: Duncan, J.S., Gerig, G. (eds.) MICCAI 2005. LNCS, vol. 3750, pp. 862–869. Springer, Heidelberg (2005). https://doi.org/10.1007/11566489_106

    Chapter  Google Scholar 

  2. Baeza-Yates, R., Hurtado, C., Mendoza, M.: Query clustering for boosting web page ranking. In: Favela, J., Menasalvas, E., Chávez, E. (eds.) AWIC 2004. LNCS (LNAI), vol. 3034, pp. 164–175. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-24681-7_19

    Chapter  Google Scholar 

  3. Brito, M., Chávez, E., Quiroz, A., Yukich, J.: Connectivity of the mutual k-nearest neighbor graph in clustering and outlier detection. Stat. Probab. Lett. 35(4), 33–42 (1996)

    MathSciNet  MATH  Google Scholar 

  4. Callahan, P., Kosaraju, R.: A decomposition of multidimensional point sets with applications to k nearest neighbors and n body potential fields. JACM 42(1), 67–90 (1995)

    Article  MathSciNet  Google Scholar 

  5. Chávez, E., Ludueña, V., Reyes, N.: Solving all-k-nearest neighbor problem without an index. In: Procs. del XXV Congreso Argentino de Ciencias de la Computación (CACIC 2019), pp. 567–576. UniRío editora (2019)

    Google Scholar 

  6. Chávez, E., Ludueña, V., Reyes, N., Roggero, P.: Faster proximity searching with the distal sat. Inf. Syst. 59, 15–47 (2016)

    Article  Google Scholar 

  7. Chávez, E., Navarro, G.: A compact space decomposition for effective metric indexing. Pattern Recogn. Lett. 26(9), 1363–1376 (2005)

    Article  Google Scholar 

  8. Chávez, E., Navarro, G., Baeza-Yates, R., Marroquín, J.: Searching in metric spaces. ACM Comput. Surv. 33(3), 273–321 (2001)

    Article  Google Scholar 

  9. Duda, R., Hart, P.: Pattern Classification and Scene Analysis. Wiley, New York (1973)

    MATH  Google Scholar 

  10. Eppstein, D., Erickson, J.: Iterated nearest neighbors and finding minimal polytopes. Int. J. Math. Comput. Sci. 11–3, 321–350 (1994)

    MathSciNet  MATH  Google Scholar 

  11. Figueroa, K., Navarro, G., Chávez, E.: Metric spaces library (2007). http://www.sisap.org/Metric_Space_Library.html

  12. Navarro, G., Paredes, R., Reyes, N., Bustos, C.: An empirical evaluation of intrinsic dimension estimators. Inf. Syst. 64, 206–218 (2017)

    Article  Google Scholar 

  13. Paredes, R.: Graphs for Metric Space Searching. Ph.D. thesis, University of Chile, Chile, July 2008

    Google Scholar 

  14. Paredes, R., Chávez, E., Figueroa, K., Navarro, G.: Practical construction of k-nearest neighbor graphs in metric spaces. In: Àlvarez, C., Serna, M. (eds.) WEA 2006. LNCS, vol. 4007, pp. 85–97. Springer, Heidelberg (2006). https://doi.org/10.1007/11764298_8

    Chapter  Google Scholar 

  15. Rubinstein, A.: Hardness of approximate nearest neighbor search. In: Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing, pp. 1260–1268. ACM (2018)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Verónica Ludueña .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Chávez, E., Ludueña, V., Reyes, N. (2020). Heuristics for Computing k-Nearest Neighbors Graphs. In: Pesado, P., Arroyo, M. (eds) Computer Science – CACIC 2019. CACIC 2019. Communications in Computer and Information Science, vol 1184. Springer, Cham. https://doi.org/10.1007/978-3-030-48325-8_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-48325-8_16

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-48324-1

  • Online ISBN: 978-3-030-48325-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics