Abstract
Algorithms for answering the k Nearest-Neighbor (k-NN) query are widely used for queries in spatial databases and for distance classification of a group of query points against a reference dataset to derive the dominating feature class. GPU devices have much larger numbers of processing cores than CPUs and faster device memory than the main memory accessed by CPUs, thus, providing higher computing power for processing demanding queries like the k-NN one. However, since device and/or main memory may not be able to host an entire, rather big, reference dataset, storing this dataset in a fast secondary device, like a Solid State Disk (SSD) is, in many practical cases, a feasible solution. We propose and implement the first GPU-based algorithms for processing the k-NN query for big reference data stored on SSDs. Based on 3d synthetic big data, we experimentally compare these algorithms and highlight the most efficient algorithmic variation.
Work of M. Vassilakopoulos and A. Corral funded by the MINECO research project [TIN2017-83964-R].
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
We used an SSD and in the rest of the text “SSD” instead of “disk” is used.
- 2.
Reading from SSD is accomplished by read operations of large sequences of consecutive pages, exploiting the internal parallelism of SSDs, although our experiments showed that reading from SSD does not contribute significantly to the performance cost of our algorithms.
References
Barlas, G.: Multicore and GPU Programming: An Integrated Approach. 1 edn, Morgan Kaufmann, Amsterdam (2014)
Garcia, V., Debreuve, E., Nielsen, F., Barlaud, M.: K-nearest neighbor search: Fast gpu-based implementations and application to high-dimensional feature matching. In: ICIP Conference, pp. 3757–3760 (2010)
Gieseke, F., Heinermann, J., Oancea, C.E., Igel, C.: Buffer k-d trees: Processing massive nearest neighbor queries on GPUs. In: ICML Conference, pp. 172–180 (2014)
Hinrichs, K.H., Nievergelt, J., Schorn, P.: Plane-sweep solves the closest pair problem elegantly. Inf. Process. Lett. 26(5), 255–261 (1988)
Katiyar, P., Vu, T., Eldawy, A., Migliorini, S., Belussi, A.: Spiderweb: a spatial data generator on the web. In: SIGSPATIAL Conference, pp. 465–468 (2020)
Komarov, I., Dashti, A., D’Souza, R.M.: Fast k-NNG construction with GPU-based quick multi-select. PloS ONE 9(5), 1–9 (2014)
Kuang, Q., Zhao, L.: A practical GPU based KNN algorithm. In: SCSCT Conference, pp. 151–155 (2009)
Leite, P.J.S., Teixeira, J.M.X.N., de Farias, T.S.M.C., Reis, B., Teichrieb, V., Kelner, J.: Nearest neighbor searches on the GPU - a massively parallel approach for dynamic point clouds. Int. J. Parallel Program. 40(3), 313–330 (2012)
Li, S., Amenta, N.: Brute-force k-nearest neighbors search on the GPU. In: SISAP Conference, pp. 259–270 (2015)
Mittal, S., Vetter, J.S.: A survey of software techniques for using non-volatile memories for storage and main memory systems. IEEE Trans. Parallel Distributed Syst. 27(5), 1537–1550 (2016)
Nam, M., Kim, J., Nam, B.: Parallel tree traversal for nearest neighbor query on the GPU. In: ICPP Conference, pp. 113–122 (2016)
Pan, J., Lauterbach, C., Manocha, D.: Efficient nearest-neighbor computation for GPU-based motion planning. In: IROS Conference, pp. 2243–2248 (2010)
Preparata, F.P., Shamos, M.I.: Computational Geometry - An Introduction. Texts and Monographs in Computer Science, Springer, New York (1985) https://doi.org/10.1007/978-1-4612-1098-6
Roh, H., Park, S., Kim, S., Shin, M., Lee, S.: B+-tree index optimization by exploiting internal parallelism of flash-based solid state drives. Proc. VLDB Endow. 5(4), 286–297 (2011)
Roumelis, G., Velentzas, P., Vassilakopoulos, M., Corral, A., Fevgas, A., Manolopoulos, Y.: Parallel processing of spatial batch-queries using xbr\({}^{\text{+ }}\)-trees in solid-state drives. Clust. Comput. 23(3), 1555–1575 (2020)
Sismanis, N., Pitsianis, N., Sun, X.: Parallel search of k-nearest neighbors with synchronous operations. In: HPEC Conference, pp. 1–6 (2012)
Velentzas, P., Vassilakopoulos, M., Corral, A.: In-memory k nearest neighbor GPU-based query processing. In: GISTAM Conference, pp. 310–317 (2020)
Velentzas, P., Vassilakopoulos, M., Corral, A.: A partitioning gpu-based algorithm for processing the k nearest-neighbor query. In: MEDES Conference. pp. 2–9 (2020)
Vu, T., Migliorini, S., Eldawy, A., Belussi, A.: Spatial data generators. In: SpatialGems - SIGSPATIAL International Workshop on Spatial Gems, pp. 1–7 (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Velentzas, P., Vassilakopoulos, M., Corral, A. (2021). GPU-Based Algorithms for Processing the k Nearest-Neighbor Query on Disk-Resident Data. In: Attiogbé, C., Ben Yahia, S. (eds) Model and Data Engineering. MEDI 2021. Lecture Notes in Computer Science(), vol 12732. Springer, Cham. https://doi.org/10.1007/978-3-030-78428-7_21
Download citation
DOI: https://doi.org/10.1007/978-3-030-78428-7_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-78427-0
Online ISBN: 978-3-030-78428-7
eBook Packages: Computer ScienceComputer Science (R0)