Abstract
A reverse k-nearest neighbour (RkNN) query determines the objects from a database that have the query as one of their k-nearest neighbors. Processing such a query has received plenty of attention in research. However, the effect of running multiple RkNN queries at once (join) or within a short time interval (bulk/group query) has only received little attention so far. In this paper, we analyze different types of RkNN joins and provide a classification of existing RkNN join algorithms. We discuss possible solutions for solving the non-trivial variants of the problem in vector spaces, including self and mutual pruning strategies. Further, we generalize the developed algorithms to general metric spaces. During an extensive performance analysis we provide evaluation results showing the IO and CPU performance of the compared algorithms for a wide range of different setups and suggest appropriate query algorithms for specific scenarios.
Similar content being viewed by others
Notes
The check, whether the new domination count of \({e_{j}^{S}}\) exceeds k will be performed in Line 21 of Algorithm 1
Each point will have itself in its kNN set, thus we need a k+1NN-join to find the kNNs of each point in S.
References
Bernecker T, Emrich T, Kriegel H-P, Mamoulis N, Renz M, Zhang S, Züfle A (2011) Inverse queries for multidimensional spaces. In: Proc. SSTD, pp 330–347
Jarvis RA, Patrick EA (1973) Clustering using a similarity measure based on shared near neighbors. IEEETC C-22(11):1025–1034
Ankerst M, Breunig MM, Kriegel H-P, Sander J (1999) OPTICS: ordering points to identify the clustering structure. In: Proc. SIGMOD, pp 49–60
Hautamäki V, Kärkkäinen I, Fränti P (2004) Outlier detection using k-nearest neighbor graph. In: Proc. ICPR, pp 430–433
Jin W, Tung AKH, Han J, Wang W (2006) Ranking outliers using symmetric neighborhood relationship. In: Proc. PAKDD, pp 577–593
Ester M, Kriegel H-P, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proc. KDD
Korn F, Muthukrishnan S (2000) Influenced sets based on reverse nearest neighbor queries. In: Proc. SIGMOD, pp 201–212
Yang C, Lin K-I (2001) An index structure for efficient reverse nearest neighbor queries. In: Proc. ICDE, pp 485–492
Achtert E, Böhm C, Kröger P, Kunath P, Pryakhin A, Renz M (2006) Efficient reverse k-nearest neighbor search in arbitrary metric spaces. In: Proc. SIGMOD, pp 515–526
Stanoi I, Agrawal D, Abbadi AE (2000) Reverse nearest neighbor queries for dynamic databases. In: Proc. DMKD, pp 44–53
Singh A, Ferhatosmanoglu H, Tosun AS (2003) High dimensional reverse nearest neighbor queries. In: Proc. CIKM, pp 91–98
Tao Y, Papadias D, Lian X (2004) Reverse kNN search in arbitrary dimensionality. In: Proc. VLDB, pp 744–755
Emrich T, Kriegel H-P, Kröger P, Niedermayer J, Renz M, Züfle A (2013) A mutual-pruning approach for rknn join processing. In: Proc. BTW, pp 21–35
Emrich T, Kriegel H-P, Kröger P, Niedermayer J, Renz M, Züfle A (2013) Reverse-k-nearest-neighbor join processing. In: Proc. SSTD, pp 277–294
Wu W, Yang F, Chan C-Y, Tan K (2008) FINCH: evaluating reverse k-nearest-neighbor queries on location data. In: Proc. VLDB, pp 1056–1067
Böhm C, Krebs F (2004) The k-nearest neighbor join: turbo charging the KDD process. KAIS 6(6):728–749
Zhang J, Mamoulis N, Papadias D, Tao Y (2004) All-nearest-neighbors queries in spatial databases. In: Proc. SSDBM, pp 297–306
Yu C, Zhang R, Huang Y, Xiong H (2010) High-dimensional knn joins with incremental updates. Geoinformatica 14(1):55–82
Venkateswaran JG (2007) Indexing techniques for metric databases with costly searches, Ph.D. dissertation, University of Florida, Gainesville, FL, USA, aAI3300799
Tao Y, Yiu ML, Mamoulis N (2006) Reverse nearest neighbor search in metric spaces. IEEE TKDE 18(9):1239–1252
Cheema MA, Lin X, Zhang W, Zhang Y (2011) Influence zone: efficiently processing reverse k nearest neighbors queries. In: ICDE, pp 577–588
Achtert E, Kriegel H-P, Kröger P, Renz M, Züfle A (2009) Reverse k-nearest neighbor search in dynamic and general metric databases. In: Proc. EDBT, pp 886–897
Kriegel H-P, Kröger P, Renz M, Züfle A, Katzdobler A (2009) Reverse k-nearest neighbor search based on aggregate point access methods. In: Proc. SSDBM, pp 444–460
Xia C, Hsu W, Lee ML (2005) ERkNN: efficient reverse k-nearest neighbors retrieval with local kNN-distance estimation. In: Proc. CIKM, pp 533–540
Papadias D, Kalnis P, Zhang J, Tao Y (2001) Efficient olap operations in spatial data warehouses. In: Proc. SSTD, pp 443–459
Kriegel H-P, Kröger P, Renz M, Züfle A, Katzdobler A (2009) Incremental reverse nearest neighbor ranking. In: Proc. ICDE, pp 1560–1567
Emrich T, Kriegel H-P, Kröger P, Renz M, Züfle A (2010) Boosting spatial pruning: on optimal pruning of mbrs. In: Proc. SIGMOD, pp 39–50
Achtert E, Hettab A, Kriegel H-P, Schubert E, Zimek A (2011) Spatial outlier detection: data, algorithms, visualizations. In: Proc. SSTD, pp 512–516
Ciaccia P, Patella M, Zezula P (1997) M-Tree: an efficient access method for similarity search in metric spaces. In: Proc. VLDB, pp 426–435
Emrich T (2013) Coping with distance and location dependencies in spatial, temporal and uncertain data, Ph.D. dissertation, Ludwig-Maximilians University Munich
Emrich T, Graf F, Kriegel H-P, Schubert M, Thoma M (2010) On the impact of flash ssds on spatial indexing. In: Proc. DaMoN, pp 3–8
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Emrich, T., Kriegel, HP., Kröger, P. et al. On reverse-k-nearest-neighbor joins. Geoinformatica 19, 299–330 (2015). https://doi.org/10.1007/s10707-014-0215-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10707-014-0215-5