Skip to main content
Log in

Reverse k nearest neighbors queries and spatial reverse top-k queries

  • Regular Paper
  • Published:
The VLDB Journal Aims and scope Submit manuscript

Abstract

Given a set of facilities and a set of users, a reverse k nearest neighbors (RkNN) query q returns every user for which the query facility is one of the k closest facilities. Almost all of the existing techniques to answer RkNN queries adopt a pruning-and-verification framework. Regions-based pruning and half-space pruning are the two most notable pruning strategies. The half-space-based approach prunes a larger area and is generally believed to be superior. Influenced by this perception, almost all existing RkNN algorithms utilize and improve the half-space pruning strategy. We observe the weaknesses and strengths of both strategies and discover that the regions-based pruning has certain strengths that have not been exploited in the past. Motivated by this, we present a new regions-based pruning algorithm called Slice that utilizes the strength of regions-based pruning and overcomes its limitations. We also study spatial reverse top-k (SRTk) queries that return every user u for which the query facility is one of the top-k facilities according to a given linear scoring function. We first extend half-space-based pruning to answer SRTk queries. Then, we propose a novel regions-based pruning algorithm following Slice framework to solve the problem. Our extensive experimental study on synthetic and real data sets demonstrates that Slice is significantly more efficient than all existing RkNN and SRTk algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23
Fig. 24
Fig. 25
Fig. 26
Fig. 27
Fig. 28
Fig. 29

Similar content being viewed by others

Notes

  1. Although the partitions with different sizes can be used, in this paper, we use equally sized partitions so that the partition that contains a point p can be identified in O(1).

  2. Lemma 20 in “Appendix” section shows that dist(uf) can never be equal to \(dist(f,p) + dist(u,p)\)) even when f lies on the ray \(L^\theta \).

  3. The observation 1 in [5] can be summarized as follows. For a circle C centered at q, let x be the point such that \(dist(f,x)= mindist(f,C)\) where mindist(fC) is the minimum distance between f and any point of the circle C. Let u be a user located on the perimeter of this circle. Then, dist(fu) monotonically increases if u moves along the circle in clockwise or counterclockwise direction from x. Since f lies outside the partition, x also lies outside the partition, and this implies that dist(uf) is minimum when u is at one of the end points of the arc.

References

  1. Achtert, E., Kriegel, H.-P., Kröger, P., Renz, M., Züfle, A.: Reverse k-nearest neighbor search in dynamic and general metric databases. In: EDBT, pp. 886–897 (2009)

  2. Benetis, R., Jensen, C.S., Karciauskas, G., Saltenis, S.: Nearest neighbor and reverse nearest neighbor queries for moving objects. In: IDEAS, pp. 44–53 (2002)

  3. Bernecker, T., Emrich, T., Kriegel, H.-P., Mamoulis, N., Renz, M., Züfle, A.: A novel probabilistic pruning approach to speed up similarity queries in uncertain databases. In: ICDE, pp. 339–350 (2011)

  4. Bernecker, T., Emrich, T., Kriegel, H.-P., Renz, M., Züfle, S.Z.A.: Efficient probabilistic reverse nearest neighbor query processing on uncertain data. Proc. VLDB Endow. 4(10), 669–680 (2011)

    Article  Google Scholar 

  5. Cheema, M.A., Lin, X., Zhang, Y., Wang, W., Zhang, W.: Lazy updates: an efficient technique to continuously monitoring reverse knn. Proc. VLDB Endow. 2(1), 1138–1149 (2009)

    Article  Google Scholar 

  6. Cheema, M.A., Brankovic, L., Lin, X., Zhang, W., Wang, W.: Multi-guarded safe zone: an effective technique to monitor moving circular range queries. In: ICDE, pp. 189–200 (2010)

  7. Cheema, M.A., Lin, X., Wang, W., Zhang, W., Pei, J.: Probabilistic reverse nearest neighbor queries on uncertain data. IEEE Trans. Knowl. Data Eng. 22(4), 550–564 (2010)

    Article  Google Scholar 

  8. Cheema, M.A., Lin, X., Zhang, W., Zhang, Y.: Influence zone: efficiently processing reverse k nearest neighbors queries. In: ICDE, pp. 577–588 (2011)

  9. Cheema, M.A., Zhang, W., Lin, X., Zhang, Y.: Efficiently processing snapshot and continuous reverse k nearest neighbors queries. VLDB J. 21(5), 703–728 (2012)

    Article  Google Scholar 

  10. Cheema, M.A., Zhang, W., Lin, X., Zhang, Y., Li, X.: Continuous reverse k nearest neighbors queries in euclidean space and in spatial networks. VLDB J. 21, 69–95 (2012)

    Article  Google Scholar 

  11. Cheema, M.A., Shen, Z., Lin, X., Zhang, W.: A unified framework for efficiently processing ranking related queries. In: Proceeding of the 17th International Conference on Extending Database Technology (EDBT), pp. 427–438, Athens, 24–28 Mar (2014)

  12. Chester, S., Thomo, A., Venkatesh, S., Whitesides, S.: Indexing reverse top-k queries in two dimensions. In: Meng, W., Feng, L., Bressan, S., Winiwarter, W., Song, W. (eds.) Database Systems for Advanced Applications, pp. 201–208. Springer, Berlin (2013)

  13. Emrich, T., Kriegel, H.-P., Kröger, P., Renz, M., Züfle, A.: Incremental reverse nearest neighbor ranking in vector spaces. In: SSTD (2009)

  14. Gkorgkas, O., Vlachou, A., Doulkeridis, C., Nørvåg, K.: Discovering influential data objects over time. In: Nascimento, M.A., Sellis, T., Cheng, R., Sander, J., Zheng, Y., Kriegel, H.-P., Renz, M., Sengstock, C. (eds.) Advances in Spatial and Temporal Databases, pp. 110–127. Springer, Berlin (2013)

  15. https://international.ipums.org/international/

  16. https://www.census.gov/geo/maps-data/data/tiger-line.html

  17. Jin, C., Zhang, R., Kang, Q., Zhang, Z., Zhou, A.: Probabilistic reverse top-k queries. In: Bhowmick, S.S., Dyreson, C.E., Jensen, C.S., Lee, M.L., Muliantara, A., Thalheim, B. (eds.) Database Systems for Advanced Applications, pp. 406–419. Springer, Switzerland (2014)

  18. Kang, J.M., Mokbel, M.F., Shekhar, S., Xia, T., Zhang, D.: Continuous evaluation of monochromatic and bichromatic reverse nearest neighbors. In: ICDE, pp. 806–815 (2007)

  19. Korn, F., Muthukrishnan, S.: Influence sets based on reverse nearest neighbor queries. In: SIGMOD, pp. 201–212 (2000)

  20. Kriegel, H.-P., Kröger, P., Renz, M., Züfle, A., Katzdobler, A.: Incremental reverse nearest neighbor ranking. In: ICDE, pp. 1560–1567 (2009)

  21. Lian, X., Chen, L.: Efficient processing of probabilistic reverse nearest neighbor queries over uncertain data. VLDB J. 18(3), 787–808 (2009)

    Article  Google Scholar 

  22. Lin, K.-I., Nolen, M., Yang, C.: Applying bulk insertion techniques for dynamic reverse nearest neighbor problems. In IDEAS, pp. 290–297 (2003)

  23. Park, J.-H., Chung, C.-W., Kang, U.: Reverse nearest neighbor search with a non-spatial aspect. Inf. Syst. 54, 92–112 (2015)

    Article  Google Scholar 

  24. Preparata, F.P., Shamos, M.I.: Computational Geometry an Introduction. Springer, Berlin (1985)

    Book  MATH  Google Scholar 

  25. Roussopoulos, N., Kelley, S., Vincent, F.: Nearest neighbor queries. In: Proceedings of the 1995 ACM SIGMOD International Conference on Management of Data, pp. 71–79, San Jose, 22–25 May (1995)

  26. Safar, M., Ebrahimi, D., Taniar, D.: Voronoi-based reverse nearest neighbor query processing on spatial networks. Multimed. Syst. 15(5), 295–308 (2009)

    Article  Google Scholar 

  27. Sharifzadeh, M., Shahabi, C.: Vor-tree: R-trees with voronoi diagrams for efficient processing of spatial nearest neighbor queries. Proc. VLDB Endow. 3(1), 1231–1242 (2010)

    Article  Google Scholar 

  28. Stanoi, I., Agrawal, D., Abbadi, A.E.: Reverse nearest neighbor queries for dynamic databases. In: ACM SIGMOD, Workshop, pp. 44–53 (2000)

  29. Stanoi, I., Riedewald, M., Agrawal, D., Abbadi, A.E.: Discovery of influence sets in frequently updated databases. In: PVLDB, pp. 99–108 (2001)

  30. Tao, Y., Papadias, D., Lian, X.: Reverse knn search in arbitrary dimensionality. In: VLDB, pp. 744–755 (2004)

  31. Tao, Y., Yiu, M.L., Mamoulis, N.: Reverse nearest neighbor search in metric spaces. IEEE Trans. Knowl. Data Eng. 18(9), 1239–1252 (2006)

    Article  Google Scholar 

  32. Tsirogiannis, D., Harizopoulos, S., Shah, M.A., Wiener, J.L., Graefe, G.: Query processing techniques for solid state drives. In: SIGMOD, pp. 59–72 (2009)

  33. Vlachou, A., Doulkeridis, C., Kotidis, Y., Nørvåg, K.: Reverse top-k queries. In: ICDE, pp. 365–376 (2010)

  34. Vlachou, A., Doulkeridis, C., Kotidis, Y., Nørvåg,K.: Reverse top-k queries. In: 2010 IEEE 26th International Conference on Data Engineering (ICDE), pp. 365–376. IEEE (2010)

  35. Vlachou, A., Doulkeridis, C., Kotidis, Y., Norvag, K.: Monochromatic and bichromatic reverse top-k queries. IEEE Trans. Knowl. Data Eng. 23(8), 1215–1229 (2011)

    Article  Google Scholar 

  36. Vlachou, A., Doulkeridis, C., Nørvåg, K.: Monitoring reverse top-k queries over mobile devices. In: Proceedings of the 10th ACM International Workshop on Data Engineering for Wireless and Mobile Access, pp. 17–24. ACM (2011)

  37. Vlachou, A., Doulkeridis, C., Nørvåg, K., Kotidis,Y.: Branch-and-bound algorithm for reverse top-k queries. In: Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, pp. 481–492. ACM (2013)

  38. Wu, W., Yang, F., Chan, C.Y., Tan, K.-L.: Continuous reverse k-nearest-neighbor monitoring. In: MDM, pp. 132–139 (2008)

  39. Wu, W., Yang, F., Chan, C.Y., Tan, K.-L.: Finch: evaluating reverse k-nearest-neighbor queries on location data. Proc. VLDB Endow. 1(1), 1056–1067 (2008)

    Article  Google Scholar 

  40. Xia, T., Zhang, D.: Continuous reverse nearest neighbor monitoring. In: ICDE, pp. 77–86 (2006)

  41. www.cs.fsu.edu/%7Elifeifei/SpatialDataset.htm

  42. Yang, C., Lin, K.-I.: An index structure for efficient reverse nearest neighbor queries. In: ICDE, pp. 485–492 (2001)

  43. Yang, S., Cheema, M.A., Lin, X., Zhang, Y.: SLICE: reviving regions-based pruning for reverse k nearest neighbors queries. In: ICDE, pp. 760–771 (2014)

  44. Yang, S., Cheema, M.A., Lin, X., Wang, W.: Reverse k nearest neighbors query processing: experiments and analysis. Proc. VLDB Endow. 8(5), 605–616 (2015)

    Article  Google Scholar 

  45. Yiu, M.L., Mamoulis, N.: Reverse nearest neighbors search in ad hoc subspaces. IEEE Trans. Knowl. Data Eng. 19(3), 412–426 (2007)

    Article  Google Scholar 

  46. Yiu, M.L., Papadias, D., Mamoulis, N., Tao, Y.: Reverse nearest neighbors in large graphs. IEEE Trans. Knowl. Data Eng. 18(4), 540–553 (2006)

  47. Yu, A.W., Mamoulis, N., Su, H.: Reverse top-k search using random walk with restart. Proc. VLDB Endow. 7(5), 401–412 (2014)

    Article  Google Scholar 

Download references

Acknowledgments

Muhammad Aamir Cheema is supported by ARC DE130101002 and DP130103405. Xuemin Lin is supported by NSFC61232006, NSFC61021004, ARC DP120 104168 and DP110102937. The research of Ying Zhang is supported by ARC DP130103245 and DP110104880. Wenjie Zhang is supported by ARC DP150103071 and DP150102 728.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Muhammad Aamir Cheema.

Appendix

Appendix

Lemma 20

Consider a facility f, a ray \(L^\theta \), a critical point p on \(L^\theta \) and a user u that lies on \(L^\theta \) and \(dist(u,q) > d^\theta \). Then, \(dist(u,f)\ne dist(f,p) + dist(p,u)\).

Proof

Note that \(dist(u,f)\le dist(f,p) + dist(u,p)\) due to the triangular inequality. Since p and u lie on \(L^\theta \), \(dist(u,f) = dist(f,p) + dist(p,u)\) if and only if f also lies on \(L^\theta \) (i.e., \(\theta =0^\circ \)) and p lies between u and f (see Fig. 30). We prove by contradiction that p cannot lie between u and f.

Assume that p lies between u and f as shown in Fig. 30. Since \(\theta =0^\circ \), \(dist(p,q)=\frac{dist(q,f)^2 - \Delta ^2}{2(\Delta + dist(q,f)\cos {0})}=\frac{dist(q,f) - \Delta }{2}\). As \(dist(q,f) > |\Delta |\), we have \(dist(p,q) < \frac{dist(q,f) + dist(q,f)}{2}\). In other words, \(dist(p,q) < dist(q,f)\). Furthermore, since \(dist(u,q) > d^\theta \) (i.e., \(dist(u,q) > dist(p,q)\)), this implies that p cannot be between f and u which contradicts the assumption. \(\square \)

Fig. 30
figure 30

Lemma 20

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yang, S., Cheema, M.A., Lin, X. et al. Reverse k nearest neighbors queries and spatial reverse top-k queries. The VLDB Journal 26, 151–176 (2017). https://doi.org/10.1007/s00778-016-0445-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00778-016-0445-2

Keywords

Navigation