Skip to main content
Log in

Efficient processing of probabilistic reverse nearest neighbor queries over uncertain data

  • Regular Paper
  • Published:
The VLDB Journal Aims and scope Submit manuscript

Abstract

Reverse nearest neighbor (RNN) search is very crucial in many real applications. In particular, given a database and a query object, an RNN query retrieves all the data objects in the database that have the query object as their nearest neighbors. Often, due to limitation of measurement devices, environmental disturbance, or characteristics of applications (for example, monitoring moving objects), data obtained from the real world are uncertain (imprecise). Therefore, previous approaches proposed for answering an RNN query over exact (precise) database cannot be directly applied to the uncertain scenario. In this paper, we re-define the RNN query in the context of uncertain databases, namely probabilistic reverse nearest neighbor (PRNN) query, which obtains data objects with probabilities of being RNNs greater than or equal to a user-specified threshold. Since the retrieval of a PRNN query requires accessing all the objects in the database, which is quite costly, we also propose an effective pruning method, called geometric pruning (GP), that significantly reduces the PRNN search space yet without introducing any false dismissals. Furthermore, we present an efficient PRNN query procedure that seamlessly integrates our pruning method. Extensive experiments have demonstrated the efficiency and effectiveness of our proposed GP-based PRNN query processing approach, under various experimental settings.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Achtert, E., Böhm, C., Kröger, P., Kunath, P., Pryakhin, A., Renz, M.: Efficient reverse k-nearest neighbor search in arbitrary metric spaces. In: SIGMOD, pp. 515–526 (2006)

  2. Böhm, C., Pryakhin, A., Schubert, M.: The Gauss-tree: efficient object identification in databases of probabilistic feature vectors. In: ICDE, p. 9 (2006)

  3. Cai, Y., Ng, R.: Indexing spatio-temporal trajectoires with Chebyshev polynomials. In: Proc. ACM SIGMOD Int. Conf. on Management of Data, pp. 599–610 (2004)

  4. Chen, J., Cheng, R.: Efficient evaluation of imprecise location-dependent queries. In: ICDE, pp. 586–595 (2007)

  5. Chen, Q., Chen, L., Lian, X., Liu, Y., Yu, J.X.: Indexable pla for efficient similarity search. In: VLDB, pp. 435–446 (2007)

  6. Cheng, R., Chen, J.: Probabilistic verifiers: evaluating constrained nearest-neighbor queries over uncertain data. In: ICDE (2008)

  7. Cheng, R., Kalashnikov, D.V., Prabhakar, S.: Evaluating probabilistic queries over imprecise data. In: SIGMOD, pp. 551–562 (2003)

  8. Cheng, R., Kalashnikov, D.V., Prabhakar, S.: Querying imprecise data in moving object environments. In: TKDE, vol. 16, pp. 1112–1127 (2004)

  9. Cheng, R., Xia, Y., Prabhakar, S., Shah, R., Vitter, J.: Efficient indexing methods for probabilistic threshold queries over uncertain data. In: VLDB, pp. 876–887 (2004)

  10. Ciaccia, P., Patella, M., Zezula, P.: M-tree: an efficient access method for similarity search in metric spaces. In: VLDB, pp. 426–435 (1997)

  11. Faradjian, A., Gehrke, J., Bonnet, P.: Gadt: A probability space ADT for representing and querying the physical world. In: ICDE, pp. 201–211 (2002)

  12. Guttman, A.: R-trees: a dynamic index structure for spatial searching. In: SIGMOD, pp. 47–57 (1984)

  13. Kang, J.M., Mokbel, M.F., Shekhar, S., Xia, T., Zhang, D.: Continuous evaluation of monochromatic and bichromatic reverse nearest neighbors. In: ICDE, pp. 806–815 (2007)

  14. Katayama, N., Satoh, S.: The SR-tree: an index structure for high-dimensional nearest neighbor queries. In: SIGMOD, pp. 369–380 (1997)

  15. Keogh, E., Chakrabarti, K., Pazzani, M., Mehrotra, S.: Locally adaptive dimensionality reduction for indexing large time series databases. In: Proc. ACM SIGMOD Int. Conf. on Management of Data, pp. 151–162 (2001)

  16. Koch, C., Olteanu, D., Antova, L., Jansen, T.: Fast and simple relational processing of uncertain data. In: ICDE (2008)

  17. Kollios, G., Yi, K., Li, F., Srivastava, D.: Efficient processing of top-k queries in uncertain databases. In: ICDE (2008)

  18. Korn, F., Muthukrishnan, S.: Influence sets based on reverse nearest neighbor queries. In: SIGMOD, pp. 201–212 (2000)

  19. Korn, F., Muthukrishnan, S., Srivastava, D.: Reverse nearest neighbor aggregates over data streams. In: VLDB, pp. 814–825 (2002)

  20. Kriegel, H.-P., Kunath, P., Pfeifle, M., Renz, M.: Probabilistic similarity join on uncertain data. In: DASFAA, pp. 295–309 (2006)

  21. Kriegel, H.-P., Kunath, P., Renz, M.: Probabilistic nearest-neighbor query on uncertain objects. In: DASFAA, pp. 337–348 (2007)

  22. Li, M., Liu, Y.: Underground coal mine monitoring with wireless sensor network. ACM Trans. Sensor Netw. (TOSN) (2009)

  23. Lian, X., Chen, L.: Monochromatic and bichromatic reverse skyline search over uncertain databases. In: SIGMOD (2008)

  24. Lian, X., Chen, L.: Probabilistic ranked queries in uncertain databases. In: EDBT (2008)

  25. Liu, Y., Chen, L., Pei, J., Chen, Q., Zhao, Y.: Mining frequent trajectory patterns for activity monitoring using radio frequency tag arrays. In: PERCOM (2007)

  26. Ljosa, V., Singh, A.K.: APLA: indexing arbitrary probability distributions. In: ICDE, pp. 247–258 (2007)

  27. Ljosa, V., Singh, A.K.: Top-k spatial joins of probabilistic objects. In: ICDE (2008)

  28. Pei, J., Jiang, B., Lin, X., Yuan, Y.: Probabilistic skylines on uncertain data. In: VLDB, pp. 15–26 (2007)

  29. Pei, J., Lin, X., Hua, M., Zhang, W.: Efficiently answering probabilistic threshold top-k queries on uncertain data. In: ICDE (2008)

  30. Prabhakar, S., Mayfield, C., Cheng, R., Singh, S., Shah, R., Neville, J., Hambrusch, S.: Database support for pdf attributes. In: ICDE (2008)

  31. Re, C., Dalvi, N., Suciu, D.: Efficient top-k query evaluation on probabilistic data. In: ICDE, pp. 886–895 (2007)

  32. Sarma, A.D., Benjelloun, O., Halevy, A.Y., Widom, J.: Working models for uncertain data. In: ICDE, p. 7 (2006)

  33. Singh, A., Ferhatosmanoglu, H., Tosun, A.: High dimensional reverse nearest neighbor queries. In: CIKM, pp. 91–98 (2003)

  34. Singh, S., Mayfield, C., Shah, R., Prabhakar, S., Hambrusch, S., Neville, J., Cheng, R.: Database support for probabilistic attributes and tuples. In: ICDE (2008)

  35. Soliman, M.A., Ilyas, I.F., Chang, K.C.: Top-k query processing in uncertain databases. In: ICDE, pp. 896–905 (2007)

  36. Stanoi, I., Agrawal, D., Abbadi, A.E.: Reverse nearest neighbor queries for dynamic databases. In: ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, pp. 44–53 (2000)

  37. Tao, Y., Cheng, R., Xiao, X., Ngai, W.K., Kao, B., Prabhakar, S.: Indexing multi-dimensional uncertain data with arbitrary probability density functions. In: VLDB, pp. 922–933 (2005)

  38. Tao, Y., Papadias, D., Lian, X.: Reverse kNN search in arbitrary dimensionality. In: VLDB, pp. 744–755 (2004)

  39. Tao, Y., Papadias, D., Lian, X., Xiao, X.: Multidimensional reverse kNN search. In: VLDBJ (2005)

  40. Theodoridis, Y., Sellis, T.: A model for the prediction of R-tree performance. In: PODS, pp. 161–171 (1996)

  41. Yang, C., Lin, K.I.: An index structure for efficient reverse nearest neighbor queries. In: ICDE, pp. 485–492 (2001)

  42. Yiu, M.L., Mamoulis, N.: Reverse nearest neighbors search in ad-hoc subspaces. In: ICDE, p. 76 (2006)

  43. Yiu, M.L., Papadias, D., Mamoulis, N., Tao, Y.: Reverse nearest neighbors in large graphs. In: ICDE, pp. 186–187 (2005)

  44. Yu, P.S., Aggarwal, C.: On high dimensional indexing of uncertain data. In: ICDE (2008)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lei Chen.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lian, X., Chen, L. Efficient processing of probabilistic reverse nearest neighbor queries over uncertain data. The VLDB Journal 18, 787–808 (2009). https://doi.org/10.1007/s00778-008-0123-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00778-008-0123-0

Keywords

Navigation