Skip to main content
Log in

Effectiveness of NAQ-tree in handling reverse nearest-neighbor queries in high-dimensional metric space

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Reverse nearest-neighbor (RNN) query processing is important for many applications such as decision-support systems, profile-based marketing and molecular biology; consequently, RNN query processing has attracted considerable attention in the research community in recent years. Most existing approaches for RNN query processing either rely on nearest-neighbor pre-computation or work for specific data space (e.g., the Euclidean space). The only method for RNN query processing in metric space is based on the M-tree. In this paper, we propose an approach for RNN query processing in high-dimensional metric space using distance-based index structure (in particular, NAQ-tree that outperforms the other distance-based index structures as we have already verified in a previous study). In high-dimensional space, the properties of distance-based index structure provide strong pruning rules than the M-tree. In addition, unlike the previous work, our approach integrates the filtering and verification steps and uses the information obtained in the verification stage to further improve the filtering rate. Our approach delivers results incrementally and hence well serves real-time applications. The reported experimental results demonstrate the applicability and effectiveness of the proposed NAQ-tree-based RNN approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Achtert E, Böhm C, Kröger P, Kunath P, Pryakhin A, Renz M (2006) Efficient reverse k-nearest neighbor search in arbitrary metric space. In: Proceedings of ACM SIGMOD. pp 515–526

  2. Aronovich L, Spiegler I (2010) Bulk construction of dynamic clustered metric trees. Knowl Inf Syst 22(2): 211–244

    Article  Google Scholar 

  3. Baeza-Yates R, Cunto W, Manber U, Wu S (1994) Proximity matching using fixed-queries trees. In: Proceedings of conference on combinatorial pattern matching. pp 198–212

  4. Benetis R, Jensen CS, Karciauskas G, Saltenis S (2006) Nearest neighbor and reverse nearest neighbor queries for moving objects. VLDB J 15(3): 229–249

    Article  Google Scholar 

  5. Bozkaya T, Ozsoyoglu M (1997) Distance-based indexing for high-dimensional metric spaces. In: Proceedings of ACM SIGMOD. pp 357–368

  6. Brin S (1995) Near neighbor search in large metric spaces. In: Proceedings of VLDB. pp 574–584

  7. Burkhard W, Keller R (1973) Some approach to best-match file searching. Commun ACM 16(4): 230–236

    Article  MATH  Google Scholar 

  8. Ciaccia P, Patella M, Zezula P (1997) M-tree: an efficient access method for similarity search in metric spaces. In: Proceedings of 23rd international conference on very large data bases, August 25–29, Athens, Greece. Morgan Kaufmann, pp 426–435. ISBN 1-55860-470-7

  9. Conway J, Sloane N (1988) Sphere packings, lattices and groups, 1st edn. Springer, New York

    MATH  Google Scholar 

  10. Copeland G, Koshafian S (1985) A decomposition storage model. In: Proceedings of ACM SIGMOD. pp 268–279

  11. Chavez E, Navarro G, Baeza-Yates R, Marroquin JL (2001) Searching in metric spaces. ACM Comput Surv 33(3): 273–321

    Article  Google Scholar 

  12. Fu AW, Chan PM, Cheung YL, Moon YS (2000) Dynamic vp-tree indexing for n-nearest neighbor search given pair-wise distances. VLDB J 9(2): 154–173

    Article  Google Scholar 

  13. Ferhatosmanoglu H, Stanoi I, Agrawal D, Abbadi AE (2001) Constrained nearest neighbor queries. In: Proceedings of the international symposium on spatial and temporal databases. pp 257–278

  14. Guttman A (1984) R-trees: a dynamic index structure for spatial searching. In: Proceedings of ACM SIGMOD. pp 47–57

  15. Kalantari I, McDonald G (1983) A data structure and an algorithm for nearest point problem. IEEE Trans Softw Eng 9(5): 631–634

    Article  Google Scholar 

  16. Kelil A, Wang S, Jiang Q, Brzezinski R (2010) A general measure of similarity for categorical sequences. Knowl Inf Syst 24(2): 197–220

    Article  Google Scholar 

  17. Korn F, Muthukrishnan S (2000) Influence sets based on reverse nearest neighbor queries. In: Proceedings of ACM SIGMOD. pp 201–212

  18. Korn F, Muthukrishnan S, Srivastava D (2002) Reverse nearest neighbor aggregates over data streams. In: Proceeding of VLDB. pp 814–825

  19. Lin K-I, Nolen M, Yang C (2003) Applying bulk insertion techniques for dynamic reverse nearest neighbor problems. In: Proceedings of IDEAS. pp 290–297

  20. Maheshwari A, Vahrenhold J, Zeh N (2002) On reverse nearest neighbor queries. In: Proceedings of the canadian conference on computational geometry. pp 128–132

  21. Seidl T, Kriegel HP (1998) Optimal multi-step k-nearest neighbor search. In: Proceeedings of ACM-SIGMOD. pp 154–165

  22. Shaft U, Ramakrishnan R (2005) When is nearest neighbors indexable? In: Proceedings of ICDT. pp 158–172

  23. Singh A, Ferhatosmanoglu H, Tosun AS (2003) High dimensional reverse nearest neighbor queries. In: Proceedings of ACM CIKM. pp 91–98

  24. Song G, Cui B, Zheng B, Yang D (2009) Accelerating sequence searching: dimensionality reduction method. Knowl Inf Syst 20(3): 301–322

    Article  Google Scholar 

  25. Stanoi I, Riedewald M, Agrawal D, El Abbadi A (2001) Discovery of influence sets in frequently updated databases. In: Proceeding of VLDB. pp 99–108

  26. Stanoi I, Agrawal D, Abbadi AE (2000) Reverse nearest neighbor queries for dynamic databases. In: Proceedings of ACM SIGMOD workshop on research issues in data mining and knowledge discovery. pp 44–53

  27. Tao Y, Papadias D, Lian X, Xiao X (2007) Multi-dimensional reverse kNN search. VLDB J 16(3): 293–316

    Article  Google Scholar 

  28. Tao Y, Yiu M, Mamoulis N (2006) Reverse nearest neighbor search in metric spaces. IEEE Trans Knowl Data Eng 18(9): 1239–1252

    Article  Google Scholar 

  29. Uhlmann JK (1991) Satisfying general proximity/similarity queries with metric trees. Inf Process Lett 40: 175–179

    Article  MATH  Google Scholar 

  30. Vidal E (1986) An algorithm for finding nearest neighbors in (approximately) constant average time. Pattern Recognit Lett 4: 145–157

    Article  Google Scholar 

  31. Yianilos P (1993) Data structures and algorithms for nearest neighbor search in general metric spaces. In: Proceedings of ACM-SIAM symposium on discrete algorithms. pp 311–321

  32. Yianilos P (1999) Excluded middle vantage point forest for nearest neighbor search. In: DIMACS implementation challenge, ALENEX’99, Baltimore, MD

  33. Yang C, Lin K-I (2001) An index structure for efficient reverse nearest neighbor queries. In: Proceedings of IEEE international conference on data engineering. pp 485–492

  34. Yiu M, Papadias D, Mamoulis N, Tao Y (2006) Reverse nearest neighbor in large graphs. IEEE Trans Knowl Data Eng 18(4): 540–553

    Article  Google Scholar 

  35. Yiu M, Mamoulis N (2007) Reverse nearest neighbor search in Ad-hoc subspaces. IEEE TKDE 19(3): 412–426

    Google Scholar 

  36. Zhang M, Alhajj R (2010) Effectiveness of NAQ-tree as index structure for similarity search in high-dimensional metric space. Knowl Inf Syst 22(1): 1–26

    Article  MATH  Google Scholar 

  37. Zhang M, Alhajj R, Rokne J (2008) Optimal incremental multi-step nearest-neighbor search. In: Proceedings of ACM international conference on advances in geographic information systems

  38. The source code is available at. http://www.cse.cuhk.edu.hk/~taoyf/paper/tkde06-rnn-metric.html

  39. The data set is available at. http://kdd.ics.uci.edu/databases/CorelFeatures/CorelFeatures.html

  40. The data set is available at. http://kodiak.cs.cornell.edu/kddcup/datasets.html

  41. http://en.wikipedia.org/wiki/Manhattan_distance

  42. http://en.wikipedia.org/wiki/Chebyshev_distance

  43. The data set is available at. http://kdd.ics.uci.edu/databases/covertype/covertype.html

  44. The data set is available at. http://archive.ics.uci.edu/ml/datasets/Poker+Hand

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Reda Alhajj.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, M., Alhajj, R. Effectiveness of NAQ-tree in handling reverse nearest-neighbor queries in high-dimensional metric space. Knowl Inf Syst 31, 307–343 (2012). https://doi.org/10.1007/s10115-011-0405-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-011-0405-5

Keywords

Navigation