Skip to main content
Log in

Data-independent vantage point selection for range queries

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Vantage point-based indexing is a popular technique for implementing range queries in main memory database. Vantage points are reference points that are used to improve the performance of range queries. In the past, vantage points have been derived from the data points in the database by using various heuristics. These approaches are, therefore, data dependent and not able to handle dynamic databases (allowing insertions and deletions) easily. Further, the amount of time needed for deriving vantage points for these approaches is very high for larger databases. We propose a data-independent technique for creating vantage points. Constraint of our approach is that values in each dimension of the feature vectors have to be bounded. Extensive experiments with real and synthetic data show that the proposed technique is superior to existing methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

References

  1. Agarwal PK, Kumar N, Sintos S, Suri S (2016) Range-max queries on uncertain data. In: Proceedings of the 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, PODS ’16. ACM, New York, pp 465–476. https://doi.org/10.1145/2902251.2902281

  2. Beckmann N, Kriegel HP, Schneider R, Seeger B (1990) The R*-Tree: an efficient and robust access method for points and rectangles. SIGMOD Rec 19(2):322–331. https://doi.org/10.1145/93605.98741

    Article  Google Scholar 

  3. Bentley JL (1975) Multidimensional binary search trees used for associative searching. Commun ACM 18:509–517. https://doi.org/10.1145/361002.361007

    Article  MATH  Google Scholar 

  4. Bertin-Mahieux T, Ellis DP, Whitman B, Lamere P (2011) The million song dataset. https://labrosa.ee.columbia.edu/millionsong/. Accessed 17 Dec 2017

  5. Bozkaya T, Ozsoyoglu M (1999) Indexing large metric spaces for similarity search queries. ACM Trans Database Syst 24:361–404. https://doi.org/10.1145/328939.328959

    Article  Google Scholar 

  6. Brin S (1995) Near neighbor search in large metric spaces. In: Proceedings of the 21th International Conference on Very Large Data Bases, VLDB ’95. Morgan Kaufmann Publishers Inc., San Francisco, pp 574–584. http://dl.acm.org/citation.cfm?id=645921.673006

  7. Chen L, Gao Y, Wang K, Jensen C.S, Chen G (2016) Answering why-not questions on metric probabilistic range queries. In: 2016 IEEE 32nd International Conference on Data Engineering (ICDE), pp 767–778. https://doi.org/10.1109/ICDE.2016.7498288

  8. Chen L, Gao Y, Zhong A, Jensen CS, Chen G, Zheng B (2017) Indexing metric uncertain data for range queries and range joins. VLDB J 26(4):585–610. https://doi.org/10.1007/s00778-017-0465-6

    Article  Google Scholar 

  9. Ciaccia P, Patella M, Zezula P (1997) M-tree: an efficient access method for similarity search in metric spaces. In: VLDB ’97: Proceedings of the 23rd International Conference on Very Large Data Bases. Morgan Kaufmann Publishers Inc., San Francisco, pp 426–435

  10. Faloutsos C, Lin K.I (1995) Fastmap: a fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets. In: Proceedings of the 1995 ACM SIGMOD International Conference on Management of Data, SIGMOD ’95. ACM, New York, pp 163–174. https://doi.org/10.1145/223784.223812

  11. Ferrada H, Navarro G (2016) Improved range minimum queries. In: 2016 Data Compression Conference (DCC), pp 516–525

  12. Filho R, Traina A, Traina A, Faloutsos C (2001) Similarity search without tears: the OMNI-family of all-purpose access methods. In: Data Engineering, 2001. Proceedings of the 17th International Conference on, pp 623–630. https://doi.org/10.1109/ICDE.2001.914877

  13. Fu X, Miao X, Xu J, Gao Y (2017) Continuous range-based skyline queries in road networks. World Wide Web 20(6):1443–1467. https://doi.org/10.1007/s11280-017-0444-2

    Article  Google Scholar 

  14. Guttman A (1984) R-trees: a dynamic index structure for spatial searching. In: Proceedings of ACM SIGMOD, pp 47–57

  15. Hu W, Xie N, Li Zeng X, Maybank S (2011) A survey on visual content-based video indexing and retrieval. Trans Syst Man Cyber Part C 41(6):797–819. https://doi.org/10.1109/TSMCC.2011.2109710

    Article  Google Scholar 

  16. Jagadish HV, Ooi BC, Tan KL, Yu C, Zhang R (2005) idistance: an adaptive B+-tree based indexing method for nearest neighbor search. ACM Trans Database Syst 30:364–397. https://doi.org/10.1145/1071610.1071612

    Article  Google Scholar 

  17. Jho NS, Chang KY, Hong D, Seo C (2016) Symmetric searchable encryption with efficient range query using multi-layered linked chains. J Supercomput 72(11):4233–4246. https://doi.org/10.1007/s11227-015-1497-6

    Article  Google Scholar 

  18. Lew MS, Sebe N, Djeraba C, Jain R (2006) Content-based multimedia information retrieval: state of the art and challenges. ACM Trans Multimed Comput Commun Appl 2(1):1–19. https://doi.org/10.1145/1126004.1126005

    Article  Google Scholar 

  19. Liu Y, Zhang D, Lu G, Ma WY (2007) A survey of content-based image retrieval with high-level semantics. Pattern Recognit 40(1):262–282. https://doi.org/10.1016/j.patcog.2006.04.045

    Article  MATH  Google Scholar 

  20. Luo Q, Zhang S, Huang T, Gao W, Tian Q (2014) Superimage: packing semantic-relevant images for indexing and retrieval. In: Proceedings of International Conference on Multimedia Retrieval, ICMR ’14. ACM, New York, pp 41:41–41:48. https://doi.org/10.1109/10.1145/2578726.2578741

  21. MIT Image Dataset (2010) MIT CSAIL: visual dictionary. http://groups.csail.mit.edu/vision/TinyImages/. Accessed 17 Dec 2017

  22. National Oceanic and Atmospheric Administration (NOAA) weather data (2012) ftp://ftp.ncdc.noaa.gov/pub/data/gsod/. Accessed 17 Dec 2017

  23. Ortega-Binderberger M, Porkaew K, Mehrotra S (2011) Corel image feature data set—UCI machine learning repository. http://archive.ics.uci.edu/ml. Accessed 17 Dec 2017

  24. Pramanik S, Watve A, Meiners CR, Liu A (2010) Transforming range queries to equivalent box queries to optimize page access. Proc VLDB Endow 3:409–416. http://portal.acm.org/citation.cfm?id=1920841.1920895

    Article  Google Scholar 

  25. Robinson J (1981) The K-D-B-tree: a search structure for large multidimensional dynamic indexes. In: Proceedings of ACM SIGMOD, pp 10–18

  26. Song JJ, Lee W (2017) Relevance maximization for high-recall retrieval problem: finding all needles in a haystack. J Supercomput. https://doi.org/10.1007/s11227-016-1956-8

    Article  Google Scholar 

  27. Van Leuken RH, Veltkamp RC (2011) Selecting vantage objects for similarity indexing. ACM Trans Multimed Comput Commun Appl 7:16:1–16:18. https://doi.org/10.1145/2000486.2000490

    Article  Google Scholar 

  28. Venkateswaran J, Lachwani D, Kahveci T, Jermaine C (2006) Reference-based indexing of sequence databases. In: Proceedings of the 32nd International Conference on Very Large Databases, VLDB ’06. VLDB Endowment, pp 906–917. http://dl.acm.org/citation.cfm?id=1182635.1164205

  29. Vleugels J, Veltkamp RC (2002) Efficient image retrieval through vantage objects. Pattern Recognit 35(1):69–80. https://doi.org/10.1016/S0031-3203(00)00120-5

    Article  MATH  Google Scholar 

  30. Wang P, Ravishankar CV (2013) Secure and efficient range queries on outsourced databases using Rp-trees. In: 2013 IEEE 29th International Conference on Data Engineering (ICDE), pp 314–325

  31. Wang X, Shasha D, Zhang K (2005) Metricmap: an embedding technique for processing distance-based queries in metric spaces. IEEE Trans Syst Man Cybern Part B (Cybern) 35:973–987. https://doi.org/10.1109/TSMCB.2005.848489

    Article  Google Scholar 

  32. Watve A, Pramanik S, Shahid S, Meiners CR, Liu AX (2015) Topological transformation approaches to database query processing. IEEE Trans Knowl Data Eng 27(5):1438–1451. https://doi.org/10.1109/TKDE.2014.2363658

    Article  Google Scholar 

  33. Yianilos PN (1993) Data structures and algorithms for nearest neighbor search in general metric spaces. In: Proceedings of the Fourth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA ’93. Society for Industrial and Applied Mathematics, Philadelphia, pp 311–321. http://dl.acm.org/citation.cfm?id=313559.313789

  34. Yoshitaka A, Ichikawa T (1999) A survey on content-based retrieval for multimedia databases. IEEE Trans Knowl Data Eng 11(1):81–93. https://doi.org/10.1109/69.755617

    Article  Google Scholar 

  35. Zhu H, Yang X, Wang B, Lee WC (2016) Range-based obstructed nearest neighbor queries. In: Proceedings of the 2016 International Conference on Management of Data, SIGMOD ’16. ACM, New York, pp 2053–2068. https://doi.org/10.1145/2882903.2915234

Download references

Acknowledgements

This research was partially supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT & Future Planning (Grant No. 2012R1A1A2042552).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sungwon Jung.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Watve, A., Pramanik, S., Jung, S. et al. Data-independent vantage point selection for range queries. J Supercomput 75, 7952–7978 (2019). https://doi.org/10.1007/s11227-018-2384-8

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-018-2384-8

Keywords

Navigation