Abstract
Vantage point-based indexing is a popular technique for implementing range queries in main memory database. Vantage points are reference points that are used to improve the performance of range queries. In the past, vantage points have been derived from the data points in the database by using various heuristics. These approaches are, therefore, data dependent and not able to handle dynamic databases (allowing insertions and deletions) easily. Further, the amount of time needed for deriving vantage points for these approaches is very high for larger databases. We propose a data-independent technique for creating vantage points. Constraint of our approach is that values in each dimension of the feature vectors have to be bounded. Extensive experiments with real and synthetic data show that the proposed technique is superior to existing methods.















Similar content being viewed by others
References
Agarwal PK, Kumar N, Sintos S, Suri S (2016) Range-max queries on uncertain data. In: Proceedings of the 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, PODS ’16. ACM, New York, pp 465–476. https://doi.org/10.1145/2902251.2902281
Beckmann N, Kriegel HP, Schneider R, Seeger B (1990) The R*-Tree: an efficient and robust access method for points and rectangles. SIGMOD Rec 19(2):322–331. https://doi.org/10.1145/93605.98741
Bentley JL (1975) Multidimensional binary search trees used for associative searching. Commun ACM 18:509–517. https://doi.org/10.1145/361002.361007
Bertin-Mahieux T, Ellis DP, Whitman B, Lamere P (2011) The million song dataset. https://labrosa.ee.columbia.edu/millionsong/. Accessed 17 Dec 2017
Bozkaya T, Ozsoyoglu M (1999) Indexing large metric spaces for similarity search queries. ACM Trans Database Syst 24:361–404. https://doi.org/10.1145/328939.328959
Brin S (1995) Near neighbor search in large metric spaces. In: Proceedings of the 21th International Conference on Very Large Data Bases, VLDB ’95. Morgan Kaufmann Publishers Inc., San Francisco, pp 574–584. http://dl.acm.org/citation.cfm?id=645921.673006
Chen L, Gao Y, Wang K, Jensen C.S, Chen G (2016) Answering why-not questions on metric probabilistic range queries. In: 2016 IEEE 32nd International Conference on Data Engineering (ICDE), pp 767–778. https://doi.org/10.1109/ICDE.2016.7498288
Chen L, Gao Y, Zhong A, Jensen CS, Chen G, Zheng B (2017) Indexing metric uncertain data for range queries and range joins. VLDB J 26(4):585–610. https://doi.org/10.1007/s00778-017-0465-6
Ciaccia P, Patella M, Zezula P (1997) M-tree: an efficient access method for similarity search in metric spaces. In: VLDB ’97: Proceedings of the 23rd International Conference on Very Large Data Bases. Morgan Kaufmann Publishers Inc., San Francisco, pp 426–435
Faloutsos C, Lin K.I (1995) Fastmap: a fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets. In: Proceedings of the 1995 ACM SIGMOD International Conference on Management of Data, SIGMOD ’95. ACM, New York, pp 163–174. https://doi.org/10.1145/223784.223812
Ferrada H, Navarro G (2016) Improved range minimum queries. In: 2016 Data Compression Conference (DCC), pp 516–525
Filho R, Traina A, Traina A, Faloutsos C (2001) Similarity search without tears: the OMNI-family of all-purpose access methods. In: Data Engineering, 2001. Proceedings of the 17th International Conference on, pp 623–630. https://doi.org/10.1109/ICDE.2001.914877
Fu X, Miao X, Xu J, Gao Y (2017) Continuous range-based skyline queries in road networks. World Wide Web 20(6):1443–1467. https://doi.org/10.1007/s11280-017-0444-2
Guttman A (1984) R-trees: a dynamic index structure for spatial searching. In: Proceedings of ACM SIGMOD, pp 47–57
Hu W, Xie N, Li Zeng X, Maybank S (2011) A survey on visual content-based video indexing and retrieval. Trans Syst Man Cyber Part C 41(6):797–819. https://doi.org/10.1109/TSMCC.2011.2109710
Jagadish HV, Ooi BC, Tan KL, Yu C, Zhang R (2005) idistance: an adaptive B+-tree based indexing method for nearest neighbor search. ACM Trans Database Syst 30:364–397. https://doi.org/10.1145/1071610.1071612
Jho NS, Chang KY, Hong D, Seo C (2016) Symmetric searchable encryption with efficient range query using multi-layered linked chains. J Supercomput 72(11):4233–4246. https://doi.org/10.1007/s11227-015-1497-6
Lew MS, Sebe N, Djeraba C, Jain R (2006) Content-based multimedia information retrieval: state of the art and challenges. ACM Trans Multimed Comput Commun Appl 2(1):1–19. https://doi.org/10.1145/1126004.1126005
Liu Y, Zhang D, Lu G, Ma WY (2007) A survey of content-based image retrieval with high-level semantics. Pattern Recognit 40(1):262–282. https://doi.org/10.1016/j.patcog.2006.04.045
Luo Q, Zhang S, Huang T, Gao W, Tian Q (2014) Superimage: packing semantic-relevant images for indexing and retrieval. In: Proceedings of International Conference on Multimedia Retrieval, ICMR ’14. ACM, New York, pp 41:41–41:48. https://doi.org/10.1109/10.1145/2578726.2578741
MIT Image Dataset (2010) MIT CSAIL: visual dictionary. http://groups.csail.mit.edu/vision/TinyImages/. Accessed 17 Dec 2017
National Oceanic and Atmospheric Administration (NOAA) weather data (2012) ftp://ftp.ncdc.noaa.gov/pub/data/gsod/. Accessed 17 Dec 2017
Ortega-Binderberger M, Porkaew K, Mehrotra S (2011) Corel image feature data set—UCI machine learning repository. http://archive.ics.uci.edu/ml. Accessed 17 Dec 2017
Pramanik S, Watve A, Meiners CR, Liu A (2010) Transforming range queries to equivalent box queries to optimize page access. Proc VLDB Endow 3:409–416. http://portal.acm.org/citation.cfm?id=1920841.1920895
Robinson J (1981) The K-D-B-tree: a search structure for large multidimensional dynamic indexes. In: Proceedings of ACM SIGMOD, pp 10–18
Song JJ, Lee W (2017) Relevance maximization for high-recall retrieval problem: finding all needles in a haystack. J Supercomput. https://doi.org/10.1007/s11227-016-1956-8
Van Leuken RH, Veltkamp RC (2011) Selecting vantage objects for similarity indexing. ACM Trans Multimed Comput Commun Appl 7:16:1–16:18. https://doi.org/10.1145/2000486.2000490
Venkateswaran J, Lachwani D, Kahveci T, Jermaine C (2006) Reference-based indexing of sequence databases. In: Proceedings of the 32nd International Conference on Very Large Databases, VLDB ’06. VLDB Endowment, pp 906–917. http://dl.acm.org/citation.cfm?id=1182635.1164205
Vleugels J, Veltkamp RC (2002) Efficient image retrieval through vantage objects. Pattern Recognit 35(1):69–80. https://doi.org/10.1016/S0031-3203(00)00120-5
Wang P, Ravishankar CV (2013) Secure and efficient range queries on outsourced databases using Rp-trees. In: 2013 IEEE 29th International Conference on Data Engineering (ICDE), pp 314–325
Wang X, Shasha D, Zhang K (2005) Metricmap: an embedding technique for processing distance-based queries in metric spaces. IEEE Trans Syst Man Cybern Part B (Cybern) 35:973–987. https://doi.org/10.1109/TSMCB.2005.848489
Watve A, Pramanik S, Shahid S, Meiners CR, Liu AX (2015) Topological transformation approaches to database query processing. IEEE Trans Knowl Data Eng 27(5):1438–1451. https://doi.org/10.1109/TKDE.2014.2363658
Yianilos PN (1993) Data structures and algorithms for nearest neighbor search in general metric spaces. In: Proceedings of the Fourth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA ’93. Society for Industrial and Applied Mathematics, Philadelphia, pp 311–321. http://dl.acm.org/citation.cfm?id=313559.313789
Yoshitaka A, Ichikawa T (1999) A survey on content-based retrieval for multimedia databases. IEEE Trans Knowl Data Eng 11(1):81–93. https://doi.org/10.1109/69.755617
Zhu H, Yang X, Wang B, Lee WC (2016) Range-based obstructed nearest neighbor queries. In: Proceedings of the 2016 International Conference on Management of Data, SIGMOD ’16. ACM, New York, pp 2053–2068. https://doi.org/10.1145/2882903.2915234
Acknowledgements
This research was partially supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT & Future Planning (Grant No. 2012R1A1A2042552).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Watve, A., Pramanik, S., Jung, S. et al. Data-independent vantage point selection for range queries. J Supercomput 75, 7952–7978 (2019). https://doi.org/10.1007/s11227-018-2384-8
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-018-2384-8