Abstract
A top-k spatial-keyword query returns the k best spatio-textual objects ranked based on their proximity to the query location and relevance to the query keywords. Various index schemes have been proposed for top-k spatial-keyword queries; however, a unified framework covering all these schemes has not been proposed. In this paper, we present a generic model of index schemes for top-k spatial-keyword queries, which we call G-Index Model. First, G-Index Model is a unified framework that exhaustively investigates all the possible index schemes for top-k spatial-keyword queries. For this, we conjecture that data clustering is the key element in composing various index schemes and generate index schemes as combinations of clustering. The result shows that all the existing methods map to those generated by G-Index Model. Using G-Index Model, we also discover two new methods that have not been reported before. Second, we show that G-Index Model is generic, i.e., it can generate index schemes for a class of queries integrating arbitrary multiple data types. For this, we show that G-Index Model can enumerate index schemes for two classes of queries: the spatial-keyword query (without the top-k constraint) and the top-k spatial-keyword-relational query, which adds the relational data type to the top-k spatial-keyword query. Third, we propose a cost model of the generated methods for the top-k spatial-keyword query. Consequently, the cost model allows us to do physical database design so as to find an optimal index scheme for a given usage pattern (i.e., a set of query loads and frequencies). We validate the cost model through extensive experiments.
Similar content being viewed by others
References
Baeza-Yates, R., Ribeiro-Neto, B.: Modern information retrieval, ACM Press. Addison–Wesley (1999)
Beckmann, N., Kriegel, H., Schneider, R., Seeger, B.: The R*-Tree: an efficient and robust access method for points and rectangles. In: Proceedings of the International Conference on Management of Data, ACM SIGMOD (1990)
Chen, Y., Suel, T., Markowetz, A.: Efficient query processing in geographic web search engines. In: Proceedings of the International Conference on Management of Data, ACM SIGMOD (2006)
Christoforaki, M., He, J., Dimopoulos, C., Markowetz, A., Suel, T.: Text vs. Space: Efficient geo-search query processing. In: Proceedings of the 20th ACM Conference on Information and Knowledge Management(CIKM) (2011)
Cong, G., Jensen, C., Wu, D.: Efficient retrieval of the top-k most relevant spatial web objects. In: Proceedings of the 35th International Conference on Very Large Data Bases (VLDB) (2009)
Fagin, R., Lotem, A., Naor, M.: Optimal aggregation algorithms for middleware. In: Proceedings of the 20th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems(PODS) (2001)
Felipe, I., Hristidis, V., Rishe, N.: Keyword search on spatial databases. In: Proceedings of the 24th International Conference on Data Engineering (ICDE), IEEE (2008)
Finkel, R., Bentley, J.L.: Quad Trees: A data structure for retrieval on composite keys. Acta Informatica 4 (1), 1–9 (1974)
Garcia-Molina, H., Ullman, J., Widom, J., 2nd Ed.: Database systems: The complete book. Prentice Hall, Englewood Cliffs (2008)
Guttman, A.: R-trees: A dynamic index structure for spatial searching. In: Proceedings of the International Conference on Management of Data, ACM SIGMOD (1984)
Hariharan, R., Hore, B., Li, C., Mehrotra, S.: Processing Spatial-Keyword (SK) Queries in Geographic Information Retrieval (GIR) systems. In: Proceedings of the 19th International Conference on Scientific and Statistical Database Management (SSDBM) (2007)
Hjaltason, G., Samet, H.: Distance browsing in spatial databases. ACM Trans. Database Syst. 24 (2), 265–318 (1999)
Kwon, H., Whang, K., Song, I., Wang, H.: RASIM: A rank-aware separate index method for answering top-k spatial keyword queries. World Wide Web J. 16 (2), 111–139 (2013)
Li, Z., Lee, K., Zheng, B., Lee, W., Lee, D., Wang, X: IR-Tree: An efficient index for geographic document search. IEEE Trans. Knowl. Data Eng. 23 (4), 585–599 (2011)
Park, D., Kim, H.: An enhanced technique for k-nearest neighbor queries with non-spatial selection predicates. Multimed. Tools Appl. Arch. 19 (1), 79–103 (2003)
Pang, H., Ding, X., Zheng, B.: Efficient processing of exact top-k queries over disk-resident sorted lists. VLDB J. 19 (3), 437–456 (2010)
Rocha-Junior, J., Gkorgkas, O., Jonassen, S., Norvag, K.: Efficient processing of top-k spatial keyword queries. In: Proceedings of the 12th International Symposium on Spatial and Temporal Databases (SSTD) (2011)
Sanderson, M., Kohler, J.: Analyzing geographic queries. In: Proceedings of the 1st ACM SIGIR Workshop on Geographic Information Retrieval (2004)
Sellis, T., Roussopoulos, N., Faloutsos, C.: The R+-Tree: a dynamic index for multi-dimensional objects. In: Proceedings of the 13th International Conference on Very Large Data Bases (VLDB) (1987)
Song, J., Whang, K., Lee, Y., Kim, S.: Spatial join processing using corner transformation. IEEE Trans. Knowl. Data Eng. 11 (4), 688–698 (1999)
Song, J., Whang, K., Lee, Y., Lee, M., Han, W., Park, B.: The clustering property of corner transformation for spatial database applications. Inf. Softw. Technol. 44 (7), 419–429 (2002)
Vaid, S., Jones, C., Joho, H., Sanderson, M.: Spatio-textual indexing for geographical. In: Proceedings of the 9th International Symposium on Spatial and Temporal Databases (SSTD) (2005)
Whang, K., Lee, M., Lee, J., Kim, M., Han, W.: Odysseus: a high-performance ordbms tightly-coupled with ir features. In: Proceedings of the 21st International Conference on Data Engineering (ICDE), IEEE pp. 1104–1105, 5–8 April 2005. This paper received the Best Demonstration Award
Whang, K., Lee, J., Kim, M., Lee, M., Lee, K., Han, W., Kim, J.: Tightly-coupled spatial database features in the odysseus/opengis dbms for high-performance. GeoInformatica 14 (4), 425–446 (2010)
Zhang, D., Tan, K., Tung, A.: Scalable top-k spatial keyword search. In: Proceedings of the 16th International Conference on Extending Database Technology(EDBT), ACM, 359–370 (2013)
Zhang, C., et al.: Inverted linear quadtree: efficient top-k spatial keyword search. In: Proceedings of the 29th International Conference on Data Engineering (ICDE), IEEE, 901–902 (2013)
Zhou, Y., Xie, X., Wang, C., Gong, Y., Ma, W.: Hybrid index structures for location-based web search. In: Proceedings of the 14th ACM Conference on Information and Knowledge Management(CIKM), 155–162 (2005)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Kwon, HY., Wang, H. & Whang, KY. G-Index Model: A generic model of index schemes for top-k spatial-keyword queries. World Wide Web 18, 969–995 (2015). https://doi.org/10.1007/s11280-014-0294-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11280-014-0294-0