ABSTRACT
With the proliferation of Internet-connected, location-aware mobile devices, such as smartphones, we are also witnessing a proliferation and increased use of map-based services that serve information about relevant Points of Interest (PoIs) to their users.
We provide an efficient and practical foundation for the processing of queries that take a keyword and a spatial region as arguments and return the k most relevant PoIs that belong to the region, which may be the part of the map covered by the user's screen. The paper proposes a novel technique that encodes the spatio-textual part of a PoI as a compact bit string. This technique extends an existing spatial encoding to also encode the textual aspect of a PoI in compressed form. The resulting bit strings may then be indexed using index structures such as B-trees or hashing that are standard in DBMSs and key-value stores. As a result, it is straightforward to support the proposed functionality using existing data management systems. The paper also proposes a novel top-k query algorithm that merges partial results while providing an exact result.
An empirical study with real-world data indicates that the proposed techniques enable excellent indexing and query execution performance on a standard DBMS.
- R. Bayer and E. McCreight. Organization and maintenance of large ordered indexes. Acta Informatica, 1 (3): 173--189, 1972. Google ScholarDigital Library
- N. Beckmann, H.-P. Kriegel, R. Schneider, and B. Seeger. The R*-tree: An efficient and robust access method for points and rectangles. In SIGMOD, pages 322--331, 1990. Google ScholarDigital Library
- A. Cary, O. Wolfson, and N. Rishe. Efficient and scalable method for processing top-k spatial Boolean queries. In SSDBM, pages 87--95, 2010. Google ScholarDigital Library
- L. Chen, G. Cong, C. S. Jensen, and D. Wu. Spatial keyword query processing: An experimental evaluation. PVLDB, 6(3):217--228, 2013. Google ScholarDigital Library
- Y.-Y. Chen, T. Suel, and A. Markowetz. Efficient query processing in geographic web search engines. In SIGMOD, pages 277--288, 2006. Google ScholarDigital Library
- M. Christoforaki, J. He, C. Dimopoulos, A. Markowetz, and T. Suel. Text vs. space: Efficient geo-search query processing. In CIKM, pages 423--432, 2011. Google ScholarDigital Library
- G. Cong, C. S. Jensen, and D. Wu. Efficient retrieval of the top-k most relevant spatial web objects. PVLDB, 2(1):337--348, 2009. Google ScholarDigital Library
- I. De Felipe, V. Hristidis, and N. Rishe. Keyword search on spatial databases. In ICDE, pages 656--665, 2008. Google ScholarDigital Library
- R. Fagin, A. Lotem, and M. Naor. Optimal aggregation algorithms for middleware. In PODS, 102--113, 2001. Google ScholarDigital Library
- R. Finkel and J. Bentley. Quad trees a data structure for retrieval on composite keys. Acta Informatica, 4(1):1--9, 1974. Google ScholarDigital Library
- Google Inc. Google S2 Geometry Library, http://code.google.com/p/s2-geometry-library/, 2011. {Online: accessed September 2014}.Google Scholar
- A. Guttman. R-trees: A dynamic index structure for spatial searching. SIGMOD, pages 47--57, 1984. Google ScholarDigital Library
- R. Hariharan, B. Hore, C. Li, and S. Mehrotra. Processing spatial-keyword (sk) queries in geographic information retrieval (gir) systems. In SSBDM, pages 16--16, 2007. Google ScholarDigital Library
- D. Hubert. Uber die stetige abbildung einer linie auf ein flachenstuck. In Mathematische Annalen, 38: 459--460, 1891.Google ScholarCross Ref
- A. Khodaei, C. Shahabi, and C. Li. Hybrid indexing and seamless ranking of spatial and textual features of web documents. In DEXA, pages 450--466, 2010. Google ScholarDigital Library
- D. E. Knuth. The Art of Computer Programming, Volume 3: Sorting and Searching, 2nd ed., Addison-Wesley. 1998. Google ScholarDigital Library
- Z. Li, K. Lee, B. Zheng, W.-C. Lee, D. L. Lee, and X. Wang. IR-tree: An efficient index for geographic document search. TKDE, 23(4):585--599, 2011. Google ScholarDigital Library
- J. a. B. Rocha-Junior, O. Gkorgkas, S. Jonassen, and K. Nørvåg. Efficient processing of top-k spatial keyword queries. In SSTD, pages 205--222, 2011. Google ScholarDigital Library
- S. Vaid, C. B. Jones, H. Joho, and M. Sanderson. Spatio-textual indexing for geographical search on the web. In SSTD, pages 218--235, 2005. Google ScholarDigital Library
- D. Wu, G. Cong, and C. S. Jensen. A framework for efficient spatial web object retrieval. VLDBJ, 21(6):797--822, 2012. Google ScholarDigital Library
- D. Wu, M. L. Yiu, G. Cong, and C. S. Jensen. Joint top-k spatial keyword query processing. TKDE, 24(10):1889--1903, 2012. Google ScholarDigital Library
- Y. Zhou, X. Xie, C. Wang, Y. Gong, and W.-Y. Ma. Hybrid index structures for location-based web search. In CIKM, pages 155--162, 2005. Google ScholarDigital Library
- V. Gaede, and O. Günther. Multidimensional Access Methods. In CSUR, pages 170--231, 1998 Google ScholarDigital Library
Index Terms
- Top-k point of interest retrieval using standard indexes
Recommendations
Scalable top-k keyword search in relational databases
DASFAA'12: Proceedings of the 17th international conference on Database Systems for Advanced Applications - Volume Part IIKeyword search in relational databases has been widely studied in recent years because it does not require users neither to master a certain structured query language nor to know the complex underlying database schemas. There would be a huge number of ...
Searching activity trajectory with keywords
Driven by the advances in location positioning techniques and the popularity of location sharing services, semantic enriched trajectory data has become unprecedentedly available. While finding relevant Point-of-Interests (PoIs) based on users' locations ...
Faster top-k document retrieval using block-max indexes
SIGIR '11: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information RetrievalLarge search engines process thousands of queries per second over billions of documents, making query processing a major performance bottleneck. An important class of optimization techniques called early termination achieves faster query processing by ...
Comments