Time-aware spatial keyword cover query
Introduction
With the rapid development of wireless networks and geographical positioning technologies, location-based services are widely used. In recent years, massive amounts of geo-textual objects are becoming available, each of which possesses both a geographical location and a textual description. Spatial keyword queries have been studied extensively [1]. Spatial keyword cover queries are also beginning to receive significant attention from the spatial database research community and the industry [2], [3], [4]. Spatial keyword cover queries focus on finding a group of objects covering all the query keywords and minimizing the inter-objects distance. Due to the remarkable value in practice and the diversification of people’s needs, several variants of spatial keyword cover queries have been studied. In [5], the best keyword cover search was proposed. The availability and importance of keyword rating in object evaluation was observed, so when looking for a set of objects covering all query keywords, the search considers two factors: the inter-objects distance and the keyword rating of objects. In [6], a query for finding the minimum spatial keyword cover was proposed. When searching for a set of objects covering all query keywords, the query takes into account inter-objects distance and the number of the objects.
We observe that geo-tagged objects are not always valid and temporal information plays an important role in spatial keyword cover query. To our knowledge, temporal information in spatial keyword cover query has never been considered. This paper proposes a new form of spatial keyword cover query, i.e., time-aware spatial keyword cover query (TSKCQ), which considers three factors: positional relevance, textual relevance, and valid time of objects. TSKCQ meets the needs of users to some extent. Due to the consideration of objects’ temporal information, the solution of TSKCQ can be very different from that of spatial keyword cover query. An example is illustrated in Fig. 1. Visitors may plan their trips according to the opening time of attractions. Suppose a tourist would like to visit a “museum” from 7:00 to 8:00, visit a “cafeteria” from 10:00 to 12:00, and visit a “zoo” from 14:00 to 17:00. Spatial keyword cover query returns , since it considers the distance between the objects only. TSKCQ returns , o2, , because it considers not only the distance between objects, but also the valid time of objects.
In this paper, we define a new cost function, and create an efficient index structure, i.e., Time-aware R-tree (TR-tree), which indexes the objects of the spatial dataset. After that, we propose an exact algorithm called Principal-keyword based Query Algorithm (PQA) that can tackle TSKCQ. In PQA, the principal query keyword is firstly selected from the query keywords. A query keyword with the minimum number of objects is the principal query keyword, and the objects associated with the principal query keyword are principal objects. For each principal object, the local optimal solution (LOS) is computed. The LOS with the highest cost function value is the final result. With the LOS and the current best solution, effective pruning strategies are proposed, which greatly improve the efficiency of the query. We get the Best keyword Cover Algorithm (BCA) by modifying the exact algorithm in [5] to tackle TSKCQ. Compared to BCA, the query time of PQA is shorter.
To summarize, the main contributions of this paper are:
(1) We formally define the problem of TSKCQ and propose a new cost function for it.
(2) We design an index structure, called TR-tree, and propose effective pruning strategies and an exact algorithm to tackle TSKCQ.
(3) We conduct extensive experiments using the data sets to demonstrate the efficiency of our algorithm.
The rest of this paper is organized as follows. Section 2 discusses related work. The problem is defined in Section 3. Section 4 introduces the TR-tree index structure. Section 5 elaborates the exact algorithm for answering TSKCQ. Then, Section 6 reports the experimental results. Finally, we conclude our paper in Section 7.
Section snippets
Spatial keyword query with a query location
Early spatial keyword query with a query location mainly retrieves individual objects that are related to the query keywords and are close to the query location [7], [8], [9], [10], [11]. Due to the diversification of actual application requirements, variants of spatial keyword query have gradually emerged. Li et al. [12] proposed the direction-based spatial keyword search, which takes as arguments a spatial point, a direction, and a set of keywords. It finds k nearest neighbors of the query,
Problem definition
In a spatial dataset, each object may be associated with one or multiple keywords. We convert the object with multiple keywords into multiple objects in the same location, so that for each object there is only one keyword. An object o is in the form (id, l, k, t), where id is the unique identifier of o, l indicates the location of the object o in a two-dimensional geographical space, k represents the keyword of o, and t is the valid time of o in the form of (st, et), with st and et being the
TR-tree index structure
R-tree [25], one of the most popular spatial index structures, is used to index all the spatial objects. R-tree is widely used and easy to be extended, therefore, to process TSKCQ, we augment R-tree with one additional dimension to index valid time of objects. In this work, a three-dimensional R-tree, called Time-aware R-tree (TR-tree for short), is used. We now detail the non-leaf node and leaf node of the TR-tree:
Non-leaf nodes of the TR-tree contain entries of the form N(ptrs, mbr, UT), where
Pruning strategies
Given a TSKCQ Q (K, T, UDist), we use the principal query keyword to find the LOS for each principal object, and the TSKCQ returns the LOS with the largest as the result.
Definition 5 Principal Query Keyword Given a TSKCQ Q (K, T, UDist), a principal query keyword is selected from according to a certain scheme, and the remaining query keywords in are referred to as non-principal query keywords. Objects with are called principal objects, and objects with non-principal query keywords are called non-principal
Experiments
In this section, we evaluate three algorithms, namely PQA, PQA algorithm without pruning strategy (no-PQA), and Best keyword Cover Algorithm (BCA).
PQA: The query algorithm is proposed in this paper.
no-PQA: The algorithm is obtained by removing the pruning strategies for the PQA algorithm.
BCA: The algorithm is obtained by modifying the algorithms in [5] to tackle TSKCQ. The specific modifications are as follows: (1) Rating of objects is replaced by valid time of objects; (2) KRR*-tree is
Conclusions
We present a new type of query in this paper, namely, time-aware spatial keyword cover query (TSKCQ). Compared with the spatial keyword cover queries, TSKCQ not only considers the textual information and temporal information of geospatial objects, but also considers the valid time of the objects, which satisfies the user’s needs to a certain extent. In TSKCQ, we define a new cost function and create the TR-tree index structure, and propose effective pruning strategies. We also propose an exact
Acknowledgment
This work was supported by the Key Research and Development Program of Hebei Province, China (No. 18270307D).
Declaration of competing interest
None.
Zijun Chen ( [email protected]) received the BS degree from the Northeast Heavy Machinery Institute, China, the MS degree from Yanshan University, and the Ph.D. degree from Fudan University in 2002, all in computer science. Since 1995, he has been with the School of Information Science and Engineering, Yanshan University, Qinhuangdao, China, where he is currently a professor. His current research interests include moving object databases and spatio-temporal databases.
References (28)
- et al.
SKQAI: A novel air index for spatial keyword query processing in road networks
Inform. Sci.
(2018) - et al.
Group-based collective keyword querying in road networks
Inform. Process. Lett.
(2017) - et al.
Level-aware collective spatial keyword queries
Inform. Sci.
(2017) - et al.
Processing spatial keyword (SK) queries in geographic information retrieval (GIR) systems
- et al.
Keyword search in spatial databases: Towards searching by document
- et al.
Locating mapped resources in web 2.0
- et al.
Efficient algorithms for answering the m-closest keywords query
- et al.
Best keyword cover search
IEEE Trans. Knowl. Data Eng.
(2015) - et al.
Finding the minimum spatial keyword cover
- et al.
Keyword search on spatial databases
Efficient and scalable method for processing top-k spatial Boolean queries
IR-Tree: An efficient index for geographic document search
IEEE Trans. Knowl. Data Eng.
Efficient processing of top-k spatial keyword queries
Fast nearest neighbor search with keywords
IEEE Trans. Knowl. Data Eng.
Cited by (4)
Top-k Spatial Keyword Query Based on Reachability
2022, Research SquareTime-aware collective spatial keyword query
2021, Computer Science and Information SystemsSocial-aware spatial keyword top-k group query
2020, Distributed and Parallel Databases
Zijun Chen ( [email protected]) received the BS degree from the Northeast Heavy Machinery Institute, China, the MS degree from Yanshan University, and the Ph.D. degree from Fudan University in 2002, all in computer science. Since 1995, he has been with the School of Information Science and Engineering, Yanshan University, Qinhuangdao, China, where he is currently a professor. His current research interests include moving object databases and spatio-temporal databases.
Tingting Zhao ( [email protected]) received the BS degree in computer science and technology from North China University of Science and Technology, China, in 2016. She is currently working toward the MS degree in the School of Information Science and Engineering, Yanshan University, China. Her research interest includes spatio-temporal databases.
Wenyuan Liu ( [email protected]) received the BS and MS degrees from the Northeast Heavy Machinery Institute, China, and the PhD degree from the Harbin Institute of Technology in 2000, all in computer science. Since 1996, he has been with the School of Information Science and Engineering, Yanshan University, Qinhuangdao, China, where he is currently a professor. He is also a special Invited Researcher with the IoT of TNlist, Tsinghua University. His research interests include wireless sensor networks and mobile networks.