Abstract
In part due to the proliferation of GPS-equipped mobile devices, massive volumes of geo-tagged streaming text messages are becoming available on social media. It is of great interest to discover most frequent nearby terms from such tremendous stream data. In this paper, we present novel indexing, updating, and query processing techniques that are capable of discovering top-k most frequent nearby terms over a sliding window. Specifically, given a query location and a set of geo-tagged messages within a sliding window, we study the problem of searching for the top-k terms by considering term frequency, spatial proximity, and term freshness. We develop a novel and efficient mechanism to solve the problem, including a quad-tree based indexing structure, indexing update technique, and a best-first based searching algorithm. An empirical study is conducted to show that our proposed techniques are efficient and fit for users’ requirements through varying a number of parameters.








Similar content being viewed by others
References
Agarwal, P.K., Cormode, G., Huang, Z., Phillips, J.M., Wei, Z., Yi, K.: Mergeable summaries. ACM Trans. Database Syst. 38(4), 26,1–26,28 (2013)
Bansal, N., Koudas, N.: Blogscope: a system for online analysis of high volume text streams. In: VLDB, pp. 1410–1413 (2007)
Chen, L., Shang, S.: Approximate spatio-temporal top-k publish/subscribe. WWW J., online first: 1–23 (2018)
Chen, L., Shang, S., Zhang, Z., Cao, X., Jensen, C.S., Kalnis, P.: Location-aware top-k term publish/subscribe. In: ICDE, pp. 1–12 (2018)
Cong, G., Jensen, C.S., Wu, D.: Efficient retrieval of the top-k most relevant spatial Web objects. PVLDB 2(1), 337–348 (2009)
Cormode, G., Muthukrishnan, S.: An improved data stream summary: the count-min sketch and its applications. J. Algorithms 55(1), 58–75 (2005)
Cormode, G., Muthukrishnan, S.: What’s hot and what’s not: tracking most frequent items dynamically. ACM Trans. Database Syst. 30(1), 249–278 (2005)
Demaine, E.D., Lȯpez-Ortiz, A., Munro, J.I.: Frequency estimation of internet packet streams with limited space. In: ESA, pp. 348–360 (2002)
Efron, M., Golovchinsky, G.: Estimation methods for ranking recent information. In: SIGIR, pp. 495–504. ACM (2011)
Felipe, I.D., Hristidis, V., Rishe, N.: Keyword search on spatial databases. In: ICDE, pp. 656–665 (2008)
Guo, D., Zhu, Y., Xu, W., Shang, S., halls, Z. Ding.: How to find appropriate automobile exhibition Towards a personalized recommendation service for auto show. Neurocomputing 213, 95–101 (2016)
Han, J., Zheng, K., Sun, A., Shang, S., Wen, J.: Discovering neighborhood pattern queries by sample answers in knowledge base. In: ICDE, pp. 1014–1025 (2016)
Hu, S., Wen, J., Dou, Z., Shang, S.: Following the dynamic block on the Web. World Wide Web 19(6), 1077–1101 (2016)
Karp, R.M., Shenker, S., Papadimitriou, C.H.: A simple algorithm for finding frequent elements in streams and bags. ACM Trans. Database Syst. 28, 51–55 (2003)
Li, Z., Lee, K.C.K., Zheng, B., Lee, W., Lee, D.L., Ir-tree, X. Wang.: An efficient index for geographic document search. IEEE Trans. Knowl. Data Eng. 23 (4), 585–599 (2011)
Li, Z., Shang, S., Xie, Q., Zhang, X.: Cost reduction for Web-based data imputation. In: DASFAA, pp. 438–452 (2014)
Liu, K., Yang, B., Shang, S., Li, Y., Ding, Z.: MOIR/UOTS: trip recommendation with user oriented trajectory search. In: MDM, pp. 335–337 (2013)
Liu, K., Li, Y., Dai, J., Shang, S., Zheng, K.: Compressing large scale urban trajectory data. In: CloudDP@EuroSys, pp. 3:1–3:6 (2014)
Liu, K., Li, Y., Ding, Z., Shang, S., Zheng, K.: Benchmarking big data for trip recommendation. In: ICCCN, pp. 1–6 (2014)
Liu, J., Zhao, K., Sommer, P., Shang, S., Kusy, B., Jurdak, R.: Bounded quadrant system: error-bounded trajectory compression on the go. In: ICDE, pp. 987–998 (2015)
Liu, J., Zhao, K., Sommer, P., Shang, S., Kusy, B., Lee, J., Jurdak, R.: A novel framework for online amnesic trajectory compression in resource-constrained environments. IEEE Trans. Knowl. Data Eng. 28(11), 2827–2841 (2016)
Liu, A., Wang, W., Shang, S., Li, Q., Zhang, X.: Efficient task assignment in spatial crowdsourcing with worker and task privacy protection. GeoInformatica, online first: 1–28 (2017)
Liu, A., Shen, X., Li, Z., Liu, G., Xu, J., Zhao, L., Zheng, K., Shang, S.: Differential private collaborative Web services qos prediction. WWW J., online first: 1–24 (2018)
Manku, G.S., Motwani, R.: Approximate frequency counts over data streams. PVLDB 5(12), 1699 (2012)
Metwally, A., Agrawal, D., El Abbadi, A.: Efficient computation of frequent and top-k elements in data streams. In: ICDT, pp. 398–412 (2005)
Metwally, A., Agrawal, D., El Abbadi, A.: An integrated efficient solution for computing frequent and top-k elements in data streams. ACM Trans. Database Syst. 31(3), 1095–1133 (2006)
Misra, J., Gries, D.: Finding repeated elements. Sci. Comput. Program. 2(2), 143–152 (1982)
Ozsoy, M.G., Onal, K.D., Altingovde, I.S.: Result diversification for tweet search. In: WISE, pp. 78–89 (2014)
Rocha-Junior, J.B., Gkorgkas, O., Jonassen, S., Nørvåg, K.: Efficient processing of top-k spatial keyword queries. In: SSTD, pp. 205–222 (2011)
Sankaranarayanan, J., Samet, H., Teitler, B.E., Lieberman, M.D., Sperling, J.: Twitterstand: news in tweets. In: SIGSPATIAL, pp. 42–51 (2009)
Shang, S., Deng, K., Xie, K.: Best point detour query in road networks. In: ACM SIGSPATIAL, pp. 71–80 (2010)
Shang, S., Yuan, B., Deng, K., Xie, K., Zhou, X.: Finding the most accessible locations: reverse path nearest neighbor query in road networks. In: ACM SIGSPATIAL, pp. 181–190 (2011)
Shang, S., Yuan, B., Deng, K., Xie, K., Zheng, K., Zhou, X.: PNN query processing on compressed trajectories. GeoInformatica 16(3), 467–496 (2012)
Shang, S., Ding, R., Yuan, B., Xie, K., Zheng, K., Kalnis, P.: User oriented trajectory search for trip recommendation. In: EDBT, pp. 156–167 (2012)
Shang, S., Lu, H., Pedersen, T.B., Xie, X.: Finding traffic-aware fastest paths in spatial networks. In: SSTD, pp. 128–145 (2013)
Shang, S., Lu, H., Pedersen, T.B., Xie, X.: Modeling of traffic-aware travel time in spatial networks. In: MDM, pp. 247–250 (2013)
Shang, S., Ding, R., Zheng, K., Jensen, C.S., Kalnis, P., Zhou, X.: Personalized trajectory matching in spatial networks, vol. 23 (2014)
Shang, S., Liu, J., Zheng, K., Lu, H., Pedersen, T.B., Wen, J.: Planning unobstructed paths in traffic-aware spatial networks. GeoInformatica 19(4), 723–746 (2015)
Shang, S., Zheng, K., Jensen, C.S., Yang, B., Kalnis, P., Li, G., Wen, J.: Discovery of path nearby clusters in spatial networks. IEEE Trans. Knowl. Data Eng. 27(6), 1505–1518 (2015)
Shang, S., Guo, D., Liu, J., Zheng, K., Wen, J.: Finding regions of interest using location based social media. Neurocomputing 173, 118–123 (2016)
Shang, S., Chen, L., Wei, Z., Guo, D., Wen, J.: Dynamic shortest path monitoring in spatial networks. J. Comput. Sci. Technol. 31(4), 637–648 (2016)
Shang, S., Chen, L., Wei, Z., Jensen, C.S., Wen, J., Kalnis, P.: Collective travel planning in spatial networks, vol. 28 (2016)
Shang, S., Zhu, S., Guo, D., Lu, M.: Discovery of probabilistic nearest neighbors in traffic-aware spatial networks. World Wide Web 20(5), 1135–1151 (2017)
Shang, S., Chen, L., Wei, Z., Jensen, C.S., Zheng, K., Kalnis, P.: Trajectory similarity join in spatial networks. PVLDB 10(11), 1178–1189 (2017)
Shang, S., Chen, L., Jensen, C.S., Wen, J., Kalnis, P.: Searching trajectories by regions of interest, vol. 29 (2017)
Shang, S., Chen, L., Wei, Z., Jensen, C.S., Zheng, K., Kalnis, P.: Parallel trajectory similarity joins in spatial networks. VLDB J., online first: 1–26 (2018)
Skovsgaard, A., Sidlauskas, D., Jensen, C.S.: Scalable top-k spatio-temporal term querying. In: ICDE, pp. 148–159 (2014)
Teitler, B.E., Lieberman, M.D., Panozzo, D., Sankaranarayanan, J., Samet, H., Sperling, J.: Newsstand: a new view on news. In: SIGSPATIAL, pp. 18 (2008)
Wang, Y., Li, J., Zhong, Y., Zhu, S., Guo, D., Shang, S.: Discovery of accessible locations using region-based geo-social data. WWW J., online first: 1–18 (2018)
Wei, Z., Liu, X., Li, F., Shang, S., Du, X., Wen, J.: Matrix sketching over sliding windows. In: SIGMOD, pp. 1465–1480 (2016)
Wu, S., Lin, H., Hu, L., Gao, Y., Lu, D.: Finding frequent items in time decayed data streams. In: APWeb, pp. 17–29 (2016)
Xie, K., Deng, K., Shang, S., Zhou, X., Zheng, K.: Finding alternative shortest paths in spatial networks. ACM Trans. Database Syst. 37(4), 29,1–29,31 (2012)
Xie, Q., Shang, S., Yuan, B., Pang, C., Zhang, X.: Local correlation detection with linearity enhancement in streaming data. In: CIKM, pp. 309–318 (2013)
Xie, X., Lu, H., Chen, J., Shang, S.: Top-k neighborhood dominating query. In: DASFAA, pp. 131–145 (2013)
Xu, Y., Chen, L., Yao, B., Shang, S., Zhu, S., Zheng, K., Li, F.: Location-based top-k term querying over sliding window. In: WISE, pp. 299–314 (2017)
Yang, B., Guo, C., Jensen, C.S., Kaul, M., Shang, S.: Stochastic skyline route planning under time-varying uncertainty. In: ICDE, pp. 136–147 (2014)
Yao, B., Chen, Z., Gao, X., Shang, S., Ma, S., Guo, M.: Flexible aggregate nearest neighbor queries in road networks. In: ICDE, pp. 1–12 (2018)
Yao, B., Zheng, W., Wang, Z., Chen, Z., Shang, S., Zheng, K., Guo, M.: Distributed in-memory analytics for big temporal data. In: DASFAA, pp. 1–16 (2018)
Zhang, C., Zhang, Y., Zhang, W., Lin, X.: Inverted linear quadtree: Efficient top k spatial keyword search. In: ICDE, pp. 901–912 (2013)
Zhang, D., Tan, K., Tung, A.K.H.: Scalable top-k spatial keyword search. In: EDBT, pp. 359–370 (2013)
Zhang, D., Chan, C., Tan, K.: Processing spatial keyword query as a top-k aggregation query. In: SIGIR, pp. 355–364 (2014)
Zhao, K., Chen, L., Cong, G.: Topic exploration in spatio-temporal document collections. In: SIGMOD, pp. 985–998 (2016)
Zheng, K., Shang, S., Yuan, N.J., Yang, Y.: Towards efficient search for activity trajectories. In: ICDE, pp. 230–241 (2013)
Zheng, K., Zheng, Y., Yuan, N.J., Shang, S.: On discovery of gathering patterns from trajectories. In: ICDE, pp. 242–253 (2013)
Zheng, K., Zheng, Y., Yuan, N.J., Shang, S., Zhou, X.: Online discovery of gathering patterns over trajectories. IEEE Trans. Knowl. Data Eng. 26(8), 1974–1988 (2014)
Zheng, K., Su, H., Zheng, B., Shang, S., Xu, J., Liu, J., Zhou, X.: Interactive top-k spatial keyword queries. In: ICDE, pp. 423–434 (2015)
Zheng, B., Wang, H., Zheng, K., Su, H., Liu, K., Shang, S.: Sharkdb: An in-memory column-oriented storage for trajectory analysis. World Wide Web 21(2), 455–485 (2018)
Zhu, S., Wang, Y., Shang, S., Zhao, G., Wang, J.: Probabilistic routing using multimodal data. Neurocomputing 253, 49–55 (2017)
Author information
Authors and Affiliations
Corresponding author
Additional information
This article belongs to the Topical Collection: Special Issue on Web Information Systems Engineering 2017
Guest Editors: Lu Chen and Yunjun Gao
Rights and permissions
About this article
Cite this article
Chen, L., Shang, S., Yao, B. et al. Spatio-temporal top-k term search over sliding window. World Wide Web 22, 1953–1970 (2019). https://doi.org/10.1007/s11280-018-0606-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11280-018-0606-x