Skip to main content
Log in

AP-Tree: efficiently support location-aware Publish/Subscribe

  • Regular Paper
  • Published:
The VLDB Journal Aims and scope Submit manuscript

Abstract

We investigate the problem of efficiently supporting location-aware Publish/Subscribe (Pub/Sub for short), which is essential in many applications such as location-based recommendation and advertising, thanks to the proliferation of geo-equipped devices and the ensuing location-based social media applications. In a location-aware Pub/Sub system (e.g., an e-coupon system), subscribers can register their interest as spatial-keyword subscriptions (e.g., interest in nearby iphone discount); each incoming geo-textual message (e.g., geo-tagged e-coupon) will be delivered to all the relevant subscribers immediately. While there are several prior approaches aiming at providing efficient processing techniques for this problem, their approaches belong to spatial-prioritized indexing method which cannot well exploit the keyword distribution. In addition, their textual filtering techniques are built upon simple variants of traditional inverted indexes, which do not perform well for the textual constraint imposed by the problem. In this paper, we address the above limitations and provide a highly efficient solution based on a novel adaptive index, named AP-Tree. AP-Tree adaptively groups registered subscriptions using keyword and spatial partitions, guided by a cost model. AP-Tree also naturally indexes ordered keyword combinations. Furthermore, we show that our techniques can be extended to process moving spatial-keyword subscriptions, where subscribers can continuously update their locations. We present efficient algorithms to process both stationary and moving subscriptions, which can seamlessly and effectively integrate keyword and spatial partitions. Our extensive experiments demonstrate that AP-Tree and its variant AP \(^{+}\) -Tree can achieve up to an order of magnitude improvement on efficiency compared with prior state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23
Fig. 24
Fig. 25
Fig. 26
Fig. 27

Similar content being viewed by others

Notes

  1. http://advertising.apple.com.

  2. Active messages refer to the messages which are currently alive and not expired in the system.

  3. We assume the location of message is a point while our techniques can be immediately extended to support a spatial region, e.g., circle, rectangle.

  4. RQ-Tree is a keyword-prioritized indexing structure derived from IQ-Tree and compared in our experiments.

  5. Note that there are about 1.7 million distinct keywords in the tweet dataset.

  6. We should not be confused with the same “s” in s-list and s-node. s-list refers to subscription list, while s-node refers to spatial node. There is no relationship between them.

  7. For sake of clarity, we remind that the subscript in each s-node (e.g., s \(_{1}\) -node) in the figure has nothing to do with subscription s (e.g., \(s_1\)).

  8. The l-nodes in current \(node_{list}{(s)}\) are those covered by s.r, i.e., case 1, case 2 and case 3 in Fig. 8, while case 4 indicates the new l-nodes which are only covered by \(r_n\).

  9. \(T'\) here is essentially an amortized factor, such that the value of \(\zeta _s\) will not depend on the length of one timestamp.

  10. Note that the subscriptions are assigned to cut or cell only once. Thus the coefficient of \(|S| \cdot \log {f}\) is 1.

  11. As we assume indexes are fit in the main memory, we use the number of verifications to evaluate the goodness of the subscription decomposition, instead of the number of I/Os.

  12. http://geonames.usgs.gov.

  13. http://www.chorochronos.org.

  14. http://people.csail.mit.edu/jrennie/20Newsgroups.

  15. Thus, we could have four dataset combinations for following experiments: TWEETS-ODB, TWEETS-SAN, GN-ODB and GN-SAN.

  16. This is also the default velocity in Network-based Generator for Moving Objects [4].

  17. The light-weighted version regards the distance from the subscription to its nearest matching message minus the search radius of the subscription as the safe region radius. The impact region is derived by extending the safe region with the search radius of subscription.

References

  1. Aguilera, M.K., Strom, R.E., Sturman, D.C., Astley, M., Chandra, T.D.: Matching events in a content-based subscription system. In: PODC, pp. 53–61 (1999)

  2. Bao, J., Mokbel, M.F., Chow, C.Y.: Geofeed: A location aware news feed system. In: ICDE, pp. 54–65 (2012)

  3. Bentley, J.L.: Solutions to Klees Rectangle Problems. Technical report, Carnegie-Mellon University, Pittsburgh, PA (1977)

  4. Brinkhoff, T.: A framework for generating network-based moving objects. GeoInformatica 6(2), 153–180 (2002)

    Article  MATH  Google Scholar 

  5. Cheema, M.A., Brankovic, L., Lin, X., Zhang, W., Wang, W.: Multi-guarded safe zone: an effective technique to monitor moving circular range queries. In: ICDE (2010)

  6. Chen, L., Cong, G., Cao, X.: An efficient query indexing mechanism for filtering geo-textual data. In: SIGMOD, pp. 749–760 (2013)

  7. Chen, L., Cong, G., Cao, X., Tan, K.: Temporal spatial-keyword top-k publish/subscribe. In: ICDE, pp. 255–266 (2015)

  8. Chen, L., Cong, G., Jensen, C.S., Wu, D.: Spatial keyword query processing: an experimental evaluation. In: PVLDB, pp. 217–228. VLDB Endowment (2013)

  9. Chen, X., Chen, Y., Rao, F.: An efficient spatial publish/subscribe system for intelligent location-based services. In: DEBS (2003)

  10. Christoforaki, M., He, J., Dimopoulos, C., Markowetz, A., Suel, T.: Text vs. space: efficient geo-search query processing. In: CIKM (2011)

  11. Cong, G., Jensen, C.S., Wu, D.: Efficient retrieval of the top-k most relevant spatial web objects. PVLDB 2(1), 337–348 (2009)

    Google Scholar 

  12. De Berg, M., Van Kreveld, M., Overmars, M., Schwarzkopf, O.C.: Computational Geometry. Springer, Berlin (2000)

    Book  MATH  Google Scholar 

  13. De Felipe, I., Hristidis, V., Rishe, N.: Keyword search on spatial databases. In: ICDE, pp. 656–665 (2008)

  14. Du, X., Li, Y., Huang, Q., Shu, L.: Search continuous spatial keyword range queries over moving objects in road networks. J. Comput. Inf. Syst. 11(2), 759–767 (2015)

    Google Scholar 

  15. Enderton, H., Enderton, H.B.: A mathematical Introduction to Logic. Academic Press, Waltham (2001)

    MATH  Google Scholar 

  16. Fabret, F., Jacobsen, H.A., Llirbat, F., Pereira, J., Ross, K.A., Shasha, D.: Filtering algorithms and implementation for very fast publish/subscribe. In: SIGMOD, pp. 115–126 (2001)

  17. Fabret, F., Jacobsen, H.A., Llirbat, F., Pereira, J., Ross, K.A., Shasha, D.: Filtering algorithms and implementation for very fast publish/subscribe systems. In: ACM SIGMOD Record, vol. 30, pp. 115–126. ACM (2001)

  18. Grigni, M., Manne, F.: On the complexity of the generalized block distribution. In: Parallel Algorithms for Irregularly Structured Problems, pp. 319–326 (1996)

  19. Guo, L., Shao, J., Aung, H.H., Tan, K.: Efficient continuous top-k spatial keyword queries on road networks. GeoInformatica 19(1), 29–60 (2015)

    Article  Google Scholar 

  20. Guo, L., Zhang, D., Li, G., Tan, K., Bao, Z.: Location-aware pub/sub system: when continuous moving queries meet dynamic event streams. In: SIGMOD, pp. 843–857 (2015)

  21. Helmer, S., Moerkotte, G.: A performance study of four index structures for set-valued attributes of low cardinality. VLDB J. 12, 244–261 (2003)

  22. Hmedeh, Z., Kourdounakis, H., Christophides, V., Du Mouza, C., Scholl, M., Travers, N.: Subscription indexes for web syndication systems. In: EDBT, pp. 312–323 (2012)

  23. Hu, H., Liu, Y., Li, G., Feng, J., Tan, K.: A location-aware publish/subscribe framework for parameterized spatio-textual subscriptions. In: ICDE, pp. 711–722 (2015)

  24. Huang, W., Li, G., Tan, K.L., Feng, J.: Efficient safe-region construction for moving top-k spatial keyword queries. In: CIKM (2012)

  25. Konig, A., Church, K., Markov, M.: A data structure for sponsored search. In: ICDE, pp. 90–101 (2009)

  26. Kullback, S.: Information Theory and Statistics. Courier Dover Publications, Mineola (1997)

    MATH  Google Scholar 

  27. Li, C., Gu, Y., Qi, J., Yu, G., Zhang, R., Yi, W.: Processing moving k NN queries using influential neighbor sets. Proc. VLDB 8(2), 113–124 (2014)

    Article  Google Scholar 

  28. Li, G., Wang, Y., Wang, T., Feng, J.: Location-aware publish/subscribe. In: SIGKDD, pp. 802–810 (2013)

  29. Mokbel, M.F., Xiong, X., Aref, W.G.: Sina: Scalable incremental processing of continuous queries in spatio-temporal databases. In: SIGMOD, pp. 623–634 (2004)

  30. Mouratidis, K., Pang, H.: Efficient evaluation of continuous text search queries. TKDE 23(10), 1469–1482 (2011)

    Google Scholar 

  31. Mouratidis, K., Papadias, D., Hadjieleftheriou, M.: Conceptual partitioning: an efficient method for continuous nearest neighbor monitoring. In: SIGMOD (2005)

  32. Nutanong, S., Zhang, R., Tanin, E., Kulik, L.: The v*-diagram: a query-dependent approach to moving KNN queries. Proc. VLDB 1(1), 1095–1106 (2008)

    Article  Google Scholar 

  33. Park, M.H., Hong, J.H., Cho, S.B.: Location-based recommendation system using bayesian users preference model in mobile devices. In: Ubiquitous Intelligence and Computing, pp. 1130–1139. Springer (2007)

  34. Rocha-Junior, J.B., Gkorgkas, O., Jonassen, S., Nørvåg, K.: Efficient processing of top-k spatial keyword queries. In: SSTD (2011)

  35. Sadoghi, M., Jacobsen, H.A.: Be-tree: an index structure to efficiently match boolean expressions over high-dimensional discrete space. In: SIGMOD (2011)

  36. Samet, H.: The quadtree and related hierarchical data structures. ACM Comput. Surv. 16(2), 187–260 (1984)

  37. Shraer, A., Gurevich, M., Fontoura, M., Josifovski, V.: Top-k publish-subscribe for social annotation of news. Proc. VLDB 6, 385–396 (2013)

  38. Swami, A.N.: Optimization of large join queries: combining heuristic and combinatorial techniques. In: SIGMOD, pp. 367–376 (1989)

  39. Terrovitis, M., Bouros, P., Vassiliadis, P., Sellis, T.K., Mamoulis, N.: Efficient answering of set containment queries for skewed item distributions. In: EDBT, pp. 225–236 (2011)

  40. Wang, X., Zhang, Y., Zhang, W., Lin, X., Wang, W.: Ap-tree: Efficiently support continuous spatial-keyword queries over stream. In: ICDE (2015)

  41. Whang, S.E., Garcia-Molina, H., Brower, C., Shanmugasundaram, J., Vassilvitskii, S., Vee, E., Yerneni, R.: Indexing boolean expressions. Proc. VLDB 2(1), 37–48 (2009)

    Article  Google Scholar 

  42. Wu, D., Yiu, M.L., Jensen, C.S., Cong, G.: Efficient continuously moving top-k spatial keyword query processing. In: ICDE (2011)

  43. Xiong, X., Mokbel, M.F., Aref, W.G.: Sea-cnn: Scalable processing of continuous k-nearest neighbor queries in spatio-temporal databases. In: ICDE, pp. 643–654 (2005)

  44. Xu, W., Chow, C.Y., Yiu, M.L., Li, Q., Poon, C.K.: Mobifeed: a location-aware news feed framework for moving users. GeoInformatica 19, 1–37 (2014)

  45. Yan, T.W., García-Molina, H.: Index structures for selective dissemination of information under the boolean model. TODS (1994)

  46. Yan, T.W., Garcia-Molina, H.: Duplicate removal in information system dissemination. In: PVLDB (1995)

  47. Zhang, C., Zhang, Y., Zhang, W., Lin, X.: Inverted linear quadtree: efficient top k spatial keyword search. In: ICDE, pp. 901–912. IEEE (2013)

  48. Zhang, D., Chan, C.Y., Tan, K.L.: An efficient publish/subscribe index for e-commerce databases. Proc. VLDB 7(8), 613–624 (2014)

  49. Zhang, J., Zhu, M., Papadias, D., Tao, Y., Lee, D.L.: Location-based spatial queries. In: SIGMOD, pp. 443–454 (2003)

  50. Zhou, Y., Xie, X., Wang, C., Gong, Y., Ma, W.Y.: Hybrid index structures for location-based web search. In: CIKM, pp. 155–162 (2005)

Download references

Acknowledgments

Ying Zhang is supported by ARC DE140100679 and DP130103245. Wenjie Zhang is supported by ARC DP150103071 and DP150102728. Xuemin Lin is supported by NSFC61232006, ARC DP150102728, DP140103578 and DP120104168. Wei Wang is supported by ARC DP130103401 and DP130103405.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiang Wang.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 812 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, X., Zhang, Y., Zhang, W. et al. AP-Tree: efficiently support location-aware Publish/Subscribe. The VLDB Journal 24, 823–848 (2015). https://doi.org/10.1007/s00778-015-0403-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00778-015-0403-4

Keywords