Skip to main content
Log in

Distributed and parallel processing for real-time and dynamic spatio-temporal graph

  • Published:
World Wide Web Aims and scope Submit manuscript

Abstract

As a non-linear data structure consisting of nodes and edges, the graph data span many different domains. In the real world, applications based on such data structures are always time-sensitive, that is, the value of graph data tends to decrease with time. Furthermore, the application based on spatio-temporal graph is one of the typical representatives of time-sensitive, since the time dimension is an inherent feature of spatio-temporal data. The Distributed Stream Processing Engine (DSPE) seems an excellent choice for the above requirement, which is commonly partitioned and concurrently processed by a number of threads to maximize the throughput. However, it is not feasible to do such mission directly using the traditional DSPE. In this paper, we propose a computational model suitable for handling the spatio-temporal graph in DSPE, by reconstructing the DSPE’s parallel processing slots. Specifically, our proposal includes a general processing framework to deal with the data structure of the spatio-temporal graph, a state information compensation mechanism to ensure the correctness of processing such stateful operation in DSPE, a lightweight summary information calculation method to ensure the performance of the system. Empirical studies on real-world stream applications validate the usefulness of our proposals and prove the considerable advantage of our approaches over state-of-the-art solutions in the literature.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12
Figure 13
Figure 14
Figure 15

Similar content being viewed by others

References

  1. Apache Flink Project. http://flink.apache.org/

  2. Apache Spark Project. http://spark.apache.org/

  3. Apache Storm Project. http://storm.apache.org/

  4. Bakalov, P., Hadjieleftheriou, M., Keogh, E., Tsotras, V.J.: Efficient trajectory joins using symbolic representations. In: Proceedings of the 6th international conference on Mobile data management, pages 86–93. ACM (2005)

  5. Bakalov, P., Hadjieleftheriou, M., Tsotras, V.J.: Time relaxed spatiotemporal trajectory joins. In: Proceedings of the 13th Annual ACM International Workshop on Geographic Information Systems, pp 182–191. ACM (2005)

  6. Balkesen, C., Tatbul, N.: Scalable data partitioning techniques for parallel sliding window processing over data streams. In: International Workshop on Data Management for Sensor Networks (DMSN) (2011)

  7. Bruno, N., Kwon, Y., Wu, M.-C.: Advanced join strategies for large-scale distributed computation. Proc. VLDB Endow. 7(13), 1484–1495 (2014)

    Article  Google Scholar 

  8. Cao, P., Wang, Z.: Efficient top-k query calculation in distributed networks. In: Proceedings of the Twenty-Third Annual ACM Symposium on Principles of Distributed Computing, pp 206–215. ACM (2004)

  9. Chen, L., Özsu, M.T., Oria, V.: Robust and fast similarity search for moving object trajectories. In: Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, pp 491–502. ACM (2005)

  10. Cormode, G., Muthukrishnan, S.: An improved data stream summary: The count-min sketch and its applications. J. Algor. 55(1), 58–75 (2005)

    Article  MathSciNet  Google Scholar 

  11. Ding, J., Fang, J., Zhang, Z., Zhao, P., Xu, J., Zhao, L.: Real-time trajectory similarity processing using longest common subsequence. In: Proceedings of the 21st High Performance Computing and Communications. IEEE (To appear)

  12. Dubuisson, M.-P., Jain, A.K.: A modified Hausdorff distance for object matching. In: Proceedings of 12th International Conference on Pattern Recognition, vol. 1, pp 566–568. IEEE (1994)

  13. Eiter, T., Mannila, H.: Computing Discrete Fréchet Distance. Technical report, Citeseer (1994)

    Google Scholar 

  14. Gedik, B.: Partitioning functions for stateful data parallelism in stream processing. VLDB J. Int. J. Very Large Data Bases 23(4), 517–539 (2014)

    Article  Google Scholar 

  15. Gonzalez, J.E., Xin, R.S., Dave, A., Crankshaw, D., Franklin, M.J., Stoica, I.: Graphx: Graph processing in a distributed dataflow framework. In: 11th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 14), pp 599–613 (2014)

  16. Haan, H., Streb, J., Bien, S., Rösler, F.: Individual cortical current density reconstructions of the semantic n400 effect: Using a generalized minimum norm model with different constraints (l1 and l2 norm). Hum. Brain Mapp. 11(3), 178–192 (2000)

    Article  Google Scholar 

  17. Ji, S., Mittal, P., Beyah, R.: Graph data anonymization, de-anonymization attacks, and de-anonymizability quantification: a survey. IEEE Commun. Surveys Tutor. 19(2), 1305–1326 (2017)

    Article  Google Scholar 

  18. Li, L., Zheng, K., Wang, S., Hua, W., Zhou, X.: Go slow to go fast: Minimal on-road time route scheduling with parking facilities using historical trajectory. VLDB J. Int. J. Very Large Data Bases 27(3), 321–345 (2018)

    Article  Google Scholar 

  19. Lian, D., Zheng, K., Ge, Y., Cao, L., Chen, E., Xie, X.: Geomf++: Scalable location recommendation via joint geographical modeling and matrix factorization. ACM Trans. Inf. Syst. (TOIS) 36(3), 33 (2018)

    Article  Google Scholar 

  20. Liu, G., Liu, Y., Zheng, K., Liu, A., Li, Z., Wang, Y., Zhou, X.: Mcs-gpm: Multi-constrained simulation based graph pattern matching in contextual social graphs. IEEE Trans. Knowl. Data Eng. 30(6), 1050–1064 (2017)

    Article  Google Scholar 

  21. Nasir, M.A.U., Morales, G.D.F., Garcia-Soriano, D., Kourtellis, N., Serafini, M.: The power of both choices: Practical load balancing for distributed stream processing engines. In: 2015 IEEE 31st International Conference on Data Engineering, pp 137–148. IEEE (2015)

  22. Nasir, M.A.U., Morales, G.D.F., Kourtellis, N., Serafini, M.: When two choices are not enough: Balancing at scale in distributed stream processing. In: 2016 IEEE 32nd International Conference on Data Engineering (ICDE), pp 589–600. IEEE (2016)

  23. Paterson, M., Dančík, V.: Longest common subsequences. In: International Symposium on Mathematical Foundations of Computer Science, pp 127–142. Springer (1994)

  24. Rivetti, N., Querzoni, L., Anceaume, E., Busnel, Y., Sericola, B.: Efficient key grouping for near-optimal load balancing in stream processing systems. In: Proceedings of the 9th ACM International Conference on Distributed Event-Based Systems, pp 80–91. ACM (2015)

  25. Shang, S., Ding, R., Yuan, B., Xie, K., Zheng, K., Kalnis, P.: User oriented trajectory search for trip recommendation. In: Proceedings of the 15th International Conference on Extending Database Technology, pp 156–167. ACM (2012)

  26. Shang, S., Chen, L., Wei, Z., Jensen, C.S., Zheng, K., Kalnis, P.: Trajectory similarity join in spatial networks. Proc. VLDB Endow. 10(11), 1178–1189 (2017)

    Article  Google Scholar 

  27. Tsourakakis, C., Gkantsidis, C., Radunovic, B., Vojnovic, M.: Fennel: Streaming graph partitioning for massive scale graphs. In: Proceedings of the 7th ACM International Conference on Web Search and Data Mining, pp 333–342. ACM (2014)

  28. Vitorovic, A., Elseidy, M., Koch, C.: Load balancing and skew resilience for parallel joins. In: 2016 IEEE 32nd International Conference on Data Engineering (ICDE), pp 313–324. IEEE (2016)

  29. Vlachos, M., Hadjieleftheriou, M., Gunopulos, D., Keogh, E.: Indexing multi-dimensional time-series with support for multiple distance measures. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 216–225. ACM (2003)

  30. Wang, H., Su, H., Zheng, K., Sadiq, S., Zhou, X.: An effectiveness study on trajectory similarity measures. In: Proceedings of the Twenty-Fourth Australasian Database Conference, vol. 137, pp 13–22. Australian Computer Society, Inc. (2013)

  31. Xie, D., Li, F., Phillips, J.M.: Distributed trajectory similarity search. Proc. VLDB Endow. 10(11), 1478–1489 (2017)

    Article  Google Scholar 

  32. Xu, Y., Kostamaa, P., Zhou, X., Chen, L.: Handling data skew in parallel joins in shared-nothing systems. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, pp 1043–1052. ACM (2008)

  33. Yi, B.-K., Jagadish, H., Faloutsos, C.: Efficient retrieval of similar time sequences under time warping. In: Proceedings 14th International Conference on Data Engineering, pp 201–208. IEEE (1998)

  34. Yin, H., Zhou, X., Cui, B., Wang, H., Zheng, K., Nguyen, Q.V.H.: Adapting to user interest drift for poi recommendation. IEEE Trans. Knowl. Data Eng. 28(10), 2566–2581 (2016)

    Article  Google Scholar 

  35. Yu, H., Li, H.-G., Wu, P., Agrawal, D., El Abbadi, A.: Efficient processing of distributed top-k queries. In: International Conference on Database and Expert Systems Applications, pp 65–74. Springer (2005)

  36. Zaharia, M., Chowdhury, M., Franklin, M.J., Shenker, S., Stoica, I.: Spark: Cluster computing with working sets. HotCloud 10(10–10), 95 (2010)

    Google Scholar 

  37. Zeinalipour-Yazti, D., Vagena, Z., Gunopulos, D., Kalogeraki, V., Tsotras, V., Vlachos, M., Koudas, N., Srivastava, D.: The threshold join algorithm for top-k queries in distributed sensor networks. In: Proceedings of the 2nd International Workshop on Data Management for Sensor Networks, pp 61–66. ACM (2005)

  38. Zeinalipour-Yazti, D., Lin, S., Gunopulos, D.: Distributed spatio-temporal similarity search. In: Proceedings of the 15th ACM International Conference on Information and Knowledge Management, pp 14–23. ACM (2006)

  39. Zhao, Y., Zheng, K., Li, Y., Su, H., Liu, J., Zhou, X.: Destination-aware task assignment in spatial crowdsourcing: A worker decomposition approach. IEEE Transactions on Knowledge and Data Engineering (2019)

  40. Zheng, K., Shang, S., Yuan, N.J., Yang, Y.: Towards efficient search for activity trajectories. In: 2013 IEEE 29Th International Conference on Data Engineering (ICDE), pp 230–241. IEEE (2013)

  41. Zheng, K., Zheng, Y., Yuan, N.J., Shang, S., Zhou, X.: Online discovery of gathering patterns over trajectories. IEEE Trans. Knowl. Data Eng. 26(8), 1974–1988 (2013)

    Article  Google Scholar 

  42. Zheng, K., Su, H., Zheng, B., Shang, S., Xu, J., Liu, J., Zhou, X.: Interactive top-k spatial keyword queries. In: 2015 IEEE 31st International Conference on Data Engineering, pp 423–434. IEEE (2015)

  43. Zheng, B., Su, H., Hua, W., Zheng, K., Zhou, X., Li, G.: Efficient clue-based route search on road networks. IEEE Trans. Knowl. Data Eng. 29 (9), 1846–1859 (2017)

    Article  Google Scholar 

  44. Zheng, K., Zhao, Y., Lian, D., Zheng, B, Liu, G., Zhou, X.: Reference-based framework for spatio-temporal trajectory compression and query processing. IEEE Transactions on Knowledge and Data Engineering (2019)

Download references

Acknowledgments

This work is partially supported by NSFC (No.61802273), the Postdoctoral Science Foundation of China under Grant (No. 2017M621813), the Postdoctoral Science Foundation of Jiangsu Province of China under Grant (No. 2018K029C), and the Natural Science Foundation for Colleges and Universities in Jiangsu Province of China under Grant (No. 18KJB520044).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Junhua Fang.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article belongs to the Topical Collection: Special Issue on Graph Data Management in Online Social Networks

Guest Editors: Kai Zheng, Guanfeng Liu, Mehmet A. Orgun, and Junping Du

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fang, J., Ding, J., Zhao, P. et al. Distributed and parallel processing for real-time and dynamic spatio-temporal graph. World Wide Web 23, 905–926 (2020). https://doi.org/10.1007/s11280-019-00741-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11280-019-00741-6

Keywords

Navigation