Skip to main content
Log in

Effective partitioning mechanisms for time-evolving graphs in the Flink system

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Graphs are suitable data structures for expressing the relationship between different types of data. With a continuous increase in the graph size, using suitable methods to divide graphs and parallelize the processing load becomes crucial. Balanced graph partitioning has been extensively studied for static and streaming graphs. However, for a time-evolving graph (TEG), whose size and structure are periodically updated, related partitioning methods are lacking. A straightforward approach is to capture snapshots of a TEG and adopt the partitioning methods designed for static or streaming graphs. Although feasible partitioning quality can be expected, the time overhead is high due to frequent repartitioning. This paper proposes two TEG partitioning methods, namely seed and similarity, to decrease the partitioning time. According to the experimental results, on average, seed and similarity require 29–39% of the partitioning time required by snapshot. Moreover, the proposed methods maintain reasonable partitioning quality.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Oussous A, Benjelloun FZ, AitLahcen A, Belfkih S (2018) Big data technologies: a survey. J King Saud UnivComputInfScinces 30(4):431–448

    Google Scholar 

  2. Khan N, Yaqoob I, Hashem IA, Inayat Z, Ali WK, Alam M, Shiraz M, Gani A (2014) Big data: Survey, technologies, opportunities, and challenges. Sci World J. https://doi.org/10.1155/2014/712826

    Article  Google Scholar 

  3. Zheng Y, Capra L, Wolfson O, Yang H (2014) Urban computing: concepts, methodologies, and applications. ACM Trans IntellSystTechnol. https://doi.org/10.1145/2629592

    Article  Google Scholar 

  4. Elshawi R, Batarfi O, Fayoumi A, Bamawi A, Sakr S (2015) Big graph processing systems: state-of-the-art and open challenges. In: Proceedings of the 1st International Conference on Big Data Computing Service and Applications. https://doi.org/10.1109/BigDataService. 2015.11

  5. Lim SH, Lee S, Ganesh G, Brown TC, Sukumar SR (2015) Graph processing platforms at scale: practices and experiences”. Proc ISPASS. https://doi.org/10.1109/ISPASS.2015.7095783

    Article  Google Scholar 

  6. Sharma S, Chou J (2020) A survey of computation techniques on time evolving graphs. Int J Big Data Intell 7(1):1–14. https://doi.org/10.1504/IJBDI.2020.106151

    Article  Google Scholar 

  7. Stanton I, Kliot G (2012) Streaming graph partitioning for large distributed graphs. In: Proceedings of KDD’12. https://doi.org/10.1145/2339530.2339722

  8. Iyer AP, Li LE, Das T, Stoica I (2016) Time-evolving graph processing at scale. In: Proceedings of GRADES’16. https://doi.org/10.1145/2960414.2960419

  9. Buluc A, Meyerhenke H, Safro I, Sanders, P, Schulz C (2015) Recent advances in graph partitioning. arXiv:1311.3144v3[cs.DS].

  10. Karypis G, Kumar V (1999) A fast and high quality multilevel scheme for partitioning irregular graphs. J Sci Comput 20(1):359–392

    MathSciNet  MATH  Google Scholar 

  11. Tsourakakis C, Gkantsidis C, Radunovic B, Vojnovic M (2014) Fennel: streaming graph partitioning for massive scale graphs. In: Proceedings of WSDM’14. https://doi.org/10.1145/2556195.2556213

  12. Gonzalez JE, Low Y, Gu H, Bickson D, Guestrin C (2012) PowerGraph: distributed graph-parallel computation on natural graphs. In: Proceedings of OSDI’12, pp 17–30

  13. Petroni F, Querzoni L, Daudjee K, Kamali S, Iacoboni G (2015) HDRF: stream-based partitioning for power-law graphs. In: Proceedings of CIKM’15. https://doi.org/10.1145/2806416.2806424

  14. Jiang JY, Lee YH, Lai KC (2018) A streaming graph partitioning approach on distributed systems. In: Proceedings of 8th International Conference on Engineering and Applied Science, pp 229–237.

  15. Abdolrashidi A, Ramaswamy L (2015) Incremental partitioning of large time-evolving graphs. Proc CIC. https://doi.org/10.1109/CIC.2015.37

    Article  Google Scholar 

  16. Filippidou I, Kotidis Y (2015) Online and on-demand partitioning of streaming graphs. ProcIntConf Big Data. https://doi.org/10.1109/BigData.2015.7363735

    Article  Google Scholar 

  17. Mayer C, Mayer R, Tariq MA, Geppert H, Laich L, Rieger L, Rothermel K (2018) ADWISE: adaptive window-based streaming edge partitioning for high-speed graph processing. In: Proceedings of ICDCS. https://doi.org/10.1109/ICDCS.2018.00072

  18. Abbas A, Kalavri V, Carbone P, Vlassov V (2018) Streaming graph partitioning: an experimental study. In: Proceedings of the VLDB endowment. https://doi.org/10.14778/3236187.3236208

  19. Leskovec J, Krevl A (2014) SNAP datasets: Stanford large network dataset collection, Jun. 2014, https://snap.stanford.edu/data. Accessed 18 Mar 2021

Download references

Acknowledgements

This study was sponsored by the Ministry of Science and Technology, Taiwan, R.O.C., under contract numbers MOST 106-2221-E-142-005 and MOST 107-2221-E-142-006.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yi-Hsuan Lee.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lee, YH., Jian, SJ. Effective partitioning mechanisms for time-evolving graphs in the Flink system. J Supercomput 77, 12336–12354 (2021). https://doi.org/10.1007/s11227-021-03769-6

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-021-03769-6

Keywords

Navigation