Skip to main content
Log in

An investigation of big graph partitioning methods for distribution of graphs in vertex-centric systems

  • Published:
Distributed and Parallel Databases Aims and scope Submit manuscript

Abstract

Relations among data entities in most big data sets can be modeled by a big graph. Implementation and execution of algorithms related to the structure of big graphs is very important in different fields. Because of the inherently high volume of big graphs, their calculations should be performed in a distributed manner. Some distributed systems based on vertex-centric model have been introduced for big graph calculations in recent years. The performance of these systems in terms of run time depends on the partitioning and distribution of the graph. Therefore, the graph partitioning is a major concern in this field. This paper concentrates on big graph partitioning approaches for distribution of graphs in vertex-centric systems. This briefly discusses vertex-centric systems and formulates different models of graph partitioning problem. Then, a review of recent methods of big graph partitioning for these systems is shown. Most recent methods of big graph partitioning for vertex centric systems can be categorized into three classes: (i) stream-based methods that see vertices or edges of the graph in a stream and partition them, (ii) distributed methods that partition vertices or edges in a distributed manner, and (iii) dynamic methods that change partitions during the execution of algorithms to obtain better performance. This study compares the properties of different approaches in each class and briefly reviews methods that are not in these categories. This comparison indicates that The streaming methods are good choices for initial load of the graph in Vertex-centric systems. The distributed and dynamic methods are appropriate for long-running applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

Notes

  1. \(n=|V|\) & \(m=|E|\) & \(r= \textit{number of iterations}\).

References

  1. Abbas, Z., Kalavri, V., Carbone, P., Vlassov, V.: Streaming graph partitioning: an experimental study. Proc. VLDB Endow. 11(11), 1590–1603 (2018). https://doi.org/10.14778/3236187.3236208

    Article  Google Scholar 

  2. Abou-Rjeili, A., Karypis, G.: Multilevel algorithms for partitioning power-law graphs. In: Proceedings 20th IEEE International Parallel Distributed Processing Symposium (2006). https://doi.org/10.1109/IPDPS.2006.1639360

  3. Akhremtsev, Y., Sanders, P., Schulz, C.: High-quality shared-memory graph partitioning (2017). arXiv:1710.08231

  4. Aslam, S.: Twitter by the numbers: stats, demographics and fun facts (2018). https://www.omnicoreagency.com/twitter-statistics/

  5. Aydin, K., Bateni, M., Mirrokni, V.: Distributed balanced partitioning via linear embedding. In: Proceedings of the Ninth ACM International Conference on Web Search and Data Mining, pp. 387–396. ACM, New York (2016)

  6. Balakrishnan, H., Kaashoek, M.F., Karger, D., Morris, R., Stoica, I.: Looking up data in p2p systems. Commun. ACM 46(2), 43–48 (2003). https://doi.org/10.1145/606272.606299

    Article  Google Scholar 

  7. Bao, N.T., Suzumura, T.: Towards highly scalable pregel-based graph processing platform with x10. In: Proceedings of the 22Nd International Conference on World Wide Web (WWW ’13) Companion, pp. 501–508. ACM, New York (2013). https://doi.org/10.1145/2487788.2487984

  8. Bollobs, B.: Random Graphs. Springer, New York (1998)

    Google Scholar 

  9. Borthakur, D.: HDFS Architecture Guide. Apache Hadoop Project (2008)

  10. Bourse, F., Lelarge, M., Vojnovic, M.: Balanced graph edge partition. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’14), pp. 1456–1465. ACM, New York (2014). https://doi.org/10.1145/2623330.2623660

  11. Bu, Y., Borkar, V., Jia, J., Carey, M.J., Condie, T.: Pregelix: big(ger) graph analytics on a dataflow engine. Proc. VLDB Endow. 8(2), 161–172 (2014). https://doi.org/10.14778/2735471.2735477

    Article  Google Scholar 

  12. Bui, T.N., Jones, C.: Finding good approximate vertex and edge partitions is np-hard. Inf. Process. Lett. 42(3), 153–159 (1992). https://doi.org/10.1016/0020-0190(92)90140-Q

    Article  MathSciNet  MATH  Google Scholar 

  13. Buluç, A., Meyerhenke, H., Safro, I., Sanders, P., Schulz, C.: Recent advances in graph partitioning. In: Kliemann, L., Sanders, P. (eds.) Algorithm Engineering. LNCS, pp. 117–158. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-49487-6-4

    Chapter  Google Scholar 

  14. Cao, Y., Rao, R.: A streaming graph partitioning approach on imbalance cluster. In: 2016 18th International Conference on Advanced Communication Technology (ICACT), pp. 360–364 (2016). https://doi.org/10.1109/ICACT.2016.7423392

  15. Chen, R., Shi, J., Chen, Y., Chen, H.: Powerlyra: differentiated graph computation and partitioning on skewed graphs. In: Proceedings of the Tenth European Conference on Computer Systems (EuroSys ’15), pp. 1:1–1:15. ACM, New York (2015). https://doi.org/10.1145/2741948.2741970

  16. Chen, T., Li, B.: A distributed graph partitioning algorithm for processing large graphs. In: 2016 IEEE Symposium on Service-Oriented System Engineering (SOSE), pp. 53–59 (2016)

  17. Ching, A.: Giraph: production-grade graph processing infrastructure for trillion edge graphs. In: ATPESC, vol. 14 (2014)

  18. Ching, A., Edunov, S., Kabiljo, M., Logothetis, D., Muthukrishnan, S.: One trillion edges: graph processing at Facebook-scale. Proc. VLDB Endow. 8(12), 1804–1815 (2015). https://doi.org/10.14778/2824032.2824077

    Article  Google Scholar 

  19. Condon, A., Karp, R.M.: Algorithms for graph partitioning on the planted partition model. Random Struct. Algorithms 18(2), 116–140 (2001)

    Article  MathSciNet  Google Scholar 

  20. Cormen, T.H.: Introduction to algorithms. MIT, Cambridge (2009)

    MATH  Google Scholar 

  21. Cui, H., Cipar, J., Ho, Q., Kim, J.K., Lee, S., Kumar, A., Wei, J., Dai, W., Ganger, G.R., Gibbons, P.B., et al.: Exploiting bounded staleness to speed up big data analytics. In: USENIX Annual Technical Conference, pp. 37–48 (2014)

  22. Devine, K., Boman, E., Heaphy, R., Hendrickson, B., Vaughan, C.: Zoltan data management services for parallel dynamic applications. Comput. Sci. Eng. 4(2), 90–97 (2002)

    Article  Google Scholar 

  23. Dindokar, R., Simmhan, Y.: Elastic partition placement for non-stationary graph algorithms. In: 2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), pp. 90–93 (2016). https://doi.org/10.1109/CCGrid.2016.97

  24. Dong, F., Zhang, J., Luo, J., Shen, D., Jin, J.: Enabling application-aware flexible graph partition mechanism for parallel graph processing systems. Concurr. Comput. Pract. Exp. 29(6), e3849 (2017). https://doi.org/10.1002/cpe.3849, e3849 cpe.3849

    Article  Google Scholar 

  25. Donnelly, G.: 75 super-useful Facebook statistics for 2018 (2018). https://www.wordstream.com/blog/ws/2017/11/07/facebook-statistics

  26. Echbarthi, G., Kheddouci, H.: Fractional greedy and partial restreaming partitioning: New methods for massive graph partitioning. In: 2014 IEEE International Conference on Big Data (Big Data), pp. 25–32 (2014). https://doi.org/10.1109/BigData.2014.7004368

  27. Echbarthi, G., Kheddouci, H.: Streaming metis partitioning. In: 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), pp. 17–24 (2016). https://doi.org/10.1109/ASONAM.2016.7752208

  28. Elsner, U.: Graph partitioning—a survey. Technical Report SFB393/97-27 (1997)

  29. Fjllstrm, P.O.: Algorithms for Graph Partitioning: A Survey, vol. 3. Linkping University Electronic Press, Linkping (1998)

    Google Scholar 

  30. Gonzalez, J.E., Low, Y., Gu, H., Bickson, D., Guestrin, C.: Powergraph: distributed graph-parallel computation on natural graphs. In: Proceedings of 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI), vol. 12, pp. 17–30 (2012)

  31. Gregory, S.: Finding overlapping communities in networks by label propagation. New J. Phys. 12(10), 103,018 (2010)

    Article  Google Scholar 

  32. Guerrieri, A., Montresor, A.: Distributed edge partitioning for graph processing (2014). arXiv:1403.6270

  33. Guerrieri, A., Montresor, A.: DFEP: Distributed Funding-Based Edge Partitioning, Springer, Berlin, pp. 346–358 (2015)

    Google Scholar 

  34. Guo, Y., Hong, S., Chafi, H., Iosup, A., Epema, D.: Modeling, analysis, and experimental comparison of streaming graph-partitioning policies. J. Parallel Distrib. Comput. 108, 106–121. https://doi.org/10.1016/j.jpdc.2016.02.003. Special Issue on Scalable Computing Systems for Big Data Applications (2017)

    Article  Google Scholar 

  35. Han, M., Daudjee, K.: Giraph unchained: barrierless asynchronous parallel execution in pregel-like graph processing systems. Proc. VLDB Endow. 8(9), 950–961 (2015). https://doi.org/10.14778/2777598.2777604

    Article  Google Scholar 

  36. Han, W., Miao, Y., Li, K., Wu, M., Yang, F., Zhou, L., Prabhakaran, V., Chen, W., Chen, E.: Chronos: A graph engine for temporal graph analysis. In: Proceedings of the Ninth European Conference on Computer Systems (EuroSys ’14), pp. 1:1–1:14. ACM, New York (2014). https://doi.org/10.1145/2592798.2592799

  37. Hendawi, A.M., Bao, J., Mokbel, M.F.: iRoad: a framework for scalable predictive query processing on road networks. Proc. VLDB Endow. 6(12), 1262–1265 (2013). https://doi.org/10.14778/2536274.2536291

    Article  Google Scholar 

  38. Hoque, I., Gupta, I.: Lfgraph: Simple and fast distributed graph analytics. In: Proceedings of the First ACM SIGOPS Conference on Timely Results in Operating Systems (TRIOS ’13), pp. 9:1–9:17. ACM, New York (2013). https://doi.org/10.1145/2524211.2524218

  39. Hu, K., Zeng, H.J.W.W.G.: Partitioning big graph with respect to arbitrary proportions in a streaming manner. Future Gen. Comput. Syst. 80, 1–11 (2018). https://doi.org/10.1016/j.future.2017.06.027

    Article  Google Scholar 

  40. Hwang, C.R.: Simulated annealing: theory and applications. Acta Appl. Math. 12(1), 108–111 (1988). https://doi.org/10.1007/BF00047572

    Article  MathSciNet  Google Scholar 

  41. Jain, N., Liao, G., Willke, T.L.: Graphbuilder: Scalable graph etl framework. In: First International Workshop on Graph Data Management Experiences and Systems (GRADES ’13), pp. 4:1–4:6. ACM, New York (2013). https://doi.org/10.1145/2484425.2484429

  42. Junghanns, M., Kiessling, M., Teichmann, N., Gomez, K., Petermann, A., Rahm, E.: Declarative and distributed graph analytics with gradoop. Proc. VLDB Endow. 11(12), 2006–2009 (2018). https://doi.org/10.14778/3229863.3236246

    Article  Google Scholar 

  43. Khayyat, Z., Awara, K., Alonazi, A., Jamjoom, H., Williams, D., Kalnis, P.: Mizan: A system for dynamic load balancing in large-scale graph processing. In: Proceedings of the 8th ACM European Conference on Computer Systems (EuroSys ’13), pp. 169–182. ACM, New York (2013). https://doi.org/10.1145/2465351.2465369

  44. Kim, G.H., Trimi, S., Chung, J.H.: Big-data applications in the government sector. Commun. ACM 57(3), 78–85 (2014). https://doi.org/10.1145/2500873

    Article  Google Scholar 

  45. Lang, K.: Finding good nearly balanced cuts in power law graphs. Technical Report YRL-2004-036, Yahoo! Research Labs (2004)

  46. Lee, K.H., Lee, Y.J., Choi, H., Chung, Y.D., Moon, B.: Parallel data processing with MapReduce: a survey. SIGMOD Rec. 40(4), 11–20 (2012). https://doi.org/10.1145/2094114.2094118

    Article  Google Scholar 

  47. Leskovec, J., Kleinberg, J., Faloutsos, C.: Graph evolution: densification and shrinking diameters. ACM Trans. Knowl. Discov. Data 1(1), 2 (2007). https://doi.org/10.1145/1217299.1217301

    Article  Google Scholar 

  48. Leskovec, J., Lang, K.J., Dasgupta, A., Mahoney, M.W.: Community structure in large networks: natural cluster sizes and the absence of large well-defined clusters. Internet Math. 6(1), 29–123 (2009)

    Article  MathSciNet  Google Scholar 

  49. Lim, Y., Lee, W.J., Choi, H.J., Kang, U.: MTP: discovering high quality partitions in real world graphs. World Wide Web, pp. 1–24 (2016)

  50. Liu, X., Murata, T.: Advanced modularity-specialized label propagation algorithm for detecting communities in networks. Physica A 389(7), 1493–1500 (2010). https://doi.org/10.1016/j.physa.2009.12.019

    Article  Google Scholar 

  51. Low, Y., Bickson, D., Gonzalez, J., Guestrin, C., Kyrola, A., Hellerstein, J.M.: Distributed GraphLab: a framework for machine learning and data mining in the cloud. Proc. VLDB Endow. 5(8), 716–727 (2012). https://doi.org/10.14778/2212351.2212354

    Article  Google Scholar 

  52. Malewicz, G., Austern, M.H., Bik, A.J., Dehnert, J.C., Horn, I., Leiser, N., Czajkowski, G.: Pregel: A system for large-scale graph processing. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data (SIGMOD ’10), pp. 135–146. ACM, New York (2010). https://doi.org/10.1145/1807167.1807184

  53. Manyika, J., Chui, M., Brown, B., Bughin, J., Dobbs, R., Roxburgh, C., Byers, A.H.: Big data: the next frontier for innovation, competition, and productivity 2011, vol. 5(33), p. 222 (2015). http://www.mckinsey.com/Insights/MGI/Research/Technology_and_Innovation/Big_data_The_next_frontier_for_innovation

  54. Margo, D., Seltzer, M.: A scalable distributed graph partitioner. Proc. VLDB Endow. 8(12), 1478–1489 (2015). https://doi.org/10.14778/2824032.2824046

    Article  Google Scholar 

  55. Martella, C., Logothetis, D., Loukas, A., Siganos, G.: Spinner: Scalable graph partitioning in the cloud (2014). arXiv:1404.3861

  56. Mayer, C., Tariq, M.A., Li, C., Rothermel, K.: Graph: heterogeneity-aware graph computation with adaptive partitioning. In: 2016 IEEE 36th International Conference on Distributed Computing Systems (ICDCS), pp. 118–128 (2016). https://doi.org/10.1109/ICDCS.2016.92

  57. Mayer, C., Mayer, R., Tariq, M.A., Geppert, H., Laich, L., Rieger, L., Rothermel, K.: Adwise: adaptive window-based streaming edge partitioning for high-speed graph processing (2017). arXiv preprint. arXiv:1712.08367

  58. McAfee, A., Brynjolfsson, E., Davenport, T.H., Patil, D., Barton, D.: Big data’s: the management revolution. Harv. Bus. Rev. 90(10), 61–67 (2012)

    Google Scholar 

  59. McSherry, F.: Spectral partitioning of random graphs. In: Proceedings of 42nd IEEE Symposium on Foundations of Computer Science, pp. 529–537 (2001)

  60. Meyerhenke, H., Sanders, P., Schulz, C.: Parallel graph partitioning for complex networks. IEEE Trans. Parallel Distrib. Syst. PP(99), 1–1 (2017)

    Google Scholar 

  61. Mofrad, M.H., Melhem, R., Hammoud, M.: Revolver: vertex-centric graph partitioning using reinforcement learning. In: 2018 IEEE 11th International Conference on Cloud Computing (CLOUD), pp. 818–821 (2018). https://doi.org/10.1109/CLOUD.2018.00111

  62. Moreira, O., Popp, M., Schulz, C.: Graph partitioning with acyclicity constraints (2017). arXiv:1704.00705

  63. Nirmala, G., Dinakaran, K.: Analysis of protein database for semantic similarity using map reduce; a survey. In: Proceedings of IEEE International Conference on Computer Communication and Systems (ICCCS14), pp. 046–050 (2014). https://doi.org/10.1109/ICCCS.2014.7068166

  64. Nishimura, J., Ugander, J.: Restreaming graph partitioning: simple versatile algorithms for advanced balancing. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’13), pp. 1106–1114. ACM, New York (2013). https://doi.org/10.1145/2487575.2487696

  65. Onizuka, M., Fujimori, T., Shiokawa, H.: Graph partitioning for distributed graph processing. Data Sci. Eng. 2(1), 94–105 (2017). https://doi.org/10.1007/s41019-017-0034-4

    Article  Google Scholar 

  66. Petroni, F., Querzoni, L., Daudjee, K., Kamali, S., Iacoboni, G.: HDRF: Stream-based partitioning for power-law graphs. In: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management (CIKM ’15), pp. 243–252. ACM, New York (2015). https://doi.org/10.1145/2806416.2806424

  67. Rahimian, F., Payberah, A.H., Girdzijauskas, S., Jelasity, M., Haridi, S.: JA-BE-JA: A distributed algorithm for balanced graph partitioning. In: 2013 IEEE 7th International Conference on Self-Adaptive and Self-Organizing Systems, pp. 51–60 (2013)

  68. Rahimian, F., Payberah, A.H., Girdzijauskas, S., Haridi S.: Distributed vertex-cut partitioning. In: Distributed Applications and Interoperable Systems. Springer, Heidelberg, pp. 186–200 (2014)

    Chapter  Google Scholar 

  69. Roy, A., Bindschaedler, L., Malicevic, J., Zwaenepoel, W.: Chaos: Scale-out graph processing from secondary storage. In: Proceedings of the 25th Symposium on Operating Systems Principles (SOSP ’15), pp. 410–424. ACM, New York (2015). https://doi.org/10.1145/2815400.2815408

  70. Sajjad, H.P., Payberah, A.H., Rahimian, F., Vlassov, V., Haridi, S.: Boosting vertex-cut partitioning for streaming graphs. In: 2016 IEEE International Congress on Big Data (BigData Congress), pp. 1–8 (2016). https://doi.org/10.1109/BigDataCongress.2016.10

  71. Sala, A., Cao, L., Wilson, C., Zablit, R., Zheng, H., Zhao, B.Y.: Measurement-calibrated graph models for social network experiments. In: Proceedings of the 19th International Conference on World Wide Web (WWW ’10), pp. 861–870. ACM, New York (2010). https://doi.org/10.1145/1772690.1772778

  72. Sala, A., Zhao, X., Wilson, C., Zheng, H., Zhao, B.Y.: Sharing graphs using differentially private graph models. In: Proceedings of the 2011 ACM SIGCOMM Conference on Internet Measurement Conference (IMC ’11), pp. 81–98. ACM, New York (2011).https://doi.org/10.1145/2068816.2068825

  73. Salihoglu, S., Widom, J.: GPS: a graph processing system. In: Proceedings of the 25th International Conference on Scientific and Statistical Database Management (SSDBM), pp. 22:1–22:12. ACM, New York (2013). https://doi.org/10.1145/2484838.2484843

  74. Salihoglu, S., Widom, J.: Optimizing graph algorithms on pregel-like systems. Proc. VLDB Endow. 7(7), 577–588 (2014). https://doi.org/10.14778/2732286.2732294

    Article  Google Scholar 

  75. Sanders, P., Schulz, C.: Engineering multilevel graph partitioning algorithms. In: Algorithms—ESA 2011. Springer, Berlin, pp. 469–480 (2011)

    Chapter  Google Scholar 

  76. Shang, Z., Yu, J.X.: Catch the wind: Graph workload balancing on cloud. In: 2013 IEEE 29th International Conference on Data Engineering (ICDE), pp. 553–564 (2013). https://doi.org/10.1109/ICDE.2013.6544855

  77. Shi, Z., Li, J., Guo, P., Li, S., Feng, D., Su, Y.: Partitioning dynamic graph asynchronously with distributed fennel. Future Gen. Comput. Syst. 71, 32–42 (2017). https://doi.org/10.1016/j.future.2017.01.014

    Article  Google Scholar 

  78. Slota, G.M., Madduri, K., Rajamanickam, S.: PuLP: scalable multi-objective multi-constraint partitioning for small-world networks. In: 2014 IEEE International Conference on Big Data (Big Data), pp. 481–490 (2014). https://doi.org/10.1109/BigData.2014.7004265

  79. Stanton, I.: Streaming balanced graph partitioning algorithms for random graphs. In: Proceedings of the Twenty-Fifth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA ’14), pp. 1287–1301 (2014). https://doi.org/10.1137/1.9781611973402.95

  80. Stanton, I., Kliot, G.: Streaming graph partitioning for large distributed graphs. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’12), pp. 1222–1230. ACM, New York (2012). https://doi.org/10.1145/2339530.2339722

  81. Sundaram, N., Satish, N., Patwary, M.M.A., Dulloor, S.R., Anderson, M.J., Vadlamudi, S.G., Das, D., Dubey, P.: GraphMat: high performance graph analytics made productive. Proc. VLDB Endow. 8(11), 1214–1225 (2015). https://doi.org/10.14778/2809974.2809983

    Article  Google Scholar 

  82. Suri, S., Vassilvitskii, S.: Counting triangles and the curse of the last reducer. In: Proceedings of the 20th International Conference on World Wide Web (WWW ’11), pp. 607–614. ACM, New York (2011). https://doi.org/10.1145/1963405.1963491

  83. Tatarowicz, A.L., Curino, C., Jones, E.P.C., Madden, S.: Lookup tables: fine-grained partitioning for distributed databases. In: 2012 IEEE 28th International Conference on Data Engineering, pp. 102–113. (2012). https://doi.org/10.1109/ICDE.2012.26

  84. Tsourakakis, C.: Streaming graph partitioning in the planted partition model. In: Proceedings of the 2015 ACM on Conference on Online Social Networks (COSN ’15), pp. 27–35. ACM, New York (2015). https://doi.org/10.1145/2817946.2817950

  85. Tsourakakis, C., Gkantsidis, C., Radunovic, B., Vojnovic, M.: Fennel: streaming graph partitioning for massive scale graphs. In: Proceedings of the 7th ACM International Conference on Web Search and Data Mining (WSDM ’14), pp. 333–342. ACM, New York (2014). https://doi.org/10.1145/2556195.2556213

  86. Ugander, J., Backstrom, L.: Balanced label propagation for partitioning massive graphs. In: Proceedings of the Sixth ACM International Conference on Web Search and Data Mining (WSDM ’13), pp. 507–516. ACM, New York (2013). https://doi.org/10.1145/2433396.2433461

  87. Valiant, L.G.: A bridging model for parallel computation. Commun. ACM 33(8), 103–111 (1990). https://doi.org/10.1145/79173.79181

    Article  Google Scholar 

  88. Vaquero, L.M., Cuadrado, F., Logothetis, D., Martella, C.: xDGP: a dynamic graph processing system with adaptive partitioning (2013). arXiv:1309.1049

  89. Verma, S., Leslie, L.M., Shin, Y., Gupta, I.: An experimental comparison of partitioning strategies in distributed graph processing. Proc. VLDB Endow. 10(5), 493–504 (2017). https://doi.org/10.14778/3055540.3055543

    Article  Google Scholar 

  90. Wang, L., Xiao, Y., Shao, B., Wang, H.: How to partition a billion-node graph. In: 2014 IEEE 30th International Conference on Data Engineering, pp. 568–579 (2014). https://doi.org/10.1109/ICDE.2014.6816682

  91. Watts, D.J., Strogatz, S.H.: Collective dynamics of ‘small-world’ networks. Nature 393(6684), 440–442 (1998). https://doi.org/10.1038/30918

    Article  MATH  Google Scholar 

  92. White, T.: Hadoop: The Definitive Guide. O’Reilly Media, Boston (2012)

    Google Scholar 

  93. Wilson, C., Boe, B., Sala, A., Puttaswamy, K.P., Zhao, B.Y.: User interactions in social networks and their implications. In: Proceedings of the 4th ACM European Conference on Computer Systems (EuroSys ’09), pp. 205–218. ACM, New York (2009). https://doi.org/10.1145/1519065.1519089

  94. Wu, M., Yang, F., Xue, J., Xiao, W., Miao, Y., Wei, L., Lin, H., Dai, Y., Zhou, L.: GraM: Scaling graph computation to the trillions. In: Proceedings of the Sixth ACM Symposium on Cloud Computing (SoCC ’15), pp. 408–421. ACM, New York (2015). https://doi.org/10.1145/2806777.2806849

  95. Xiao, W., Xue, J., Miao, Y., Li, Z., Chen, C., Wu, M., Li, W., Zhou, L.: Tux2: Distributed graph computation for machine learning. In: Proceedings of the 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI 17), pp. 669–682 (2017)

  96. Xiao-Shu, W., Yao, X., Huan, L.: Cloud computing oriented retrieval technology based on big data. In: 2015 IEEE Seventh International Conference on Measuring Technology and Mechatronics Automation, pp. 275–278 (2015). https://doi.org/10.1109/ICMTMA.2015.73

  97. Xie, C., Li, W.J., Zhang, Z.: S-PowerGraph: streaming graph partitioning for natural graphs by vertex-cut (2015). arXiv:1511.02586

  98. Xin, R.S., Gonzalez, J.E., Franklin, M.J., Stoica, I.: GraphX: a resilient distributed graph system on spark. In: First International Workshop on Graph Data Management Experiences and Systems (GRADES ’13), pp. 2:1–2:6. ACM, New York (2013). https://doi.org/10.1145/2484425.2484427

  99. Xu, N., Chen, L., Cui, B.: LogGP: a log-based dynamic graph partitioning method. Proc. VLDB Endow. 7(14), 1917–1928 (2014)

    Article  Google Scholar 

  100. Xu, N., Cui, B., Chen, L., Huang, Z., Shao, Y.: Heterogeneous environment aware streaming graph partitioning. IEEE Trans. Knowl. Data Eng. 27(6), 1560–1572 (2015)

    Article  Google Scholar 

  101. Yan, D., Cheng, J., Lu, Y., Ng, W.: Effective techniques for message reduction and load balancing in distributed graph computation. In: Proceedings of the 24th International Conference on World Wide Web (WWW ’15), pp. 1307–1317. International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland (2015). https://doi.org/10.1145/2736277.2741096

  102. Yang, S., Yan, X., Zong, B., Khan, A.: Towards effective partition management for large graphs. In: Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data (SIGMOD ’12), pp. 517–528. ACM, New York (2012). https://doi.org/10.1145/2213836.2213895

  103. Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., McCauley, M., Franklin, M.J., Shenker, S., Stoica, I.: Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. In: Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation (NSDI’12), pp. 2–2. USENIX Association, Berkeley (2012)

  104. Zeng, Z., Wu, B., Wang, H.: A parallel graph partitioning algorithm to speed up the large-scale distributed graph mining. In: Proceedings of the 1st International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications (BigMine ’12), pp. 61–68. ACM, New York (2012). https://doi.org/10.1145/2351316.2351325

  105. Zhao, Y., Yoshigoe, K., Xie, M., Zhou, S., Seker, R., Bian, J.: LightGraph: lighten communication in distributed graph-parallel processing. In: 2014 IEEE International Congress on Big Data, pp. 717–724 (2014). https://doi.org/10.1109/BigData.Congress.2014.106

  106. Zheng, A., Labrinidis, A., Chrysanthis, P.K.: Architecture-aware graph repartitioning for data-intensive scientific computing. In: 2014 IEEE International Conference on Big Data (Big Data), pp. 78–85 (2014). https://doi.org/10.1109/BigData.2014.7004375

  107. Zheng, A., Labrinidis, A., Chrysanthis, P.K.: Planar: Parallel lightweight architecture-aware adaptive graph repartitioning. In: 2016 IEEE 32nd International Conference on Data Engineering (ICDE), pp. 121–132 (2016a). https://doi.org/10.1109/ICDE.2016.7498234

  108. Zheng, A., Labrinidis, A., Pisciuneri, P.H., Chrysanthis, P.K., Givi, P.: Paragon: parallel architecture-aware graph partition refinement algorithm. In: EDBT, pp. 365–376 (2016b)

  109. Zheng, A., Labrinidis, A., Faloutsos, C.: Skew-resistant graph partitioning. In: 2017 IEEE 33rd International Conference on Data Engineering (ICDE), pp. 151–154 (2017). https://doi.org/10.1109/ICDE.2017.62

  110. Zhu, X., Ghahramani, Z.: Learning from labeled and unlabeled data with label propagation. Tech. Rep., Citeseer (2002)

  111. Zhu, X., Chen, W., Zheng, W., Ma, X.: Gemini: A computation-centric distributed graph processing system. In: Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), pp. 301–316 (2016)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Afsaneh Fatemi.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mazaheri Soudani, N., Fatemi, A. & Nematbakhsh, M. An investigation of big graph partitioning methods for distribution of graphs in vertex-centric systems. Distrib Parallel Databases 38, 1–29 (2020). https://doi.org/10.1007/s10619-019-07256-z

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10619-019-07256-z

Keywords

Navigation