Abstract
Graph partitioning is an effective way to improve the performance of parallel computing. However, the research of graph partitioning is driven by the demand of practical applications. In this paper, we propose a streaming graph partitioning algorithm based on heterogeneous-aware for distributed heterogeneous computing environment. It not only considers the difference of network bandwidth and the compute ability of computer nodes, but also considers the shared resources competition between cores in a high-speed network. Taking breadth first search, single source shortest path and PageRank algorithms as examples, compared with the streaming algorithms without considering the heterogeneous environment, the efficiency of graph computing is improved on average by 38%, 45.7% and 61.8% respectively. Meanwhile, in view of the low efficiency of streaming graph partitioning based on adjacent vertex structure, we design a cache management mechanism based on adjacent edge structure for streaming graph partitioning, which effectively improves the partitioning efficiency. The experimental results show that our method is suitable for graph vertex assignment in distributed heterogeneous cluster environment.
Similar content being viewed by others
References
Abbas Z, Kalavri V, Carbone P, Vlassov V (2018) Streaming graph partitioning: an experimental study. Very Large Data Bases 11(11):1590–1603
Buluc A, Meyerhenke H, Safro I, Sanders P, Schulz C (2013) Recent advances in graph partitioning. arXiv: Data Structures and Algorithms
Cannon JW, Thurston WP (2007) Group invariant peano curves. Geom Topol 11(3):1315–1355
Chen Q, Yao J, Li B, Xiao Z (2019) Pisces: optimizing multi-job application execution in mapreduce. IEEE Trans Cloud Comput 7(1):273–286
Chen R, Shi J, Chen Y, Chen H (2015) Powerlyra: differentiated graph computation and partitioning on skewed graphs. ACM Trans Parallel Comput (TOPC) 5:1–39
Chen R, Yang M, Weng X, Choi B, He BJ, Li X (2012) Improving large graph processing on partitioned graphs in the cloud, p 3
Choi D, Han J, Lim J, Han J, Bok K, Yoo J (2021) Dynamic graph partitioning scheme for supporting load balancing in distributed graph environments. IEEE Access 9:65254–65265
Dathathri R, Gill G, Hoang L, Dang HV, Pingali K (2018) Gluon: a communication-optimizing substrate for distributed heterogeneous graph analytics. In: Acm Sigplan conference
El Moussawi A, Seghouani NB, Bugiotti F (2021) B-grap: balanced graph partitioning algorithm for large graphs. J Data Intell 2(2):116–135
Fernandezmusoles C, Coca D, Richmond P (2019) Communication sparsity in distributed spiking neural network simulations to improve scalability. Front Neuroinform 13:19
Hua Q, Li Y, Yu D, Jin H (2019) Quasi-streaming graph partitioning: a game theoretical approach. IEEE Trans Parallel Distrib Syst 30(7):1643–1656
Ji S, Bu C, Li L, Wu X (2021) Local graph edge partitioning. ACM Trans Intell Syst Technol (TIST) 12(5):1–25
Kalavri V, Vlassov V, Haridi S (2018) High-level programming abstractions for distributed graph processing. IEEE Trans Knowl Data Eng 30(2):305–324
Karypis G, Kumar V (1998) A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J Sci Comput 20(1):359–392
Kokosiński Z, Bała M (2018) Solving graph partitioning problems with parallel metaheuristics. Springer, Berlin
Kumar D, Raj A, Dharanipragada J (2017) Graphsteal: dynamic re-partitioning for efficient graph processing in heterogeneous clusters, pp 439–446
Liu S, Chen P, Hero AO (2018) Accelerated distributed dual averaging over evolving networks of growing connectivity. IEEE Trans Signal Process 66(7):1845–1859
Liu WL, Gong YJ, Chen WN, Liu Z, Wang H, Zhang J (2019) Coordinated charging scheduling of electric vehicles: a mixed-variable differential evolution approach. IEEE Trans Intell Transp Syst 21(12):5094–5109
Liu X, Zhou Y, Hu C, Guan X (2016) Miracle: a multiple independent random walks community parallel detection algorithm for big graphs. J Netw Comput Appl 70:89–101
Masood S, Sheng B, Li P, Shen R, Fang R, Wu Q (2018) Automatic choroid layer segmentation using normalized graph cut. IET Image Proc 12(1):53–59
Muttipati AS, Padmaja P (2015) Analysis of large graph partitioning and frequent subgraph mining on graph data. Int J Adv Res Comput Sci 6(7):29–40
Nishimura J, Ugander J (2013) Restreaming graph partitioning: simple versatile algorithms for advanced balancing, pp 1106–1114 (2013)
Rahimian F, Payberah AH, Girdzijauskas S, Jelasity M, Haridi S (2013) Ja-be-ja: a distributed algorithm for balanced graph partitioning, pp 51–60
Sajjad HP, Payberah AH, Rahimian F, Vlassov V, Haridi S (2016) Boosting vertex-cut partitioning for streaming graphs, pp 1–8
Schulz C, Strash D (2018) Graph partitioning: formulations and applications to big data[M]. In: Encyclopedia of big data technologies. Springer, Cham, pp 1–7
Shi Z, Li J, Guo P, Li S, Feng D, Su Y (2017) Partitioning dynamic graph asynchronously with distributed fennel. Future Gener Comput Syst 71:32–42
Stanton I, Kliot G (2012) Streaming graph partitioning for large distributed graphs, pp 1222–1230
Stanton I, Kliot G (2012) Streaming graph partitioning for large distributed graphs. In: KDD
Testa A, Rucco A, Notarstefano G (2018) Distributed mixed-integer linear programming via cut generation and constraint exchange. IEEE Trans Autom Control 65:1456–1467
Tsourakakis CE, Gkantsidis C, Radunovic B, Vojnovic M (2014) Fennel: streaming graph partitioning for massive scale graphs, pp 333–342
Wang G, Ng TSE (2010) The impact of virtualization on network performance of amazon ec2 data center, pp 1163–1171
Wang W, Du S, Guo Z, Luo L (2015) Polygonal clustering analysis using multilevel graph-partition. Trans GIS 19(5):716–736
Washburne AD, Silverman JD, Morton JT, Becker DJ, Crowley DE, Mukherjee S, David LA, Plowright RK (2019) Phylofactorization: a graph partitioning algorithm to identify phylogenetic scales of ecological data. Ecol Monogr 89(2):e01353
Xiaofeng Y, Yumei Z, Yang W (2013) The innovation of e-commerce financial service product based on cloud computing-taking alibaba finance as an example, pp 259–261
Xu H, Li B (2014) Repflow: minimizing flow completion times with replicated flows in data centers. In: International conference on computer communications, pp 1581–1589
Xu N, Cui B, Chen L, Huang Z, Shao Y (2014) Heterogeneous environment aware streaming graph partitioning. IEEE Trans Knowl Data Eng 27(6):1560–1572
Yi S, Kondo D, Andrzejak A (2010) Reducing costs of spot instances via checkpointing in the amazon elastic compute cloud, pp 236–243
Zhang X, Wu Y, Zhao C (2016) Mrheter: improving mapreduce performance in heterogeneous environments. Clust Comput 19(4):1691–1701
Zhao F, He X, Wang L (2020) A two-stage cooperative evolutionary algorithm with problem-specific knowledge for energy-efficient scheduling of no-wait flow-shop problem. IEEE Trans Cybern 51(11):5291–5303
Zhao F, Ma R, Wang L (2022) A self-learning discrete jaya algorithm for multiobjective energy-efficient distributed no-idle flow-shop scheduling problem in heterogeneous factory system. In: IEEE transactions on cybernetics, vol 52, no 12, pp 12675–12686
Zheng A, Labrinidis A, Chrysanthis PK (2016) Planar: parallel lightweight architecture-aware adaptive graph repartitioning, pp 121–132
Zhou S, Xing L, Zheng X, Du N, Wang L, Zhang Q (2019) A self-adaptive differential evolution algorithm for scheduling a single batch-processing machine with arbitrary job sizes and release times. IEEE Trans Cybern 51(3):1430–1442
Acknowledgements
This work was supported in part by the Natural Science Foundation of Zhejiang Province (No. LY18F020002).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhong, Y., Huang, C. & Zhou, Q. HaSGP: an effective graph partition method for heterogeneous-aware. Computing 105, 455–481 (2023). https://doi.org/10.1007/s00607-022-01132-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00607-022-01132-y