skip to main content
10.1145/3318464.3389745acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Application Driven Graph Partitioning

Published: 31 May 2020 Publication History

Abstract

Graph partitioning is crucial to parallel computations on large graphs. The choice of partitioning strategies has strong impact on not only the performance of graph algorithms, but also the design of the algorithms. For an algorithm of our interest, what partitioning strategy fits it the best and improves its parallel execution? Is it possible to develop graph algorithms with partition transparency, such that the algorithms work under different partitions without changes? This paper aims to answer these questions. We propose an application-driven hybrid partitioning strategy that, given a graph algorithm A, learns a cost model for A as polynomial regression. We develop partitioners that given the learned cost model, refine an edge-cut or vertex-cut partition to a hybrid partition and reduce the parallel cost of A. Moreover, we identify a general condition under which graph-centric algorithms are partition transparent. We show that a number of graph algorithms can be made partition transparent. Using real-life and synthetic graphs, we experimentally verify that our partitioning strategy improves the performance of a variety of graph computations, up to 22.5 times.

Supplementary Material

MP4 File (3318464.3389745.mp4)
Presentation Video

References

[1]
Livejournal. http://snap.stanford.edu/data/soc-LiveJournal1.html.
[2]
Twitter. http://twitter.com/.
[3]
UKWeb. http://law.di.unimi.it/webdata/uk-union-2006-06--2007-05, 2006.
[4]
A. Adadi and M. Berrada. Peeking inside the black-box: A survey on explainable artificial intelligence (xai). IEEE Access, 6:52138--52160, 2018.
[5]
K. Andreev and H. Racke. Balanced graph partitioning. TCS, 39(6), 2006.
[6]
D. Avdiukhin, S. Pupyrev, and G. Yaroslavtsev. Multi-dimensional balanced graph partitioning via projected gradient descent. PVLDB, 12(8):906--919, 2019.
[7]
C.-E. Bichot and P. Siarry. Graph partitioning. John Wiley & Sons, 2013.
[8]
C. M. Bishop. Pattern recognition and machine learning. springer, 2006.
[9]
F. Bourse, M. Lelarge, and M. Vojnovic. Balanced graph edge partition. In SIGKDD, pages 1456--1465, 2014.
[10]
S. Brin and L. Page. The anatomy of a large-scale hypertextual Websearch engine. WWW, pages 107--117, 1998.
[11]
A. Buluç, H. Meyerhenke, I. Safro, P. Sanders, and C. Schulz. Recent advances in graph partitioning. In Algorithm Engineering - SelectedResults and Surveys, pages 117--158. 2016.
[12]
G. Chandrashekar and F. Sahin. A survey on feature selection methods. Computers & Electrical Engineering, 40(1):16--28, 2014.
[13]
R. Chen, J. Shi, Y. Chen, and H. Chen. PowerLyra: Differentiated graph computation and partitioning on skewed graphs. In EuroSys, pages 1:1--1:15, 2015.
[14]
W. Cukierski, B. Hamner, and B. Yang. Graph-based features for supervised link prediction. In INCC, pages 1237--1244. IEEE, 2011.
[15]
D. Dai, W. Zhang, and Y. Chen. IOGP: An incremental online graph partitioning algorithm for distributed graph databases. In HPDC, pages 219--230, 2017.
[16]
W. Fan, P. Lu, X. Luo, J. Xu, Q. Yin, W. Yu, and R. Xu. Adaptive asynchronous parallelization of graph algorithms. In SIGMOD, pages1141--1156, 2018.
[17]
W. Fan, W. Yu, J. Xu, J. Zhou, X. Luo, Q. Yin, P. Lu, Y. Cao, and R. Xu. Parallelizing sequential graph computations. TODS, 43(4):18:1--18:39, 2018.
[18]
M. Garey and D. Johnson. Computers and Intractability: A Guide tothe Theory of NP-Completeness. W. H. Freeman and Company, 1979.
[19]
J. E. Gonzalez, Y. Low, H. Gu, D. Bickson, and C. Guestrin. PowerGraph: Distributed graph-parallel computation on natural graphs. In OSDI,pages 17--30, 2012.
[20]
J. Huang and D. Abadi. LEOPARD: Lightweight edge-oriented partitioning and replication for dynamic graphs. PVLDB, 9(7), 2016.
[21]
L. Huang, J. Jia, B. Yu, B. gon Chun, P. Maniatis, and M. Naik. Predicting execution time of computer programs using sparsepolynomial regression. In NIPS, 2010.
[22]
A. Itai and M. Rodeh. Finding a minimum circuit in a graph. SIAM Journal on Computing, 7(4):413--423, 1978.
[23]
N. Jain, G. Liao, and T. L. Willke. Graphbuilder: Scalable graph etl framework. Graph Data Management Experiences and Systems, 2013.
[24]
G. Karypis. Metis and parmetis. In Encyclopedia of Parallel Computing, pages 1117--1124. 2011.
[25]
G. Karypis and V. Kumar. Metis--unstructured graph partitioning and sparse matrix ordering system, version 2.0. 1995.
[26]
G. Karypis and V. Kumar. Metis: A software package for partitioning unstructured graphs. Partitioning Meshes, and ComputingFill-Reducing Orderings of Sparse Matrices Version, 4, 1998.
[27]
G. Karypis and V. Kumar. Multilevelk-way Partitioning Scheme forIrregular Graphs. JPDC, 48(1):96--129, 1998.
[28]
M. Kim and K. S. Candan.SBV-Cut: Vertex-cut based graph partitioning using structural balance vertices.DKE, 72:285--303, 2012.
[29]
R. Krauthgamer, J. Naor, and R. Schwartz. Partitioning graphs into balanced components. In SODA, 2009.
[30]
D. Li, Y. Zhang, J. Wang, and K. Tan. TopoX: Topology refactorization for efficient graph partitioning and processing. PVLDB, 12(8):891--905,2019.
[31]
D. Liben-Nowell and J. Kleinberg. The link prediction problem for social networks. CIKM, 2003.
[32]
D. W. Margo and M. I. Seltzer. A scalable distributed graph partitioner. PVLDB, 8(12):1478--1489, 2015.
[33]
J. Mondal and A. Deshpande. Managing large dynamic graphs efficiently. In SIGMOD, pages 145--156, 2012.
[34]
M. E. Newman, D. J. Watts, and S. H. Strogatz. Random graph models of social networks. Proceedings of the National Academy of Sciences, 99(suppl 1):2566--2572, 2002.
[35]
H. Park and L. Stefanski. Relative-error prediction. Statistics & probability letters, 40(3):227--236, 1998.
[36]
A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang, Z. DeVito, Z. Lin, A. Desmaison, L. Antiga, and A. Lerer. Automatic differentiation in PyTorch. In NIPS Autodiff Workshop, 2017.
[37]
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion,O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Van-derplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. Scikit-learn: Machine learning in Python. Journal ofMachine Learning Research, 12:2825--2830, 2011.
[38]
F. Petroni, L. Querzoni, K. Daudjee, S. Kamali, and G. Iacoboni. HDRF: Stream-based partitioning for power-law graphs. In CIKM, 2015.
[39]
A. Pothen, H. D. Simon, and K.-P. Liou. Partitioning sparse matrices with eigenvectors of graphs. SIMAX, 11(3):430--452, 1990.
[40]
G. Ramalingam and T. Reps. On the computational complexity of dynamic graph problems. TCS, 158(1--2), 1996.
[41]
M. T. Ribeiro, S. Singh, and C. Guestrin. "Why should I trust you?" Explaining the predictions of any classifier. In SIGKDD, pages 1135--1144, 2016.
[42]
A. Roy, I. Mihailovic, and W. Zwaenepoel. X-stream: Edge-centric graph processing using streaming partitions. In SOSP, pages 472--488,2013.
[43]
G. M. Slota, S. Rajamanickam, and K. Madduri. PuLP/XtraPuLP: Partitioning tools for extreme-scale graphs. Technical report, SandiaNational Lab.(SNL-NM), Albuquerque, NM (United States), 2017.
[44]
C. E. Tsourakakis, C. Gkantsidis, B. Radunovic, and M. Vojnovic. FENNEL: Streaming graph partitioning for massive scale graphs. In WSDM, pages 333--342, 2014.
[45]
L. G. Valiant. A bridging model for parallel computation. Commun. ACM, 33(8):103--111, 1990.
[46]
D. J. Watts and S. H. Strogatz. Collective dynamics of 'small-world' networks. nature, 393(6684):440, 1998.
[47]
Wikipedia. Stone--Weierstrass Theorem. https://en.wikipedia.org/wiki/Stone-Weierstrass_theorem.
[48]
S. Yang, X. Yan, B. Zong, and A. Khan. Towards effective partition management for large graphs. In SIGMOD, page 517, 2012.
[49]
C. Zhang, F. Wei, Q. Liu, Z. G. Tang, and Z. Li. Graph edge partitioning via neighborhood heuristic. In KDD, 2017.
[50]
Y. Zhang, Q. Gao, L. Gao, and C. Wang. Maiter: An asynchronous graph processing framework for delta-based accumulative iterative computation. TPDS, 25(8):2091--2100, 2013.
[51]
X. Zhu, W. Chen, W. Zheng, and X. Ma. Gemini: A computation-centric distributed graph processing system. In OSDI, pages 301--316, 2016.

Cited By

View all
  • (2025)Efficient Partitioning Algorithms for Optimizing Big Graph ComputationComputing and Combinatorics10.1007/978-981-96-1093-8_17(202-213)Online publication date: 20-Feb-2025
  • (2024)Algorithm Based on Morphological Operators for Shortness Path PlanningAlgorithms10.3390/a1705018417:5(184)Online publication date: 29-Apr-2024
  • (2024)Improving Graph Compression for Efficient Resource-Constrained Graph AnalyticsProceedings of the VLDB Endowment10.14778/3665844.366585217:9(2212-2226)Online publication date: 1-May-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGMOD '20: Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data
June 2020
2925 pages
ISBN:9781450367356
DOI:10.1145/3318464
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 31 May 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. graph partition
  2. machine learning
  3. partition transparency

Qualifiers

  • Research-article

Funding Sources

Conference

SIGMOD/PODS '20
Sponsor:

Acceptance Rates

Overall Acceptance Rate 785 of 4,003 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)157
  • Downloads (Last 6 weeks)9
Reflects downloads up to 20 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Efficient Partitioning Algorithms for Optimizing Big Graph ComputationComputing and Combinatorics10.1007/978-981-96-1093-8_17(202-213)Online publication date: 20-Feb-2025
  • (2024)Algorithm Based on Morphological Operators for Shortness Path PlanningAlgorithms10.3390/a1705018417:5(184)Online publication date: 29-Apr-2024
  • (2024)Improving Graph Compression for Efficient Resource-Constrained Graph AnalyticsProceedings of the VLDB Endowment10.14778/3665844.366585217:9(2212-2226)Online publication date: 1-May-2024
  • (2024)A Unified Graph Framework for Storage-Compute Coupled Cluster and High-Density Computing ClusterProceedings of the International Workshop on Big Data in Emergent Distributed Environments10.1145/3663741.3664790(1-6)Online publication date: 9-Jun-2024
  • (2024)Distributed Graph Neural Network Training: A SurveyACM Computing Surveys10.1145/364835856:8(1-39)Online publication date: 10-Apr-2024
  • (2024)HRCM: A Hierarchical Regularizing Mechanism for Sparse and Imbalanced Communication in Whole Human Brain SimulationsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2024.338772035:6(1056-1073)Online publication date: Jun-2024
  • (2024)DKWS: A Distributed System for Keyword Search on Massive GraphsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.3313726(1-16)Online publication date: 2024
  • (2024)Robust Regularized Locality Preserving Indexing for Fiedler Vector EstimationIEEE Open Journal of Signal Processing10.1109/OJSP.2024.34006835(867-885)Online publication date: 2024
  • (2024)Minimum motif-cut: a workload-aware RDF graph partitioning strategyThe VLDB Journal10.1007/s00778-024-00860-133:5(1517-1542)Online publication date: 8-Jul-2024
  • (2024)Locality Sensitive Hashing for Data Placement to Optimize Parallel Subgraph Query EvaluationWeb and Big Data10.1007/978-981-97-2303-4_3(32-47)Online publication date: 29-May-2024
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media