Abstract
We propose Neighbor Expansion on power-law graph(NEPG), a distributed graph partitioning method based on a specific power-law graph that offers both good scalability and high partitioning quality. NEPG is based on a heuristic method, Neighbor Expansion, which constructs the different partitions and greedily expands from vertices selected randomly. NEPG improves the partitioning quality by selecting the vertices according to the properties of the power-law graph. We put forward theoretical proof that NEPG can reach the higher upper bound in partitioning quality. The empirical evaluation demonstrates that compared with the state-of-the-art distributed graph partitioning algorithms, NEPG significantly improved partitioning quality while reducing the graph construction time. The performance evaluation demonstrates that the time efficiency of the proposed method outperforms the existing algorithms.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Ahmad, Y., et al.: LA3: a scalable link-and locality-aware linear algebra-based graph analytics system. Proc. VLDB Endowment 11(8), 920–933 (2018)
Arulraj, J.: Apache Giraph (2018)
Chen, R., Shi, J., Chen, Y., Zang, B., Guan, H., Chen, H.: PowerLyra: differentiated graph computation and partitioning on skewed graphs. ACM Trans. Parallel Comput. (TOPC) 5(3), 1–39 (2019)
Ching, A., Edunov, S., Kabiljo, M., Logothetis, D., Muthukrishnan, S.: One trillion edges: graph processing at Facebook-scale. Proc. VLDB Endowment 8(12), 1804–1815 (2015)
Gan, X.: Customizing graph500 for Tianhe pre-exacale system. arXiv preprint arXiv:2102.01254 (2021)
Gonzalez, J.E., Low, Y., Gu, H., Bickson, D., Guestrin, C.: PowerGraph: distributed graph-parallel computation on natural graphs. In: 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI 12), pp. 17–30 (2012)
Gonzalez, J.E., Xin, R.S., Dave, A., Crankshaw, D., Franklin, M.J., Stoica, I.: GraphX: graph processing in a distributed dataflow framework. In: 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI 14), pp. 599–613 (2014)
Hanai, M., Suzumura, T., Tan, W.J., Liu, E., Theodoropoulos, G., Cai, W.: Distributed edge partitioning for trillion-edge graphs. arXiv preprint arXiv:1908.05855 (2019)
Khayyat, Z., Awara, K., Alonazi, A., Jamjoom, H., Williams, D., Kalnis, P.: Mizan: a system for dynamic load balancing in large-scale graph processing. In: Proceedings of the 8th ACM European Conference on Computer Systems, pp. 169–182 (2013)
Leskovec, J., Chakrabarti, D., Kleinberg, J., Faloutsos, C., Ghahramani, Z.: Kronecker graphs: an approach to modeling networks. J. Mach. Learn. Res. 11(2), 985–1042 (2010)
Low, Y., Gonzalez, J., Kyrola, A., Bickson, D., Guestrin, C., Hellerstein, J.M.: Distributed GraphLab: a framework for machine learning in the cloud. arXiv preprint arXiv:1204.6078 (2012)
Malewicz, G., et al.: Pregel: a system for large-scale graph processing. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data, pp. 135–146 (2010)
Margo, D., Seltzer, M.: A scalable distributed graph partitioner. Proc. VLDB Endowment 8(12), 1478–1489 (2015)
Martella, C., Logothetis, D., Loukas, A., Siganos, G.: Spinner: scalable graph partitioning in the cloud. In: 2017 IEEE 33rd International Conference on Data Engineering (ICDE), pp. 1083–1094. IEEE (2017)
Mayer, R., Jacobsen, H.-A.: Hybrid edge partitioner: partitioning large power-law graphs under memory constraints. In: Proceedings of the 2021 International Conference on Management of Data, pp. 1289–1302 (2021)
Roy, A., Mihailovic, I., Zwaenepoel, W.: X-stream: edge-centric graph processing using streaming partitions. In: Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, pp. 472–488 (2013)
Salihoglu, S., Widom, J.: Optimizing graph algorithms on pregel-like systems (2014)
Shao, B., Wang, H., Li, Y.: Trinity: a distributed graph engine on a memory cloud. In: Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, pp. 505–516 (2013)
Suzumura, T., Ueno, K.: ScaleGraph: a high-performance library for billion-scale graph analytics. In: 2015 IEEE International Conference on Big Data (Big Data), pp. 76–84. IEEE (2015)
Tsourakakis, C., Gkantsidis, C., Radunovic, B., Vojnovic, M.: FENNEL: streaming graph partitioning for massive scale graphs. In: Proceedings of the 7th ACM International Conference on Web Search and Data Mining, pp. 333–342 (2014)
Wang, R., et al.: Brief introduction of TianHe exascale prototype system. Tsinghua Sci. Technol. 26(3), 361–369 (2020)
Wu, M., et al.: Gram: scaling graph computation to the trillions. In: Proceedings of the Sixth ACM Symposium on Cloud Computing, pp. 408–421 (2015)
Xie, C., Yan, L., Li, W.-J., Zhang, Z.: Distributed power-law graph computing: theoretical and empirical analysis. In: NIPS, vol. 27, pp. 1673–1681 (2014)
Zhu, X., Chen, W., Zheng, W., Ma, X.: Gemini: a computation-centric distributed graph processing system. In: 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), pp. 301–316 (2016)
Acknowledgement
This work is supported in part by National Key R&D Program of China (Grant No. 2018YFB0204300), Excellent Youth Foundation of Hunan Province (Dezun Dong), National Postdoctoral Program for Innovative Talents (Grant No. BX20190091).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Si, J., Gan, X., Bai, H., Dong, D., Pang, Z. (2022). NEPG: Partitioning Large-Scale Power-Law Graphs. In: Lai, Y., Wang, T., Jiang, M., Xu, G., Liang, W., Castiglione, A. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2021. Lecture Notes in Computer Science(), vol 13157. Springer, Cham. https://doi.org/10.1007/978-3-030-95391-1_42
Download citation
DOI: https://doi.org/10.1007/978-3-030-95391-1_42
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-95390-4
Online ISBN: 978-3-030-95391-1
eBook Packages: Computer ScienceComputer Science (R0)