ABSTRACT
Distributed graph computing technology for processing large-scale graph data has been widely used in social network, communication network and so on. The foundation of distributed graph computing is reasonable partitioning of large-scale graph in distributed system. Most proposed graph partitioning algorithms cannot achieve the goals of load balance and minimizing the number of edge-cuts at the same time. This paper constructs a cost function to measure the efficiency of partitioning large-scale graphs in a distributed system, where the graph is dynamically updated in real time. Based on the cost function, the update algorithm is proposed for the addition of vertex and edge. Experimental results show that the proposed algorithm can yield significant reduction in load imbalance and number of edge-cuts.
- Low, Y., Bickson, D., Gonzalez, J., Guestrin, C., Kyrola, A., and Hellerstein, J.M. 2012. Distributed Graphlab: A Framework for Machine Learning and Data Mining in the Cloud. In Proceedings of the VLDB Endowment. 5, 8 (April. 2012), 716--727.Google ScholarDigital Library
- Gonzalez, J. E., Xin, R. S., Dave, A., 2014. Crankshaw, D., Franklin, M. J., and Stoica, I. 2014. GraphX: graph processing in a distributed dataflow framework. In Proceedings of the 11th USENIX Conference on Operating System Design and Implementation (Broomfield, CO, October 06-08, 2014). OSDI'14. USENIX Association Berkeley, CA, USA, 599--613.Google Scholar
- Salihoglu, S., Widom, J. 2013. GPS: A graph processing system. In Proceedings of the 25th International Conference on Scientific and Statistical Database Management (Baltimore, Maryland, USA, July 29-31, 2013). SSDBM. ACM, New York, USA, Article No. 22.Google ScholarDigital Library
- Wang, T., Rong, C., Lu, W., and Du, X. 2018. Survey on technologies of distributed graph processing systems. Journal of Software. 29, 3 (Nov. 2018), 569--586.Google Scholar
- Stantion, I., Kliot, G. 2012. Streaming graph partitioning for large distributed graphs. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge discovery and data mining (Beijing, China, August 12-16, 2012). KDD '12. ACM, New York, NY, 1222--1230.Google Scholar
- Tsourakakis, C., Gkantsidis, C., Radunovic, B., and Vojnovic, M. 2014. FENNEL: Streaming graph partitioning for massive scale graphs. In Proceedings of the 7th ACM International Conference on Web search and data mining (New York, USA, February 24-28, 2014). WSDM '14. ACM, New York, NY, 333--342.Google ScholarDigital Library
- ZHANG W, CHEN Y, DAI D. AKIN: A streaming graph partitioning algorithm for distributed graph storage systems. In Proceedings of 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (Washington, DC, USA, May 01-04, 2018). CCGrid'18. IEEE Press Piscataway, NJ, USA, 183--192.Google Scholar
- Gonzalez, J. E., Low, Y., Gu, H., Bickson, D., and Guestrin, C. 2012. PowerGraph: Distributed graph-parallel computation on natural graphs. In Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation (Hollywood, CA, USA, October 08-10, 2012). OSDI'12. USENIX Association Berkeley, CA, USA, 17--30.Google Scholar
- Sajjad, H. P., Payberah, A. H., Rahimian, F., Vlassov, A., and Haridi, S. 2016. Boosting vertex-cut partitioning for streaming graphs. In 2016 IEEE International Congress on Big Data (San Francisco, CA, USA, June 27-July 02, 2016). BigData Congress. IEEE, 1--8.Google ScholarCross Ref
- Patwary, M. A. K., Garg, S., and Kang, B. 2019. Window-based Streaming Graph Partitioning Algorithm. In proceedings of the Australasian Computer Science Week Multiconference (Sydney, NEW, Australia, January 29-31, 2019). ACSW 2019. ACM, New York, USA, Article No. 51.Google Scholar
- Leskovec, J., Krevl, A. SNAP Datasets: Stanford Large Network Dataset Collection. http://snap.stanford.edu/data/, Accessed date: May 2019.Google Scholar
- Niu, J., Cui, H., Cheng, X., and Fu, Y., 2018, Multithreading Parallel Algorithm for Solving Circuits of Large-scale Sparse Directed Graphs. Journal of Shandong University of Science and Technology (Natural Science), 37(02):32--38.Google Scholar
Index Terms
- Large-Scale Dynamic Graph Updating Algorithm in Distributed Computing System
Recommendations
Streaming graph partitioning for large distributed graphs
KDD '12: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data miningExtracting knowledge by performing computations on graphs is becoming increasingly challenging as graphs grow in size. A standard approach distributes the graph over a cluster of nodes, but performing computations on a distributed graph is expensive if ...
A parallel graph partitioning algorithm to speed up the large-scale distributed graph mining
BigMine '12: Proceedings of the 1st International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and ApplicationsFor the large-scale distributed graph mining, the graph is distributed over a cluster of nodes, thus performing computations on the distributed graph is expensive when large amount of data have to be moved between different computers. A good ...
A Distributed Algorithm for Large-Scale Graph Partitioning
Balanced graph partitioning is an NP-complete problem with a wide range of applications. These applications include many large-scale distributed problems, including the optimal storage of large sets of graph-structured data over several hosts. However, ...
Comments