Skip to main content

Parallel Graph Processing

Encyclopedia of Big Data Technologies

Definition

The term parallel graph processing refers to the use of multiple cores to process a graph for the purpose of (1) speeding up of processing and (2) scaling to bigger graphs. The environment can be (1) a stand-alone machine running multiple threads or (2) a distributed cluster of machines (i.e., the shared-nothing architecture).

Overview

Modern big graph processing systems place emphasis on two aspects:

  1. 1.

    user-friendliness: the programming interface should be designed based on an intuitive computation model, to allow algorithm developers to focus on the graph analytics logic without touching low-level execution details (e.g., network communication);

  2. 2.

    efficiency: the underlying execution engine should guarantee high-throughput execution and automatically support fault tolerance and horizontal/vertical scaling.

Since comprehensive surveys of this area already exist (Yan et al. 2017a,d), this chapter aims at a succinct and more up-to-date review of the key programming...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

References

  • Carbone P, Katsifodimos A, Ewen S, Markl V, Haridi S, Tzoumas K (2015) Apache flink™: stream and batch processing in a single engine. IEEE Data Eng Bull 38(4):28–38

    Google Scholar 

  • Ching A, Edunov S, Kabiljo M, Logothetis D, Muthukrishnan S (2015) One trillion edges: graph processing at facebook-scale. PVLDB 8(12):1804–1815

    Google Scholar 

  • Fan W, Xu J, Wu Y, Yu W, Jiang J, Zheng Z, Zhang B, Cao Y, Tian C (2017) Parallelizing sequential graph computations. In: SIGMOD, pp 495–510

    Google Scholar 

  • Gonzalez JE, Low Y, Gu H, Bickson D, Guestrin C (2012) Powergraph: distributed graph-parallel computation on natural graphs. In: OSDI, pp 17–30

    Google Scholar 

  • Gonzalez JE, Xin RS, Dave A, Crankshaw D, Franklin MJ, Stoica I (2014) Graphx: graph processing in a distributed dataflow framework. In: OSDI, pp 599–613

    Google Scholar 

  • Karypis G, Kumar V (1998) A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J Sci Comput 20(1):359–392

    Article  MathSciNet  MATH  Google Scholar 

  • Kyrola A, Blelloch GE, Guestrin C (2012) GraphChi: Large-scale graph computation on just a PC. In: OSDI, pp 31–46

    Google Scholar 

  • Liu H, Huang HH (2017) Graphene: fine-grained IO management for graph computing. In: FAST, pp 285–300

    Google Scholar 

  • Lu Y, Cheng J, Yan D, Wu H (2014) Large-scale distributed graph computing systems: an experimental evaluation. PVLDB 8(3):281–292

    Google Scholar 

  • Malewicz G, Austern MH, Bik AJC, Dehnert JC, Horn I, Leiser N, Czajkowski G (2010) Pregel: a system for large-scale graph processing. In: SIGMOD, pp 135–146

    Google Scholar 

  • Quamar A, Deshpande A, Lin JJ (2016) Nscale: neighborhood-centric large-scale graph analytics in the cloud. VLDB J 25(2):125–150

    Article  Google Scholar 

  • Quick L, Wilkinson P, Hardcastle D (2012) Using pregel-like large scale graph processing frameworks for social network analysis. In: International conference on advances in social networks analysis and mining, ASONAM 2012, Istanbul, pp 457–463

    Google Scholar 

  • Roy A, Mihailovic I, Zwaenepoel W (2013) X-stream: edge-centric graph processing using streaming partitions. In: SOSP, pp 472–488

    Google Scholar 

  • Salihoglu S, Widom J (2013) GPS: a graph processing system. In: SSDBM, pp 22:1–22:12

    Google Scholar 

  • Tian Y, Balmin A, Corsten SA, Tatikonda S, McPherson J (2013) From “think like a vertex” to “think like a graph”. PVLDB 7(3):193–204

    Google Scholar 

  • Yan D, Cheng J, Lu Y, Ng W (2014a) Blogel: a block-centric framework for distributed computation on real-world graphs. PVLDB 7(14):1981–1992

    Google Scholar 

  • Yan D, Cheng J, Xing K, Lu Y, Ng W, Bu Y (2014b) Pregel algorithms for graph connectivity problems with performance guarantees. PVLDB 7(14):1821–1832

    Google Scholar 

  • Yan D, Cheng J, Lu Y, Ng W (2015) Effective techniques for message reduction and load balancing in distributed graph computation. In: WWW, pp 1307–1317

    Google Scholar 

  • Yan D, Bu Y, Tian Y, Deshpande A (2017a) Big graph analytics platforms. Found Trends Databases 7(1–2): 1–195. https://doi.org/10.1561/1900000056

    Article  Google Scholar 

  • Yan D, Chen H, Cheng J, Özsu MT, Zhang Q, Lui JCS (2017b) G-thinker: big graph mining made easier and faster. CoRR abs/1709.03110

    Google Scholar 

  • Yan D, Huang Y, Liu M, Chen H, Cheng J, Wu H, Zhang C (2017c) GraphD: distributed vertex-centric graph processing beyond the memory limit. IEEE Trans Parallel Distrib Syst 29(1):99–114

    Article  Google Scholar 

  • Yan D, Tian Y, Cheng J (2017d) Systems for big graph analytics. Springer briefs in computer science. Springer, Cham. https://doi.org/10.1007/978-3-319-58217-7

    Book  Google Scholar 

  • Yan D, Chen H, Cheng J, Cai Z, Shao B (2018) Scalable de novo genome assembly using pregel. In: ICDE

    Google Scholar 

  • Zhang Y, Gao Q, Gao L, Wang C (2014) Maiter: an asynchronous graph processing framework for delta-based accumulative iterative computation. IEEE Trans Parallel Distrib Syst 25(8):2091–2100

    Article  Google Scholar 

  • Zhou C, Gao J, Sun B, Yu JX (2014) Mocgraph: scalable distributed graph processing using message online computing. PVLDB 8(4):377–388

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Da Yan .

Editor information

Editors and Affiliations

Section Editor information

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this entry

Check for updates. Verify currency and authenticity via CrossMark

Cite this entry

Yan, D., Liu, H. (2018). Parallel Graph Processing. In: Sakr, S., Zomaya, A. (eds) Encyclopedia of Big Data Technologies. Springer, Cham. https://doi.org/10.1007/978-3-319-63962-8_272-1

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-63962-8_272-1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-63962-8

  • Online ISBN: 978-3-319-63962-8

  • eBook Packages: Springer Reference MathematicsReference Module Computer Science and Engineering

Publish with us

Policies and ethics

Chapter history

  1. Latest

    Parallel Graph Processing
    Published:
    17 March 2022

    DOI: https://doi.org/10.1007/978-3-319-63962-8_272-2

  2. Original

    Parallel Graph Processing
    Published:
    01 February 2018

    DOI: https://doi.org/10.1007/978-3-319-63962-8_272-1