IncPregel: an incremental graph parallel computation model

Liu, Qiang; Dong, Xiaoshe; Chen, Heng; Wang, Yinfeng

doi:10.1007/s11704-016-6109-y

IncPregel: an incremental graph parallel computation model

Research Article
Published: 19 December 2018

Volume 12, pages 1076–1089, (2018)
Cite this article

Frontiers of Computer Science Aims and scope Submit manuscript

Qiang Liu¹,
Xiaoshe Dong¹,
Heng Chen¹ &
…
Yinfeng Wang²

88 Accesses
4 Citations
Explore all metrics

Abstract

Large-scale graph computation is often required in a variety of emerging applications such as social network computation and Web services. Such graphs are typically large and frequently updated with minor changes. However, re-computing an entire graph when a few vertices or edges are updated is often prohibitively expensive. To reduce the cost of such updates, this study proposes an incremental graph computation model called IncPregel, which leverages the non-after-effect property of the first-order Markov chain and provides incremental programming abstractions to avoid redundant computation and message communication. This is accomplished by employing an efficient and fine-grained reuse mechanism. We implemented this model on Hama, a popular open source framework based on Pregel, to construct an incremental graph processing system called IncHama. IncHama automatically detects changes in input in order to recognize “changed vertices” and to exchange reusable data by means of shuffling. The evaluation results on large-scale graphs show that, compared with Hama, IncHama is 1.1–2.7 times faster and can reduce communication messages by more than 50% when the incremental edges increase in number from 0.1 to 100k.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Intelligent and independent processes for overcoming big graphs

Article 18 August 2016

Auxiliary Parameter MCMC for Exponential Random Graph Models

Article 27 October 2016

NScale: neighborhood-centric large-scale graph analytics in the cloud

Article 13 October 2015

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

References

Dean J, Ghemawat S. MapReduce: simplified data processing on large clusters. Communications of the ACM, 2008, 51(1): 107–113
Article Google Scholar
Malewicz G, Austern M H, Bik A J C, Dehnert J C, Horn I, Leiser N, Czajkowski G. Pregel: a system for large-scale graph processing. In: Proceedings of ACM SIGMOD International Conference on Management of Data. 2010, 135–146
Google Scholar
Low Y C, Gonzalez J, Kyrola A, Bickson D, Guestrin C E, Hellerstein J M. GraphLab: a new framework for parallel machine learning. In: Proceedings of the 26th Conference on Uncertainty in Artificial Intelligence. 2010, 340–349
Google Scholar
Low Y C, Bickson D, Gonzalez J, Guestrin C E, Kyrola A, Hellerstein J M. Distributed GraphLab: a framework for machine learning and data mining in the cloud. Proceedings of the Very Large Data Base Endowment, 2012, 5(8): 716–727
Google Scholar
Power R, Li J Y. Piccolo: building fast, distributed programs with partitioned tables. In: Proceedings of the 9th USENIX Conference on Operating Systems Design and Implementation. 2010, 1–14
Google Scholar
Roy A, Mihailovic I, Zwaenepoel W. X-stream: edge-centric graph processing using streaming partitions. In: Proceedings of the 24th ACM Symposium on Operating Systems Principles. 2013, 472–488
Google Scholar
Wilson C, Sala A, Puttaswamy K P N, Zhao B Y. Beyond social graphs: user interactions in online social networks and their implications. ACM Transactions on the Web, 2012, 6(4): 17
Article Google Scholar
Fan WF, Wang X, Wu Y H. Incremental graph pattern matching. ACM Transactions on Database Systems, 2013, 38(3): 18
Article MathSciNet MATH Google Scholar
Logothetis D, Olston C, Reed B, Webb K C, Yocum K. Stateful bulk processing for incremental analytics. In: Proceedings of the 1st ACM Symposium on Cloud Computing. 2010, 51–62
Chapter Google Scholar
Bhatotia P, Wieder A, Rodrigues R, Acar U A, Pasquin R. Incoop: MapReduce for incremental computations. In: Proceedings of the 2nd ACM Symposium on Cloud Computing. 2011
Google Scholar
Sagharichian M, Naderi H, Haghjoo M. ExPregel: a new computational model for large-scale graph processing. Concurrency and Computation: Practice and Experience, 2015, 27(17): 4954–4969
Article Google Scholar
Brin S, Page L. Reprint of: the anatomy of a large-scale hypertextual Web search engine. Computer Networks, 2012, 56(18): 3825–3833
Article Google Scholar
Gyöngyi Z, Garcia-Molina H, Pedersen J. Combating Web spam with trustrank. In: Proceedings of the 30th International Conference on Very Large Data Base. 2004, 576–587
Google Scholar
Bu Y Y, Howe B, Balazinska M, Ernst MD. HaLoop: efficient iterative data processing on large clusters. Proceedings of the Very Large Data Base Endowment, 2010, 3(1–2): 285–296
Google Scholar
Kang U, Tsourakakis C E, Faloutsos C. Pegasus: a peta-scale graph mining system implementation and observations. In: Proceedings of the 9th IEEE International Conference on Data Mining. 2009, 229–238
Google Scholar
Kang U, Tsourakakis C E, Appel A P, Faloutsos C, Leskovec J. Hadi: mining radii of large graphs. ACM Transactions on Knowledge Discovery from Data, 2011, 5(2): 8
Article Google Scholar
Valiant L G. A bridging model for parallel computation. Communications of the ACM, 1990, 33(8): 103–111
Article Google Scholar
Prabhakaran V, Wu M, Weng X T, McSherry F, Zhou L D, Haridasan M. Managing large graphs on multi-cores with graph awareness. In: Proceedings of USENIX Annual Technical Conference. 2012, 41–52
Google Scholar
Gonzalez J E, Xin R S, Dave A, Crankshaw D, Franklin M J, Stoica I. GraphX: graph processing in a distributed dataflow framework. In: Proceedings of the 11th USENIX Conference on Operating Systems Design and Implementation. 2014, 599–613
Google Scholar
Zaharia M, Chowdhury M, Franklin M J, Shenker S, Stoica I. Spark: cluster computing with working sets. In: Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing. 2010
Google Scholar
Gonzalez J E, Low Y C, Gu H J, Bickson D, Guestrin C. PowerGraph: distributed graph-parallel computation on natural graphs. In: Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation. 2012, 17–30
Google Scholar
Chen R, Ding X, Wang P, Chen H B, Zang B Y, GuanH B. Computation and communication efficient graph processing with distributed immutable view. In: Proceedings of the 23rd International Symposium on High-Performance Parallel and Distributed computing. 2014, 215–226
Google Scholar
Desikan P, Pathak N, Srivastava J,Kumar V. Incremental page rank computation on evolving graphs. In: Proceedings of Special Interest Tracks and Posters of the 14th International Conference onWorldWide Web. 2005, 1094–1095
Google Scholar
Chien S, Dwork C, Kumar R, Simon D R, Sivakumar D. Link evolution: analysis and algorithms. Internet Mathematics, 2004, 1(3): 277–304
Article MathSciNet MATH Google Scholar
Popa L, Budiu M, Yu Y, Isard M. DryadInc: reusing work in large-scale computations. In: Proceedings of Conference on Hot Topics in Cloud Computing. 2009
Google Scholar
Peng D, Dabek F. Large-scale incremental processing using distributed transactions and notifications. In: Proceedings of the 9th USENIX Conference on Operating Systems Design and Implementation. 2010, 1–15
Google Scholar
Cheng R, Hong J, Kyrola A, Miao Y S, Weng X T, Wu M, Yang F, Zhou L D, Zhao F, Chen E H. Kineograph: taking the pulse of a fastchanging and connected world. In: Proceedings of the 7th ACM European Conference on Computer Systems. 2012, 85–98
Google Scholar
Lovász L. Random walks on graphs: a survey. Combinatorics, Paul Erdos is Eighty, 1993, 2: 1–46
Google Scholar
Puterman M L. Markov Decision Processes: Discrete Dynamic Stochastic Programming. New York: John Wiley & Sons, 1994
Book MATH Google Scholar
Shao Y X, Cui B, Ma L. PAGE: a partition aware engine for parallel graph computation. IEEE Transactions on Knowledge and Data Engineering, 2015, 27(2): 518–530
Article Google Scholar

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (Grant No. 61572394), National Key Research and Development Program of China (2016YFB1000303), and Shenzhen Scientific Plan (JSGG20140519141854753).

Author information

Authors and Affiliations

School of Electronics and Information Engineering, Xi’an Jiaotong University, Xi’an, 710049, China
Qiang Liu, Xiaoshe Dong & Heng Chen
Shenzhen Institute of Information Technology, Shenzhen, 518172, China
Yinfeng Wang

Authors

Qiang Liu
View author publications
Search author on:PubMed Google Scholar
Xiaoshe Dong
View author publications
Search author on:PubMed Google Scholar
Heng Chen
View author publications
Search author on:PubMed Google Scholar
Yinfeng Wang
View author publications
Search author on:PubMed Google Scholar

Corresponding author

Correspondence to Heng Chen.

Additional information

Qiang Liu received his MS degree in software engineering from University of Science and Technology of China, China in 2011. He is currently pursuing his PhD degree in the Department of Computer Science and Technology at Xi’an Jiaotong University, China. His major research interests include graph computation and cloud computing.

Xiaoshe Dong is a professor in the Department of Computer Science and Technology at Xi’an Jiaotong University, China. He received his PhD degree in computer science from Keio University, Japan in 1999. His current research interests are high performance computing, architectures, parallel programming model, and cloud computing.

Heng Chen received his PhD degree in computer science and technology from Xi’an Jiaotong University (XJTU), China in 2012. He is a lecturer in the Department of Computer Science and Technology of XJTU. His major research interests include energy efficient routing and distributed location service in wireless sensor networks, and cloud computing.

Yinfeng Wang received his PhD degree in computer science from Xi’an Jiaotong University, China in 2007. He is with the Department of Software Engineering, Shenzhen Institute of Information Technology, China. His major research fields are inmemory database and cloud computing.

Electronic supplementary material

Supplementary material, approximately 282 KB.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, Q., Dong, X., Chen, H. et al. IncPregel: an incremental graph parallel computation model. Front. Comput. Sci. 12, 1076–1089 (2018). https://doi.org/10.1007/s11704-016-6109-y

Download citation

Received: 23 February 2016
Accepted: 14 October 2016
Published: 19 December 2018
Issue Date: December 2018
DOI: https://doi.org/10.1007/s11704-016-6109-y

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

IncPregel: an incremental graph parallel computation model

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Intelligent and independent processes for overcoming big graphs

Auxiliary Parameter MCMC for Exponential Random Graph Models

NScale: neighborhood-centric large-scale graph analytics in the cloud

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Electronic supplementary material

Supplementary material, approximately 282 KB.

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now