Magas: matrix-based asynchronous graph analytics on shared memory systems

Luo, Le; Liu, Yi; Yang, Hailong; Qian, Depei

doi:10.1007/s11227-021-04091-x

Magas: matrix-based asynchronous graph analytics on shared memory systems

Published: 01 October 2021

Volume 78, pages 5650–5680, (2022)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Le Luo¹,
Yi Liu²,
Hailong Yang ORCID: orcid.org/0000-0003-1101-7927² &
…
Depei Qian²

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

Graph analytics plays an important role in many areas such as big data and artificial intelligence. The vertex-centric programming model provides friendly interfaces to programmers and is extensively used in graph processing frameworks. However, it is prone to generate many irregular memory accesses and scheduling overhead due to vertex-based execution and scheduling of programs in the backend. Instead, the matrix-based model provides a different approach by using high-performance matrix operations in the backend to improve the efficiency of graph processing. Unfortunately, current matrix-based frameworks only support the synchronous parallel model, which constrains its application to various graph algorithms. To address these problems, this paper proposes a graph processing framework, which combines matrix operations with the asynchronous model while providing friendly programming interfaces similar to vertex-centric programming model. Firstly, we propose an approach to map the vertex-based graph processing to matrix operations in the asynchronous model. Then, we propose two asynchronous scheduling policies, Gauss–Seidel policy and relaxed Gauss–Seidel policy, for different graph algorithms. After that, our framework applies the batch scheduling and optimized in-memory data structure to reduce the scheduling overhead introduced by the asynchronous model. Experimental results show that our framework performs better than the popular vertex programming frameworks such as GraphLab and GRACE in both performance and speedup and achieves similar performance compared to the BSP-based matrix framework such as GraphMat.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

FOG: A Fast Out-of-Core Graph Processing Framework

Article 01 November 2016

BlockGraphChi: Enabling Block Update in Out-of-Core Graph Processing

Article 23 October 2017

swSpAMM: optimizing large-scale sparse approximate matrix multiplication on Sunway Taihulight

Article 07 November 2022

Notes

References

Brin S, Page L (2012) The anatomy of a large-scale hypertextual web search engine. Comput Netw 56(18):3825–3833
Article Google Scholar
Gonzalez JE, Low Y, Guestrin C, O’Hallaron D (2009) Distributed parallel inference on large factor graphs. In: Conference on Uncertainty in Artificial Intelligence, pp 203–212
Yang H, Wang P (2014) Bpgm: a big graph mining tool. Tsinghua Sci Technol 19(1):33–38
Article Google Scholar
Panda B, Herbach JS, Basu S, Bayardo RJ (2009) Planet: massively parallel learning of tree ensembles with mapreduce. Proc Vldb Endow 2(2):1426–1437
Article Google Scholar
Chang A, Chang A, Zhao BY, Zhao BY, Zhao BY (2013) On the embeddability of random walk distances. Proc Vldb Endow 6(14):1690–1701
Article Google Scholar
Jeong H, Mason SP, Barabási AL, Oltvai ZN (2001) Lethality and centrality in protein networks. Nature 411(6833):41–2
Article Google Scholar
Smola A, Narayanamurthy S (2010) An architecture for parallel topic models. Proceedings of the VLDB Endowment 3(1–2):703–710. https://doi.org/10.14778/1920841.1920931
Ye J, Chow JH, Chen J, Zheng Z (2009) Stochastic gradient boosted distributed decision trees. In: ACM Conference on Information and Knowledge Management, pp. 2061–2064
Gonzalez JE, Low Y, Guestrin C (2009) Residual splash for optimally parallelizing belief propagation. In: International Conference on Artificial Intelligence and Statistics. Florida, pp 177–184
Malewicz G, Austern MH, Bik AJC, Dehnert JC, Horn I, Leiser N, Czajkowski G (2010) Pregel: a system for large-scale graph processing. In: ACM SIGMOD International Conference on Management of Data, pp 135–146
Valiant LG (1990) A bridging model for parallel computation. Commun ACM 33(8):103–111
Article Google Scholar
Avery C (2011) Giraph: large-scale graph processing infrastructure on hadoop. In: Proceedings of Hadoop Summit, Santa Clara, USA
Low Y, Gonzalez JE, Kyrola A, Bickson D, Guestrin C, Hellerstein JM (2010) Graphlab: A new framework for parallel machine learning. In: The 26th Conference on Uncertainty in Artificial Intelligence, California, pp 340–349
Shun J, Blelloch GE (2013) Ligra: a lightweight graph processing framework for shared memory. In: Acm sigplan symposium on principles and practice of parallel programming, pp 135–146
Prabhakaran V, Wu M, Weng X, Mcsherry F, Zhou L, Haridasan M (2012) Managing large graphs on multi-cores with graph awareness. Usenix Atc, pp 4–4
Wang G, Xie W, Demers AJ, Gehrke J (2013) Asynchronous large-scale graph processing made easy. In: Conference on Innovative Data Systems Research
Han M, Daudjee K (2015) Giraph unchained: barrierless asynchronous parallel execution in pregel-like graph processing systems. Proc VLDB Endow 8(9):950–961
Article Google Scholar
Gilbert JR (2011) The combinatorial BLAS: design, implementation, and applications. Sage Publications Inc, London
Google Scholar
Sundaram N, Satish N, Ali Patwary MM, Dulloor SR, Anderson MJ, Vadlamudi SG, Das D, Dubey P (2015) Graphmat: high performance graph analytics made productive. Proc Vldb Endow 8(11):1214–1225
Article Google Scholar
Nguyen D, Lenharth A, Pingali K (2013) A lightweight infrastructure for graph analytics. In: Twenty-fourth ACM symposium on operating systems principles, pp 456–471
Hong S, Cha H, Sedlar E, Olukotun K (2012) Greenmarl: a dsl for easy and efficient graph analysis. In: Seventeenth International Conference on Architectural Support for Programming Languages and Operating Systems, pp 349–362
Satish N, Sundaram N, Mostofa Ali Patwary Md., Seo J, Park J, Amber Hassaan M, Sengupta S, Yin Z, Dubey P (2014) Navigating the maze of graph analytics frameworks using massive graph datasets. In: Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, pp 979–990. https://doi.org/10.1145/2588555.2610518
Mccune RR, Weninger T, Madey G (2015) Thinking like a vertex: a survey of vertex-centric frameworks for large-scale distributed graph processing. ACM Comput Surv 48(2):1–39
Article Google Scholar
Anderson MJ, Sundaram N, Satish N, Mostofa Ali Patwary Md., Willke TL, Dubey P (2016) Graphpad: optimized graph primitives for parallel and distributed platforms. In: 2016 IEEE international parallel and distributed processing symposium, pp 313–322
Zhang H, Chen G, Ooi BC, Tan KL, Zhang M (2015) In-memory big data management and processing: a survey. IEEE Trans Knowl Data Eng 27(7):1920–1948
Article Google Scholar
Rowstron A, Narayanan D, Donnelly A, O’Shea G, Douglas A (2012) Nobody ever got fired for using hadoop on a cluster. In: HotCDP’12, Bern, pp 1–5
Cormen TT, Leiserson CE, Rivest RL (1990) Introduction to algorithms. MIT Press, Cambridge
MATH Google Scholar
Graph blas fourm
Mattson T, Bader D, Berry J, Buluc A, Dongarra J, Faloutsos C, Feo J, Gilbert J, Gonzalez J, Hendrickson B, Kepner J, Leiserson C, Lumsdaine A, Padua D, Poole S, Reinhardt S, Stonebraker M, Wallach S, Yoo A (2013) Standards for graph algorithm primitives. In: 2013 IEEE High Performance Extreme Computing Conference (HPEC), pp 1–2
Buluç A, Mattson T, McMillan S, Moreira J, Yang C (2017) Design of the graphblas api for c. In: 2017 IEEE international parallel and distributed processing symposium workshops (IPDPSW), pp 643–652
Gonzalez JE, Low Y, Gu H, Bickson D, Guestrin C (2012) Powergraph: distributed graph-parallel computation on natural graphs. In: Usenix Conference on Operating Systems Design and Implementation, pp 17–30
Raghavan UN, Albert R, Kumara S (2007) Near linear time algorithm to detect community structures in large-scale networks. Phys Rev E Stat Nonlinear Soft Matter Phys 76(3 Pt 2):036106
Article Google Scholar
Bertsekas DP, Tsitsiklis JN (1989) Parallel and distributed computation: numerical methods. Prentice Hall, New York
MATH Google Scholar
Saad Y (2003) Iterative methods for sparse linear systems, 2nd edn. SIAM, Philadelpha
Book Google Scholar
Koren Y, Bell R, Volinsky C (2009) Matrix factorization techniques for recommender systems. Computer 42(8):30–37
Article Google Scholar
Gonzalez J, Low Y, Gretton A, Guestrin C (2011) Parallel gibbs sampling: from colored fields to thin junction trees. In: Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, pp 324-332
Murphy RC, Wheeler KB, Barrett BW, Ang JA (2010) Introducing the graph 500. In: Cray user’s group (CUG)
Bennett J, Lanning S (2007) The netflix prize. In: KDD cup and workshop at ACM SIGKDD
Xie C, Chen R, Guan H, Zang B, Chen H (2015) Sync or async: time to fuse for distributed graph-parallel computation. In: Acm sigplan symposium on principles and practice of parallel programming, pp 194–204
Kyrola A, Blelloch G, Guestrin C (2012) Graphchi: large-scale graph computation on just a pc. In: Usenix Conference on Operating Systems Design and Implementation, pp 31–46
Grossman S, Litz H, Kozyrakis C (2018) Making pull-based graph processing performant. SIGPLAN Not. 53(1):246–260
Article Google Scholar
Zhang Y, Yang M, Baghdadi R, Kamil S, Shun J, Amarasinghe S (2018) Graphit: a high-performance graph dsl. In: Proceedings of the ACM programming language, 2(OOPSLA), pp 121:1–121:30
Roy A, Mihailovic I, Zwaenepoel W (2013) X-stream: edgecentric graph processing using streaming partitions. In: Twenty-fourth ACM symposium on operating systems principles, pp 472-488
Zhong J, He B (2014) Medusa: simplified graph processing on gpus. IEEE Trans Parallel Distrib Syst 25(6):1543–1552
Article MathSciNet Google Scholar
Khorasani F, Vora K, Gupta R, Bhuyan LN (2014) Cusha: vertex-centric graph processing on gpus. In: Proceedings of the 23rd international symposium on high-performance parallel and distributed computing, pp 239-252
Gharaibeh A, Santos-Neto E, Ripeanu M (2012) A yoke of oxen and a thousand chickens for heavy lifting graph processing. In: International Conference on Parallel Architectures and Compilation Techniques, pp 345–354
Shi X, Luo X, Liang J, Zhao P, Di S, He B, Jin H (2018) Frog: asynchronous graph processing on gpu with hybrid coloring model. IEEE Trans Knowl Data Eng 99:1
Google Scholar
Gonzalez JE, Xin RS, Dave A, Crankshaw D, Franklin MJ, Stoica I (2014) Graphx: graph processing in a distributed dataflow framework. In: Usenix Conference on Operating Systems Design and Implementation, pp 599–613
Zaharia M, Chowdhury M, Franklin MJ, Shenker S, Stoica I (2010) Spark: cluster computing with working sets. In: Usenix Conference on Hot Topics in Cloud Computing, pp 10–10
Chen R, Shi J, Chen Y, Chen H (2015) Powerlyra: differentiated graph computation and partitioning on skewed graphs. In: Tenth European Conference on Computer Systems, pp 1–15

Download references

Acknowledgements

The authors would like to thank all anonymous reviewers for their insightful comments and suggestions. This work has been supported by National Key R&D Program of China under (Grant No. 2020YFB1506703), National Natural Science Foundation of China (Grant Nos. 62072018 and 61732002) and Natural Science Foundation of Anhui Province, China (Grant No. 2108085QF265). Hailong Yang is the corresponding author.

Author information

Authors and Affiliations

School of Computer and Information, Anhui Normal University, Wuhu, 241002, Anhui, China
Le Luo
School of Computer Science and Engineering, Beihang University, Beijing, 100191, China
Yi Liu, Hailong Yang & Depei Qian

Authors

Le Luo
View author publications
You can also search for this author in PubMed Google Scholar
Yi Liu
View author publications
You can also search for this author in PubMed Google Scholar
Hailong Yang
View author publications
You can also search for this author in PubMed Google Scholar
Depei Qian
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hailong Yang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Luo, L., Liu, Y., Yang, H. et al. Magas: matrix-based asynchronous graph analytics on shared memory systems. J Supercomput 78, 5650–5680 (2022). https://doi.org/10.1007/s11227-021-04091-x

Download citation

Accepted: 15 September 2021
Published: 01 October 2021
Issue Date: March 2022
DOI: https://doi.org/10.1007/s11227-021-04091-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Magas: matrix-based asynchronous graph analytics on shared memory systems

Abstract

Access this article

Similar content being viewed by others

FOG: A Fast Out-of-Core Graph Processing Framework

BlockGraphChi: Enabling Block Update in Out-of-Core Graph Processing

swSpAMM: optimizing large-scale sparse approximate matrix multiplication on Sunway Taihulight

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Magas: matrix-based asynchronous graph analytics on shared memory systems

Abstract

Access this article

Similar content being viewed by others

FOG: A Fast Out-of-Core Graph Processing Framework

BlockGraphChi: Enabling Block Update in Out-of-Core Graph Processing

swSpAMM: optimizing large-scale sparse approximate matrix multiplication on Sunway Taihulight

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation