research-article

HUS-Graph: I/O-Efficient Out-of-Core Graph Processing with Hybrid Update Strategy

Authors:

Yongxuan ZhangAuthors Info & Claims

ICPP '18: Proceedings of the 47th International Conference on Parallel Processing

Article No.: 3, Pages 1 - 10

https://doi.org/10.1145/3225058.3225108

Published: 13 August 2018 Publication History

Abstract

In recent years, a number of out-of-core graph processing systems have been proposed to process graphs with billions of edges on just one commodity computer, due to their high cost efficiency. To obtain the better performance, these systems adopt a full I/O model that accesses all edges during the computation to avoid the ineffectiveness of random I/Os. Although this model ensures good I/O access locality, it loads a large number of useless edges when running graph algorithms that only require a small portion of edges in each iteration. A natural method to solve this problem is the on-demand I/O model that only accesses the active edges. However, this method only works well for the graph algorithms with very few active edges, since the I/O cost will grow rapidly as the number of active edges increases due to larger amount of random I/Os.

In this paper, we present HUS-Graph, an efficient out-of-core graph processing system to address the above I/O issues and achieve a good balance between I/O amount and I/O access locality. HUS-Graph first adopts a hybrid update strategy including two update models, Row-oriented Push (ROP) and Column-oriented Pull (COP). It can adaptively select the optimal update model for the graph algorithms that have different computation and I/O features, based on an I/O-based performance prediction method. Furthermore, HUS-Graph proposes a dual-block representation to organize graph data, which ensures good access locality. Extensive experimental results show that HUS-Graph outperforms existing out-of-core systems by 1.4x-23.1x.

References

[1]

Lars Backstrom, Dan Huttenlocher, Jon Kleinberg, and Xiangyang Lan. 2006. Group formation in large social networks: membership, growth, and evolution. In KDD'06. ACM, 44--54.

Digital Library

[2]

Scott Beamer, Krste Asanović, and David Patterson. 2013. Direction-optimizing breadth-first search. Scientific Programming 21, 3--4 (2013), 137--148.

Digital Library

[3]

Paolo Boldi, Massimo Santini, and Sebastiano Vigna. 2008. A large time-aware web graph. In ACM SIGIR Forum, Vol. 42. ACM, 33--38.

Digital Library

[4]

Paolo Boldi and Sebastiano Vigna. 2004. The webgraph framework I: compression techniques. In WWW'04. ACM, 595--602.

Digital Library

[5]

Jiefeng Cheng, Qin Liu, Zhenguo Li, Wei Fan, John CS Lui, and Cheng He. 2015. VENUS: Vertex-centric streamlined graph computation on a single PC. In ICDE'15. IEEE, 1131--1142.

[6]

Yongli Cheng, Hong Jiang, Fang Wang, Yu Hua, and Dan Feng. 2017. BlitzG: Exploiting high-bandwidth networks for fast graph processing. In INFOCOM'17. IEEE, 1--9.

[7]

YongLi Cheng, Fang Wang, Hong Jiang, Yu Hua, Dan Feng, and XiuNeng Wang. 2016. DD-Graph: A Highly Cost-Effective Distributed Disk-based Graph-Processing Framework. In HPDC'16. ACM, 259--262.

Digital Library

[8]

Yongli Cheng, Fang Wang, Hong Jiang, Yu Hua, Dan Feng, and Xiuneng Wang. 2016. LCC-Graph: A high-performance graph-processing framework with low communication costs. In IWQoS'16. IEEE, 1--10.

[9]

Yongli Cheng, Fang Wang, Hong Jiang, Yu Hua, Dan Feng, Jun Zhou, and Lingling Zhang. 2017. A Communication-reduced and Computation-balanced Framework for Fast Graph Computation. In Frontiers of Computer Science.

[10]

Yuze Chi, Guohao Dai, Yu Wang, Guangyu Sun, Guoliang Li, and Huazhong Yang. 2016. Nxgraph: An efficient graph processing system on a single machine. In ICDE'16. IEEE, 409--420.

[11]

Joseph E Gonzalez, Yucheng Low, Haijie Gu, Danny Bickson, and Carlos Guestrin. 2012. Powergraph: Distributed graph-parallel computation on natural graphs. In OSDI'12. 17--30.

Digital Library

[12]

Joseph E Gonzalez, Reynold S Xin, Ankur Dave, Daniel Crankshaw, Michael J Franklin, and Ion Stoica. 2014. GraphX: Graph Processing in a Distributed Dataflow Framework. In OSDI'14. 599--613.

Digital Library

[13]

Wook-Shin Han, Sangyeon Lee, Kyungyeol Park, Jeong-Hoon Lee, Min-Soo Kim, Jinha Kim, and Hwanjo Yu. 2013. TurboGraph: a fast parallel graph engine handling billion-scale graphs in a single PC. In KDD'13. ACM, 77--85.

Digital Library

[14]

Haewoon Kwak, Changhyun Lee, Hosung Park, and Sue Moon. 2010. What is Twitter, a social network or a news media?. In WWW'10. ACM, 591--600.

Digital Library

[15]

Aapo Kyrola, Guy E Blelloch, and Carlos Guestrin. 2012. Graphchi: Large-scale graph computation on just a pc. In OSDI'12. USENIX, 31--46.

Digital Library

[16]

Hang Liu and H Howie Huang. 2017. Graphene: Fine-Grained IO Management for Graph Computing. In FAST'17. 285--300.

Digital Library

[17]

Grzegorz Malewicz, Matthew HAustern, Aart JC Bik, James C Dehnert, Ilan Horn, Naty Leiser, and Grzegorz Czajkowski. 2010. Pregel: a system for large-scale graph processing. In SIGMOD'10. ACM, 135--146.

Digital Library

[18]

Amitabha Roy, Laurent Bindschaedler, Jasmina Malicevic, and Willy Zwaenepoel. 2015. Chaos: Scale-out graph processing from secondary storage. In SOSP'15. ACM, 410--424.

Digital Library

[19]

Amitabha Roy, Ivo Mihailovic, and Willy Zwaenepoel. 2013. X-stream: Edge-centric graph processing using streaming partitions. In SOSP'13. ACM, 472--488.

Digital Library

[20]

Julian Shun and Guy E Blelloch. 2013. Ligra: a lightweight graph processing framework for shared memory. In ACM Sigplan Notices, Vol. 48. ACM, 135--146.

Digital Library

[21]

Keval Vora, Guoqing Xu, and Rajiv Gupta. 2016. Load the edges you need: A generic I/O optimization for disk-based graph processing. In USENIX ATC'16. 507--522.

Digital Library

[22]

Zhigang Wang, Yu Gu, Yubin Bao, Ge Yu, and Jeffrey Xu Yu. 2016. Hybrid pulling/pushing for i/o-efficient distributed and iterative graph computing. In SIGMOD'16. 479--494.

Digital Library

[23]

Da Zheng, Disa Mhembere, Randal Burns, Joshua Vogelstein, Carey E Priebe, and Alexander S Szalay. 2015. FlashGraph: Processing billion-node graphs on an array of commodity SSDs. In FAST'15. 45--58.

Digital Library

[24]

Xiaowei Zhu, Wenguang Chen, Weimin Zheng, and Xiaosong Ma. 2016. Gemini: A computation-centric distributed graph processing system. In OSDI'16. 301--316.

Digital Library

[25]

Xiaowei Zhu, Wentao Han, and Wenguang Chen. 2015. GridGraph: Large-scale graph processing on a single machine using 2-level hierarchical partitioning. In USENIX ATC'15. 375--386.

Digital Library

Cited By

Yang TYang MChang Y(2024)LeapGraph: A Fully External Graph Processing System on High-Speed SSD2024 13th Non-Volatile Memory Systems and Applications Symposium (NVMSA)10.1109/NVMSA63038.2024.10693656(1-2)Online publication date: 21-Aug-2024
https://doi.org/10.1109/NVMSA63038.2024.10693656
Wang JLi CLiu YWang TMei JZhang LWang PGuo M(2023)Fargraph+: Excavating the Parallelism of Graph Processing Workload on RDMA-based Far Memory SystemJournal of Parallel and Distributed Computing10.1016/j.jpdc.2023.02.015Online publication date: Mar-2023
https://doi.org/10.1016/j.jpdc.2023.02.015
Liao XZhao JZhang YHe BHe LJin HGu L(2022)A Structure-Aware Storage Optimization for Out-of-Core Concurrent Graph ProcessingIEEE Transactions on Computers10.1109/TC.2021.309897671:7(1612-1625)Online publication date: 1-Jul-2022
https://doi.org/10.1109/TC.2021.3098976
Show More Cited By

Index Terms

HUS-Graph: I/O-Efficient Out-of-Core Graph Processing with Hybrid Update Strategy
1. Hardware
  1. Communication hardware, interfaces and storage
    1. External storage
2. Theory of computation
  1. Design and analysis of algorithms
    1. Graph algorithms analysis

Recommendations

LOSC: efficient out-of-core graph processing with locality-optimized subgraph construction
IWQoS '19: Proceedings of the International Symposium on Quality of Service

Big data applications increasingly rely on the analysis of large graphs. In recent years, a number of out-of-core graph processing systems have been proposed to process graphs with billions of edges on just one commodity computer, by efficiently using ...
Wonderland: A Novel Abstraction-Based Out-Of-Core Graph Processing System
ASPLOS '18: Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems

Many important graph applications are iterative algorithms that repeatedly process the input graph until convergence. For such algorithms, graph abstraction is an important technique: although much smaller than the original graph, it can bootstrap an ...
Wonderland: A Novel Abstraction-Based Out-Of-Core Graph Processing System
ASPLOS '18

Many important graph applications are iterative algorithms that repeatedly process the input graph until convergence. For such algorithms, graph abstraction is an important technique: although much smaller than the original graph, it can bootstrap an ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ICPP '18: Proceedings of the 47th International Conference on Parallel Processing

August 2018

945 pages

ISBN:9781450365109

DOI:10.1145/3225058

Copyright © 2018 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

University of Oregon: University of Oregon

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 August 2018

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

ICPP 2018

ICPP 2018: 47th International Conference on Parallel Processing

August 13 - 16, 2018

OR, Eugene, USA

Acceptance Rates

ICPP '18 Paper Acceptance Rate 91 of 313 submissions, 29%;

Overall Acceptance Rate 91 of 313 submissions, 29%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

6
Total Citations
View Citations
198
Total Downloads

Downloads (Last 12 months)10
Downloads (Last 6 weeks)0

Reflects downloads up to 14 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Yang TYang MChang Y(2024)LeapGraph: A Fully External Graph Processing System on High-Speed SSD2024 13th Non-Volatile Memory Systems and Applications Symposium (NVMSA)10.1109/NVMSA63038.2024.10693656(1-2)Online publication date: 21-Aug-2024
https://doi.org/10.1109/NVMSA63038.2024.10693656
Wang JLi CLiu YWang TMei JZhang LWang PGuo M(2023)Fargraph+: Excavating the Parallelism of Graph Processing Workload on RDMA-based Far Memory SystemJournal of Parallel and Distributed Computing10.1016/j.jpdc.2023.02.015Online publication date: Mar-2023
https://doi.org/10.1016/j.jpdc.2023.02.015
Liao XZhao JZhang YHe BHe LJin HGu L(2022)A Structure-Aware Storage Optimization for Out-of-Core Concurrent Graph ProcessingIEEE Transactions on Computers10.1109/TC.2021.309897671:7(1612-1625)Online publication date: 1-Jul-2022
https://doi.org/10.1109/TC.2021.3098976
Si BLiang YZhao JZhang YLiao XJin HLiu HGu L(2022)GGraph: An Efficient Structure-Aware Approach for Iterative Graph ProcessingIEEE Transactions on Big Data10.1109/TBDATA.2020.30196418:5(1182-1194)Online publication date: 1-Oct-2022
https://doi.org/10.1109/TBDATA.2020.3019641
Wang JLi CWang TZhang LWang PMei JGuo M(2022)Excavating the Potential of Graph Workload on RDMA-based Far Memory Architecture2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS53621.2022.00104(1029-1039)Online publication date: May-2022
https://doi.org/10.1109/IPDPS53621.2022.00104
Xu XWang FJiang HCheng YFeng DZhang Y(2020)A Hybrid Update Strategy for I/O-Efficient Out-of-Core Graph ProcessingIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2020.297314331:8(1767-1782)Online publication date: 1-Aug-2020
https://doi.org/10.1109/TPDS.2020.2973143
Xu XWang FJiang HCheng YHua YFeng DZhang Y(2019)LOSCProceedings of the International Symposium on Quality of Service10.1145/3326285.3329069(1-10)Online publication date: 24-Jun-2019
https://dl.acm.org/doi/10.1145/3326285.3329069

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten