skip to main content
10.1145/3225058.3225108acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicppConference Proceedingsconference-collections
research-article

HUS-Graph: I/O-Efficient Out-of-Core Graph Processing with Hybrid Update Strategy

Published: 13 August 2018 Publication History

Abstract

In recent years, a number of out-of-core graph processing systems have been proposed to process graphs with billions of edges on just one commodity computer, due to their high cost efficiency. To obtain the better performance, these systems adopt a full I/O model that accesses all edges during the computation to avoid the ineffectiveness of random I/Os. Although this model ensures good I/O access locality, it loads a large number of useless edges when running graph algorithms that only require a small portion of edges in each iteration. A natural method to solve this problem is the on-demand I/O model that only accesses the active edges. However, this method only works well for the graph algorithms with very few active edges, since the I/O cost will grow rapidly as the number of active edges increases due to larger amount of random I/Os.
In this paper, we present HUS-Graph, an efficient out-of-core graph processing system to address the above I/O issues and achieve a good balance between I/O amount and I/O access locality. HUS-Graph first adopts a hybrid update strategy including two update models, Row-oriented Push (ROP) and Column-oriented Pull (COP). It can adaptively select the optimal update model for the graph algorithms that have different computation and I/O features, based on an I/O-based performance prediction method. Furthermore, HUS-Graph proposes a dual-block representation to organize graph data, which ensures good access locality. Extensive experimental results show that HUS-Graph outperforms existing out-of-core systems by 1.4x-23.1x.

References

[1]
Lars Backstrom, Dan Huttenlocher, Jon Kleinberg, and Xiangyang Lan. 2006. Group formation in large social networks: membership, growth, and evolution. In KDD'06. ACM, 44--54.
[2]
Scott Beamer, Krste Asanović, and David Patterson. 2013. Direction-optimizing breadth-first search. Scientific Programming 21, 3--4 (2013), 137--148.
[3]
Paolo Boldi, Massimo Santini, and Sebastiano Vigna. 2008. A large time-aware web graph. In ACM SIGIR Forum, Vol. 42. ACM, 33--38.
[4]
Paolo Boldi and Sebastiano Vigna. 2004. The webgraph framework I: compression techniques. In WWW'04. ACM, 595--602.
[5]
Jiefeng Cheng, Qin Liu, Zhenguo Li, Wei Fan, John CS Lui, and Cheng He. 2015. VENUS: Vertex-centric streamlined graph computation on a single PC. In ICDE'15. IEEE, 1131--1142.
[6]
Yongli Cheng, Hong Jiang, Fang Wang, Yu Hua, and Dan Feng. 2017. BlitzG: Exploiting high-bandwidth networks for fast graph processing. In INFOCOM'17. IEEE, 1--9.
[7]
YongLi Cheng, Fang Wang, Hong Jiang, Yu Hua, Dan Feng, and XiuNeng Wang. 2016. DD-Graph: A Highly Cost-Effective Distributed Disk-based Graph-Processing Framework. In HPDC'16. ACM, 259--262.
[8]
Yongli Cheng, Fang Wang, Hong Jiang, Yu Hua, Dan Feng, and Xiuneng Wang. 2016. LCC-Graph: A high-performance graph-processing framework with low communication costs. In IWQoS'16. IEEE, 1--10.
[9]
Yongli Cheng, Fang Wang, Hong Jiang, Yu Hua, Dan Feng, Jun Zhou, and Lingling Zhang. 2017. A Communication-reduced and Computation-balanced Framework for Fast Graph Computation. In Frontiers of Computer Science.
[10]
Yuze Chi, Guohao Dai, Yu Wang, Guangyu Sun, Guoliang Li, and Huazhong Yang. 2016. Nxgraph: An efficient graph processing system on a single machine. In ICDE'16. IEEE, 409--420.
[11]
Joseph E Gonzalez, Yucheng Low, Haijie Gu, Danny Bickson, and Carlos Guestrin. 2012. Powergraph: Distributed graph-parallel computation on natural graphs. In OSDI'12. 17--30.
[12]
Joseph E Gonzalez, Reynold S Xin, Ankur Dave, Daniel Crankshaw, Michael J Franklin, and Ion Stoica. 2014. GraphX: Graph Processing in a Distributed Dataflow Framework. In OSDI'14. 599--613.
[13]
Wook-Shin Han, Sangyeon Lee, Kyungyeol Park, Jeong-Hoon Lee, Min-Soo Kim, Jinha Kim, and Hwanjo Yu. 2013. TurboGraph: a fast parallel graph engine handling billion-scale graphs in a single PC. In KDD'13. ACM, 77--85.
[14]
Haewoon Kwak, Changhyun Lee, Hosung Park, and Sue Moon. 2010. What is Twitter, a social network or a news media?. In WWW'10. ACM, 591--600.
[15]
Aapo Kyrola, Guy E Blelloch, and Carlos Guestrin. 2012. Graphchi: Large-scale graph computation on just a pc. In OSDI'12. USENIX, 31--46.
[16]
Hang Liu and H Howie Huang. 2017. Graphene: Fine-Grained IO Management for Graph Computing. In FAST'17. 285--300.
[17]
Grzegorz Malewicz, Matthew HAustern, Aart JC Bik, James C Dehnert, Ilan Horn, Naty Leiser, and Grzegorz Czajkowski. 2010. Pregel: a system for large-scale graph processing. In SIGMOD'10. ACM, 135--146.
[18]
Amitabha Roy, Laurent Bindschaedler, Jasmina Malicevic, and Willy Zwaenepoel. 2015. Chaos: Scale-out graph processing from secondary storage. In SOSP'15. ACM, 410--424.
[19]
Amitabha Roy, Ivo Mihailovic, and Willy Zwaenepoel. 2013. X-stream: Edge-centric graph processing using streaming partitions. In SOSP'13. ACM, 472--488.
[20]
Julian Shun and Guy E Blelloch. 2013. Ligra: a lightweight graph processing framework for shared memory. In ACM Sigplan Notices, Vol. 48. ACM, 135--146.
[21]
Keval Vora, Guoqing Xu, and Rajiv Gupta. 2016. Load the edges you need: A generic I/O optimization for disk-based graph processing. In USENIX ATC'16. 507--522.
[22]
Zhigang Wang, Yu Gu, Yubin Bao, Ge Yu, and Jeffrey Xu Yu. 2016. Hybrid pulling/pushing for i/o-efficient distributed and iterative graph computing. In SIGMOD'16. 479--494.
[23]
Da Zheng, Disa Mhembere, Randal Burns, Joshua Vogelstein, Carey E Priebe, and Alexander S Szalay. 2015. FlashGraph: Processing billion-node graphs on an array of commodity SSDs. In FAST'15. 45--58.
[24]
Xiaowei Zhu, Wenguang Chen, Weimin Zheng, and Xiaosong Ma. 2016. Gemini: A computation-centric distributed graph processing system. In OSDI'16. 301--316.
[25]
Xiaowei Zhu, Wentao Han, and Wenguang Chen. 2015. GridGraph: Large-scale graph processing on a single machine using 2-level hierarchical partitioning. In USENIX ATC'15. 375--386.

Cited By

View all
  • (2024)LeapGraph: A Fully External Graph Processing System on High-Speed SSD2024 13th Non-Volatile Memory Systems and Applications Symposium (NVMSA)10.1109/NVMSA63038.2024.10693656(1-2)Online publication date: 21-Aug-2024
  • (2023)Fargraph+: Excavating the Parallelism of Graph Processing Workload on RDMA-based Far Memory SystemJournal of Parallel and Distributed Computing10.1016/j.jpdc.2023.02.015Online publication date: Mar-2023
  • (2022)A Structure-Aware Storage Optimization for Out-of-Core Concurrent Graph ProcessingIEEE Transactions on Computers10.1109/TC.2021.309897671:7(1612-1625)Online publication date: 1-Jul-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICPP '18: Proceedings of the 47th International Conference on Parallel Processing
August 2018
945 pages
ISBN:9781450365109
DOI:10.1145/3225058
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

  • University of Oregon: University of Oregon

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 August 2018

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. graph computing
  2. hybrid update strategy
  3. out-of-core

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ICPP 2018

Acceptance Rates

ICPP '18 Paper Acceptance Rate 91 of 313 submissions, 29%;
Overall Acceptance Rate 91 of 313 submissions, 29%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)10
  • Downloads (Last 6 weeks)0
Reflects downloads up to 14 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)LeapGraph: A Fully External Graph Processing System on High-Speed SSD2024 13th Non-Volatile Memory Systems and Applications Symposium (NVMSA)10.1109/NVMSA63038.2024.10693656(1-2)Online publication date: 21-Aug-2024
  • (2023)Fargraph+: Excavating the Parallelism of Graph Processing Workload on RDMA-based Far Memory SystemJournal of Parallel and Distributed Computing10.1016/j.jpdc.2023.02.015Online publication date: Mar-2023
  • (2022)A Structure-Aware Storage Optimization for Out-of-Core Concurrent Graph ProcessingIEEE Transactions on Computers10.1109/TC.2021.309897671:7(1612-1625)Online publication date: 1-Jul-2022
  • (2022)GGraph: An Efficient Structure-Aware Approach for Iterative Graph ProcessingIEEE Transactions on Big Data10.1109/TBDATA.2020.30196418:5(1182-1194)Online publication date: 1-Oct-2022
  • (2022)Excavating the Potential of Graph Workload on RDMA-based Far Memory Architecture2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS53621.2022.00104(1029-1039)Online publication date: May-2022
  • (2020)A Hybrid Update Strategy for I/O-Efficient Out-of-Core Graph ProcessingIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2020.297314331:8(1767-1782)Online publication date: 1-Aug-2020
  • (2019)LOSCProceedings of the International Symposium on Quality of Service10.1145/3326285.3329069(1-10)Online publication date: 24-Jun-2019

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media