skip to main content
10.1145/3545008.3545039acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicppConference Proceedingsconference-collections
research-article

GraphSD: A State and Dependency aware Out-of-Core Graph Processing System

Published: 13 January 2023 Publication History

Abstract

In recent years, system researchers have proposed many out-of-core graph processing systems to efficiently handle graphs that exceed the memory capacity of a single machine. Through disk-friendly graph data organizations and well-designed execution engines, existing out-of-core graph processing systems can maintain sequential locality on disk access and greatly reduce disk I/Os during processing. However, they have not fully explored the characteristics of graph data and algorithm execution to further reduce disk I/Os, leaving significant room for performance improvement. In this paper, we present a novel out-of-core graph processing system called GraphSD, which optimizes the I/O traffic by simultaneously capturing the state and dependency of graph data during computation. At the heart of GraphSD is a state- and dependency-aware update strategy that includes two adaptive update models, selective cross-iteration update (SCIU) and full cross-iteration update (FCIU). These two update models are dynamically triggered at runtime to enable active-vertex aware processing and cross-iteration vertex value computation, which avoid loading inactive edges and reduce disk I/Os in the future iterations. Moreover, an efficient sub-block based buffering scheme is proposed to further minimize I/O overheads. Our evaluation results show that GraphSD outperforms two state-of-the-art out-of-core graph processing systems HUS-Graph and Lumos by up to 2.7 × and 3.9 × respectively.

References

[1]
[1] 2022. http://www.graph500.org/.
[2]
Zhiyuan Ai, Mingxing Zhang, Yongwei Wu, Xuehai Qian, Kang Chen, and Weimin Zheng. 2017. Squeezing out all the value of loaded data: An out-of-core graph processing system with reduced disk i/o. In USENIX ATC’17. 125–137.
[3]
Hisham Alasmary, Aminollah Khormali, Afsah Anwar, Jeman Park, Jinchun Choi, Ahmed Abusnaina, Amro Awad, Daehun Nyang, and Aziz Mohaisen. 2019. Analyzing and detecting emerging internet of things malware: A graph-based approach. IEEE Internet of Things Journal 6, 5 (2019), 8977–8988.
[4]
Paolo Boldi, Massimo Santini, and Sebastiano Vigna. 2008. A large time-aware web graph. In ACM SIGIR Forum, Vol. 42. ACM, 33–38.
[5]
Paolo Boldi and Sebastiano Vigna. 2004. The webgraph framework I: compression techniques. In WWW’04. ACM, 595–602.
[6]
Jiefeng Cheng, Qin Liu, Zhenguo Li, Wei Fan, John CS Lui, and Cheng He. 2015. VENUS: Vertex-centric streamlined graph computation on a single PC. In ICDE’15. IEEE, 1131–1142.
[7]
Yuze Chi, Guohao Dai, Yu Wang, Guangyu Sun, Guoliang Li, and Huazhong Yang. 2016. Nxgraph: An efficient graph processing system on a single machine. In ICDE’16. IEEE, 409–420.
[8]
Hoang-Vu Dang, Roshan Dathathri, Gurbinder Gill, Alex Brooks, Nikoli Dryden, Andrew Lenharth, Loc Hoang, Keshav Pingali, and Marc Snir. 2018. A lightweight communication runtime for distributed graph analytics. In IPDPS’18. IEEE, 980–989.
[9]
Minyang Han and Khuzaima Daudjee. 2015. Giraph unchained: Barrierless asynchronous parallel execution in pregel-like graph processing systems. Proceedings of the VLDB Endowment 8, 9 (2015), 950–961.
[10]
Haewoon Kwak, Changhyun Lee, Hosung Park, and Sue Moon. 2010. What is Twitter, a social network or a news media?. In WWW’10. ACM, 591–600.
[11]
Aapo Kyrola, Guy E Blelloch, and Carlos Guestrin. 2012. Graphchi: Large-scale graph computation on just a pc. In OSDI’12. USENIX, 31–46.
[12]
Zhao Li, Haishuai Wang, Peng Zhang, Pengrui Hui, Jiaming Huang, Jian Liao, Ji Zhang, and Jiajun Bu. 2021. Live-Streaming Fraud Detection: A Heterogeneous Graph Neural Network Approach. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 3670–3678.
[13]
Yucheng Low, Danny Bickson, Joseph Gonzalez, Carlos Guestrin, Aapo Kyrola, and Joseph M Hellerstein. 2012. Distributed GraphLab: a framework for machine learning and data mining in the cloud. PVLDB (2012), 716–727.
[14]
Grzegorz Malewicz, Matthew H Austern, Aart JC Bik, James C Dehnert, Ilan Horn, Naty Leiser, and Grzegorz Czajkowski. 2010. Pregel: a system for large-scale graph processing. In SIGMOD’10. ACM, 135–146.
[15]
Kiran Kumar Matam, Hanieh Hashemi, and Murali Annavaram. 2021. MultiLogVC: Efficient Out-of-Core Graph Processing Framework for Flash Storage. In IPDPS’21. IEEE, 245–255.
[16]
Tuan-Anh Nguyen Pham, Xutao Li, Gao Cong, and Zhenjie Zhang. 2015. A general graph-based model for recommendation in event-based social networks. In ICDE’15. IEEE, 567–578.
[17]
Amitabha Roy, Ivo Mihailovic, and Willy Zwaenepoel. 2013. X-stream: Edge-centric graph processing using streaming partitions. In SOSP’13. ACM, 472–488.
[18]
Shuang Song, Meng Li, Xinnian Zheng, Michael LeBeane, Jee Ho Ryoo, Reena Panda, Andreas Gerstlauer, and Lizy K John. 2016. Proxy-guided load balancing of graph processing workloads on heterogeneous clusters. In ICPP’16. IEEE, 77–86.
[19]
Leslie G Valiant. 1990. A bridging model for parallel computation. Commun. ACM 33, 8 (1990), 103–111.
[20]
Keval Vora. 2019. LUMOS: Dependency-Driven Disk-based Graph Processing. In USENIX ATC’19. 429–442.
[21]
Keval Vora, Guoqing Xu, and Rajiv Gupta. 2016. Load the edges you need: A generic I/O optimization for disk-based graph processing. In USENIX ATC’16. 507–522.
[22]
Xianghao Xu, Fang Wang, Hong Jiang, Yongli Cheng, Dan Feng, and Yongxuan Zhang. 2020. A Hybrid Update Strategy for I/O-Efficient Out-of-Core Graph Processing. IEEE Transactions on Parallel and Distributed Systems 31, 8 (2020), 1767–1782.
[23]
Da Yan, James Cheng, Yi Lu, and Wilfred Ng. 2014. Blogel: A block-centric framework for distributed computation on real-world graphs. Proceedings of the VLDB Endowment 7, 14 (2014), 1981–1992.
[24]
Pingpeng Yuan, Wenya Zhang, Changfeng Xie, Hai Jin, Ling Liu, and Kisung Lee. 2014. Fast iterative graph computation: A path centric approach. In SC’14. IEEE, 401–412.
[25]
Mingxing Zhang, Yongwei Wu, Youwei Zhuo, Xuehai Qian, Chengying Huan, and Kang Chen. 2018. Wonderland: A novel abstraction-based out-of-core graph processing system. ACM SIGPLAN Notices 53, 2 (2018), 608–621.
[26]
Zhixuan Zhou and Henry Hoffmann. 2018. Graphz: Improving the performance of large-scale graph analytics on small-scale machines. In ICDE’18. IEEE, 1368–1371.
[27]
Xiaowei Zhu, Wenguang Chen, and Weimin Zheng. 2016. Gemini: A computation-centric distributed graph processing system. In OSDI’16. 301–316.
[28]
Xiaojin Zhu and Zoubin Ghahramani. 2002. Learning from labeled and unlabeled data with label propagation. Technical Report. Citeseer.
[29]
Xiaowei Zhu, Wentao Han, and Wenguang Chen. 2015. GridGraph: Large-scale graph processing on a single machine using 2-level hierarchical partitioning. In USENIX ATC’15. 375–386.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICPP '22: Proceedings of the 51st International Conference on Parallel Processing
August 2022
976 pages
ISBN:9781450397339
DOI:10.1145/3545008
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 January 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. I/O-efficient
  2. graph processing
  3. state and dependency of graph data

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ICPP '22
ICPP '22: 51st International Conference on Parallel Processing
August 29 - September 1, 2022
Bordeaux, France

Acceptance Rates

Overall Acceptance Rate 91 of 313 submissions, 29%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 120
    Total Downloads
  • Downloads (Last 12 months)31
  • Downloads (Last 6 weeks)1
Reflects downloads up to 14 Feb 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media