skip to main content
10.1145/3326285.3329069acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiwqosConference Proceedingsconference-collections
research-article

LOSC: efficient out-of-core graph processing with locality-optimized subgraph construction

Published: 24 June 2019 Publication History

Abstract

Big data applications increasingly rely on the analysis of large graphs. In recent years, a number of out-of-core graph processing systems have been proposed to process graphs with billions of edges on just one commodity computer, by efficiently using the secondary storage (e.g., hard disk, SSD). On the other hand, the vertex-centric computing model is extensively used in graph processing thanks to its good applicability and expressiveness. Unfortunately, when implementing vertex-centric model for out-of-core graph processing, the large number of random memory accesses required to construct subgraphs lead to a serious performance bottleneck that substantially weakens cache access locality and thus leads to very long waiting time experienced by users for the computing results. In this paper, we propose an efficient out-of-core graph processing system, LOSC, to substantially reduce the overhead of subgraph construction without sacrificing the underlying vertex-centric computing model. LOSC proposes a locality-optimized subgraph construction scheme that significantly improves the in-memory data access locality of the subgraph construction phase. Furthermore, LOSC adopts a compact edge storage format and a lightweight replication of vertices to reduce I/O traffic and improve computation efficiency. Extensive evaluation results show that LOSC is respectively 6.9x and 3.5x faster than GraphChi and GridGraph, two state-of-the-art out-of-core systems.

References

[1]
2018. http://www.valgrind.org/. (2018).
[2]
Zhiyuan Ai, Mingxing Zhang, Yongwei Wu, Xuehai Qian, Kang Chen, and Weimin Zheng. 2017. Squeezing out All the Value of Loaded Data: An Out-of-core Graph Processing System with Reduced Disk I/O. In USENIX ATC'17.
[3]
Lars Backstrom, Dan Huttenlocher, Jon Kleinberg, and Xiangyang Lan. 2006. Group formation in large social networks: membership, growth, and evolution. In KDD'06. 44--54.
[4]
Paolo Boldi, Massimo Santini, and Sebastiano Vigna. 2008. A large time-aware web graph. In ACM SIGIR Forum. 33--38.
[5]
Paolo Boldi and Sebastiano Vigna. 2004. The webgraph framework I: compression techniques. In WWW'04. 595--602.
[6]
Jiefeng Cheng, Qin Liu, Zhenguo Li, Wei Fan, John CS Lui, and Cheng He. 2015. VENUS: Vertex-centric streamlined graph computation on a single PC. In ICDE'15. 1131--1142.
[7]
Yongli Cheng, Fang Wang, Hong Jiang, Yu Hua, Dan Feng, and Xiuneng Wang. 2016. LCC-Graph: A high-performance graph-processing framework with low communication costs. In IWQoS'16. IEEE, 1--10.
[8]
Yuze Chi, Guohao Dai, Yu Wang, Guangyu Sun, Guoliang Li, and Huazhong Yang. 2016. Nxgraph: An efficient graph processing system on a single machine. In ICDE'16. IEEE, 409--420.
[9]
Joseph E Gonzalez, Yucheng Low, Haijie Gu, Danny Bickson, and Carlos Guestrin. 2012. Powergraph: Distributed graph-parallel computation on natural graphs. In OSDI'12. 17--30.
[10]
Joseph E Gonzalez, Reynold S Xin, Ankur Dave, Daniel Crankshaw, Michael J Franklin, and Ion Stoica. 2014. GraphX: Graph Processing in a Distributed Dataflow Framework. In OSDI'14. 599--613.
[11]
Wook-Shin Han, Sangyeon Lee, Kyungyeol Park, Jeong-Hoon Lee, Min-Soo Kim, Jinha Kim, and Hwanjo Yu. 2013. TurboGraph: a fast parallel graph engine handling billion-scale graphs in a single PC. In KDD'13. 77--85.
[12]
Haewoon Kwak, Changhyun Lee, Hosung Park, and Sue Moon. 2010. What is Twitter, a social network or a news media?. In WWW'10. 591--600.
[13]
Aapo Kyrola, Guy Blelloch, and Carlos Guestrin. 2012. GraphChi: large-scale graph computation on just a PC. In OSDI'12. 31--46.
[14]
Kisung Lee, Ling Liu, Karsten Schwan, Calton Pu, Qi Zhang, Yang Zhou, Emre Yigitoglu, and Pingpeng Yuan. 2015. Scaling iterative graph computations with GraphMap. In SC'15. 57.
[15]
Zhiyuan Lin, Minsuk Kahng, Kaeser Md Sabrin, Duen Horng Polo Chau, Ho Lee, and U Kang. 2014. Mmap: Fast billion-scale graph computation on a pc via memory mapping. In Big Data'14. 159--164.
[16]
Yucheng Low, Danny Bickson, Joseph Gonzalez, Carlos Guestrin, Aapo Kyrola, and Joseph M Hellerstein. 2012. Distributed GraphLab: a framework for machine learning and data mining in the cloud. PVLDB (2012), 716--727.
[17]
Grzegorz Malewicz, Matthew H Austern, Aart JC Bik, James C Dehnert, Ilan Horn, Naty Leiser, and Grzegorz Czajkowski. 2010. Pregel: a system for large-scale graph processing. In SIGMOD'10. 135--146.
[18]
Kamran Najeebullah, Kifayat Ullah Khan, Waqas Nawaz, and Young-Koo Lee. 2014. Bishard parallel processor: A disk-based processing engine for billion-scale graphs. IJMUE (2014), 199--212.
[19]
Amitabha Roy, Ivo Mihailovic, and Willy Zwaenepoel. 2013. X-Stream: edge-centric graph processing using streaming partitions. In SOSP'13. 472--488.
[20]
Julian Shun and Guy E Blelloch. 2013. Ligra: a lightweight graph processing framework for shared memory. In Proceeding of the 18th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming(PPoPP), Vol. 48. 135--146.
[21]
Keval Vora, Guoqing Xu, and Rajiv Gupta. 2016. Load the edges you need: A generic I/O optimization for disk-based graph processing. In USENIX ATC'16. 507--522.
[22]
Xianghao Xu, Fang Wang, Hong Jiang, Yongli Cheng, Dan Feng, and Yongxuan Zhang. 2018. HUS-Graph: I/O-Efficient Out-of-Core Graph Processing with Hybrid Update Strategy. In ICPP'18. ACM, 3.
[23]
Pingpeng Yuan, Pu Liu, Buwen Wu, Hai Jin, Wenya Zhang, and Ling Liu. 2013. TripleBit: a fast and compact system for large scale RDF data. PVLDB (2013), 517--528.
[24]
Da Zheng, Disa Mhembere, Randal Burns, Joshua Vogelstein, Carey E Priebe, and Alexander S Szalay. 2015. FlashGraph: Processing billion-node graphs on an array of commodity SSDs. In FAST'15. 45--58.
[25]
Xiaowei Zhu, Wenguang Chen, Weimin Zheng, and Xiaosong Ma. 2016. Gemini: A computation-centric distributed graph processing system. In OSDI'16. 301--316.
[26]
Xiaowei Zhu, Wentao Han, and Wenguang Chen. 2015. GridGraph: Large-scale graph processing on a single machine using 2-level hierarchical partitioning. In USENIX ATC'15. 375--386.

Cited By

View all
  • (2022)LOSC: A Locality-optimized Subgraph Construction Scheme for Out-of-Core Graph ProcessingJournal of Parallel and Distributed Computing10.1016/j.jpdc.2022.10.005Online publication date: Oct-2022

Index Terms

  1. LOSC: efficient out-of-core graph processing with locality-optimized subgraph construction

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      IWQoS '19: Proceedings of the International Symposium on Quality of Service
      June 2019
      420 pages
      ISBN:9781450367783
      DOI:10.1145/3326285
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 24 June 2019

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. graph computing
      2. out-of-core
      3. subgraph construction

      Qualifiers

      • Research-article

      Conference

      IWQoS '19

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)5
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 14 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2022)LOSC: A Locality-optimized Subgraph Construction Scheme for Out-of-Core Graph ProcessingJournal of Parallel and Distributed Computing10.1016/j.jpdc.2022.10.005Online publication date: Oct-2022

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media