skip to main content
10.1145/3437801.3441622acmconferencesArticle/Chapter ViewAbstractPublication PagesppoppConference Proceedingsconference-collections
poster

DFOGraph: an I/O- and communication-efficient system for distributed fully-out-of-core graph processing

Published:17 February 2021Publication History

ABSTRACT

With the magnitude of graph-structured data continually increasing, graph processing systems that can scale-out and scale-up are needed to handle extreme-scale datasets. While existing distributed out-of-core solutions have made it possible, they suffer from limited performance due to excessive I/O and communication costs.

We present DFOGraph, a distributed fully-out-of-core graph processing system that applies and assembles multiple techniques to enable I/O- and communication-efficient processing. DFOGraph builds upon two-level partitions with adaptive compressed representations to allow fine-grained selective computation and communication. Our evaluation shows DFOGraph outperforms Chaos and HybridGraph significantly (>12.94× and >10.82×) when scaling out to eight nodes.

References

  1. Paolo Boldi, Marco Rosa, Massimo Santini, and Sebastiano Vigna. 2011. Layered Label Propagation: A MultiResolution Coordinate-Free Ordering for Compressing Social Networks. In Proceedings of the 20th international conference on World Wide Web, Sadagopan Srinivasan, Krithi Ramamritham, Arun Kumar, M. P. Ravindra, Elisa Bertino, and Ravi Kumar (Eds.). ACM Press, 587--596.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Paolo Boldi and Sebastiano Vigna. 2004. The WebGraph Framework I: Compression Techniques. In Proc. of the Thirteenth International World Wide Web Conference (WWW 2004). ACM Press, Manhattan, USA, 595--601.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Deepayan Chakrabarti, Yiping Zhan, and Christos Faloutsos. 2004. RMAT: A recursive model for graph mining. In Proceedings of the 2004 SIAM International Conference on Data Mining. SIAM, 442--446.Google ScholarGoogle ScholarCross RefCross Ref
  4. Seongyun Ko and Wook-Shin Han. 2018. Turbograph++: A scalable and fast graph analytics system. In Proceedings of the 2018 International Conference on Management of Data. ACM, 395--410.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Jurij Leskovec, Deepayan Chakrabarti, Jon Kleinberg, and Christos Faloutsos. 2005. Realistic, mathematically tractable graph generation and evolution, using kronecker multiplication. In European conference on principles of data mining and knowledge discovery. Springer, 133--145.Google ScholarGoogle ScholarCross RefCross Ref
  6. Heng Lin, Xiaowei Zhu, Bowen Yu, Xiongchao Tang, Wei Xue, Wenguang Chen, Lufei Zhang, Torsten Hoefler, Xiaosong Ma, Xin Liu, et al. 2018. ShenTu: processing multi-trillion edge graphs on millions of cores in seconds. In Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis. IEEE Press, 56.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Robert Ryan McCune, Tim Weninger, and Greg Madey. 2015. Thinking like a vertex: a survey of vertex-centric frameworks for large-scale distributed graph processing. ACM Computing Surveys (CSUR) 48, 2 (2015), 25.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Amitabha Roy, Laurent Bindschaedler, Jasmina Malicevic, and Willy Zwaenepoel. 2015. Chaos: Scale-out graph processing from secondary storage. In Proceedings of the 25th Symposium on Operating Systems Principles. ACM, 410--424.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Zhigang Wang, Yu Gu, Yubin Bao, Ge Yu, and Jeffrey Xu Yu. 2016. Hybrid pulling/pushing for I/O-efficient distributed and iterative graph computing. In Proceedings of the 2016 International Conference on Management of Data. ACM, 479--494.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. DFOGraph: an I/O- and communication-efficient system for distributed fully-out-of-core graph processing

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      PPoPP '21: Proceedings of the 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
      February 2021
      507 pages
      ISBN:9781450382946
      DOI:10.1145/3437801

      Copyright © 2021 Owner/Author

      Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 17 February 2021

      Check for updates

      Qualifiers

      • poster

      Acceptance Rates

      PPoPP '21 Paper Acceptance Rate31of150submissions,21%Overall Acceptance Rate230of1,014submissions,23%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader