skip to main content
10.1145/3543873.3587365acmconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
poster

RealGraph+: A High-Performance Single-Machine-Based Graph Engine that Utilizes IO Bandwidth Effectively

Published: 30 April 2023 Publication History

Abstract

This paper proposes RealGraph+, an improved version of RealGraph that processes large-scale real-world graphs efficiently in a single machine. Via a preliminary analysis, we observe that the original RealGraph does not fully utilize the IO bandwidth provided by NVMe SSDs, a state-of-the-art storage device. In order to increase the IO bandwidth, we equip RealGraph+ with three optimization strategies to issue more-frequent IO requests: (1) User-space IO, (2) Asynchronous IO, and (3) SIMD processing. Via extensive experiments with four graph algorithms and six real-world datasets, we show that (1) each of our strategies is effective in increasing the IO bandwidth, thereby reducing the execution time; (2) RealGraph+ with all of our strategies improves the original RealGraph significantly; (3) RealGraph+ outperforms state-of-the-art single-machine-based graph engines dramatically; (4) it shows performance comparable to or even better than those of other distributed-system-based graph engines.

References

[1]
Ching Avery. 2011. Giraph: Large-scale graph processing infrastructure on hadoop. Proceedings of the Hadoop Summit. Santa Clara 11, 3 (2011), 5–9.
[2]
Aapo Kyrola et al. 2012. GraphChi: Large-scale graph computation on just a pc. In USENIX OSDI. 31–46.
[3]
Amitabha Roy et al. 2013. X-Stream: Edge-centric graph processing using streaming partitions. In ACM SOSP. 472–488.
[4]
Da Zheng et al. 2015. FlashGraph: Processing billion-node graphs on an array of commodity SSDs. In USENIX FAST. 45–58.
[5]
Farzad Khorasani et al. 2015. Scalable simd-efficient graph processing on gpus. In ACM PACT. 39–50.
[6]
Hilmi Yildirim et al. 2008. A random walk method for alleviating the sparsity problem in collaborative filtering. In ACM RecSys. 131–138.
[7]
Joseph E Gonzalez et al. 2012. Powergraph: Distributed graph-parallel computation on natural graphs. In USENIX OSDI. 17–30.
[8]
Joseph E Gonzalez et al. 2014. Graphx: Graph processing in a distributed dataflow framework. In USENIX OSDI. 599–613.
[9]
Lawrence Page et al. 1999. The PageRank citation ranking: Bringing order to the web. Technical Report. Stanford InfoLab.
[10]
Matias Bjørling et al. 2013. Linux block IO: introducing multi-queue SSD access on multi-core systems. In ACM SYSTOR. 1–10.
[11]
Myung-Hwan Jang et al. 2022. RealGraphGPU: A GPU-Based Graph Engine toward Large-Scale Real-World Network analysis. In ACM CIKM. 4074–4078.
[12]
Masoud Rehyani Hamedani et al. 2021. AdaSim: A Recursive Similarity Measure in Graphs. In ACM CIKM. 1528–1537.
[13]
Wook-Shin Han et al. 2013. TurboGraph: A fast parallel graph engine handling billion-scale graphs in a single PC. In ACM KDD. 77–85.
[14]
Xiaowei Zhu et al. 2015. Gridgraph: Large-scale graph processing on a single machine using 2-level hierarchical partitioning. In USENIX ATC. 375–386.
[15]
Yoonsuk Kang et al. 2020. Cr-graph: Community reinforcement for accurate community detection. In ACM CIKM. 2077–2080.
[16]
Yucheng Low et al. 2012. Distributed GraphLab: A Framework for Machine Learning and Data Mining in the Cloud. In VLDB Endowment 5, 8 (2012).
[17]
Yong-Yeon Jo et al. 2019. RealGraph: A graph engine leveraging the power-law distribution of real-world graphs. In ACM WWW. 807–817.
[18]
Yong-Yeon Jo et al. 2021. A Data Layout with Good Data Locality for Single-Machine based Graph Engines. IEEE Trans. Comput. 14, 8 (2021), 1–10.
[19]
Ziye Yang et al. 2017. SPDK: A development kit to build high performance storage applications. In IEEE CloudCom. 154–161.
[20]
Chris Lomont. 2011. Introduction to intel advanced vector extensions. Intel white paper 23 (2011).

Cited By

View all
  • (2024)Efficient large graph processing with chunk-based graph representation modelProceedings of the 2024 USENIX Conference on Usenix Annual Technical Conference10.5555/3691992.3692067(1239-1255)Online publication date: 10-Jul-2024
  • (2024)GNNDrive: Reducing Memory Contention and I/O Congestion for Disk-based GNN TrainingProceedings of the 53rd International Conference on Parallel Processing10.1145/3673038.3673063(650-659)Online publication date: 12-Aug-2024
  • (2024)RealGraphGPU++: A High-Performance GPU-Based Graph Engine with Direct Storage-to-DM IOCompanion Proceedings of the ACM on Web Conference 202410.1145/3589335.3651549(654-657)Online publication date: 13-May-2024
  • Show More Cited By

Index Terms

  1. RealGraph+: A High-Performance Single-Machine-Based Graph Engine that Utilizes IO Bandwidth Effectively

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    WWW '23 Companion: Companion Proceedings of the ACM Web Conference 2023
    April 2023
    1567 pages
    ISBN:9781450394192
    DOI:10.1145/3543873
    Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 30 April 2023

    Check for updates

    Author Tags

    1. graph engine
    2. large-scale graphs processing
    3. single machine

    Qualifiers

    • Poster
    • Research
    • Refereed limited

    Funding Sources

    • Samsung Electronics Co., Ltd
    • The Korea government (Ministry of Science and ICT)

    Conference

    WWW '23
    Sponsor:
    WWW '23: The ACM Web Conference 2023
    April 30 - May 4, 2023
    TX, Austin, USA

    Acceptance Rates

    Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)41
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 07 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Efficient large graph processing with chunk-based graph representation modelProceedings of the 2024 USENIX Conference on Usenix Annual Technical Conference10.5555/3691992.3692067(1239-1255)Online publication date: 10-Jul-2024
    • (2024)GNNDrive: Reducing Memory Contention and I/O Congestion for Disk-based GNN TrainingProceedings of the 53rd International Conference on Parallel Processing10.1145/3673038.3673063(650-659)Online publication date: 12-Aug-2024
    • (2024)RealGraphGPU++: A High-Performance GPU-Based Graph Engine with Direct Storage-to-DM IOCompanion Proceedings of the ACM on Web Conference 202410.1145/3589335.3651549(654-657)Online publication date: 13-May-2024
    • (2024)RealGraphGPUWeb: A Convenient and Efficient GPU-Based Graph Analysis Platform on the WebCompanion Proceedings of the ACM on Web Conference 202410.1145/3589335.3651237(1011-1014)Online publication date: 13-May-2024

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media