skip to main content
10.1145/2807591.2807626acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
research-article

GraphBIG: understanding graph computing in the context of industrial solutions

Published: 15 November 2015 Publication History

Abstract

With the emergence of data science, graph computing is becoming a crucial tool for processing big connected data. Although efficient implementations of specific graph applications exist, the behavior of full-spectrum graph computing remains unknown. To understand graph computing, we must consider multiple graph computation types, graph frameworks, data representations, and various data sources in a holistic way.
In this paper, we present GraphBIG, a benchmark suite inspired by IBM System G project. To cover major graph computation types and data sources, GraphBIG selects representative datastructures, workloads and data sets from 21 real-world use cases of multiple application domains. We characterized GraphBIG on real machines and observed extremely irregular memory patterns and significant diverse behavior across different computations. GraphBIG helps users understand the impact of modern graph computing on the hardware architecture and enables future architecture and system research.

References

[1]
The Boost Graph Library: User Guide and Reference Manual. Addison-Wesley Longman, 2002.
[2]
Apache Giraph. http://giraph.apache.org/, 2015.
[3]
S. Andreassen and et al. MUNIN --- an expert EMG assistant. In Computer-Aided Electromyography and Expert Systems. 1989.
[4]
D. A. Bader and et al. Design and implementation of the hpcs graph analysis benchmark on symmetric multiprocessors. HiPC'05.
[5]
R. Balasubramonian and et al. Near-data processing: Insights from a micro-46 workshop. IEEE Micro, 2014.
[6]
Y. Bengio. Learning deep architectures for ai. Found. Trends Mach. Learn., 2(1), Jan. 2009.
[7]
M. Burtscher, R. Nasre, and K. Pingali. A quantitative study of irregular programs on gpus. IISWC'12.
[8]
M. Canim and Y.-C. Chang. System G Data Store: Big, rich graph data analytics in the cloud. IC2E'13.
[9]
S. Che and et al. Rodinia: A benchmark suite for heterogeneous computing. IISWC'09, Oct 2009.
[10]
E. Chesler and M. Haendel. Bioinformatics of Behavior:. Number pt. 2. Elsevier Science, 2012.
[11]
M. Ferdman and et al. Clearing the clouds: A study of emerging scale-out workloads on modern hardware. ASPLOS'12, 2012.
[12]
Z. Fu, M. Personick, and B. Thompson. Mapgraph: A high level api for fast development of high performance graph analytics on gpus. GRADES'14, 2014.
[13]
Y. Guo and et al. Benchmarking graph-processing platforms: A vision. ICPE'14, 2014.
[14]
M. T. Jones and P. E. Plassmann. A parallel graph coloring heuristic. SIAM J. Sci. Comput., May 1993.
[15]
U. Kang and et al. Centralities in large networks: Algorithms and observations. SDM'11, 2011.
[16]
F. Khorasani and et al. Cusha: Vertex-centric graph processing on gpus. HPDC'14, 2014.
[17]
A. Kyrola and et al. Graphchi: Large-scale graph computation on just a pc. OSDI'12, 2012.
[18]
J. Leskovec and A. Krevl. SNAP Datasets: Stanford large network dataset collection, June 2014.
[19]
C.-Y. Lin and et al. Social network analysis in enterprise. Proceedings of the IEEE, 100(9), Sept 2012.
[20]
Y. Low and et al. Distributed graphlab: A framework for machine learning and data mining in the cloud. Proc. VLDB Endow., 5(8), Apr. 2012.
[21]
K. Madduri and et al. A faster parallel algorithm and efficient multithreaded implementations for evaluating betweenness centrality on massive datasets. IPDPS'09.
[22]
G. Malewicz and et al. Pregel: A system for large-scale graph processing. SIGMOD'10, 2010.
[23]
D. Matula and et al. Smallest-last ordering and clustering and graph coloring algorithms. J. ACM, 1983.
[24]
J. Mondal and A. Deshpande. Managing large dynamic graphs efficiently. SIGMOD'12, 2012.
[25]
R. C. Murphy and et al. Introducing the graph 500. In Cray User's Group (CUG), 2010.
[26]
S. A. Myers and et al. Information network or social network?: The structure of the twitter follow graph. WWW Companion'14.
[27]
L. Nai, Y. Xia, C.-Y. Lin, B. Hong, and H.-H. S. Lee. Cache-conscious graph collaborative filtering on multi-socket multicore systems. CF'14, 2014.
[28]
M.-D. Pham and et al. S3g2: A scalable structure-correlated social graph generator. TPCTC'12, 2012.
[29]
M. J. Quinn and N. Deo. Parallel graph algorithms. ACM Comput. Surv., 16(3), Sept. 1984.
[30]
A. Roy and et al. X-stream: Edge-centric graph processing using streaming partitions. SOSP'13, 2013.
[31]
W. RW and et al. Genetic and molecular network analysis of behavior. Int Rev Neurobiol, 2012.
[32]
T. Schank and et al. Finding, counting and listing all triangles in large graphs, an experimental study. WEA'05, 2005.
[33]
B. Shao, H. Wang, and Y. Li. Trinity: A distributed graph engine on a memory cloud. SIGMOD'13, 2013.
[34]
J. Shun and et al. Brief announcement: The problem based benchmark suite. SPAA'12, 2012.
[35]
J. Soman, K. Kishore, and P. Narayanan. A fast gpu algorithm for graph connectivity. IPDPSW'10.
[36]
J. A. Stratton and et al. Parboil: A revised benchmark suite for scientific and commercial throughput computing. Technical Report IMPACT-12-01, UIUC, 2012.
[37]
I. Tanase, Y. Xia, L. Nai, Y. Liu, W. Tan, J. Crawford, and C.-Y. Lin. A highly efficient runtime and graph library for large scale graph analytics. GRADES'14, 2014.
[38]
A. L. Varbanescu and et al. Can portability improve performance?: An empirical study of parallel graph analytics. ICPE'15, 2015.
[39]
L. Wang and et al. Bigdatabench: A big data benchmark suite from internet services. HPCA'14, 2014.
[40]
J. Webber. A programmatic introduction to neo4j. SPLASH '12, 2012.
[41]
Z. Wen and C.-Y. Lin. How accurately can one's interests be inferred from friends. WWW'10, 2010.
[42]
Y. Xia. System G: Graph analytics, storage and runtimes. In Tutorial on the 19th ACM PPoPP, 2014.
[43]
Y. Xia, J.-H. Lai, L. Nai, and C.-Y. Lin. Concurrent image query using local random walk with restart on large scale graphs. ICMEW'14, July 2014.
[44]
Y. Xia and V. K. Prasanna. Topologically adaptive parallel breadth-first search on multicore processors. PDCS'09.

Cited By

View all
  • (2025)PreSIT: Predict Cryptography Computations in SGX-Style Integrity TreesIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2024.344826444:3(882-896)Online publication date: Mar-2025
  • (2025)Enhancing IOMMU Efficiency in Heterogeneous SaCs: A Study on Cache Policy Impacts2025 International Conference on Electronics, Information, and Communication (ICEIC)10.1109/ICEIC64972.2025.10879735(1-4)Online publication date: 19-Jan-2025
  • (2025)C4ECC: Data Compression for Bandwidth Efficiency Under ECC Protection in GPUs2025 International Conference on Electronics, Information, and Communication (ICEIC)10.1109/ICEIC64972.2025.10879667(1-4)Online publication date: 19-Jan-2025
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SC '15: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis
November 2015
985 pages
ISBN:9781450337236
DOI:10.1145/2807591
  • General Chair:
  • Jackie Kern,
  • Program Chair:
  • Jeffrey S. Vetter
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 November 2015

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Conference

SC15
Sponsor:

Acceptance Rates

SC '15 Paper Acceptance Rate 79 of 358 submissions, 22%;
Overall Acceptance Rate 1,516 of 6,373 submissions, 24%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)32
  • Downloads (Last 6 weeks)4
Reflects downloads up to 01 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2025)PreSIT: Predict Cryptography Computations in SGX-Style Integrity TreesIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2024.344826444:3(882-896)Online publication date: Mar-2025
  • (2025)Enhancing IOMMU Efficiency in Heterogeneous SaCs: A Study on Cache Policy Impacts2025 International Conference on Electronics, Information, and Communication (ICEIC)10.1109/ICEIC64972.2025.10879735(1-4)Online publication date: 19-Jan-2025
  • (2025)C4ECC: Data Compression for Bandwidth Efficiency Under ECC Protection in GPUs2025 International Conference on Electronics, Information, and Communication (ICEIC)10.1109/ICEIC64972.2025.10879667(1-4)Online publication date: 19-Jan-2025
  • (2024)An analysis of cache configuration’s impacts on the miss rate of big data applications using gem5Serbian Journal of Electrical Engineering10.2298/SJEE2402217D21:2(217-234)Online publication date: 2024
  • (2024)Improving Graph Compression for Efficient Resource-Constrained Graph AnalyticsProceedings of the VLDB Endowment10.14778/3665844.366585217:9(2212-2226)Online publication date: 1-May-2024
  • (2024)Leveraging On-demand Processing to Co-optimize Scalability and Efficiency for Fully-external Graph ComputationACM Transactions on Storage10.1145/370103721:2(1-31)Online publication date: 23-Nov-2024
  • (2024)PIM-Potential: Broadening the Acceleration Reach of PIM ArchitecturesProceedings of the International Symposium on Memory Systems10.1145/3695794.3695795(1-12)Online publication date: 30-Sep-2024
  • (2024)Rethinking Page Table Structure for Fast Address Translation in GPUs: A Fixed-Size Hashed Page TableProceedings of the 2024 International Conference on Parallel Architectures and Compilation Techniques10.1145/3656019.3676900(325-337)Online publication date: 14-Oct-2024
  • (2024)BLQ: Light-Weight Locality-Aware Runtime for Blocking-Less QueuingProceedings of the 33rd ACM SIGPLAN International Conference on Compiler Construction10.1145/3640537.3641568(100-112)Online publication date: 17-Feb-2024
  • (2024)Memory Allocation Under Hardware Compression2024 57th IEEE/ACM International Symposium on Microarchitecture (MICRO)10.1109/MICRO61859.2024.00075(966-982)Online publication date: 2-Nov-2024
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media