research-article

V-Combiner: speeding-up iterative graph processing on a shared-memory platform with vertex merging

Authors:

Azin Heidarshenas,

Dimitrios Skarlatos,

Sasa Misailovic,

Josep TorrellasAuthors Info & Claims

ICS '20: Proceedings of the 34th ACM International Conference on Supercomputing

Article No.: 9, Pages 1 - 13

https://doi.org/10.1145/3392717.3392739

Published: 29 June 2020 Publication History

Abstract

An iterative graph algorithm applies a vertex update operation to all vertices in a graph in every iteration. For large graphs, this computation is costly. However, in practice, not all the updates contribute equally to the end result and, in fact, an exact result may not be needed. In this work, we leverage these insights to speed-up iterative graph algorithms. We propose a mechanism to identify the less important vertices and omit computations for them.

Our scheme, called V-Combiner, is a deterministic, fast, and application-transparent technique to construct an approximate graph to enable faster execution. The main idea behind V-Combiner is to merge certain vertices into hubs, which are vertices that have many connections and contribute heavily to the end result of the algorithm. We also propose an inexpensive correction step to recover the contribution of the merged vertices to get higher accuracy.

We evaluate V-Combiner on 4 different applications and 5 datasets. For 44-threaded runs, V-Combiner achieves an average end-to-end speedup of 1.25X over the conventional system, with an accuracy of 91.8%. It also shows a better performance-accuracy trade-off than the existing sparsification and k-core techniques.

References

[1]

Takuya Akiba and Yosuke Yano. 2016. Compact and scalable graph neighborhood sketching. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 685--694.

Digital Library

[2]

Hongji Bao and Edward Y Chang. 2010. AdHeat: An influence-based diffusion model for propagating hints to match ads. In Proceedings of the 19th international conference on World wide web. ACM, 71--80.

Digital Library

[3]

Joshua Batson, Daniel A Spielman, Nikhil Srivastava, and Shang-Hua Teng. 2013. Spectral sparsification of graphs: theory and algorithms. Commun. ACM 56, 8 (2013), 87--94.

Digital Library

[4]

Scott Beamer, Krste Asanović, and David Patterson. 2015. The GAP benchmark suite. arXiv preprint arXiv:1508.03619 (2015).

[5]

Timothy A Davis and Yifan Hu. 2011. The University of Florida sparse matrix collection. ACM Transactions on Mathematical Software (TOMS) 38, 1 (2011), 1--25.

Digital Library

[6]

Laxman Dhulipala, Guy Blelloch, and Julian Shun. 2017. Julienne: A framework for parallel graph algorithms using work-efficient bucketing. In Proceedings of the 29th ACM Symposium on Parallelism in Algorithms and Architectures. ACM, 293--304.

Digital Library

[7]

Dorit Dor, Shay Halperin, and Uri Zwick. 2000. All-pairs almost shortest paths. SIAM J. Comput. 29, 5 (2000), 1740--1759.

Digital Library

[8]

Hadi Esmaeilzadeh, Adrian Sampson, Luis Ceze, and Doug Burger. 2012. Neural acceleration for general-purpose approximate programs. In 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE, 449--460.

Digital Library

[9]

Adam Fidel, Francisco Coral Sabido, Colton Riedel, Nancy M Amato, and Lawrence Rauchwerger. 2016. Fast approximate distance queries in unweighted graphs using bounded asynchrony. In International Workshop on Languages and Compilers for Parallel Computing. Springer, 40--54.

[10]

Yasuhiro Fujiwara, Makoto Nakatsuji, Hiroaki Shiokawa, Takeshi Mishima, and Makoto Onizuka. 2013. Fast and exact top-k algorithm for pagerank. In Twenty-Seventh AAAI Conference on Artificial Intelligence.

Digital Library

[11]

Wolfgang Gatterbauer. 2017. The linearization of belief propagation on pairwise markov random fields. In Thirty-First AAAI Conference on Artificial Intelligence.

Digital Library

[12]

Inigo Goiri, Ricardo Bianchini, Santosh Nagarakatte, and Thu D Nguyen. 2015. ApproxHadoop: Bringing Approximations to MapReduce Frameworks. In International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). 383--397.

Digital Library

[13]

Priya Govindan, Chenghong Wang, Chumeng Xu, Hongyu Duan, and Sucheta Soundarajan. 2017. The k-peak decomposition: Mapping the global structure of graphs. In Proceedings of the 26th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 1441--1450.

Digital Library

[14]

Anand Padmanabha Iyer, Zaoxing Liu, Xin Jin, Shivaram Venkataraman, Vladimir Braverman, and Ion Stoica. 2018. ASAP: Fast, Approximate Graph Pattern Mining at Scale. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18). 745--761.

[15]

Anand Padmanabha Iyer, Aurojit Panda, Shivaram Venkataraman, Mosharaf Chowdhury, Aditya Akella, Scott Shenker, and Ion Stoica. 2018. Bridging the GAP: towards approximate graph analytics. In Proceedings of the 1st ACM SIGMOD Joint International Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA). ACM, 10.

Digital Library

[16]

Min-Hee Jang, Christos Faloutsos, Sang-Wook Kim, U Kang, and Jiwoon Ha. 2016. Pin-trust: Fast trust propagation exploiting positive, implicit, and negative information. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. ACM, 629--638.

Digital Library

[17]

Kyomin Jung, Wooram Heo, and Wei Chen. 2012. Irie: Scalable and robust influence maximization in social networks. In 2012 IEEE 12th International Conference on Data Mining. IEEE, 918--923.

Digital Library

[18]

U Kang, Duen Horng Chau, and Christos Faloutsos. 2011. Mining large graphs: Algorithms, inference, and discoveries. In 2011 IEEE 27th International Conference on Data Engineering. IEEE, 243--254.

Digital Library

[19]

U Kang, Duen Horng, et al. 2010. Inference of beliefs on billion-scale graphs. Workshop on Large-scale Data Mining: Theory and Applications (2010).

[20]

David R Karger and Clifford Stein. 1996. A new approach to the minimum cut problem. Journal of the ACM (JACM) 43, 4 (1996), 601--640.

Digital Library

[21]

Wissam Khaouid, Marina Barsky, Venkatesh Srinivasan, and Alex Thomo. 2015. K-core decomposition of large networks on a single PC. Proceedings of the VLDB Endowment 9, 1 (2015), 13--23.

Digital Library

[22]

Jon M Kleinberg. 1999. Authoritative sources in a hyperlinked environment. Journal of the ACM (JACM) 46, 5 (1999), 604--632.

Digital Library

[23]

Yusuke Kozawa, Toshiyuki Amagasa, and Hiroyuki Kitagawa. 2017. GPU-Accelerated Graph Clustering via Parallel Label Propagation. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. ACM, 567--576.

Digital Library

[24]

Jérôme Kunegis. 2013. Konect: the koblenz network collection. In Proceedings of the 22nd International Conference on World Wide Web. ACM, 1343--1350.

Digital Library

[25]

Amlan Kusum, Keval Vora, Rajiv Gupta, and Iulian Neamtiu. 2016. Efficient processing of large graphs via input reduction. In Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing. ACM, 245--257.

Digital Library

[26]

Aapo Kyrola, Guy Blelloch, and Carlos Guestrin. 2012. GraphChi: Large-Scale Graph Computation on Just a PC. In Symposium on Operating Systems Design and Implementation (OSDI 12). 31--46.

[27]

Jure Leskovec and Andrej Krevl. 2014. SNAP Datasets: Stanford Large Network Dataset Collection. http://snap.stanford.edu/data.

[28]

Yike Liu, Tara Safavi, Abhilash Dighe, and Danai Koutra. 2018. Graph summarization methods and applications: A survey. ACM Computing Surveys (CSUR) 51, 3 (2018), 62.

Digital Library

[29]

Jasmina Malicevic, Baptiste Lepers, and Willy Zwaenepoel. 2017. Everything you always wanted to know about multicore graph processing but were afraid to ask. In 2017 USENIX Annual Technical Conference (USENIX ATC 17). 631--643.

[30]

Andrew McGregor. 2014. Graph stream algorithms: a survey. ACM SIGMOD Record 43, 1 (2014), 9--20.

Digital Library

[31]

Robert Meusel, Oliver Lehmberg, Christian Bizer, and Sebastiano Vigna. 2019. Web Data Commons - Hyperlink Graphs. http://webdatacommons.org/hyperlinkgraph/.

[32]

Ioannis Mitliagkas, Michael Borokhovich, Alexandros G Dimakis, and Constantine Caramanis. 2015. FrogWild!: Fast PageRank approximations on graph engines. Proceedings of the VLDB Endowment 8, 8 (2015), 874--885.

Digital Library

[33]

Hamza Omar, Masab Ahmad, and Omer Khan. 2017. GraphTuner: An input dependence aware loop perforation scheme for efficient execution of approximated graph algorithms. In 2017 IEEE International Conference on Computer Design (ICCD). IEEE, 201--208.

[34]

Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. 1999. The PageRank citation ranking: Bringing order to the web. Technical Report. Stanford InfoLab.

[35]

Ali Pinar, Tamara G Kolda, and Changbin Peng. 2014. Accelerating Community Detection by using k-core subgraphs. Technical Report. Sandia National Lab.(SNL-CA), Livermore, CA (United States).

[36]

Martin Rinard. 2006. Probabilistic accuracy bounds for fault-tolerant computations that discard tasks. In Proceedings of the 20th annual international conference on Supercomputing. ACM, 324--334.

Digital Library

[37]

Tamás Sarlós, Adrás A Benczúr, Károly Csalogány, Dániel Fogaras, and Balázs Rácz. 2006. To randomize or not to randomize: space optimal summaries for hyperlink analysis. In Proceedings of the 15th international conference on World Wide Web. ACM, 297--306.

Digital Library

[38]

Zechao Shang and Jeffrey Xu Yu. 2014. Auto-approximation of graph computing. Proceedings of the VLDB Endowment 7, 14 (2014), 1833--1844.

Digital Library

[39]

Stelios Sidiroglou-Douskos, Sasa Misailovic, Henry Hoffmann, and Martin Rinard. 2011. Managing performance vs. accuracy trade-offs with loop perforation. In Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering. ACM, 124--134.

Digital Library

[40]

Xin Sui, Andrew Lenharth, Donald S Fussell, and Keshav Pingali. 2016. Proactive control of approximate programs. ACM SIGOPS Operating Systems Review 50, 2 (2016), 607--621.

[41]

Konstantin Tretyakov, Abel Armas-Cervantes, Luciano García-Bañuelos, Jaak Vilo, and Marlon Dumas. 2011. Fast fully dynamic landmark-based estimation of shortest path distances in very large graphs. In Proceedings of the 20th ACM international conference on Information and knowledge management. 1785--1794.

Digital Library

[42]

Johan Ugander and Lars Backstrom. 2013. Balanced label propagation for partitioning massive graphs. In Proceedings of the sixth ACM international conference on Web search and data mining. ACM, 507--516.

Digital Library

[43]

Biao Xiang, Qi Liu, Enhong Chen, Hui Xiong, Yi Zheng, and Yu Yang. 2013. Pagerank with priors: An influence propagation perspective. In Twenty-Third International Joint Conference on Artificial Intelligence.

[44]

Jaemin Yoo, Saehan Jo, and U Kang. 2017. Supervised Belief Propagation: Scalable Supervised Inference on Attributed Networks. In 2017 IEEE International Conference on Data Mining (ICDM). IEEE, 595--604.

[45]

Xiaojin Zhu and Zoubin Ghahramani. 2002. Learning from labeled and unlabeled data with label propagation. Technical Report. Carnegie Mellon University.

Cited By

Vandierendonck HRauchwerger LCameron KNikolopoulos DPnevmatikatos D(2022)Software-defined floating-point number formats and their application to graph processingProceedings of the 36th ACM International Conference on Supercomputing10.1145/3524059.3532360(1-17)Online publication date: 28-Jun-2022
https://dl.acm.org/doi/10.1145/3524059.3532360
Ramezani MKandemir MSivasubramaniam A(2022)GraphGuess: Approximate Graph Processing System with Adaptive CorrectionEuro-Par 2022: Parallel Processing10.1007/978-3-031-12597-3_18(285-300)Online publication date: 22-Aug-2022
https://dl.acm.org/doi/10.1007/978-3-031-12597-3_18

Index Terms

V-Combiner: speeding-up iterative graph processing on a shared-memory platform with vertex merging
1. Computing methodologies
  1. Parallel computing methodologies

Recommendations

Radiocoloring in planar graphs: complexity and approximations
Mathematical foundations of computer science 2000

The Frequency Assignment Problem (FAP) in radio networks is the problem of assigning frequencies to transmitters, by exploiting frequency reuse while keeping signal interference to acceptable levels. The FAP is usually modelled by variations of the ...
A survey on dynamic graph processing on GPUs: concepts, terminologies and systems
Abstract
Graphs that are used to model real-world entities with vertices and relationships among entities with edges, have proven to be a powerful tool for describing real-world problems in applications. In most real-world scenarios, entities and their ...
GraphService: Topology-aware Constructor for Large-scale Graph Applications
Graph-based services are becoming integrated into everyday life through graph applications and graph learning systems. While traditional graph processing approaches boast excellent throughput with millisecond-level processing time, the construction phase ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ICS '20: Proceedings of the 34th ACM International Conference on Supercomputing

June 2020

499 pages

ISBN:9781450379830

DOI:10.1145/3392717

General Chairs:
Eduard Ayguadé
Universitat Politècnica de Catalunya and Barcelona Supercomputing Center
,
Wen-mei Hwu
University of Illinois at Urbana-Champaign
,
Program Chairs:
Rosa M. Badia
Barcelona Supercomputing Center and Universitat Politècnica de Catalunya
,
H. Peter Hofstee
IBM Austin

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGARCH: ACM Special Interest Group on Computer Architecture

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 June 2020

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Science Foundation

Conference

ICS '20

Sponsor:

SIGARCH

ICS '20: 2020 International Conference on Supercomputing

June 29 - July 2, 2020

Spain, Barcelona

Acceptance Rates

Overall Acceptance Rate 629 of 2,180 submissions, 29%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
168
Total Downloads

Downloads (Last 12 months)10
Downloads (Last 6 weeks)0

Reflects downloads up to 12 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Vandierendonck HRauchwerger LCameron KNikolopoulos DPnevmatikatos D(2022)Software-defined floating-point number formats and their application to graph processingProceedings of the 36th ACM International Conference on Supercomputing10.1145/3524059.3532360(1-17)Online publication date: 28-Jun-2022
https://dl.acm.org/doi/10.1145/3524059.3532360
Ramezani MKandemir MSivasubramaniam A(2022)GraphGuess: Approximate Graph Processing System with Adaptive CorrectionEuro-Par 2022: Parallel Processing10.1007/978-3-031-12597-3_18(285-300)Online publication date: 22-Aug-2022
https://dl.acm.org/doi/10.1007/978-3-031-12597-3_18

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten