skip to main content
10.1145/3392717.3392739acmconferencesArticle/Chapter ViewAbstractPublication PagesicsConference Proceedingsconference-collections
research-article

V-Combiner: speeding-up iterative graph processing on a shared-memory platform with vertex merging

Published: 29 June 2020 Publication History

Abstract

An iterative graph algorithm applies a vertex update operation to all vertices in a graph in every iteration. For large graphs, this computation is costly. However, in practice, not all the updates contribute equally to the end result and, in fact, an exact result may not be needed. In this work, we leverage these insights to speed-up iterative graph algorithms. We propose a mechanism to identify the less important vertices and omit computations for them.
Our scheme, called V-Combiner, is a deterministic, fast, and application-transparent technique to construct an approximate graph to enable faster execution. The main idea behind V-Combiner is to merge certain vertices into hubs, which are vertices that have many connections and contribute heavily to the end result of the algorithm. We also propose an inexpensive correction step to recover the contribution of the merged vertices to get higher accuracy.
We evaluate V-Combiner on 4 different applications and 5 datasets. For 44-threaded runs, V-Combiner achieves an average end-to-end speedup of 1.25X over the conventional system, with an accuracy of 91.8%. It also shows a better performance-accuracy trade-off than the existing sparsification and k-core techniques.

References

[1]
Takuya Akiba and Yosuke Yano. 2016. Compact and scalable graph neighborhood sketching. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 685--694.
[2]
Hongji Bao and Edward Y Chang. 2010. AdHeat: An influence-based diffusion model for propagating hints to match ads. In Proceedings of the 19th international conference on World wide web. ACM, 71--80.
[3]
Joshua Batson, Daniel A Spielman, Nikhil Srivastava, and Shang-Hua Teng. 2013. Spectral sparsification of graphs: theory and algorithms. Commun. ACM 56, 8 (2013), 87--94.
[4]
Scott Beamer, Krste Asanović, and David Patterson. 2015. The GAP benchmark suite. arXiv preprint arXiv:1508.03619 (2015).
[5]
Timothy A Davis and Yifan Hu. 2011. The University of Florida sparse matrix collection. ACM Transactions on Mathematical Software (TOMS) 38, 1 (2011), 1--25.
[6]
Laxman Dhulipala, Guy Blelloch, and Julian Shun. 2017. Julienne: A framework for parallel graph algorithms using work-efficient bucketing. In Proceedings of the 29th ACM Symposium on Parallelism in Algorithms and Architectures. ACM, 293--304.
[7]
Dorit Dor, Shay Halperin, and Uri Zwick. 2000. All-pairs almost shortest paths. SIAM J. Comput. 29, 5 (2000), 1740--1759.
[8]
Hadi Esmaeilzadeh, Adrian Sampson, Luis Ceze, and Doug Burger. 2012. Neural acceleration for general-purpose approximate programs. In 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE, 449--460.
[9]
Adam Fidel, Francisco Coral Sabido, Colton Riedel, Nancy M Amato, and Lawrence Rauchwerger. 2016. Fast approximate distance queries in unweighted graphs using bounded asynchrony. In International Workshop on Languages and Compilers for Parallel Computing. Springer, 40--54.
[10]
Yasuhiro Fujiwara, Makoto Nakatsuji, Hiroaki Shiokawa, Takeshi Mishima, and Makoto Onizuka. 2013. Fast and exact top-k algorithm for pagerank. In Twenty-Seventh AAAI Conference on Artificial Intelligence.
[11]
Wolfgang Gatterbauer. 2017. The linearization of belief propagation on pairwise markov random fields. In Thirty-First AAAI Conference on Artificial Intelligence.
[12]
Inigo Goiri, Ricardo Bianchini, Santosh Nagarakatte, and Thu D Nguyen. 2015. ApproxHadoop: Bringing Approximations to MapReduce Frameworks. In International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). 383--397.
[13]
Priya Govindan, Chenghong Wang, Chumeng Xu, Hongyu Duan, and Sucheta Soundarajan. 2017. The k-peak decomposition: Mapping the global structure of graphs. In Proceedings of the 26th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 1441--1450.
[14]
Anand Padmanabha Iyer, Zaoxing Liu, Xin Jin, Shivaram Venkataraman, Vladimir Braverman, and Ion Stoica. 2018. ASAP: Fast, Approximate Graph Pattern Mining at Scale. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18). 745--761.
[15]
Anand Padmanabha Iyer, Aurojit Panda, Shivaram Venkataraman, Mosharaf Chowdhury, Aditya Akella, Scott Shenker, and Ion Stoica. 2018. Bridging the GAP: towards approximate graph analytics. In Proceedings of the 1st ACM SIGMOD Joint International Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA). ACM, 10.
[16]
Min-Hee Jang, Christos Faloutsos, Sang-Wook Kim, U Kang, and Jiwoon Ha. 2016. Pin-trust: Fast trust propagation exploiting positive, implicit, and negative information. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. ACM, 629--638.
[17]
Kyomin Jung, Wooram Heo, and Wei Chen. 2012. Irie: Scalable and robust influence maximization in social networks. In 2012 IEEE 12th International Conference on Data Mining. IEEE, 918--923.
[18]
U Kang, Duen Horng Chau, and Christos Faloutsos. 2011. Mining large graphs: Algorithms, inference, and discoveries. In 2011 IEEE 27th International Conference on Data Engineering. IEEE, 243--254.
[19]
U Kang, Duen Horng, et al. 2010. Inference of beliefs on billion-scale graphs. Workshop on Large-scale Data Mining: Theory and Applications (2010).
[20]
David R Karger and Clifford Stein. 1996. A new approach to the minimum cut problem. Journal of the ACM (JACM) 43, 4 (1996), 601--640.
[21]
Wissam Khaouid, Marina Barsky, Venkatesh Srinivasan, and Alex Thomo. 2015. K-core decomposition of large networks on a single PC. Proceedings of the VLDB Endowment 9, 1 (2015), 13--23.
[22]
Jon M Kleinberg. 1999. Authoritative sources in a hyperlinked environment. Journal of the ACM (JACM) 46, 5 (1999), 604--632.
[23]
Yusuke Kozawa, Toshiyuki Amagasa, and Hiroyuki Kitagawa. 2017. GPU-Accelerated Graph Clustering via Parallel Label Propagation. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. ACM, 567--576.
[24]
Jérôme Kunegis. 2013. Konect: the koblenz network collection. In Proceedings of the 22nd International Conference on World Wide Web. ACM, 1343--1350.
[25]
Amlan Kusum, Keval Vora, Rajiv Gupta, and Iulian Neamtiu. 2016. Efficient processing of large graphs via input reduction. In Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing. ACM, 245--257.
[26]
Aapo Kyrola, Guy Blelloch, and Carlos Guestrin. 2012. GraphChi: Large-Scale Graph Computation on Just a PC. In Symposium on Operating Systems Design and Implementation (OSDI 12). 31--46.
[27]
Jure Leskovec and Andrej Krevl. 2014. SNAP Datasets: Stanford Large Network Dataset Collection. http://snap.stanford.edu/data.
[28]
Yike Liu, Tara Safavi, Abhilash Dighe, and Danai Koutra. 2018. Graph summarization methods and applications: A survey. ACM Computing Surveys (CSUR) 51, 3 (2018), 62.
[29]
Jasmina Malicevic, Baptiste Lepers, and Willy Zwaenepoel. 2017. Everything you always wanted to know about multicore graph processing but were afraid to ask. In 2017 USENIX Annual Technical Conference (USENIX ATC 17). 631--643.
[30]
Andrew McGregor. 2014. Graph stream algorithms: a survey. ACM SIGMOD Record 43, 1 (2014), 9--20.
[31]
Robert Meusel, Oliver Lehmberg, Christian Bizer, and Sebastiano Vigna. 2019. Web Data Commons - Hyperlink Graphs. http://webdatacommons.org/hyperlinkgraph/.
[32]
Ioannis Mitliagkas, Michael Borokhovich, Alexandros G Dimakis, and Constantine Caramanis. 2015. FrogWild!: Fast PageRank approximations on graph engines. Proceedings of the VLDB Endowment 8, 8 (2015), 874--885.
[33]
Hamza Omar, Masab Ahmad, and Omer Khan. 2017. GraphTuner: An input dependence aware loop perforation scheme for efficient execution of approximated graph algorithms. In 2017 IEEE International Conference on Computer Design (ICCD). IEEE, 201--208.
[34]
Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. 1999. The PageRank citation ranking: Bringing order to the web. Technical Report. Stanford InfoLab.
[35]
Ali Pinar, Tamara G Kolda, and Changbin Peng. 2014. Accelerating Community Detection by using k-core subgraphs. Technical Report. Sandia National Lab.(SNL-CA), Livermore, CA (United States).
[36]
Martin Rinard. 2006. Probabilistic accuracy bounds for fault-tolerant computations that discard tasks. In Proceedings of the 20th annual international conference on Supercomputing. ACM, 324--334.
[37]
Tamás Sarlós, Adrás A Benczúr, Károly Csalogány, Dániel Fogaras, and Balázs Rácz. 2006. To randomize or not to randomize: space optimal summaries for hyperlink analysis. In Proceedings of the 15th international conference on World Wide Web. ACM, 297--306.
[38]
Zechao Shang and Jeffrey Xu Yu. 2014. Auto-approximation of graph computing. Proceedings of the VLDB Endowment 7, 14 (2014), 1833--1844.
[39]
Stelios Sidiroglou-Douskos, Sasa Misailovic, Henry Hoffmann, and Martin Rinard. 2011. Managing performance vs. accuracy trade-offs with loop perforation. In Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering. ACM, 124--134.
[40]
Xin Sui, Andrew Lenharth, Donald S Fussell, and Keshav Pingali. 2016. Proactive control of approximate programs. ACM SIGOPS Operating Systems Review 50, 2 (2016), 607--621.
[41]
Konstantin Tretyakov, Abel Armas-Cervantes, Luciano García-Bañuelos, Jaak Vilo, and Marlon Dumas. 2011. Fast fully dynamic landmark-based estimation of shortest path distances in very large graphs. In Proceedings of the 20th ACM international conference on Information and knowledge management. 1785--1794.
[42]
Johan Ugander and Lars Backstrom. 2013. Balanced label propagation for partitioning massive graphs. In Proceedings of the sixth ACM international conference on Web search and data mining. ACM, 507--516.
[43]
Biao Xiang, Qi Liu, Enhong Chen, Hui Xiong, Yi Zheng, and Yu Yang. 2013. Pagerank with priors: An influence propagation perspective. In Twenty-Third International Joint Conference on Artificial Intelligence.
[44]
Jaemin Yoo, Saehan Jo, and U Kang. 2017. Supervised Belief Propagation: Scalable Supervised Inference on Attributed Networks. In 2017 IEEE International Conference on Data Mining (ICDM). IEEE, 595--604.
[45]
Xiaojin Zhu and Zoubin Ghahramani. 2002. Learning from labeled and unlabeled data with label propagation. Technical Report. Carnegie Mellon University.

Cited By

View all
  • (2022)Software-defined floating-point number formats and their application to graph processingProceedings of the 36th ACM International Conference on Supercomputing10.1145/3524059.3532360(1-17)Online publication date: 28-Jun-2022
  • (2022)GraphGuess: Approximate Graph Processing System with Adaptive CorrectionEuro-Par 2022: Parallel Processing10.1007/978-3-031-12597-3_18(285-300)Online publication date: 22-Aug-2022

Index Terms

  1. V-Combiner: speeding-up iterative graph processing on a shared-memory platform with vertex merging

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ICS '20: Proceedings of the 34th ACM International Conference on Supercomputing
    June 2020
    499 pages
    ISBN:9781450379830
    DOI:10.1145/3392717
    • General Chairs:
    • Eduard Ayguadé,
    • Wen-mei Hwu,
    • Program Chairs:
    • Rosa M. Badia,
    • H. Peter Hofstee
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 29 June 2020

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. approximations
    2. graph processing
    3. shared-memory platforms

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    ICS '20
    Sponsor:
    ICS '20: 2020 International Conference on Supercomputing
    June 29 - July 2, 2020
    Spain, Barcelona

    Acceptance Rates

    Overall Acceptance Rate 629 of 2,180 submissions, 29%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)10
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 12 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)Software-defined floating-point number formats and their application to graph processingProceedings of the 36th ACM International Conference on Supercomputing10.1145/3524059.3532360(1-17)Online publication date: 28-Jun-2022
    • (2022)GraphGuess: Approximate Graph Processing System with Adaptive CorrectionEuro-Par 2022: Parallel Processing10.1007/978-3-031-12597-3_18(285-300)Online publication date: 22-Aug-2022

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media