skip to main content
10.1145/3087556.3087580acmconferencesArticle/Chapter ViewAbstractPublication PagesspaaConference Proceedingsconference-collections
research-article
Public Access

Julienne: A Framework for Parallel Graph Algorithms using Work-efficient Bucketing

Published:24 July 2017Publication History

ABSTRACT

Existing graph-processing frameworks let users develop efficient implementations for many graph problems, but none of them support efficiently bucketing vertices, which is needed for bucketing-based graph algorithms such as \Delta-stepping and approximate set-cover. Motivated by the lack of simple, scalable, and efficient implementations of bucketing-based algorithms, we develop the Julienne framework, which extends a recent shared-memory graph processing framework called Ligra with an interface for maintaining a collection of buckets under vertex insertions and bucket deletions.

We provide a theoretically efficient parallel implementation of our bucketing interface and study several bucketing-based algorithms that make use of it (either bucketing by remaining degree or by distance) to improve performance: the peeling algorithm for k-core (coreness), \Delta-stepping, weighted breadth-first search, and approximate set cover. The implementations are all simple and concise (under 100 lines of code). Using our interface, we develop the first work-efficient parallel algorithm for k-core in the literature with nontrivial parallelism.

We experimentally show that our bucketing implementation scales well and achieves high throughput on both synthetic and real-world workloads. Furthermore, the bucketing-based algorithms written in Julienne achieve up to 43x speedup on 72 cores with hyper-threading over well-tuned sequential baselines, significantly outperform existing work-inefficient implementations in Ligra, and either outperform or are competitive with existing special-purpose parallel codes for the same problem. We experimentally study our implementations on the largest publicly available graphs and show that they scale well in practice, processing real-world graphs with billions of edges in seconds, and hundreds of billions of edges in a few minutes. As far as we know, this is the first time that graphs at this scale have been analyzed in the main memory of a single multicore machine.

References

  1. D. Achlioptas and M. Molloy. The solution space geometry of random linear equations. Random Structures & Algorithms, 46(2), 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. J. I. Alvarez-Hamelin, L. Dall'asta, A. Barrat, and A. Vespignani. Large scale networks fingerprinting and visualization using the k-core decomposition. In Advances in Neural Information Processing Systems. 2005.Google ScholarGoogle Scholar
  3. R. Anderson and E. W. Mayr. A P-complete problem and approximations to it. Technical report, 1984.Google ScholarGoogle Scholar
  4. V. Batagelj and M. Zaversnik. An o(m) algorithm for cores decomposition of networks. CoRR, cs.DS/0310049, 2003.Google ScholarGoogle Scholar
  5. S. Beamer, K. Asanović, and D. Patterson. Direction-optimizing breadth-first search. In International Conference on High Performance Computing, Networking, Storage and Analysis, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. S. Beamer, K. Asanovic, and D. A. Patterson. The GAP benchmark suite. CoRR, abs/1508.03619, 2015.Google ScholarGoogle Scholar
  7. B. Berger, J. Rompel, and P. W. Shor. Efficient NC algorithms for set cover with applications to learning and geometry. J. Comput. Syst. Sci., 49(3), Dec. 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. G. E. Blelloch, Y. Gu, Y. Sun, and K. Tangwongsan. Parallel shortest paths using radius stepping. In ACM Symposium on Parallelism in Algorithms and Architectures, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. G. E. Blelloch, R. Peng, and K. Tangwongsan. Linear-work greedy parallel approximate set cover and variants. In ACM Symposium on Parallelism in Algorithms and Architectures, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. G. E. Blelloch, H. V. Simhadri, and K. Tangwongsan. Parallel and I/O efficient set covering algorithms. In ACM Symposium on Parallelism in Algorithms and Architectures, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. G. S. Brodal, J. L. Tr\"aff, and C. D. Zaroliagis. A parallel priority queue with constant time operations. J. Parallel Distrib. Comput., 49(1), Feb. 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. F. Chierichetti, R. Kumar, and A. Tomkins. Max-cover in map-reduce. In International Conference on World Wide Web, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. E. Cohen. Using selective path-doubling for parallel shortest-path computations. J. Algorithms, 22(1), Jan. 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. R. Cole, P. N. Klein, and R. E. Tarjan. Finding minimum spanning forests in logarithmic time and linear work using random sampling. In ACM Symposium on Parallel Algorithms and Architectures. ACM, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to Algorithms (3. ed.). MIT Press, 2009.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. N. S. Dasari, R. Desh, and M. Zubair. ParK: An efficient algorithm for k-core decomposition on multicore processors. In IEEE International Conference on Big Data, 2014. Google ScholarGoogle ScholarCross RefCross Ref
  17. A. A. Davidson, S. Baxter, M. Garland, and J. D. Owens. Work-efficient parallel GPU methods for single-source shortest paths. In IEEE International Parallel and Distributed Processing, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. R. B. Dial. Algorithm 360: Shortest-path forest with topological ordering [H]. Commun. ACM, 12(11), Nov. 1969. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. E. W. Dijkstra. A note on two problems in connexion with graphs. Numer. Math., 1(1), Dec. 1959. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. B. Elser and A. Montresor. An evaluation study of bigdata frameworks for graph processing. In IEEE International Conference on Big Data, 2013. Google ScholarGoogle ScholarCross RefCross Ref
  21. M. L. Fredman and R. E. Tarjan. Fibonacci heaps and their uses in improved network optimization algorithms. J. ACM, 34(3), July 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. J. E. Gonzalez, Y. Low, H. Gu, D. Bickson, and C. Guestrin. PowerGraph: Distributed graph-parallel computation on natural graphs. In USENIX Symposium on Operating Systems Design and Implementation, 2012.Google ScholarGoogle Scholar
  23. Y. Gu, J. Shun, Y. Sun, and G. E. Blelloch. A top-down parallel semisort. In ACM Symposium on Parallelism in Algorithms and Architectures, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. M. A. Hassaan, M. Burtscher, and K. Pingali. Ordered vs. unordered: A comparison of parallelism and work-efficiency in irregular algorithms. In ACM Symposium on Principles and Practice of Parallel Programming, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. J. Jaja. Introduction to Parallel Algorithms. Addison-Wesley Professional, 1992.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. J. Jiang, M. Mitzenmacher, and J. Thaler. Parallel peeling algorithms. ACM Trans. Parallel Comput., 3(1), Jan. 2017.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. D. S. Johnson. Approximation algorithms for combinatorial problems. Journal of Computer and System Sciences, 9(3), 1974. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. W. Khaouid, M. Barsky, V. Srinivasan, and A. Thomo. k-core decomposition of large networks on a single PC. Proc. VLDB Endow., 9(1), Sept. 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. P. N. Klein and S. Subramanian. A randomized parallel algorithm for single-source shortest paths. J. Algorithms, 25(2), Nov. 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. R. Kumar, B. Moseley, S. Vassilvitskii, and A. Vattani. Fast greedy algorithms in mapreduce and streaming. ACM Trans. Parallel Comput., 2(3), Sept. 2015.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. H. Kwak, C. Lee, H. Park, and S. Moon. What is twitter, a social network or a news media? In International Conference on World Wide Web, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Y. Low, D. Bickson, J. Gonzalez, C. Guestrin, A. Kyrola, and J. M. Hellerstein. Distributed graphLab: A framework for machine learning and data mining in the cloud. Proc. VLDB Endow., 5(8), Apr. 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Y. Low, J. Gonzalez, A. Kyrola, D. Bickson, C. Guestrin, and J. M. Hellerstein. GraphLab: A new parallel framework for machine learning. In Conference on Uncertainty in Artificial Intelligence, July 2010.Google ScholarGoogle Scholar
  34. K. Madduri, D. A. Bader, J. W. Berry, and J. R. Crobak. An experimental study of a parallel shortest path algorithm for solving large-scale graph instances. In Meeting on Algorithm Engineering & Experiments, 2007. Google ScholarGoogle ScholarCross RefCross Ref
  35. S. Maleki, D. Nguyen, A. Lenharth, M. Garzarán, D. Padua, and K. Pingali. DSMR: A parallel algorithm for single-source shortest path problem. In International Conference on Supercomputing, 2016.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. S. Maleki, D. Nguyen, A. Lenharth, M. Garzarán, D. Padua, and K. Pingali. DSMR: A parallel algorithm for single-source shortest path problem. In International Conference on Supercomputing, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. G. Malewicz, M. H. Austern, A. J. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski. Pregel: A system for large-scale graph processing. In ACM SIGMOD International Conference on Management of Data, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. D. W. Matula and L. L. Beck. Smallest-last ordering and clustering and graph coloring algorithms. J. ACM, 30(3), July 1983. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. F. McSherry, M. Isard, and D. G. Murray. Scalability! But at what COST? In Workshop on Hot Topics in Operating Systems, 2015.Google ScholarGoogle Scholar
  40. R. Meusel, S. Vigna, O. Lehmberg, and C. Bizer. The graph structure in the web--analyzed on different aggregation levels. The Journal of Web Science, 1(1), 2015. Google ScholarGoogle ScholarCross RefCross Ref
  41. U. Meyer and P. Sanders. Δ-stepping: a parallelizable shortest path algorithm. Journal of Algorithms, 49(1), 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. A. Montresor, F. D. Pellegrini, and D. Miorandi. Distributed k-core decomposition. IEEE Transactions on Parallel and Distributed Systems, 24(2), 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. D. Nguyen, A. Lenharth, and K. Pingali. A lightweight infrastructure for graph analytics. In ACM Symposium on Operating Systems Principles, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. R. C. Paige and C. P. Kruskal. Parallel algorithms for shortest path problems. In International Conference on Parallel Processing, 1985.Google ScholarGoogle Scholar
  45. K. Pechlivanidou, D. Katsaros, and L. Tassiulas. MapReduce-based distributed k-shell decomposition for online social networks. In IEEE World Congress on Services, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. S. Rajagopalan and V. V. Vazirani. Primal-dual RNC approximation algorithms for set cover and covering integer programs. SIAM J. Comput., 28(2), Feb. 1999.Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. A. E. Sariyüce and A. Pinar. Fast hierarchy construction for dense subgraphs. Proc. VLDB Endow., 10(3), Nov. 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. A. E. Sariyuce, C. Seshadhri, and A. Pinar. Parallel local algorithms for core, truss, and nucleus decompositions. arXiv preprint arXiv:1704.00386, 2017.Google ScholarGoogle Scholar
  49. S. B. Seidman. Network structure and minimum degree. Social Networks, 5(3), 1983. Google ScholarGoogle ScholarCross RefCross Ref
  50. H. Shi and T. H. Spencer. Time-work tradeoffs of the single-source shortest paths problem. J. Algorithms, 30(1), Jan. 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. K. Shin, T. Eliassi-Rad, and C. Faloutsos. CoreScope: Graph mining using k-core analysis--patterns, anomalies and algorithms. In IEEE International Conference on Data Mining, 2016. Google ScholarGoogle ScholarCross RefCross Ref
  52. J. Shun and G. E. Blelloch. Ligra: A lightweight graph processing framework for shared memory. In ACM SIGPLAN Symposium On Principles and Practice of Parallel Programming, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. J. Shun, G. E. Blelloch, J. T. Fineman, and P. B. Gibbons. Reducing contention through priority updates. In ACM Symposium on Parallelism in Algorithms and Architectures, 2013.Google ScholarGoogle Scholar
  54. J. Shun, G. E. Blelloch, J. T. Fineman, P. B. Gibbons, A. Kyrola, H. V. Simhadri, and K. Tangwongsan. Brief announcement: the problem based benchmark suite. In ACM Symposium on Parallelism in Algorithms and Architectures. ACM, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. J. Shun, L. Dhulipala, and G. Blelloch. A simple and practical linear-work parallel algorithm for connectivity. In ACM Symposium on Parallelism in Algorithms and Architectures, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. J. Shun, L. Dhulipala, and G. Blelloch. Smaller and faster: Parallel processing of compressed graphs with Ligra+ In IEEE Data Compression Conference, 2015.Google ScholarGoogle Scholar
  57. T. H. Spencer. Time-work tradeoffs for parallel algorithms. J. ACM, 44(5), Sept. 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. S. Stergiou and K. Tsioutsiouliklis. Set cover at web scale. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. J. Ugander, B. Karrer, L. Backstrom, and C. Marlow. The anatomy of the facebook social graph. arXiv preprint arXiv:1111.4503, 2011.Google ScholarGoogle Scholar
  60. Y. Wang, A. A. Davidson, Y. Pan, Y. Wu, A. Riffel, and J. D. Owens. Gunrock: a high-performance graph processing library on the GPU. In ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. S. Wuchty and E. Almaas. Peeling the yeast protein network. Proteomics, 5(2), 2005. Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Julienne: A Framework for Parallel Graph Algorithms using Work-efficient Bucketing

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        SPAA '17: Proceedings of the 29th ACM Symposium on Parallelism in Algorithms and Architectures
        July 2017
        392 pages
        ISBN:9781450345934
        DOI:10.1145/3087556

        Copyright © 2017 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 24 July 2017

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        SPAA '17 Paper Acceptance Rate31of127submissions,24%Overall Acceptance Rate447of1,461submissions,31%

        Upcoming Conference

        SPAA '24

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader