Abstract
GraphChi [16] is a recent high-performance system for external memory (disk-based) graph computations. It uses the Parallel Sliding Windows (PSW) algorithm which is based on the so-called Gauss-Seidel type of iterative computation, in which updates to values are immediately visible within the iteration. In contrast, previous external memory graph algorithms are based on the synchronous model where computation can only observe values from previous iterations. In this work, we study implementations of connected components and minimum spanning forest on PSW and show that they have a competitive I/O bound of O(sort(E)log(V/M)) and also work well in practice. We also show that our MSF implementation is competitive with a specialized algorithm proposed by Dementiev et al. [10] while being much simpler.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Abello, J., Buchsbaum, A.L., Westbrook, J.R.: A functional approach to external graph algorithms. Algorithmica 32(3), 437–458 (2002)
Aggarwal, A., Vitter, J., et al.: The input/output complexity of sorting and related problems. Commun. ACM 31(9), 1116–1127 (1988)
Arge, L., Brodal, G.S., Toma, L.: On external-memory MST, SSSP and multi-way planar graph separation. J. Algorithms 53(2), 186–206 (2004)
Backstrom, L., Huttenlocher, D., Kleinberg, J., Lan, X.: Group formation in large social networks: Membership, growth, and evolution. In: 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 44–54. ACM, New York (2006)
Bertsekas, D.P., Tsitsiklis, J.N.: Parallel and Distributed Computation: Numerical Methods. Prentice-Hall, Upper Saddle River (1989)
Boldi, P., Santini, M., Vigna, S.: A large time-aware web graph. ACM SIGIR Forum 42(2), 33–38 (2008)
Boruvka, O.: O jistem problemu minimalnim (about a certain minimal problem). In: Prace, Moravske Prirodovedecke Spolecnosti, pp. 37–58 (1926)
Chiang, Y.J., Goodrich, M.T., Grove, E.F., Tamassia, R., Vengroff, D.E., Vitter, J.S.: External-memory graph algorithms. In: 6th Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 139–149. SIAM, Philadelphia (1995)
Dementiev, R., Kettner, L., Sanders, P.: STXXL: Standard template library for XXL data sets. In: Brodal, G.S., Leonardi, S. (eds.) ESA 2005. LNCS, vol. 3669, pp. 640–651. Springer, Heidelberg (2005)
Dementiev, R., Sanders, P., Schultes, D., Sibeyn, J.: Engineering an external memory minimum spanning tree algorithm. In: Levy, J.-J., Mayr, E.W., Mitchell, J.C. (eds.) Exploring New Frontiers of Theoretical Informatics. IFIP, vol. 155, pp. 195–208. Springer, Heidelberg (2004)
Gonzalez, J., Low, Y., Guestrin, C.: Residual splash for optimally parallelizing belief propagation. In: International Conference on Artificial Intelligence and Statistics. pp. 177–184. JMLR (2009)
Han, W.S., Lee, S., Park, K., Lee, J.H., Kim, M.S., Kim, J., Yu, H.: TurboGraph: A fast parallel graph engine handling billion-scale graphs in a single PC. In: 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 77–85. ACM, New York (2013)
Katriel, I., Meyer, U.: Elementary graph algorithms in external memory. In: Meyer, U., Sanders, P., Sibeyn, J. (eds.) Algorithms for Memory Hierarchies. LNCS, vol. 2625, pp. 62–84. Springer, Heidelberg (2003)
Kumar, V., Schwabe, E.J.: Improved algorithms and data structures for solving graph problems in external memory. In: 8th IEEE Symposium on Parallel and Distributed Processing, pp. 169–176. IEEE Press, New York (1996)
Kwak, H., Lee, C., Park, H., Moon, S.: What is Twitter, a social network or a news media? In: 19th International Conference on World Wide Web, pp. 591–600. ACM, New York (2010)
Kyrola, A., Blelloch, G., Guestrin, C.: GraphChi: Large-scale graph computation on just a PC. In: 10th USENIX Symposium on Operating Systems Design and Implementation, vol. 8, pp. 31–46. USENIX (2012)
Lambert, O., Sibeyn, J.F., Stadtwald, I.: Parallel and external list ranking and connected components. In: International Conference of Parallel and Distributed Computing and Systems, pp. 454–460. IASTED (1999)
Leskovec, J., Lang, K.J., Dasgupta, A., Mahoney, M.W.: Community structure in large networks: Natural cluster sizes and the absence of large well-defined clusters. Internet Mathematics 6(1), 29–123 (2009)
Reif, J.H.: Synthesis of Parallel Algorithms. Morgan Kaufmann, San Francisco (1993)
Roy, A., Mihailovic, I., Zwaenepoel, W.: X-Stream: edge-centric graph processing using streaming partitions. In: 24th ACM Symposium on Operating Systems Principles, pp. 472–488. ACM, New York (2013)
Sibeyn, J.F.: External connected components. In: Hagerup, T., Katajainen, J. (eds.) SWAT 2004. LNCS, vol. 3111, pp. 468–479. Springer, Heidelberg (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Kyrola, A., Shun, J., Blelloch, G. (2014). Beyond Synchronous: New Techniques for External-Memory Graph Connectivity and Minimum Spanning Forest. In: Gudmundsson, J., Katajainen, J. (eds) Experimental Algorithms. SEA 2014. Lecture Notes in Computer Science, vol 8504. Springer, Cham. https://doi.org/10.1007/978-3-319-07959-2_11
Download citation
DOI: https://doi.org/10.1007/978-3-319-07959-2_11
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-07958-5
Online ISBN: 978-3-319-07959-2
eBook Packages: Computer ScienceComputer Science (R0)