Abstract
In this paper we study the prevalent problem of graph partitioning by analyzing the diffusion-based partitioning heuristic Bubble-FOS/C, a key component of a practical successful graph partitioner (Meyerhenke et al. in J. Parallel Distrib. Comput. 69(9):750–761, 2009).
We begin by studying the disturbed diffusion scheme FOS/C, which computes the similarity measure used in Bubble-FOS/C and is therefore the most crucial component. By relating FOS/C to random walks, we obtain precise characterizations of the behavior of FOS/C on tori and hypercubes. Besides leading to new knowledge on FOS/C (and therefore also on Bubble-FOS/C), these characterizations have been recently used for the analysis of load balancing algorithms (Berenbrink et al. in Proceedings of the 22nd Annual Symposium on Discrete Algorithms, pp. 429–439, 2011).
We then regard Bubble-FOS/C, which has been shown in previous experiments to produce solutions with good partition shapes and other favorable properties. In this paper we prove that it computes a relaxed solution to an edge cut minimizing binary quadratic program (BQP). This result provides the first substantial theoretical insight why Bubble-FOS/C yields good experimental results in terms of graph partitioning metrics. Moreover, we show that in bisections computed by Bubble-FOS/C, at least one of the two parts is connected. Using the aforementioned relation between FOS/C and random walks, we prove that in vertex-transitive graphs both parts must be connected components.





Similar content being viewed by others
Notes
Here, the maximum degree of G is defined as \(\operatorname {deg}(G):=\max_{u\in V}\operatorname{deg}(u)\).
References
Alon, N., Spencer, J.H.: The Probabilistic Method, 2nd edn. Wiley, New York (2000)
Andersen, R., Chung, F.R.K., Lang, K.J.: Local graph partitioning using pagerank vectors. In: Proceedings of the 47th Annual Symposium on Foundations of Computer Science (FOCS’06), pp. 475–486 (2006)
Andersen, R., Peres, Y.: Finding sparse cuts locally using evolving sets. In: Proceedings of the 41st Annual ACM Symposium on Theory of Computing (STOC’09), pp. 235–244. ACM, New York (2009)
Andreev, K., Räcke, H.: Balanced graph partitioning. Theory Comput. Syst. 39(6), 929–939 (2006)
Bazaraa, M.S., Sherali, H.D., Shetty, C.M.: Nonlinear Programming. Theory and Algorithms, 2nd edn. Wiley, New York (1993)
Berenbrink, P., Cooper, C., Friedetzky, T., Friedrich, T., Sauerwald, T.: Randomized diffusion for indivisible loads. In: Proceedings of the 22nd Annual Symposium on Discrete Algorithms (SODA’11), pp. 429–439 (2011)
Biggs, N.: Algebraic Graph Theory. Cambridge University Press, Cambridge (1993)
Chevalier, C., Pellegrini, F.: PT-Scotch: A tool for efficient parallel graph ordering. Parallel Comput. 34(6–8), 318–331 (2008)
Coifman, R.R., Lafon, S., Lee, A.B., Maggioni, M., Nadler, B., Warner, F., Zucker, S.W.: Geometric diffusions as a tool for harmonic analysis and structure definition of data. Parts I and II. Proc. Natl. Acad. Sci. USA 102(21), 7426–7437 (2005)
Cybenko, G.: Dynamic load balancing for distributed memory multiprocessors. J. Parallel Distrib. Comput. 7, 279–301 (1989)
Dhillon, I.S., Guan, Y., Kulis, B.: Weighted graph cuts without eigenvectors: A multilevel approach. IEEE Trans. Pattern Anal. Mach. Intell. 29(11), 1944–1957 (2007)
Diaconis, P., Graham, R.L., Morrison, J.A.: Asymptotic analysis of a random walk on a hypercube with many dimensions. Random Struct. Algorithms 1(1), 51–72 (1990)
Diekmann, R., Frommer, A., Monien, B.: Efficient schemes for nearest neighbor load balancing. Parallel Comput. 25(7), 789–812 (1999)
Doyle, P.G., Snell, J.L.: Random Walks and Electric Networks. Math. Assoc. of America, Washington (1984)
Feldmann, A.E., Foschini, L.: Balanced partitions of trees and applications. In: Proceedings of the 29th International Symposium on Theoretical Aspects of Computer Science, STACS 2012, pp. 100–111 (2012)
Fiedler, M.: A property of eigenvectors of nonnegative symmetric matrices and its application to graph theory. Czechoslov. Math. J. 25, 619–633 (1975)
Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory of NP-Completeness. Freeman, New York (1979)
Godsil, C., Royle, G.: Algebraic Graph Theory. Springer, Berlin (2001)
Golub, G.H., Loan, C.F.V.: Matrix Computations, 3rd edn. Johns Hopkins Univ. Press, Baltimore (1996)
Grady, L.: Space-variant computer vision: a graph-theoretic approach. PhD thesis, Boston University, Boston, MA (2004)
Grimmett, G.R., Stirzaker, D.R.: Probability and Random Processes, 3rd edn. Oxford University Press, Oxford (2001)
Hendrickson, B., Leland, R.: An improved spectral graph partitioning algorithm for mapping parallel computations. SIAM J. Sci. Comput. 16(2), 452–469 (1995)
Karypis, G., Kumar, V.: Multilevel k-way partitioning scheme for irregular graphs. J. Parallel Distrib. Comput. 48(1), 96–129 (1998)
Kaufmann, H., Pape, H.: Clusteranalyse. In: Fahrmeir, L., Hamerle, A., Tutz, G. (eds.) Multivariate statistische Verfahren 2nd edn. de Gruyter, Berlin (1996)
Kemeny, J.G., Snell, J.L.: Finite Markov Chains. Springer, Berlin (1976)
Kernighan, B.W., Lin, S.: An efficient heuristic for partitioning graphs. Bell Syst. Tech. J. 49, 291–308 (1970)
Leighton, F.T.: Introduction to Parallel Algorithms and Architectures: Arrays, Trees, Hypercubes. San Mateo, Morgan Kaufmann (1992)
Lloyd, S.P.: Least squares quantization in PCM. IEEE Trans. Inf. Theory 28(2), 129–136 (1982)
Lovász, L.: Random walks on graphs: a survey. Combinatorics 2, 1–46 (1993)
Meila, M., Shi, J.: A random walks view of spectral segmentation. In: 8th International Workshop on Artificial Intelligence and Statistics (AISTATS) (2001)
Meyerhenke, H.: Disturbed diffusive processes for solving partitioning problems on graphs. PhD thesis, Universität Paderborn (2008)
Meyerhenke, H.: Beyond good shapes: Diffusion-based graph partitioning is relaxed cut optimization. In: Proceedings of the 21st International Symposium on Algorithms and Computation (ISAAC’10), Part II. Lecture Notes in Computer Science, vol. 6507, pp. 387–398. Springer, Berlin (2010)
Meyerhenke, H., Monien, B., Sauerwald, T.: A new diffusion-based multilevel algorithm for computing graph partitions. J. Parallel Distrib. Comput. 69(9), 750–761 (2009) Best Paper Awards and Panel Summary: IPDPS 2008
Meyerhenke, H., Monien, B., Schamberger, S.: Graph partitioning and disturbed diffusion. Parallel Comput. 35(10–11), 544–569 (2009)
Meyerhenke, H., Sauerwald, T.: Analyzing disturbed diffusion on networks. In: Proceedings of the 17th International Symposium on Algorithms and Computation (ISAAC’06), pp. 429–438. Springer, Berlin (2006)
Nadler, B., Lafon, S., Coifman, R.R., Kevrekidis, I.G.: Diffusion maps, spectral clustering and eigenfunctions of Fokker-Planck operators. In: Proceedings of Advances in Neural Information Processing Systems 18 (NIPS’05) (2005)
Pellegrini, F.: A parallelisable multi-level banded diffusion scheme for computing balanced partitions with smooth boundaries. In: Proceedings of the 13th International Euro-Par Conference (EURO-PAR’07). Lecture Notes in Computer Science, vol. 4641, pp. 195–204. Springer, Berlin (2007)
Rabani, Y., Sinclair, A., Wanka, R.: Local divergence of Markov chains and the analysis of iterative load balancing schemes. In: Proceedings of the 39th Annual Symposium on Foundations of Computer Science (FOCS’98), pp. 694–705 (1998)
Räcke, H.: Optimal hierarchical decompositions for congestion minimization in networks. In: Proc. 40th Annual ACM Symposium on Theory of Computing, Victoria, British Columbia, Canada, May 17–20, 2008, pp. 255–264 (2008)
Saerens, M., Fouss, F., Yen, L., Dupont, P.: The principal components analysis of a graph, and its relationship to spectral clustering. In: Proceedings of the 15th European Conference on Machine Learning (ECML’04), pp. 371–383 (2004)
Schaeffer, S.E.: Graph clustering. Comput. Sci. Rev. 1(1), 27–64 (2007)
Schloegel, K., Karypis, G., Kumar, V.: Graph partitioning for high performance scientific simulations. In: The Sourcebook of Parallel Computing, pp. 491–541. San Mateo, Morgan Kaufmann (2003)
Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000)
The BlueGene/L Team: An overview of the BlueGene/L supercomputer. In: Proceedings of the 2002 ACM/IEEE Conference on Supercomputing, pp. 1–22. ACM, New York (2002)
Tishby, N., Slonim, N.: Data clustering by Markovian relaxation and the information bottleneck method. In: Proceedings of Advances in Neural Information Processing Systems 13 (NIPS), pp. 640–646 (2000)
Trefethen, L.N., Bau, D.: Numerical Linear Algebra. Philadelphia, SIAM (1997)
Trottenberg, U., Oosterlee, C.W., Schüller, A.: Multigrid. Academic Press, San Diego (2000)
van Dongen, S.: Graph clustering by flow simulation. PhD thesis, University of Utrecht (2000)
Walshaw, C.: The graph partitioning archive. http://staffweb.cms.gre.ac.uk/~c.walshaw/partition/ (2010). Last access: 31 May 2012
Xu, C., Lau, F.C.M.: Load Balancing in Parallel Computers. Kluwer, Dordrecht (1997)
Yen, L., Vanvyve, D., Wouters, F., Fouss, F., Verleysen, M., Saerens, M.: Clustering using a random-walk based distance measure. In: Proceedings of the 13th European Symposium on Artificial Neural Networks (ESANN’05), pp. 317–324 (2005)
Zha, H., He, X., Ding, C.H.Q., Gu, M., Simon, H.D.: Spectral relaxation for k-means clustering. In: Proceedings of Advances in Neural Information Processing Systems 14 (NIPS), pp. 1057–1064. MIT Press, Cambridge (2001)
Acknowledgements
The authors thank Christoph Buchheim, Burkhard Monien, Peter Sanders, and Christian Schulz for helpful discussions.
Author information
Authors and Affiliations
Corresponding author
Additional information
Parts of this paper have been published in a preliminary form in the Proceedings of the 17th and 21st International Symposium on Algorithms and Computation (ISAAC 2006 and ISAAC 2010) [32, 35].
This work was partially supported by German Research Foundation (DFG) Research Training Group GK-693 of the Paderborn Institute for Scientific Computation and by DFG Priority Programme 1307 Algorithm Engineering. H. Meyerhenke was also partially supported by the CASS-MT Center led by Pacific Northwest National Laboratory and NSF Grant CNS-0708307. Parts of this work were performed while the authors were affiliated with the Department of Computer Science, University of Paderborn, Germany, and while H.M. was affiliated with Georgia Institute of Technology, Atlanta, Georgia, USA.
Rights and permissions
About this article
Cite this article
Meyerhenke, H., Sauerwald, T. Beyond Good Partition Shapes: An Analysis of Diffusive Graph Partitioning. Algorithmica 64, 329–361 (2012). https://doi.org/10.1007/s00453-012-9666-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00453-012-9666-y