Skip to main content
Log in

Beyond Good Partition Shapes: An Analysis of Diffusive Graph Partitioning

  • Published:
Algorithmica Aims and scope Submit manuscript

Abstract

In this paper we study the prevalent problem of graph partitioning by analyzing the diffusion-based partitioning heuristic Bubble-FOS/C, a key component of a practical successful graph partitioner (Meyerhenke et al. in J. Parallel Distrib. Comput. 69(9):750–761, 2009).

We begin by studying the disturbed diffusion scheme FOS/C, which computes the similarity measure used in Bubble-FOS/C and is therefore the most crucial component. By relating FOS/C to random walks, we obtain precise characterizations of the behavior of FOS/C on tori and hypercubes. Besides leading to new knowledge on FOS/C (and therefore also on Bubble-FOS/C), these characterizations have been recently used for the analysis of load balancing algorithms (Berenbrink et al. in Proceedings of the 22nd Annual Symposium on Discrete Algorithms, pp. 429–439, 2011).

We then regard Bubble-FOS/C, which has been shown in previous experiments to produce solutions with good partition shapes and other favorable properties. In this paper we prove that it computes a relaxed solution to an edge cut minimizing binary quadratic program (BQP). This result provides the first substantial theoretical insight why Bubble-FOS/C yields good experimental results in terms of graph partitioning metrics. Moreover, we show that in bisections computed by Bubble-FOS/C, at least one of the two parts is connected. Using the aforementioned relation between FOS/C and random walks, we prove that in vertex-transitive graphs both parts must be connected components.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. Here, the maximum degree of G is defined as \(\operatorname {deg}(G):=\max_{u\in V}\operatorname{deg}(u)\).

References

  1. Alon, N., Spencer, J.H.: The Probabilistic Method, 2nd edn. Wiley, New York (2000)

    Book  MATH  Google Scholar 

  2. Andersen, R., Chung, F.R.K., Lang, K.J.: Local graph partitioning using pagerank vectors. In: Proceedings of the 47th Annual Symposium on Foundations of Computer Science (FOCS’06), pp. 475–486 (2006)

    Google Scholar 

  3. Andersen, R., Peres, Y.: Finding sparse cuts locally using evolving sets. In: Proceedings of the 41st Annual ACM Symposium on Theory of Computing (STOC’09), pp. 235–244. ACM, New York (2009)

    Chapter  Google Scholar 

  4. Andreev, K., Räcke, H.: Balanced graph partitioning. Theory Comput. Syst. 39(6), 929–939 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  5. Bazaraa, M.S., Sherali, H.D., Shetty, C.M.: Nonlinear Programming. Theory and Algorithms, 2nd edn. Wiley, New York (1993)

    MATH  Google Scholar 

  6. Berenbrink, P., Cooper, C., Friedetzky, T., Friedrich, T., Sauerwald, T.: Randomized diffusion for indivisible loads. In: Proceedings of the 22nd Annual Symposium on Discrete Algorithms (SODA’11), pp. 429–439 (2011)

    Google Scholar 

  7. Biggs, N.: Algebraic Graph Theory. Cambridge University Press, Cambridge (1993)

    Google Scholar 

  8. Chevalier, C., Pellegrini, F.: PT-Scotch: A tool for efficient parallel graph ordering. Parallel Comput. 34(6–8), 318–331 (2008)

    Article  MathSciNet  Google Scholar 

  9. Coifman, R.R., Lafon, S., Lee, A.B., Maggioni, M., Nadler, B., Warner, F., Zucker, S.W.: Geometric diffusions as a tool for harmonic analysis and structure definition of data. Parts I and II. Proc. Natl. Acad. Sci. USA 102(21), 7426–7437 (2005)

    Article  Google Scholar 

  10. Cybenko, G.: Dynamic load balancing for distributed memory multiprocessors. J. Parallel Distrib. Comput. 7, 279–301 (1989)

    Article  Google Scholar 

  11. Dhillon, I.S., Guan, Y., Kulis, B.: Weighted graph cuts without eigenvectors: A multilevel approach. IEEE Trans. Pattern Anal. Mach. Intell. 29(11), 1944–1957 (2007)

    Article  Google Scholar 

  12. Diaconis, P., Graham, R.L., Morrison, J.A.: Asymptotic analysis of a random walk on a hypercube with many dimensions. Random Struct. Algorithms 1(1), 51–72 (1990)

    Article  MathSciNet  MATH  Google Scholar 

  13. Diekmann, R., Frommer, A., Monien, B.: Efficient schemes for nearest neighbor load balancing. Parallel Comput. 25(7), 789–812 (1999)

    Article  MathSciNet  Google Scholar 

  14. Doyle, P.G., Snell, J.L.: Random Walks and Electric Networks. Math. Assoc. of America, Washington (1984)

    MATH  Google Scholar 

  15. Feldmann, A.E., Foschini, L.: Balanced partitions of trees and applications. In: Proceedings of the 29th International Symposium on Theoretical Aspects of Computer Science, STACS 2012, pp. 100–111 (2012)

    Google Scholar 

  16. Fiedler, M.: A property of eigenvectors of nonnegative symmetric matrices and its application to graph theory. Czechoslov. Math. J. 25, 619–633 (1975)

    MathSciNet  Google Scholar 

  17. Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory of NP-Completeness. Freeman, New York (1979)

    MATH  Google Scholar 

  18. Godsil, C., Royle, G.: Algebraic Graph Theory. Springer, Berlin (2001)

    Book  MATH  Google Scholar 

  19. Golub, G.H., Loan, C.F.V.: Matrix Computations, 3rd edn. Johns Hopkins Univ. Press, Baltimore (1996)

    MATH  Google Scholar 

  20. Grady, L.: Space-variant computer vision: a graph-theoretic approach. PhD thesis, Boston University, Boston, MA (2004)

  21. Grimmett, G.R., Stirzaker, D.R.: Probability and Random Processes, 3rd edn. Oxford University Press, Oxford (2001)

    Google Scholar 

  22. Hendrickson, B., Leland, R.: An improved spectral graph partitioning algorithm for mapping parallel computations. SIAM J. Sci. Comput. 16(2), 452–469 (1995)

    Article  MathSciNet  MATH  Google Scholar 

  23. Karypis, G., Kumar, V.: Multilevel k-way partitioning scheme for irregular graphs. J. Parallel Distrib. Comput. 48(1), 96–129 (1998)

    Article  MathSciNet  Google Scholar 

  24. Kaufmann, H., Pape, H.: Clusteranalyse. In: Fahrmeir, L., Hamerle, A., Tutz, G. (eds.) Multivariate statistische Verfahren 2nd edn. de Gruyter, Berlin (1996)

    Google Scholar 

  25. Kemeny, J.G., Snell, J.L.: Finite Markov Chains. Springer, Berlin (1976)

    MATH  Google Scholar 

  26. Kernighan, B.W., Lin, S.: An efficient heuristic for partitioning graphs. Bell Syst. Tech. J. 49, 291–308 (1970)

    MATH  Google Scholar 

  27. Leighton, F.T.: Introduction to Parallel Algorithms and Architectures: Arrays, Trees, Hypercubes. San Mateo, Morgan Kaufmann (1992)

    MATH  Google Scholar 

  28. Lloyd, S.P.: Least squares quantization in PCM. IEEE Trans. Inf. Theory 28(2), 129–136 (1982)

    Article  MathSciNet  MATH  Google Scholar 

  29. Lovász, L.: Random walks on graphs: a survey. Combinatorics 2, 1–46 (1993)

    Google Scholar 

  30. Meila, M., Shi, J.: A random walks view of spectral segmentation. In: 8th International Workshop on Artificial Intelligence and Statistics (AISTATS) (2001)

    Google Scholar 

  31. Meyerhenke, H.: Disturbed diffusive processes for solving partitioning problems on graphs. PhD thesis, Universität Paderborn (2008)

  32. Meyerhenke, H.: Beyond good shapes: Diffusion-based graph partitioning is relaxed cut optimization. In: Proceedings of the 21st International Symposium on Algorithms and Computation (ISAAC’10), Part II. Lecture Notes in Computer Science, vol. 6507, pp. 387–398. Springer, Berlin (2010)

    Google Scholar 

  33. Meyerhenke, H., Monien, B., Sauerwald, T.: A new diffusion-based multilevel algorithm for computing graph partitions. J. Parallel Distrib. Comput. 69(9), 750–761 (2009) Best Paper Awards and Panel Summary: IPDPS 2008

    Article  Google Scholar 

  34. Meyerhenke, H., Monien, B., Schamberger, S.: Graph partitioning and disturbed diffusion. Parallel Comput. 35(10–11), 544–569 (2009)

    Article  Google Scholar 

  35. Meyerhenke, H., Sauerwald, T.: Analyzing disturbed diffusion on networks. In: Proceedings of the 17th International Symposium on Algorithms and Computation (ISAAC’06), pp. 429–438. Springer, Berlin (2006)

    Google Scholar 

  36. Nadler, B., Lafon, S., Coifman, R.R., Kevrekidis, I.G.: Diffusion maps, spectral clustering and eigenfunctions of Fokker-Planck operators. In: Proceedings of Advances in Neural Information Processing Systems 18 (NIPS’05) (2005)

    Google Scholar 

  37. Pellegrini, F.: A parallelisable multi-level banded diffusion scheme for computing balanced partitions with smooth boundaries. In: Proceedings of the 13th International Euro-Par Conference (EURO-PAR’07). Lecture Notes in Computer Science, vol. 4641, pp. 195–204. Springer, Berlin (2007)

    Google Scholar 

  38. Rabani, Y., Sinclair, A., Wanka, R.: Local divergence of Markov chains and the analysis of iterative load balancing schemes. In: Proceedings of the 39th Annual Symposium on Foundations of Computer Science (FOCS’98), pp. 694–705 (1998)

    Google Scholar 

  39. Räcke, H.: Optimal hierarchical decompositions for congestion minimization in networks. In: Proc. 40th Annual ACM Symposium on Theory of Computing, Victoria, British Columbia, Canada, May 17–20, 2008, pp. 255–264 (2008)

    Google Scholar 

  40. Saerens, M., Fouss, F., Yen, L., Dupont, P.: The principal components analysis of a graph, and its relationship to spectral clustering. In: Proceedings of the 15th European Conference on Machine Learning (ECML’04), pp. 371–383 (2004)

    Google Scholar 

  41. Schaeffer, S.E.: Graph clustering. Comput. Sci. Rev. 1(1), 27–64 (2007)

    Article  MathSciNet  Google Scholar 

  42. Schloegel, K., Karypis, G., Kumar, V.: Graph partitioning for high performance scientific simulations. In: The Sourcebook of Parallel Computing, pp. 491–541. San Mateo, Morgan Kaufmann (2003)

    Google Scholar 

  43. Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000)

    Article  Google Scholar 

  44. The BlueGene/L Team: An overview of the BlueGene/L supercomputer. In: Proceedings of the 2002 ACM/IEEE Conference on Supercomputing, pp. 1–22. ACM, New York (2002)

    Google Scholar 

  45. Tishby, N., Slonim, N.: Data clustering by Markovian relaxation and the information bottleneck method. In: Proceedings of Advances in Neural Information Processing Systems 13 (NIPS), pp. 640–646 (2000)

    Google Scholar 

  46. Trefethen, L.N., Bau, D.: Numerical Linear Algebra. Philadelphia, SIAM (1997)

    Book  MATH  Google Scholar 

  47. Trottenberg, U., Oosterlee, C.W., Schüller, A.: Multigrid. Academic Press, San Diego (2000)

    Google Scholar 

  48. van Dongen, S.: Graph clustering by flow simulation. PhD thesis, University of Utrecht (2000)

  49. Walshaw, C.: The graph partitioning archive. http://staffweb.cms.gre.ac.uk/~c.walshaw/partition/ (2010). Last access: 31 May 2012

  50. Xu, C., Lau, F.C.M.: Load Balancing in Parallel Computers. Kluwer, Dordrecht (1997)

    Google Scholar 

  51. Yen, L., Vanvyve, D., Wouters, F., Fouss, F., Verleysen, M., Saerens, M.: Clustering using a random-walk based distance measure. In: Proceedings of the 13th European Symposium on Artificial Neural Networks (ESANN’05), pp. 317–324 (2005)

    Google Scholar 

  52. Zha, H., He, X., Ding, C.H.Q., Gu, M., Simon, H.D.: Spectral relaxation for k-means clustering. In: Proceedings of Advances in Neural Information Processing Systems 14 (NIPS), pp. 1057–1064. MIT Press, Cambridge (2001)

    Google Scholar 

Download references

Acknowledgements

The authors thank Christoph Buchheim, Burkhard Monien, Peter Sanders, and Christian Schulz for helpful discussions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Henning Meyerhenke.

Additional information

Parts of this paper have been published in a preliminary form in the Proceedings of the 17th and 21st International Symposium on Algorithms and Computation (ISAAC 2006 and ISAAC 2010) [32, 35].

This work was partially supported by German Research Foundation (DFG) Research Training Group GK-693 of the Paderborn Institute for Scientific Computation and by DFG Priority Programme 1307 Algorithm Engineering. H. Meyerhenke was also partially supported by the CASS-MT Center led by Pacific Northwest National Laboratory and NSF Grant CNS-0708307. Parts of this work were performed while the authors were affiliated with the Department of Computer Science, University of Paderborn, Germany, and while H.M. was affiliated with Georgia Institute of Technology, Atlanta, Georgia, USA.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Meyerhenke, H., Sauerwald, T. Beyond Good Partition Shapes: An Analysis of Diffusive Graph Partitioning. Algorithmica 64, 329–361 (2012). https://doi.org/10.1007/s00453-012-9666-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00453-012-9666-y

Keywords

Navigation