Regularity lemmas for clustering graphs

https://doi.org/10.1016/j.aam.2019.101961Get rights and content

Abstract

For a graph G with a positive clustering coefficient C, it is proved that for any positive constant ϵ, the vertex set of G can be partitioned into finitely many parts, say S1,S2,,Sm, such that all but an ϵ fraction of the triangles in G are contained in the projections of tripartite subgraphs induced by (Si,Sj,Sk) which are ϵ-Δ-regular, where the size m of the partition depends only on ϵ and C. The notion of ϵ-Δ-regular, which is a variation of ϵ-regular for the original regularity lemma, concerns triangle density instead of edge density. Several generalizations and variations of the regularity lemma for clustering graphs are derived.

Introduction

One of the celebrated results of Szemerédi [19] is the so-called regularity lemma which asserts that for any graph on n vertices, the vertex set can be partitioned into finitely many parts so that almost all but ϵn2 edges are contained in the union of bipartite subgraphs between pairs of the parts that are random-like under the notion of ϵ-regular. A bipartite graph is said to be ϵ-regular, if the edge density on any induced sub-bipartite graph on at least ϵn vertices differs from the edge density of the bipartite graph by at most ϵ. The regularity lemma has been a powerful tool in graph theory with numerous applications [11], [14], [17] because any graph (with more than ϵn2 edges) can be approximated by a finite graph in the sense that each vertex of the finite graph can be replaced by a subset of vertices and the bipartite subgraphs between any two subsets are quasirandom.

A major deficiency of the regularity lemma is the fact that it is useful only for graphs with a positive edge density since the error bound of approximation is of order ϵn2. There have been numerous attempts for possible extensions of the regularity lemma to sparse graphs, mostly with either additional assumptions [13] or weakened conditions [9], [18].

In this paper, we give a regularity lemma for clustering graphs without any restriction on edge density. We note that many information networks and social network graphs contain a large number of triangles and thus have nontrivial clustering coefficients [16], [20]. Such a clustering effect is one of the main characteristics of the so-called “small world phenomenon” that appear in a variety of real world graphs [15]. There are many research papers concerning finding dense subgraphs [2], [3] or partitioning into dense clique-like subgraphs [12] for such small-world graphs.

In this paper, we focus on graphs with nontrivial clustering coefficients (or triangle density). Let tG denote the number of triangles in G and pG denote the number of paths of two edges. The clustering coefficient CG is defined to be (see [16])CG=3tGpG. If pG=0, we define CG=0. We say G is a clustering graph if its clustering coefficient CG is a positive constant independent of the number of vertices of G.

Theorem 1

For any ϵ>0 and any graph G with clustering coefficient C, the vertex set of G can be partitioned into S1,S2,,Sm for some m depending only on ϵ and C, such that all but ϵtG triangles in G are contained in the projections of tripartite subgraphs with vertex set (Si,Sj,Sk) that are ϵ-Δ-regular.

The detailed definitions of various terms above will be given in Section 2. The proof of the regularity lemma for clustering graphs are quite similar to the previous proofs for the original regularity lemma [4], [14], [19] except for using an index function involving clustering coefficients. In Section 3 we give a proof of the regularity lemma for tripartite graphs with nontrivial clustering coefficient. The proof is self-contained and relatively short. In Section 4 we then consider a strong version of ϵ-Δ-regular for tripartite graphs. In Section 5 we give a proof of Theorem 1 and a weighted version of the regularity lemma both of which are straightforward applications of the regularity lemma for tripartite graphs with nontrivial clustering coefficients. In Section 6, we consider several generalizations of the regularity lemma. We will give a regularity for graphs which is dense in 4-cycles and, in general, graphs which contain a relatively large number of any specified graph (in comparison with its subgraphs). Some remarks and problems are mentioned in Section 7.

Section snippets

Preliminaries

We consider a tripartite graph H with the vertex set as the disjoint union V1V2V3. Any triangle in H has one vertex in each Vi for i=1,2,3. Let tH denote the number of triangles in H. Let pH denote the number of triples (v1,v2,v3) with viVi and {v1,v2},{v2,v3} are edges in H. The clustering coefficient of a tripartite graph is defined to becH=tHpH

For a graph G=(V,E), it is helpful to consider the associated tripartite graph G which has vertex set as the disjoint union V1V2V3 where Vi is a

A regularity lemma for tripartite graphs

We first prove the following version of the regularity lemma for tripartite clustering graphs.

Theorem 2

For any ϵ>0 and any tripartite graph H with clustering coefficient c, the vertex set V1V2V3 of H can be partitioned into S1,S2,,Sm for some m depending only on ϵ and c, such that all but ϵtH triangles in H are contained in the ϵ-Δ-regular tripartite subgraphs with vertex set SiSjSk.

Proof

For a partition P consisting of partitions Pi of Vi, for i=1,2,3, we define the index function I(P):I(P)=I(P1,P2,P3)

A strong regularity lemma for tripartite graphs

For a tripartite graph H with vertex set T1T2T3, we consider some variations of clustering coefficient. Recall thatcH=cH(1)=c(T1,T2,T3)=t(T1,T2,T3)p(T1,T2,T3).We definepH(2)=p(T2,T3,T1),cH(2)=c(T2,T3,T1),andpH(3)=p(T3,T1,T2),cH(3)=c(T3,T1,T2). For j=1,2,3, we say a tripartite graph with vertex set T1T2T3 is ϵ-Δ(j)-regular if for any SiTi for i=1,2,3, with p(j)(S1,S2,S3)ϵp(j)(T1,T2,T3), we have|c(j)(S1,S2,S3)c(j)(T1,T2,T3)|ϵ.

We say a tripartite graph with vertex set T1T2T3 is strongly ϵ

Regularity lemmas for triangle-dense graphs

In a graph G=(V,E), we consider the associated tripartite graph G=G(V1,V2,V3), where Vi's are copies of V. For any three subsets S1,S2,S3V, not necessarily distinct, we consider the associated induced subgraph of G, denoted by G(T1,T2,T3), where Ti is the copy of Si in Vi. For a triple (v1,v2,v3) where viSi, we note that v1,v2,v3 form a triangle in G if and only if (v1,v2,v3) forms a triangle in G(T1,T2,T3). In other words, the set of triangles in G are in one-to-one correspondence with

Several regularity lemmas for general clustering graphs

Many information networks are bipartite and therefore do not have nontrivial clustering coefficient as defined in (1). Nevertheless, some of these graphs contain a relatively large number of 4-cycles C4. For a graph G, we can define the C4-clustering coefficient of G, defined byC(G;C4)=4N(G;C4)N(G;P4) where N(G,H) denotes the number of subgraph of G isomorphic to H. The usual clustering coefficient is just C(G;C3).

Before we define ϵ-C4-regular, we consider the 4-partite graph with vertex set V1

Problems and remarks

A natural question is to derive a reasonable upper bound for the size of the ϵ-Δ-regular partition for clustering graphs. A crude upper bound as mentioned in the proof of Theorem 1 is of tower type, namely, a tower of 2's of height proportional to 1/ϵ5 where C is the clustering coefficient and ϵ is the desired accuracy. For the original regularity lemma, Gowers [10] gave a lower bound for the size of the partition as a tower of 2's of height 1/ϵ1/16. With a slightly different definition of

References (20)

  • N. Alon et al.

    The algorithmic aspects of the regularity lemma

    J. Algorithms

    (1994)
  • R. Andersen

    A local algorithm for finding dense subgraphs

  • M. Charikar

    Greedy approximation algorithms for finding dense components in a graph

  • F. Chung

    Regularity lemmas for hypergraphs and quasi-randomness

    Random Structures Algorithms

    (1991)
  • F. Chung et al.

    Quasi-random hypergraphs

    Random Structures Algorithms

    (1990)
  • F. Chung et al.

    Sparse quasi-random graphs

    Combinatorica

    (2002)
  • F. Chung et al.

    Quasi-random graphs

    Combinatorica

    (1989)
  • J. Fox et al.

    A tight lower bound for Szemerédi's regularity lemma

    Combinatorica

    (2017)
  • A. Frieze et al.

    A simple algorithm for constructing Szemerédi's regularity partition

    Electron. J. Combin.

    (1999)
  • W.T. Gowers

    Lower bounds of tower type for Szemerédi's uniformity lemma

    Geom. Funct. Anal. GAFA

    (1997)
There are more references available in the full text version of this article.

Cited by (2)

View full text