1 Introduction

The broadcast congested clique model is a message-passing model of distributed computation where n nodes communicate with each other in synchronous rounds over a complete network [1, 2, 4, 6,7,8, 14, 16, 18, 21]. The joint input to the n nodes is an undirected graph G on the same set of nodes, with node u receiving the list of its neighbors in G. Nodes have pairwise distinct identities, which are numbers upper bounded by some polynomial in n. The identity of node u is denoted by \(\text {id}(u)\). All nodes know n, the size of the network.

Each node broadcasts, in each round, a single b-bit message along each of its \(n-1\) communication links. The size of the messages is known as the bandwidth of the system, and it is a parameter of the model (which could grow with n). Broadcasting is equivalent to writing the messages on a whiteboard, visible to every node. In each round every node produces its message using its input, the contents of the whiteboard, and a sequence of public random bits.

Typically, the goal of an algorithm is to decide whether the input graph G belongs to some graph class \({\mathcal C}\). An algorithm is correct if it terminates with every node knowing the correct answer (that is, whether \(G \in {\mathcal C}\)) with high probability. The round complexity of an algorithm is the maximum number of rounds over all possible input graphs (of size n).

Few fast algorithms are known in the broadcast congested clique model. In fact, if the bandwidth \(b= {\mathcal O}(\log n)\), then there exist one-round algorithms for deciding whether the input graph G has bounded degeneracy [6], contains a fixed forest [14], is a cograph [21]. Also, if \(b = {\mathcal O}(\text {polylog}n)\), then there is a one-round algorithm for deciding whether G is connected [1, 2].

One way to increase the computing power of the model is to lift the broadcast restriction and to allow the nodes the possibility of sending different messages through different links. This general model, known as unicast congested clique [14], gives the possibility to perform a load balancing procedure efficiently. Such enormous intrinsic power has allowed some authors to provide fast algorithms for solving natural problems: an \(\mathcal {O}(\log \log \log n)\)-round algorithm for finding a 3-ruling set [17], \(\mathcal {O}(n^{0.158})\)-round algorithms for counting triangles, for counting 4-cycles and for computing the girth [12], an \(\mathcal {O}(1)\)-round algorithm for detecting a 4-cycle [12], an \(\mathcal {O}(1)\)-round algorithm constructing a minimum spanning tree [20].

Another very natural, much more limited and less dramatic way to increase the computing power of the broadcast congested clique model, is to expand the local knowledge the nodes initially have about G. The idea of a constant-radius neighborhood independent of the size of the network is present in the research on local algorithms pioneered by Angluin [3], Linial [23] and Naor and Stockmeyer [25].

We therefore use the \(KT_r\) notion, introduced by Awebuch et al. [5], which means Knowledge of Topology up to distance r, excluding edges with both endpoints at distance r. More precisely, we call \({\textsc {BClique}}[r]\) the extension of the broadcast congested clique model where each node u “sees” (receives as input) the set of all edges lying on a path of length at most r, starting in u. Hence, \({\textsc {BClique}}[1]\) corresponds to the classical broadcast congested clique model, and is simply denoted \({\textsc {BClique}}\).

One of the most studied problems in the \({\textsc {BClique}}\) model is related to the existence of cycles in the input graph G. The first natural question one can formulate, that is, deciding whether G contains a cycle has been, until now, the only question amenable to a simple algorithm. In fact, Becker et al. [6] show that a simple set of logarithmic size messages is sufficient to recognize, deterministically and in one round, whether the input graph G is acyclic.

Any other natural question concerning cycles has given strong negative results. Drucker et al. [14] showed that, if \({\ell } \ge 4\), then any algorithm that decides whether the \(\ell \)-node cycle \(C_{\ell }\) is a subgraph (or an induced subgraph) of the input graph G needs \(\varOmega (ex(n,C_{\ell })/n b)\) rounds, where ex(nH) is the Turán number of H, i.e., the maximal number of edges of an n-node graph which does not contain a subgraph isomorphic to H. Remark that \(ex(n,C_{\ell })\) is \(\varTheta (n^2)\) for odd values \(\ell \), and \(\varTheta (n^{1+1/\ell })\) for even values (assuming the Erdős Girth ConjectureFootnote 1 [15]).

Moreover, even in the very powerful unicast congested clique model, the algorithms for cycle detection are rather slow. In fact, the best algorithm for detecting \(C_\ell \) uses \(\mathcal O(n^{\rho } \log n)\) rounds, for every \(\ell \ge 3\), where \(\rho < 0.15715\) [12]. The only exception being the detection of squares \(C_4\), for which an extremely elegant \(\mathcal O(1)\)-round algorithm has been devised [12].

In this paper, we mainly study two problems: \(\textsc {Cycle}_{\le k}\) and \(\textsc {Cycle}_{> k}\). The first one consists in deciding whether the graph contains an induced cycle of length at most k (i.e., deciding whether the girth of the graph is at most k). The second problem, complementary to the first one, consists in detecting the existence of an induced cycle of length at least \(k+1\). This difficulty to find fast algorithms for problems related to the existence of cycles is what makes the positive results of this paper surprising.

Note that the existence of an induced cycle of length at most k is equivalent to the existence of a cycle (not necessarily induced) of length at most k. On the other hand, as we are going to explain later, finding an algorithm for detecting induced cycles of length at least \(k+1\) requires much more involved arguments than finding algorithms for detecting cycles (not necessarily induced) of length at least \(k+1\).

Our Results

In Sect. 3 we show that there is a deterministic, one-round \({\textsc {BClique}}\) algorithm for solving problem \(\textsc {Cycle}_{\le k}\) with bandwidth \(\mathcal {O}(n^{2/k} \log n)\) if k is even, and bandwidth \(\mathcal {O}(n^{2/(k-1)}\log n)\) if k is odd. The main ingredient for proving this is a deterministic, one-round algorithm given in [24] that reconstructs a graph of degeneracy at most d in the \({{\textsc {BClique}}}\) model using bandwidth \(\mathcal {O}(d \log n)\) (reconstruction means that every node knows all the edges of the input graph). Recall that the degeneracy of G is the minimum d such that, by iteratively removing vertices of degree at most d, we obtain the empty graph.

We also show that previous upper bounds match the lower bounds up to logarithmic factors, even in the \({\textsc {BClique}}[\lfloor k/4 \rfloor ]\) model allowing randomization and multiple rounds. More precisely, if we allowed the nodes to see up to distance \(\lfloor k/4 \rfloor \), to use public coins and multiple rounds, then the number of rounds R and bandwidth b needed to solve \(\textsc {Cycle}_{\le k}\) is such that \(R\cdot b = \varOmega ( n^{2/k})\) if k is even, and \(R\cdot b = \varOmega ( n^{2/(k-1)})\) if k is odd (in both cases \(k \ge 4\)), for every \(\epsilon \)-error algorithm. (For these lower bounds we assume the Erdős Girth Conjecture).

We start Sect. 4 by giving a useful, “local” characterization of graphs which do not have long induced cycles. Using this, together with a technique inspired by the linear sketches of [1, 19], we show that, if each node is allowed to see at distance \({\lfloor k/2 \rfloor + 1}\), then a polylogarithmic number of bits is sufficient for detecting in two rounds an induced cycle of length strictly larger than k. More precisely, we prove that for every \(k\ge 3\), there exists a two-round algorithm in the \({\textsc {BClique}}[\lfloor k/2 \rfloor + 1]\) model that solves \(\textsc {Cycle}_{>k}\) with high probability using bandwidth \(\mathcal {O}(\log ^4 n)\). The approach is based on the randomized algorithm of Ahn et al. [1] for computing a spanning forest in the \({\textsc {BClique}}\) model with bandwidth \(\mathcal {O}(\log ^3 n)\). With respect to lower bounds, we prove that any one-round, public-coin \({{\textsc {BClique}}}[\lfloor k/3 \rfloor ]\) algorithm that solves \(\textsc {Cycle}_{> k}\) needs the bandwidth to be at least \(\varOmega (n/ \log n)\). Note that the case \(k=3\) corresponds to decide whether the input graph G is chordal, i.e., whether the only induced cycles in G are triangles.

The results of this article are summarized in Tables 1 and 2.

Table 1. Results concerning problem \(\textsc {Cycle}_{\le k}\). The lower bounds assume the Erdős Girth Conjecture.
Table 2. Results concerning problem \(\textsc {Cycle}_{> k}\)

2 Basic Definitions and Notations

Let \(G = (V,E)\) be an undirected graph, and let \(u \in V\). We call \(N_G(u) = \{v \in V | uv \in E \}\) and \(N_G[u]= N_G(u) \cup \{u\}\), the open and closed neighborhoods of u, respectively. Similarly, for \(U \subseteq V\), \(N_G(U) =\cup _{u\in U} N_G(u) - U\) and \(N_G[U]= N_G(U) \cup \{U\}\) are the open and closed neighborhoods of U, respectively. When no ambiguity is possible, we will omit the subindices. By extension, we denote \(N^r[u]\) the set of vertices at distance at most r from u, and we call it closed r -neighborhood of u. Analogously, \(N^r(u) = N^r[u] \setminus \{u\}\) is the open r -neighborhood of u.

Graph \(H = (V',E')\) is a subgraph of \(G=(V,E)\) if \(V' \subseteq V\) and \(E' \subseteq E\). If, for any edge \(uv \in E\) with \(u,v \in V'\) we also have \(uv \in E'\), we say that H is an induced subgraph of G, or that H is the subgraph of G induced by \(V'\). Given a vertex subset S, the subgraph induced by S is denoted G[S]. We simply write \(G-S\) for \(G[V \setminus S]\). Also, if F is a subset of edges, we denote by \(G - F\) the graph obtained from G by removing the edges of F. The degeneracy of a graph G is the minimum d such that, by iteratively removing vertices of degree at most d, we obtain the empty graph.

If S is a vertex subset of \(G = (V,E)\), the contraction of S consists in replacing the whole subset S by a unique vertex \(v_S\), such that the neighborhood of \(v_S\) in the new graph is \(N_G(S)\) while \(G - S\) remains unchanged. A connected component of G is the inclusion-maximal set of vertices inducing a connected graph. An induced path (resp. cycle) of graph G is called a chordless path (cycle). A graph is called k -chordal if it does not contain any induced cycle of length greater than k. The 3-chordal graphs are known as chordal graphs.

The \({\textsc {BClique}}[r]\) model is formally defined as follows. There are n nodes which are given distinct identities (IDs), that we assume for simplicity to be numbers between 1 and n. In this paper we consider the situation where the joint input to the nodes is a graph G. More precisely, each node u receives as input the subgraph of radius r around itself (i.e., all edges lying on a path of length at most r, starting in u). Nodes execute an algorithm, broadcasting b-bit messages in synchronous rounds. Their goal is to compute some function f that depends on G. When an algorithm stops every node must know f(G). Function f defines the problem to be solved. A \(0-1\) function corresponds to a decision problem.

An algorithm may be deterministic or randomized. We distinguish two sub-cases of randomized algorithms: the private-coin setting, where each node flips its own coin; and the public-coin setting, where the coin is shared between all nodes. (In this work we are going to consider public-coin algorithms only). An \(\varepsilon \)-error algorithm \({\mathcal A}\) that computes a function f is a randomized algorithm such that, for every input graph G, \(\Pr \{ \mathcal {A} \text { outputs } f(G) \} \ge 1- \varepsilon \). In the case where \(\varepsilon \rightarrow 0\) as \(n \rightarrow \infty \), we say that \({\mathcal A}\) computes f with high probability (whp).

We consider several decision problems in this paper: \(\textsc {Cycle}_{= k}\), \(\textsc {Cycle}_{\le k}\) and \(\textsc {Cycle}_{> k}\). These problems consist in deciding, respectively, whether the input graph has an induced cycle of length exactly k, at most k, and strictly larger than k. Problems \(\textsc {Sub-Cycle}_{= k}\), \(\textsc {Sub-Cycle}_{\le k}\) and \(\textsc {Sub-Cycle}_{> k}\) are defined in a similar way, but in this case we ask whether the input graph has a cycle as a subgraph (induced or not) of length k, at most k, and strictly larger than k.

3 Detection of Short Cycles

Let us denote by ex(nk) the maximum number of edges in an n-vertex graph not containing a cycle of length at most k. A very helpful result in the study of graphs without short cycles is the one that relates the nonexistence of short cycles in G with the degeneracy of G. More precisely, graphs with no cycles of length at most k (as subgraphs) have a relatively small degeneracy.

Proposition 1

([14]). Graphs with no cycles of length at most k are of degeneracy \(\mathcal {O}(ex(n,k)/n)\).

In [24] it is shown that graphs of degeneracy at most d can be recognized, and even reconstructed, by a one-round algorithm in the \({\textsc {BClique}}\) model using bandwidth \(\mathcal {O}(d \cdot \log n)\). Recall that reconstruction means that at the end of the algorithm, every node knows all the edges of the input graph.

Theorem 1

([24]). There is a one-round, deterministic algorithm in the model \({\textsc {BClique}}\), that reconstructs the input graph G if the graph is d-degenerate, and rejects otherwise, using bandwidth \(\mathcal {O}(d \cdot \log n)\).

By Proposition 1, the degeneracy of the NO-instances of \(\textsc {Cycle}_{\le k}\) is upper bounded by \(\mathcal {O}(ex(n,k)/n)\). Therefore, from Theorem 1, we conclude the existence of a one-round algorithm for \(\textsc {Cycle}_{\le k}\) such that, each node, either (1) fully reconstructs the graph and decides the existence of a cycle of length at most k or (2) notices that the degeneracy of the input graph is larger than the bound required by the NO instances, and concludes that the input graph must be a YES instance. Therefore, we have the following corollary.

Corollary 1

Problem \(\textsc {Cycle}_{\le k}\) can be solved with a one-round, deterministic algorithm in the \({\textsc {BClique}}\) model using bandwidth \(\mathcal {O}((ex(n,k)/n)\log n)\).

Previous algorithm is rather restrictive. It is deterministic, it works in one-round and the information each node has about the graph is minimal, consisting in the 1-neighborhood. The question we ask here is the following: is it possible, by lifting previous restrictions, to decrease the total number of bits broadcasted by each node? Next results give a negative answer to this question. In other words, the one-round deterministic algorithm based on the degeneracy seems to be the best we can do.

Recall that \({{\textsc {BClique}}}[r]\) is the extension of the broadcast congested clique model where each node u receives as input the set of all edges lying on a path of length at most r, starting in u. Our first result tackles the case where \(r \le \lfloor k/4 \rfloor \).

Theorem 2

Let \(\epsilon \le 1/3\) and \(0 < r \le k/4 \). Then, any \(\epsilon \)-error, R-round, b-bandwidth algorithm in the \({{\textsc {BClique}}}[r]\) model solving \(\textsc {Cycle}_{\le k}\) satisfies \(R \cdot b = \varOmega ( ex(n,k)/n)\).

In the case where the nodes have more knowledge of the graph, i.e., when \(k/4 \le r \le k/3\), we obtain a tight bound for one-round algorithms.

Theorem 3

Let \(\epsilon \le 1/3\) and \(k/4 < r \le k/3\). Then, any \(\epsilon \)-error, one-round algorithm in the \({{\textsc {BClique}}}[r]\) model that solves \(\textsc {Cycle}_{\le k}\) requires bandwidth \(b = \varOmega ( ex(n,k)/(n\log n))\).

Remark 1

Bondy and Simonovits [10] showed that \(ex(n,k) = \mathcal {O}(n^{1+2/k})\) if k is even, and \(ex(n,k) = \mathcal {O}(n^{1+ 2/(k-1)})\) if k is odd. On the other hand, the Erdős Girth Conjecture states that this bound is tight, implying the results of Table 1. Note that currently, the best constructions provide a lower bound for \(ex(n,k) = \varOmega (n^{1+4/(3k-7)})\) if k is even, and \(ex(n,k) = \varOmega (n^{1+ 4/(3k-9)})\) if k is odd [22].

4 Detection of Long Cycles

Recall that graphs without induced cycles of length greater than k are called k-chordal [11]. 3-chordal graphs, i.e., graphs in which every cycle (not necessarily induced) of 4 or more vertices has a chord, are called chordal graphs. It is known that a graph G is chordal if and only if, for each vertex \(u \in V\), and each connected component C in \(G - N[u]\), the neighborhood N(C) of this component induces a clique in G. This “local” characterization has been exploited by Chandrasekharan and Sitharama Iyengar [13] for devising a fast parallel algorithm recognizing chordal graphs. We begin this section by extending previous characterization to arbitrary chordalities \(k>3\) in order to take advantage of this in our distributed framework.

Let G be a graph, \(u \in V(G)\) and \(k>0\). Let \(D_1, \dots , D_p\) be the p connected components of \(G - N^{\lfloor k/2 \rfloor }[u]\) (obtained by removing the vertices at distance at most \(\lfloor k/2 \rfloor \) from u). Let \(H^k_u\) denote the graph obtained from G by contracting each component \(D_i\) into a single node \(d_i\).

Lemma 1

Let G be a graph. G is k-chordal if and only if, for every \(u \in V(G)\), \(H^k_u\) is k-chordal.

Lemma 1 provides us with a strategy for deciding k-chordality, i.e., for deciding whether the input graph G is a NO instance of problem \(\textsc {Cycle}_{>k}\). For doing this every node x must compute the graph \(H^k_x\) and then decides whether \(H^k_x\) is k-chordal. In order to compute \(H^k_x\), each node x needs first to find the connected components of \(G-N^{\lfloor k/2\rfloor }[x]\). Let \(F_x\) is the set of all edges lying on a path of length at most \(\lfloor k/2\rfloor +1\) starting in x. We need then each node to compute the connected components of \(G-F_x\) outside \(N^{\lfloor k/2\rfloor }[x]\).

4.1 Computing the Connected Components of \(G-F_x\)

Ahn et al. provide a probabilistic, one-round algorithm for computing a spanning forest of the input graph G, in the \({\textsc {BClique}}\) model using bandwidth \(\mathcal {O}(\log ^3 n)\) [1]. In their algorithm, each node constructs a message based on its neighborhood and on a sequence of public random coins, and broadcasts it to all other nodes. Using all these messages, every node is able to construct a spanning forest of the graph with probability \(1-\epsilon \), for a fixed \(\epsilon >0\).

We want each node x to compute the connected components of \(G-F_x\). Recall that \(F_x\) is the set of all edges lying on a path of length at most \(\lfloor k/2\rfloor +1\) starting in x. We place ourselves in the \({\textsc {BClique}}[\lfloor k/2\rfloor +1]\) model with bandwidth \(\mathcal {O}(\log ^4 n)\). We amplify the bandwidth by a \(\log (n)\) factor, with respect to the spanning tree algorithm of [1], to ensure that it succeeds with high probability. Also, every node needs to know all the set of edges \(F_x\), that is why we choose the \({\textsc {BClique}}[\lfloor k/2\rfloor +1]\) model. Using the spanning forest algorithm of [1], we prove that each node x can construct a spanning forest of \(G - F_x\) with high probability.

The key observation is that the messages produced by each vertex is a linear function (w.r.t. to the edges of the graph). Therefore, from the messages of G, each vertex x computes the messages that the algorithm would have constructed on \(G - F_x\).

Definition 1

Let \(n, k, \delta >0\). A \(\delta \)-linear sketch of size k is a function \(S\) \(: \{0,1\}^{\mathcal {O}(\log n)} \times \{-1,0,1\}^n \rightarrow \{0,1\}^k\), such that, if we call \(S_r = S(r, \cdot )\), then

  • \(S_r\) is linear, for each \(r \in \{0,1\}^{\mathcal {O}(\log n)}\);

  • If r is chosen uniformly at random, then there is an algorithm that on input \(S_r(x)\) returns ERROR with probability at most \(\delta \), and otherwise returns a pair \((i, x_i)\) such that \(x_i \ne 0\) and coordinate i is picked uniformly at random between the non-zero coordinates of x. The probabilities are taken over the random choices of r.

Proposition 2

([19]). For each \(n, \delta > 0\), there exists a \(\delta \)-linear sketch of size \(\mathcal {O}(\log ^2 n \log \delta ^{-1})\).

Let \(G = (V,E)\) be a graph of size n, and \(x\in V\). We call \(a^x\) the connectivity vector of x in G, defined as the vector of dimension \(V \atopwithdelims ()2\) such that:

$$a^x_{\{u,v\} } = \left\{ \begin{array}{cl} 1 &{} \text {if } \{u,v\} \in E, x=u \text { and }u<v,\\ -1 &{} \text {if } \{u,v\} \in E, x=v \text { and } u<v, \\ 0 &{} \text {otherwise.} \end{array} \right. $$

For \(r \in \{0,1\}^{\mathcal {O}(\log n)}\), we say that \(S_r(G) = \{ S_r(a^x) \}_{x \in V(G)}\) is a \(\delta \)-connectivity sketch of G, where S is a \(\delta \)-linear sketch. Note that for any \(x\in V\), each non zero coordinate of \(a^x\) represents an edge of N(x), and for any \(U \subseteq V\) the non zero coordinates of \(\sum _{x \in U} a^x\) are exactly the edges in the cut between U and its complement \(V\) \(\setminus \) \(U\).

Let \(G = (V,E)\) be the input graph. The one-round algorithm in the \({\textsc {BClique}}\) model devised by Ahn et al. for computing a spanning forest of G works as follows. Let \(t = \lceil \log n \rceil \). Each node computes and sends t independent \(\delta \)-linear sketches of its connectivity vector, using t random strings \(r_1, \dots , r_t\) picked uniformly at random. Using these messages, any node can compute t independent \(\delta \)-connectivity sketches of G and therefore it can compute a spanning tree using the following t steps procedure. First, let us denote by \(\hat{V}\) the set of supernodes, which initially are the n singletons \(\{ \{u\} | u \in V\}\). At step \(0 \le i < t\), each node samples an incident edge to each set \(\hat{v} \in \hat{V}\) using the ith collection of linear sketches \(\sum _{x \in \hat{v}}S_{r_i}(a^x)\), and merge the obtained connected components into a single supernode. The procedure finishes before \(t=\lceil \log n \rceil \) steps since the number of supernodes at least halves at each step. This idea is behind the proof of the following proposition.

Proposition 3

(Ahn et al. [1]). Let \(n, \delta >0\) and \(t = \lceil \log n \rceil \). There exists an algorithm that receives t independent \(\delta \)-connectivity sketches of a graph G, produced with \(r_1, \dots , r_t \in \{0,1\}^{\mathcal {O}(\log n)}\) random strings picked uniformly at random, and outputs a spanning forest of G with probability \(1-\delta \).

Lemma 2

There exists a one-round algorithm in the \({\textsc {BClique}}[{\lfloor k/2\rfloor }+1]\) model which computes, for every node \(x \in V\), the connected components of \(G-N^{\lfloor k/2\rfloor }[x]\), using bandwidth \(\mathcal {O}(\log ^4 n)\) and with high probability.

Proof

The algorithm works as follows. First, each node x sends \(t = \lceil \log n \rceil \) different \(1/n^2\)-linear sketches of its connectivity vector \(a^x\), using t random strings \(r_1, \dots , r_t\). Note that each node knows \(F_x\). Observe that the components of \(G-N^{\lfloor k/2 \rfloor }[x]\) are exactly the components of \(G - F_x\) without considering the nodes in \(N^{\lfloor k/2 \rfloor }[x]\). In the following, we show that after the communication round, each node x can compute a spanning forest of \(G-F_x\) with probability at least \(1-1/n^2\). Therefore, the whole algorithm succeeds with probability at least \(1- 1/n\).

Let \(S_{r}(G) = (S_r(a^{x_1}), \dots , S_r(a^{x_n}))\) be one of the \(1/n^2\)-connectivity sketches of G, produced with the random string r, received in the communication round. Consider, for each \(e\in F_x\) and \(u \in e\), the vector \(b^{u,e}\) of dimension \(n \atopwithdelims ()2\) where,

$$b_{e'}^{u,e} =\left\{ \begin{array}{cl} -a^u_e &{} \text { if } e' = e, \\ 0 &{} \text {otherwise} \end{array}\right. , \text { for each } e' \in {n \atopwithdelims ()2}.$$

Let us call \(c^u\) be the connectivity vector of node u in \(G-F_x\). Note that, for each \(e \in {n \atopwithdelims ()2}\),

$$c^u_{e} = a_e^u + \sum _{\{e' \in F_x : u\in e'\} }b^{u,e'}_e =\left\{ \begin{array}{cl} a^u_{e} &{} \text {if } e \in E(G)\setminus F_x,\\ 0 &{}\text {otherwise.} \end{array} \right. $$

If we define \(S_r^u = S_r(a^u) + \sum _{\{e \in F_x : u\in e\} } S_r(b^{u,e})\), we obtain, by linearity of \(S_r\), that \(S_r^u = S_r(c^u)\) and then \(\{S_r(c^u)\}_{u \in V}\) is a \(1/n^2\)-connectivity sketch of \(G-F_x\) produced with r.

Then, after the communication round, any node x can obtain t different \(1/n^2\)-connectivity sketches of \(G-F_x\) produced with random strings \(r_1, \dots , r_t\) picked uniformly at random. Therefore, by Proposition 3, it can produce a spanning forest of that graph with probability at least \(1-1/n^2\).   \(\square \)

4.2 Deciding k-Chordality

We are now able to express the distributed algorithm recognizing k-chordal graphs, see Algorithm 1.

Theorem 4

Let \(k\ge 3\). There exists a two-round randomized algorithm in the \({\textsc {BClique}}[\lfloor k/2 \rfloor + 1]\) model, that recognizes k-chordal graphs, and thus solves problem \(\textsc {Cycle}_{> k}\), with bandwidth \(\mathcal {O}(\log ^4 n)\) and high probability.

Proof

In the first round, each node \(x \in G\) computes the connected components of \(G-N^{\lfloor k/2\rfloor }[x]\) using the algorithm of Lemma 2. After the first round, each node x uses its knowledge of G to locally reconstruct \(H^k_x\) by identifying the connected components \(D_1, \dots , D_p\) of \(G - N^{\lfloor k/2 \rfloor }[x]\) and contracting each \(D_i\) into a unique vertex \(d_i\). Note that x sees the edges between \(D_i\) and \(N^{\lfloor k/2 \rfloor }[x]\). Finally, x checks whether \(H^k_x\) is k-chordal and communicates the answer in the second round. By Lemma 1, the input graph is chordal if and only if each vertex x communicated a YES answer. We emphasis that the second round is needed only because the nodes must all agree on the output.

The algorithm may fail only when some node x fails to compute the components of \(G-N^{\lfloor k/2 \rfloor }[x]\); this event may occur, from Lemma 2, with probability at most 1 / n.   \(\square \)

figure a

We end this section giving a lower-bound on the bandwidth b for any one-round algorithm solving \(\textsc {Cycle}_{>k}\) in the \({\textsc {BClique}}_{r}\) model, when \(0<r \le k/3\).

Theorem 5

Let \(\epsilon \le 1/3\), and \(0<r \le k/3\). Any \(\epsilon \)-error, one-round algorithm in the \({{\textsc {BClique}}}[r]\) model that solves \(\textsc {Cycle}_{> k}\) requires bandwidth \(\varOmega (n/ \log n)\).

5 Conclusion

All throughout the paper we considered problems \(\textsc {Cycle}_{\le k}\) and \(\textsc {Cycle}_{> k}\). Let us briefly discuss the similar problems \(\textsc {Sub-Cycle}_{\le k}\), \(\textsc {Sub-Cycle}_{> k}\) and \(\textsc {Sub-Cycle}_{=k}\), which consist in deciding whether the input graphs has, as a subgraph, a cycle of length at most k, greater than k, and equal to k, respectively.

Observe that \(\textsc {Sub-Cycle}_{\le k}\) is identical to \(\textsc {Cycle}_{\le k}\), so upper and lower bounds coincide. We emphasize that, for \(k \ge 3r\), the lower and upper bounds for these problems are tight up to polylogarithmic factors.

Unlike the case of short cycles, there is a significative difference between detecting long induced cycles and detecting long cycles (induced or not). By a result of Birmelé [9], graphs with no cycles of length greater than k have treewidth (and hence degeneracy) at most k. Therefore, they can be recognized by a one-round deterministic algorithm in the \({\textsc {BClique}}\) model with bandwidth \(\mathcal {O}(k \log n)\), based on Theorem 1.

Further lower bounds can be obtained for both \(\textsc {Cycle}_{=k}\) and \(\textsc {Sub-Cycle}_{=k}\) problems in the \({\textsc {BClique}}[r]\) model, when k is an odd number between 3r and 4r. These bounds are obtained by a reduction from a 3-party Number-On-the-Forehead version of the disjointness problem \(\textsc {DISJ}\), and show that any deterministic R-round b-bandwidth algorithm for this problem, in the \({\textsc {BClique}}[r]\) model, is such that \(R\cdot b= \varOmega (n^{1-o(1)})\). Under some stronger complexity assumptions, this lower bound can be extended to randomized algorithms.

When k is even, problem \(\textsc {Sub-Cycle}_{=k}\) can be solved by a one-round deterministic algorithm in \({\textsc {BClique}}\) with bandwidth \(\mathcal {O}(n^{2/k} \log n)\), thanks to degeneracy arguments.

We leave as open problems the question whether \(\textsc {Cycle}_{>k}\) can be solved by a non-trivial one-round algorithm in the \({\textsc {BClique}}[\lfloor k/2\rfloor +1]\) model, as well as the question of multi-round lower bounds for this problem in the \({\textsc {BClique}}[r]\) model for \(r<k/2\).