1 Introduction

Background. Robust secret sharing is a version of secret sharing that enables the reconstruction of the shared secret s in the presence of incorrect shares: given all n shares but with t of them possibly incorrect, and of course without knowing which ones are incorrect, it should still be possible to recover s. If \(t < n/3\) then this can be achieved by standard error-correction techniques, while for \(t \ge n/2\) the task is impossible. When \(n/3 \le t < n/2\), robust secret sharing is possible, but only if one accepts a small failure probability and an overhead in the share size, i.e., shares of bit size larger than the bit size of s. The goal then is to minimize the overhead in the share size for a given (negligible) failure probability \(2^{-k}\). Following up on earlier work on the topic [2, 4,5,6,7, 10], Bishop et al. proposed a scheme with optimal overhead O(k) in the share size, neglecting polylogarithmic terms (in n and k and the bit size of s) [3]. In particular, their scheme was the first robust secret sharing with an overhead that is independent of n (neglecting \( polylog (n)\) terms). However, as pointed out by Fehr and Yuan [8], the Bishop et al. scheme does not (appear to) offer security in the presence of a rushing adversary that may choose the incorrect shares depending on the shares of the honest parties. This is in contrast to most of the earlier schemes, which do offer security against such rushing attacks (but are less efficient in terms of share size).Footnote 1 Towards recovering security against a rushing adversary, Fehr and Yuan [8] proposed a new robust secret sharing scheme that features security against a rushing adversary and an overhead “almost independent” of n, i.e., \(O(n^\epsilon )\) for an arbitrary \(\epsilon > 0\). Furthermore, a variation of their scheme offers security against a rushing adversary and an overhead that is truly independent of n (neglecting polylogarithmic terms), but this version of the scheme has a running time that is superpolynomial.

Our Result. In this work, we close the final gap left open in [8]: we propose and analyze a new robust secret sharing scheme that is secure against a rushing adversary, has an overhead independent of n as in [3] (i.e., independent up to the same poly-logarithmic \(O(\log ^4n+\log n\log m)\) term as in [3], where m is the bit size of the secret), and has a polynomial running time.

Our new scheme recycles several of the ideas and techniques of [8]. The basic idea, which goes back to [3], is to have each share \(s_i\) be authenticated by a small randomly chosen subset of the other parties. Following [8], our approach here differs from [3] in that the keys for the authentication are not authenticated. Indeed, this “circularity” of having the authentication keys authenticated causes the solution in [3] to not allow a rushing adversary; on the other hand, by not authenticating the authentication keys, we give the dishonest parties more flexibility in lying, making the reconstruction harder.

The reconstruction is in terms of a careful (and rather involved) inspection of the resulting consistency graph, exploiting that every honest party can verify the correctness of the shares of a small but random subset of parties, and that these choices of random “neighborhoods” become known to the adversary only after having decided about which shares \(s_i\) to lie about. As a matter of fact, in our scheme, every honest party can verify the correctness of the shares of several randomly chosen small neighborhoods, giving rise to several global verification graphs. Furthermore, to ensure “freshness” of each such neighborhood conditioned on the adversary’s behavior so far, these neighborhoods are revealed sequentially in subsequent rounds of communication during the reconstruction phase.

As in [8], in our scheme the reconstructor first learns from the consistency graph whether the number p of “passive” parties, i.e., dishonest parties that did not lie about the actual share \(s_i\) (but possibly about other pieces of information), is “large” or “small”. For p small, we can recycle the solution from [8], which happens to also work for the tighter parameter setting we consider here. When p is large though, the solution in [8] is to exploit the given redundancy in the shares \(s_i\) by means of applying list decoding, and then to find the right candidate from the list by again resorting to the consistency graph. However, this list decoding technique only works in a parameter regime that then gives rise to the \(O(n^\epsilon )\) overhead obtained in [8]. To overcome this in our solution, we invoke a new technique for dealing with the case of a large p.

We quickly explain this new part on a very high level. The idea is to design a procedure that works assuming the exact value of p is known. This procedure is then repeated for every possible choice of p, leading to a list of possible candidates; similarly to how the scheme in [8] finds the right candidate from the list produced by the list decoding, we can then find the right one from the list. As for the procedure assuming p is known, exploiting the fact that p is large and known, we can find subsets V and \(V_1\) so that either we can recover the shared secret from the shares of the parties in \(V \cup V_1\) by standard error correction (since it happens that there is more redundancy than errors in this collection of shares), or we can argue that the complement of V is a set for which the small-p case applies and thus we can again resort to the corresponding technique in [8].

One technical novelty in our approach is that we also invoke one layer of random neighborhoods that are publicly known. In this case, the adversary can corrupt parties depending on who can verify whose share, but the topology of the global verification graph is fixed and cannot be modified by dishonest parties that lie about their neighborhoods.

Following [3, 8], we point out that it is good enough to have a robust secret sharing scheme with a constant failure probability and a (quasi-)constant overhead; a scheme with \(2^{-k}\) failure probability and a (quasi-) O(k) overhead can then be obtained by means of parallel repetition. This is what we do here as well: at the core is a scheme where each party is given a quasi-constant number of bits on top of the actual share \(s_i\) (i.e., the size of the authentication keys and the size of the random neighborhoods are chosen to be quasi-constant), and we show that this scheme has a constant failure probability.

Concurrent Work. In concurrent and independent work [9], a very similar result as ours was obtained (using rather different techniques though). They also show an optimal (up to poly-logarithmic terms) robust secret sharing scheme with security against a rushing adversary. Compared to our scheme, their scheme has a slightly better poly-logarithmic dependency on n: \(O(\log ^2 n + \log m\log n)\). On the other hand, in a setting where the reconstruction is towards an external reconstructor R, our scheme works simply by revealing the shares to R (over multiple rounds) and R doing some local computation, whereas their scheme requires interaction among the shareholders and, as far as we can see, the shareholders will then learn the shared secret as well. For instance in the context of robust storage, the latter is undesirable.

2 Preliminaries

2.1 Graph Notation

We follow the graph notation in [8], which we briefly recall. Let \(G=([n],E)\) be a graph with vertex set \([n] := \{1,\ldots ,n\}\) and edge set E. By convention, \((v,w) \in E\) represents an edge directed from v to w is . We let \(G|_S\) be the restriction of G to S for any \(S\subseteq [n]\), i.e., \(G|_S=(S, E|_S)\) with \(E|_S=\{(u,v)\in E: u,v\in S\}\).

For vertex \(v\in [n]\), we set

$$\begin{aligned} N^{\mathsf {out}}(v)=\{w \in [n] : (v,w)\in E\} \quad \text {and}\quad N^{\mathsf {in}}(v)=\{w \in [n] : (w,v) \in E\} . \end{aligned}$$

We use \(E_v\) as a short hand for \(N^{\mathsf {out}}(v)\), the neighborhood of v. For \(S\subseteq [n]\), we set

$$\begin{aligned} N_S^{\mathsf {out}}(v)=N^{\mathsf {out}}(v)\cap S \quad \text {and}\quad N_S^{\mathsf {in}}(v)=N^{\mathsf {in}}(v)\cap S . \end{aligned}$$

This notation is extended to a labeled graph, i.e., when G comes with a function \(L:E \rightarrow \{\mathtt{good},\mathtt{bad}\}\) that labels each edge. Namely, for \(v\in [n]\) we set

$$\begin{aligned}&N^{\mathsf {out}}(v,\mathtt{good})=\{w\in N^{\mathsf {out}}(v): L(v,w)=\mathtt{good}\},\\&N^{\mathsf {in}}(v,\mathtt{good})=\{w\in N^{\mathsf {in}}(v): L(w,v)=\mathtt{good}\}, \end{aligned}$$

and similarly \(N^{\mathsf {out}}(v,\mathtt{bad})\) and \(N^{\mathsf {in}}(v,\mathtt{bad})\). Also, \(N_S^{\mathsf {out}}(v,\mathtt{good})\), \(N_S^{\mathsf {in}}(v,\mathtt{good})\), \(N_S^{\mathsf {out}}(v,\mathtt{bad})\) and \(N_S^{\mathsf {in}}(v,\mathtt{bad})\) are defined accordingly for \(S \subseteq [n]\). Finally, we set

$$\begin{aligned} n^{\mathsf {out}}(v)=|N^{\mathsf {out}}(v)| \quad \text {and}\quad n^{\mathsf {in}}_S(v,\mathtt{bad})=|N_S^{\mathsf {in}}(v,\mathtt{bad})| \end{aligned}$$

and similarly for all other variations.

2.2 Random Graphs

We call a graph \(G = ([n],E)\) a randomized graph if each edge in E is actually a random variable. We are particularly interested in randomized graphs where (some or all of) the \(E_v\)’s are uniformly random and independent subsets \(E_v \subset [n] \setminus \{v\}\) of a given size d. For easier terminology, we refer to such neighborhoods \(E_v\) as being random and independent. G is called a random degree-d graph if \(E_v\) is a random subset of size d in the above sense for all \(v \in [n]\). The following properties are direct corollaries of the Chernoff-Hoeffding bound: the first follows from Chernoff-Hoeffding with independent random variables, and the latter from Chernoff-Hoeffding with negatively correlated random variables (see Appendix A).Footnote 2

Corollary 1

Let \(G=([n],E)\) be a randomized graph with the property that, for some fixed \(v \in [n]\), the neighborhood \(E_v\) is a random subset of \([n] \setminus \{v\}\) of size d. Then, for any fixed subset \(T \subset [n]\), we have

$$\begin{aligned} \Pr \bigl [n^{\mathsf {out}}_T(v) \ge \mu +\varDelta \bigr ] \le 2^{-\frac{\varDelta ^2}{3\mu }} \quad \text { and } \quad \Pr \bigl [n^{\mathsf {out}}_T(v) \le \mu -\varDelta \bigr ] \le 2^{-\frac{\varDelta ^2}{2\mu }} , \end{aligned}$$

where \(\mu : = \frac{|T|d}{n}\).

Corollary 2

Let \(G=([n],E)\) be a randomized graph with the property that, for some fixed \(T \subset [n]\), the neighborhoods \(E_v\) for \(v \in T\) are random and independent of size d (in the sense as explained above). Then, for any \(v \not \in T\), we have

$$\begin{aligned} \Pr \bigl [n^{\mathsf {in}}_T(v) \ge \mu +\varDelta \bigr ] \le 2^{-\frac{\varDelta ^2}{3\mu }} \quad \text { and } \quad \Pr \bigl [n^{\mathsf {in}}_T(v) \le \mu -\varDelta \bigr ] \le 2^{-\frac{\varDelta ^2}{2\mu }} , \end{aligned}$$

where \(\mu : = \frac{|T|d}{n}\).

We will also encounter a situation where the set T may depend on the graph G; this will be in the context of a random but publicly known verification graph, where the adversary can then influence T dependent on G. The technical issue then is that conditioned on the set T, the neighborhood \(E_v\) may not be random anymore, so that we cannot apply the above two corollaries. Instead, we will then use the following properties, which require some more work to prove.

Lemma 1

Let \(G=([n],E)\) be a random degree-d graph. Then, there exists no \(\gamma \in \frac{1}{n} \mathbb {Z} \cap \bigl [0, \frac{1}{2}\bigr ]\) and \(T \subset [n]\) of size \(|T| \ge (\gamma -\alpha )n\) for \(\alpha ^2 d=24\log n\) with the property that

$$\begin{aligned} |\{v\in [n]: {n}^{\mathsf {in}}_{T}(v)< d(\gamma -2\alpha )\}|\ge \frac{\gamma n}{2} , \end{aligned}$$

except with probability \(n^{1-5n}\).Footnote 3

Proof

See appendix.

Lemma 2

Let \(G=([n],E)\) be a random degree-d graph. Then, there exists no \(\gamma \in \frac{1}{n} \mathbb {Z} \cap \bigl [\frac{1}{\log n}, \frac{1}{2}\bigr ]\) and \(T \subset [n]\) of size \(|T| \le (\gamma -3\alpha )n\) for \(\alpha ^2 d=24\log n\) with the property that

$$\begin{aligned} |\{v\in [n]: {n}^{\mathsf {in}}_{T}(v) \ge d(\gamma -2\alpha )\}|\ge \frac{\gamma n}{2} , \end{aligned}$$

except with probability \(n^{1-3n}\).

The proof goes along the very same lines as for Lemma 1.

2.3 Robust Secret Sharing

A robust secret sharing scheme consists of two interactive protocols: the sharing protocol \(\mathbf {Share}\) and the reconstruction protocol \(\mathbf {Rec}\). There are three different roles in this scheme, a dealer D, a receiver R and n parties labeled \(1,\ldots ,n\). The sharing protocol is executed by D and n parties: D takes as input a message \(\mathbf{msg}\), and each party \(i \in \{1,\ldots ,n\}\) obtains as output a share. Typically, D generates these shares locally and then sends to each party the corresponding share. The reconstruction protocol is executed by R and the n parties: each party is supposed to use its share as input, and the goal is that R obtains \(\mathbf{msg}\) as output. Ideally, the n parties simply send their shares to R—possibly using multiple communication rounds—and R then performs some local computation to reconstruct the message.Footnote 4

We want a robust secret sharing scheme to be secure in the presence of an active adversary who can corrupt up to t of n parties. Once a party is corrupted, the adversary can see the share of this party. In addition, in the reconstruction protocol, the corrupt parties can arbitrarily deviate from the protocol. The following captures the formal security requirements of a robust secret sharing scheme.

Definition 1

(Robust Secret Sharing). A pair \((\mathbf {Share}, \mathbf {Rec})\) of protocols is called a \((t,\delta )\)-robust secret sharing scheme if the following properties hold for any distribution of \(\mathbf{msg}\) (from a given domain).

  • Privacy: Before \(\mathbf {Rec}\) is started, the adversary has no more information on the shared secret \(\mathbf{msg}\) than he had before the execution of \(\mathbf {Share}\).

  • Robust reconstructability: At the end of \(\mathbf {Rec}\), the reconstructor R outputs \(\mathbf{msg}'=\mathbf{msg}\) except with probability at most \(\delta \).

As for the precise corruption model, we consider an adversary that can corrupt up to t of the n parties (but not the dealer and receiver). We consider the adversary to be rushing, meaning that the messages sent by the corrupt parties during any communication round in the reconstruction phase may depend on the messages of the honest parties sent in that round. Also, we consider the adversary to be adaptive, meaning that the adversary can corrupt parties one by one (each one depending on the adversary’s current view) and between any two rounds of communication, as long as the total number of corrupt parties is at most t. We point out that we do not allow the adversary to be “corruption-rushing”, i.e., to corrupt parties during a communication round, depending on the messages of (some of) the honest parties in this round, and to then “rush” and modify this round’s messages of the freshly corrupt parties.Footnote 5

2.4 Additional Building Blocks

We briefly recall a couple of techniques that we use in our construction. For more details, see Appendix B.

Message Authentication Codes. The construction uses unconditionally secure message authentication codes (MAC) that satisfy the usual authentication security, but which also feature a few additional properties: (1) an authentication tag \(\sigma \) is computed in a randomized way as a function \(MAC_{key}(m,r)\) of the message m, the key key, and freshly chosen randomness r, (2) it is ensured that for any \(\ell \) keys \(key_1,\ldots ,key_\ell \) (with \(\ell \) a parameter), the list of tags \(MAC_{key_1}(m,r),\ldots ,MAC_{key_\ell }(m,r)\) is independent of m over the choice of random string r, and (3) for any message m and fixed randomness r, the tag \(MAC_{key}(m,r)\) is uniformly distributed (over the random choice of the key). The specific construction we use is polynomial-evaluation construction

$$\begin{aligned} MAC_{(x,y)}: \mathbb {F}^a\times \mathbb {F}^\ell \rightarrow \mathbb {F}, (m,r) \mapsto \sum _{i=1}^{a} m_ix^{i+\ell }+\sum _{i=1}^{\ell }r_ix^i+y , \end{aligned}$$

with \(\mathbb {F}\) a finite field of appropriate size and the key being \(key = (x,y) \in \mathbb {F}^2\).

Robust Distributed Storage. Following [3, 8], the tags in the construction of our robust secret sharing scheme will be stored robustly yet non-privately; the latter is the reason why the extra privacy property (2) for the MAC is necessary. This design ensures that cheaters cannot lie about the tags that authenticate their shares to, say, provoke disagreement among honest parties about the correctness of the share of a dishonest party.

Formally, a robust distributed storage scheme is a robust secret sharing scheme but without the privacy requirement, and it can be achieved using a list-decodable code (see Appendix B or [8] for more details). Important for us will be that the share of each party i consists of two parts, \(p_i\) and \(q_i\), and robustness against a rushing adversary is achieved by first revealing \(p_i\) and only then, in a second communication round, \(q_i\). Furthermore, we can do with \(p_i\) and \(q_i\) that are (asymptotically) smaller than the message by a fraction 1/n, and with correct reconstruction except with probability \(2^{-\varOmega (\log ^2 n)}\).

3 The Robust Secret Sharing Scheme

3.1 The Sharing Protocol

Let t be an arbitrary positive integer and \(n=2t+1\). Let \(d = 600 \log ^3 n\).Footnote 6 We consider the message \(\mathbf{msg}\) to be shared to be m bits long. We let \(\mathbb {F}\) be a field with \(\log |\mathbb {F}|=\log m+3\log n\), and we set \(a := \frac{m}{\log m+3\log n}\) so that \(\mathbf{msg}\in \mathbb {F}^a\). Our robust secret sharing scheme uses the following three building blocks. A linear secret sharing scheme \(\mathbf {Sh}\) that corresponds to a Reed-Solomon code of length n and dimension \(t+1\) over an extension field \(\mathbb {K}\) over \(\mathbb {F}\) with \([\mathbb {K}:\mathbb {F}]=a\),Footnote 7 together with its corresponding error-correcting decoding algorithm \(\mathbf {Dec}\), the MAC construction from Theorem 6 with \(\ell = 10 d\), and the robust distributed storage scheme from Theorem 7. On input \(\mathbf{msg}\in \mathbb {F}^a\), our sharing protocol \(\mathbf {Share(msg)}\) works as follows.

  1. 1.

    Let \((s_1,\ldots ,s_n)\leftarrow \mathbf {Sh(\mathbf{msg})}\) to be the non-robust secret sharing of \(\mathbf{msg}\).

  2. 2.

    Sample MAC randomness \(r_1,\ldots ,r_n \leftarrow \mathbb {F}^{10d}\) and repeat the following 5 times.

    1. (a)

      For each \(i\in [n]\), choose a random set \(E_i \subseteq [n]\setminus \{i\}\) of size d. If there exists \(j\in [n]\) with in-degree more than 2d, do it again.Footnote 8

    2. (b)

      For each \(i\in [n]\), sample a random MAC keys \(key_{i,j} \in \mathbb {F}^2\) for each \(j\in E_i\), and set \(\mathcal {K}_i = (key_{i,j})_{j\in E_i}\).

    3. (c)

      Compute the MACsFootnote 9

      $$\begin{aligned} \sigma _{i\rightarrow j}=MAC_{key_{i,j}}(s_j,r_j)\in \mathbb {F}\quad \forall j\in E_i \end{aligned}$$

      and set \(\mathbf{tag}_i = (\sigma _{i\rightarrow j})_{ j\in E_i} \in \mathbb {F}^d\).

    Let \(E^{(m)}_i\), \(\mathcal {K}^{(m)}_{i}\) and \(\mathbf{tag}^{(m)}_{i}\) be the resulting choices in the m-th repetition.

  3. 3.

    Set \(\mathbf{tag}=(\mathbf{tag}^{(m)}_i)_{m\in [5], i\in [n]}\in \mathbb {F}^{5nd}\), and use the robust distributed storage scheme to store \(\mathbf{tag}\) together with \(E^{(2)}\). Party i gets \(p_i\) and \(q_i\).

  4. 4.

    For \(i\in [n]\), define \(\mathbf{s}_i = \bigl (s_i, E^{(1)}_i, E^{(3)}_i, E^{(4)}_i, E^{(5)}_i, \mathcal {K}^{(1)}_i,\ldots , \mathcal {K}^{(5)}_i, r_i, p_i,q_i\bigr )\) to be the share of party i. Output \((\mathbf{s}_1,\ldots ,\mathbf{s}_n)\).

We emphasize that the topology of the graph \(G_2\), determined by the random neighborhoods \(E^{(2)}_i\), is stored robustly (yet non-private). This means that the adversary will know \(G^{(2)}\) but dishonest parties cannot lie about it. For \(G_1, G_3, G_4, G_5\) it is the other way round: they remain private until revealed (see below), but a dishonest party i can then lie about \(E^{(m)}_i\).

3.2 The Reconstruction Protocol

The reconstruction protocol \(\mathbf {Rec}\) works as follows. First, using 5 rounds of communication, the different parts of the shares \((\mathbf{s}_1,\ldots ,\mathbf{s}_n)\) are gradually revealed to the reconstructor R:

  • Round 1: Every party i sends \((s_i, r_i, p_i)\) to the reconstructor R.

  • Round 2: Every party i sends \((q_i, E^{(1)}_i, \mathcal {K}^{(1)}_i,\mathcal {K}^{(2)}_i)\) to the reconstructor R.

  • Round 3: Every party i sends \((E^{(3)}_i, \mathcal {K}^{(3)}_i)\) to the reconstructor R.

  • Round 4: Every party i sends \(( E^{(4)}_i, \mathcal {K}^{(4)}_i)\) to the reconstructor R.

  • Round 5: Every party i sends \(( E^{(5)}_i, \mathcal {K}^{(5)}_i)\) to the reconstructor R.

Remark 1

We emphasize that since the keys for the authentication tags are announced after the Shamir/Reed-Solomon shares \(s_i\), it is ensured that the MAC does its job also in the case of a rushing adversary. Furthermore, it will be crucial that also the \(E^{(1)}_i\)’s are revealed in the second round only, so as to ensure that once the (correct and incorrect) Shamir shares are “on the table”, the \(E^{(1)}_i\)’s for the honest parties are still random and independent. Similarly for the \(E^{(m)}_i\)’s in the m-th round for \(m=3,4,5\). The graph \(G_2\) is stored robustly; hence, the adversary knows all of it but cannot lie about it.

Then, second, having received the shares of n parties, the reconstructor R locally runs the reconstruction algorithm given in the box below.

figure a

In a first step, this reconstruction algorithm considers the graphs \(G_1, G_2, G_3, G_4\) and all the authentication information, and turns there graphs into labeled graphs by marking edges as \(\mathtt{good}\) or \(\mathtt{bad}\) depending on whether the corresponding authentication verification works out. Then, makes calls to various subroutines; we will describe and analyze them at a time. As indicated in the description of the reconstruction algorithm, the overall approach is to first find out if the number p of passive partiesFootnote 10 is small or large, i.e., if there is either lots of redundancy or many errors in the Shamir shares, and then use a procedure that is tailored to that case. Basically speaking, there are three subroutines to handle p in three different ranges. The unique decoding algorithm \(\mathbf {Dec}(\mathbf{s})\) handles the case \(p \ge \frac{n}{4}\) where there is sufficient redundancy in the shares to uniquely decode (this is the trivial case, which we do not discuss any further below but assume for the remainder that \(p \le \frac{n}{4}\)). The graph algorithm \(\mathrm {GraphB}\) handles the case \(p \le \frac{4n}{\log n}\), and the algorithm \(\mathrm {BigP}\) deals with \(p \in [\frac{n}{\log n}, \frac{n}{4}]\); there is some overlap in those two ranges as will not be able to pinpoint the range precisely.

In order to complete the description of the reconstruction procedure and to show that it does its job (except with at most constant probability), we will show the following in the upcoming sections.

  1. 1.

    An algorithm Check that distinguishes “small” from “large” p.

  2. 2.

    An algorithm BigP that, when run with \(\gamma = p/n\) and given that p is “large”, outputs a valid codeword \(\mathbf{c}\) for which \(c_i = s_i\) for all honest i, and thus which decodes to s. Given the p is not know, this algorithm is run with all possible choices for p, and all the candidates for \(\mathbf{c}\) are collected.

  3. 3.

    An algorithm Cand that finds the right \(\mathbf{c}\) in the above list of candidates.

  4. 4.

    An algorithm GraphB that, when run with an honest party i and given that p is “small”, outputs the codeword corresponding to the correct secret s. This algorithm very much coincides with the algorithm used in [8] to deal with the case of a “small” p, except for an adjustment of the parameters. We defer description of this algorithm to our appendix as the security analysis is quite similar to the graph algorithm in BigP.

3.3 “Active” and “Passive” Dishonest Parties

As in previous work on the topic, for the analysis of our scheme, it will be convenient to distinguish between corrupt parties that announce the correct Shamir share \(s_i\) and the correct randomness \(r_i\) in the first round of the reconstruction phase (but may lie about other pieces of information) and between corrupt parties that announce an incorrect \(s_i\) or \(r_i\). Following the terminology of previous work on the topic, the former parties are called passive and the latter are called active parties, and we write P and A for the respective sets of passive and active parties, and we write H for the set of honest parties.

A subtle issue is the following. While the set A of active parties is determined and fixed after the first round of communication, the set of passive parties P may increase over time, since the adversary may keep corrupting parties as long as \(|A \cup P| \le t\), and make them lie in later rounds. Often, this change in P is no concern since many of the statements are in terms of \(H \cup P\), which is fixed like A. In the other cases, we have to be explicit about the communication round we consider, and P is then understood to be the set of passive parties during this communication round.

3.4 The Consistency Graphs

As in [8], using a lazy sampling argument, it is not hard to see that after every communication round (including the subsequent “corruption round”) in the reconstruction procedure, the following holds. Conditioned on anything that can be computed from the information announced up to that point, the neighbourhoods \(E_i^{(m)}\) of the currently honest parties that are then announced in the next round are still random and independent. For example, conditioned on the set A of active parties and the set P of passive parties after the first round, the \(E_i^{(1)}\)’s announced in the second round are random and independent for all \(i \in H = [n] \setminus (A \cup P)\). Whenever we make probabilistic arguments, the randomness is drawn from these random neighbourhoods. The only exception is the graph \(G_2\), which is robustly but non-privately stored, and which thus has the property that the \(E_i^{(2)}\)’s are random and independent for all parties, but not necessarily anymore when conditioned on, say, A and/or P.

Furthermore, by the security of the robust distributed storage of \(\mathbf{tag}\) (Theorem 7) and the MAC (Theorem 6) with our choice of parameters, it is ensured that all of the labeled graphs \(G_1, \ldots ,G_5\) satisfy the following property except with probability \(O\bigl (\log ^3(n)/n^2\bigr )\). For any edge (ij) in any of these graphs \(G_m\), if i is honest at the time it announces \(E_i^{(m)}\) then (ij) is labled \(\mathtt{good}\) whenever j is honest or passive.Footnote 11 Also, (ij) is labeled \(\mathtt{bad}\) whenever j is active.

These observations give rise to the following definition, given a partition \([n] = H \cup P \cup A\) into disjoint subsets with \(|H| \ge t+1\).

Definition 2

A randomized labeled graph \(G = ([n],E)\) is called a degree-d consistency graph (w.r.t. the given partition) if the following two properties hold.

(Randomness):

The neighborhoods \(E_i = \{j \,|\, (i,j) \in E\}\) of the vertices \(i \in H\) are uniformly random and independent subsets of \([n]\setminus \{i\}\) of size d.

(Labelling):

For any edge \((i,j) \in E\) with \(i \in H\), if \(j \in H \cup P\) then \(L(i,j) = \mathtt{good}\) and if \(j \in A\) then \(L(i,j) = \mathtt{bad}\).

In order to emphasize the randomness of the neighborhoods \(E_i\) given the partition \([n] = H \cup P \cup A\) (and possibly of some other information X considered at a time), we also speak of a fresh consistency graph (w.r.t. to the partition and X). When we consider a variant of a consistency graph that is a random degree-d graph, i.e., the randomness property holds for all \(i \in [n]\), while the partition \([n] = H \cup P \cup A\) (and possibly of some other information X considered at a time) may depend on the choice of the random edges, we speak of a random but non-fresh consistency graph.

Using this terminology, we can now capture the above remarks as follows:

Proposition 1

The graphs \(G_1,G_3,G_4,G_5\), as announced in the respective communication rounds, are fresh consistency graphs w.r.t. the partition \([n] = H \cup P \cup A\) given by the active and (at that time) passive parties and w.r.t. any information available to R or the adversary prior to the respective communication round, except that the labeling property may fail with probability \(O\bigl (\log ^3(n)/n^2\bigr )\) (independent of the randomness of the edges). On the other hand, \(G_2\) is a random but non-fresh consistency graph (where, again, the labeling property may fail with probability \(O\bigl (\log ^3(n)/n^2\bigr )\)).

In the following analysis we will suppress the \(O\bigl (\log ^3(n)/n^2\bigr )\) failure probability for the labeling property; we will incorporate it again in the end. Also, we take it as understood that the partition \([n] = H \cup P \cup A\) always refers to the honest, the passive and the active parties, respectively.

3.5 The Check Subroutine

Let A be the set of active parties (well defined after the first communication round), and let \(p := t - |A|\), the number of (potential) passive parties. The following subroutine distinguishes between \(p \ge \frac{n}{\log n}\) and \(p \le \frac{4n}{\log n}\). This very subroutine was already considered and analyzed in [8]; thus, we omit the proof. The intuition is simply that the number of good outgoing edges of the honest parties reflects the number of active parties.

figure b

Proposition 2

[8]. Except with probability \(\epsilon _{check} \le 2^{-\varOmega (\epsilon d)}\), Check(\(G, \epsilon \)) outputs yes if \(p \ge \epsilon n\) and no if \(p \le \epsilon n/4\) (and either of the two otherwise).

3.6 The Cand Subroutine

For simplicity, we next discuss the algorithm Cand. Recall that the set of correct Shamir sharings form a (Reed-Solomon) code with minimal distance t, and \(\mathbf{s}\) collected by R is such a codeword, but with the coordinates in A (possibly) altered. The task of Cand is to find “the right” codeword \(\mathbf{c}\), i.e., the one with \(c_i = s_i\) for all \(i \not \in A\), out of a given list \(\mathcal {L}\) of codewords. The algorithm is given access to a “fresh” consistency graph, i.e., one that is still random when conditioned on the list \(\mathcal {L}\), and it is assumed that p is not too small.

figure c

Proposition 3

If \(p \ge \frac{n}{\log n}\), \(\mathcal {L}\) is a set of codewords of cardinality \(O(n^2)\) for which there exists \(\mathbf{c}\in \mathcal {L}\) with \(c_i = s_i\) for all \(i \in H \cup P\), and G is a fresh consistency graph, then Cand\((\mathcal {L}, G, \mathbf{s})\) outputs this \(\mathbf{c}\in \mathcal {L}\) except with probability \(\epsilon _{cand}\le e^{-\varOmega (\log ^2 n)}\).

Proof

Consider first a codeword \(\mathbf{c}\in \mathcal {L}\) for which \(c_i \ne s_i\) for some \(i \in H \cup P\). Then, due to the minimal distance of the code, \(|(H \cup P) \cap S| \le t\). Therefore,

$$\begin{aligned} |(H \cup P)\setminus S| \ge |H \cup P| - t > p \ge \frac{n}{\log n} . \end{aligned}$$

By the properties of G and using Corollary 1, this implies that for any \(v \in H\)

$$\begin{aligned} \Pr [n^{\mathsf {out}}_{[n] \setminus S}(v,\mathtt{good})=0] \le \Pr [n^{\mathsf {out}}_{(H \cup P)\setminus S}(v,\mathtt{good})=0] \le 2^{-\varOmega (\frac{d}{\log n})} , \end{aligned}$$

which is \(2^{-\varOmega (\log ^2 n)}\) by the choice of d. Taking a union bound over all such \(\mathbf{c}\in \mathcal {L}\) does not affect this asymptotic bound.

Next, if \(\mathbf{c}\in \mathcal {L}\) with \(c_i = s_i\) for all \(i \in H \cup P\), i.e., \([n] \setminus S \subseteq A\), then, by the properties of G,

$$\begin{aligned} n^{\mathsf {out}}_{[n]\setminus S}(v,\mathtt{good})\le n^{\mathsf {out}}_{A}(v,\mathtt{good}) = 0 \end{aligned}$$

for any \(v \in H \subseteq S\). This proves the claim. \(\square \)

4 The Algorithm for Big p

We describe and discuss here the algorithm BigP, which is invoked when p is large. We show that BigP works, i.e., outputs a codeword \(\mathbf{c}\) for which \(c_i = s_i\) for all \(i \in H \cup P\), and thus which decodes to the correct secret s, if it is given p as input. Since, p is not known, in the local reconstruction procedure BigP is run with all possible choices for p, producing a list of codewords, from which the correct one can be found by means of Cand, as shown above.

figure d

Below, we describe the different subroutines of BigP and show that they do what they are supposed to do. Formally, we will prove the following.

Theorem 1

If the number \(p := t - |A|\) of passive parties satisfies \(\frac{n}{\log n} \le p \le \frac{n}{4}\), and \(\gamma := \frac{p}{n}\), then BigP will output a list that contains the correct codeword except with probability \(\epsilon _{bigp} \le O(n^{-3})\). Moreover, it runs in time poly(nm).

For the upcoming description of the subroutines of BigP, we define the global constant

$$\begin{aligned} \alpha := \frac{1}{5\log n} \qquad \text {so that}\qquad \alpha ^2 d =\frac{600 \log ^3 n}{25\log ^2 n} = 24\log n . \end{aligned}$$

Also, recall that \(\frac{1}{\log n} \le \gamma = \frac{p}{n} \le \frac{1}{4}\).

4.1 Filter Out Active Parties

The goal of the algorithm Filter is to find a set V with no active parties and many honest parties. It has access to \(\gamma \) and to a fresh consistency graph.

figure e

Proposition 4

If \(\gamma = (t - |A|)/n\) and G is a fresh consistency graph then Filter\((G, \gamma )\) outputs a set V that satisfies

$$\begin{aligned} |V\cap H| \ge |H| - t + (\gamma -\alpha )n \ge (\gamma -\alpha )n \qquad \text {and}\qquad V \cap A = \emptyset \end{aligned}$$
(1)

except with probability \(O(n^{-3})\).

We point out that the statement holds for the set of honest parties H as it is before Round 2 of the reconstruction procedure, but lower bound \((\gamma -\alpha )n\) will still hold after Round 2, since |H| remains larger than t.

Proof

By the property of G and using Corollary 1, recalling that \(\frac{|A|}{n} \le \frac{1-2\gamma }{2}\), we have

$$\begin{aligned}&\Pr \bigl [{n}^{\mathsf {out}}(v, \mathtt{bad})\ge \textstyle \frac{d(1-2\gamma +\alpha )}{2}\bigr ] = \Pr \bigl [{n}^{\mathsf {out}}_{A}(v, \mathtt{bad})\ge \frac{d(1-2\gamma +\alpha )}{2}\bigr ] \le 2^{-\alpha ^2 d/6} = n^{-4} \end{aligned}$$

for all \(v \in H\). Taking a union bound over all honest parties, we conclude that all \(v \in H\) are contained in T, except with probability \(n^{-3}\).

In order for an honest party \(v \in H\) to fail the test for being included in V, there must be \(d(1-\alpha )/2\) bad incoming edges, coming from dishonest parties in T. However, there are at most t dishonest parties in T, each one contributing at most \(d(1-2\gamma +\alpha )/2\) bad outgoing edges; thus, there are at most

$$\begin{aligned} \frac{t d(1-2\gamma +\alpha )}{d(1-\alpha )} \le t(1-2\gamma +2\alpha ) = t-(\gamma -\alpha )n \end{aligned}$$

honest parties excluded from V, where the inequality holds because

$$\begin{aligned} \frac{1-2\gamma +\alpha }{1-2\gamma +2\alpha } \le \frac{(1-2\gamma +\alpha ) + 2(\gamma -\alpha )}{(1-2\gamma +2\alpha ) + 2(\gamma -\alpha )} = 1 - \alpha , \end{aligned}$$

using \(\gamma - \alpha \ge 0\). This proves the claim on the number of honest parties in V.

For an active party \(v\in A\), again by the properties of G but using Corollary 2 now, it follows that

$$\begin{aligned} \Pr \bigl [{n}_T^{\mathsf {in}}(v, \mathtt{bad}) \le \textstyle \frac{d(1-\alpha )}{2}\bigr ] \le \Pr \bigl [{n}^{\mathsf {in}}_{H}(v, \mathtt{bad})\le \frac{d(1-\alpha )}{2}\bigr ] \le 2^{-\alpha ^2 d/4} = n^{-6} , \end{aligned}$$

recalling that \(\frac{|H|}{n} \ge \frac{1}{2}\) and \(H\subseteq T\). Taking the union bound, we conclude that V contains no active party, except with probability \(O(n^{-5})\).

4.2 Find the Correct Codeword—In Some Cases

On input the set V as produced by Filter above, the goal of Find is to find the correct decoding of \(\mathbf{s}\). Find is given access to \(\gamma \) and to a modified version of a consistency graph G. Here, the consistency graph has uniformly random neighbourhoods \(E_i\) for all parties, but the set V as well as the partition of [n] into honest, passive and active parties may depend on the topology of G. Indeed, this is the property of the graph \(G_2\), on which Find is eventually run.

figure f

We will show that the algorithm Find succeeds as long as

$$\begin{aligned} |V\cap P|\le (\gamma -3\alpha )n \qquad \text {or}\qquad |V|\ge (2\gamma +2\alpha )n . \end{aligned}$$
(2)

This condition implies that honest parties outnumber passive parties by at least \(2\alpha n\) in V. We notice that \(2\alpha n\) is a very narrow margin which may become useless if passive parties in V can lie about their neighbours by directing all their outgoing edges to active parties. This behaviour may result in many active parties admitted to \(V_1\). To prevent passive parties in V from lying about their neighbours, we introduce a non-fresh consistency graph G whose topology is publicly known but can not be modified. With the help of this graph G, we first prove that under condition (2), \(V\cup V_1\) contains many honest and passive parties with high probability. Then, we further prove that under the same condition, \(V\cup V_1\) contains very few active parties with high probability.

We stress that in the following statement, the partition \([n] = H \cup P \cup A\) (and thus \(\gamma \)) and the set V may depend on the choice of the (random) edges in G.

Lemma 3

For \(\gamma = (t - |A|)/n\), \(V \subseteq [n]\) and G a random but non-fresh consistency graph, the following holds except with probability \(2^{-\varOmega (n)}\). If

$$\begin{aligned} |V \cap H| \ge (\gamma -\alpha )n \qquad \text {and}\qquad |V| < (2\gamma +2\alpha )n , \end{aligned}$$

or

$$\begin{aligned} |V| \ge (2\gamma +2\alpha )n , \end{aligned}$$

then \(V_1\) produced by Find\((G, V, \gamma )\) satisfies \(|(H \cup P) \setminus (V \cup V_1)| \le \frac{\gamma n}{2}\).

Proof

Consider \(T := V \cap H\). Note that for \(v \in H \cup P\)

$$\begin{aligned} {n}^{\mathsf {in}}_{T}(v) = {n}^{\mathsf {in}}_{V \cap H}(v) = {n}^{\mathsf {in}}_{V \cap H}(v, \mathtt{good}) \le {n}^{\mathsf {in}}_{V}(v, \mathtt{good}) , \end{aligned}$$

and thus

$$\begin{aligned} B := \big \{v\in H \cup P: {n}^{\mathsf {in}}_{V}(v, \mathtt{good})< d(\gamma -3\alpha ) \big \} \subseteq \big \{v\in H \cup P: {n}^{\mathsf {in}}_{T}(v) < d(\gamma -3\alpha ) \big \} . \end{aligned}$$

By Lemma 1 the following holds, except with probability \(2^{-\varOmega (n)}\). If \(|V\cap H|\ge (\gamma -\alpha )n\) then \(|B| < \frac{\gamma n}{2}\). But also, by definition of \(V_1\) in case \(|V| < (2\gamma +2\alpha )n\), \((H \cup P) \setminus (V \cup V_1) \subseteq B\). This proves the claim under the first assumption on V.

The proof under the second assumption goes along the same lines, noting that the lower bound on |V| then implies that \(|V\cap H|\ge |V| - |P| \ge (\gamma +2\alpha )n\), offering a similar gap to the condition \({n}^{\mathsf {in}}_{V}(v, \mathtt{good}) < d(\gamma +\alpha )\) in the definition of \(V_1\) then.

We proceed to our second claim.

Lemma 4

For \(\gamma = (t - |A|)/n\), \(V \subseteq [n]\) and G a random but non-fresh consistency graph, the following holds except with probability \(2^{-\varOmega (n)}\). If \(V \cap A = \emptyset \), as well as

$$\begin{aligned} |V\cap P|\le (\gamma -3\alpha )n \qquad \text {and}\qquad |V| < (2\gamma +2\alpha )n \end{aligned}$$

or

$$\begin{aligned} |V| \ge (2\gamma +2\alpha )n , \end{aligned}$$

then \(V_1\) produced by Find\((G, V, \gamma )\) satisfies \(|V_1 \cap A| \le \frac{\gamma n}{2}\).

Proof

Consider \(T := V \cap P\). Note that for \(v \in A\)

$$\begin{aligned} {n}^{\mathsf {in}}_{T}(v) = {n}^{\mathsf {in}}_{V \cap P}(v) \ge {n}^{\mathsf {in}}_{V \cap P}(v, \mathtt{good}) = {n}^{\mathsf {in}}_{V}(v, \mathtt{good}) , \end{aligned}$$

and thus

$$\begin{aligned} C := \big \{v\in A: {n}^{\mathsf {in}}_{V}(v, \mathtt{good}) \ge d(\gamma -2\alpha ) \big \} \subseteq \big \{v\in A: {n}^{\mathsf {in}}_{T}(v) \ge d(\gamma -2\alpha ) \big \} \, . \end{aligned}$$

By Lemma 2 the following holds, except with probability \(2^{-\varOmega (n)}\). If \(|V\cap P|\le (\gamma -3\alpha )n\) then \(|C| < \frac{\gamma n}{2}\). But also, by definition of \(V_1\) in case \(|V| < (2\gamma +2\alpha )n\), \(V_1 \cap A \subseteq C\). This proves the claim under the first assumption on V.

The proof under the second assumption goes along the same lines, noting that \(|V\cap P|\le |P| \le \gamma n\) offers a similar gap to the condition \({n}^{\mathsf {in}}_{V}(v, \mathtt{good}) < d(\gamma +\alpha )\) in the definition of \(V_1\) then.

The following theorem is a consequence of Lemma 3 and Lemma 4. The statement holds for P and H after Round 2 in the reconstruction procedure.

Proposition 5

The following holds except with probability \(2^{-\varOmega (n)}\). If (1) is satisfied, i.e., \(|V\cap H| \ge (\gamma -\alpha )n\) and \(V \cap A = \emptyset \), and additionally

$$\begin{aligned} |V\cap P|\le (\gamma -3\alpha )n \qquad \text {or}\qquad |V|\ge (2\gamma +2\alpha )n \end{aligned}$$

holds, and if G is a non-fresh consistency graph, then Find\((G, V, \gamma , \mathbf{s})\) will output the correct codeword (determined by the \(s_i\) for \(i \in H\)).

Proof

It follows from Lemma 3 and Lemma 4 that, except with the claimed probability, \(|(V \cup V_1)\cap A|\le \frac{\gamma n}{2}\) and \(|(V \cup V_1)\cap (P\cup H)|\ge t+1+\frac{\gamma n}{2}\). Therefore, the punctured codeword, obtained by restricting to the coordinates in \(V \cup V_1\), has more redundancy than errors, thus unique decoding works and produces the correct codeword.

Remark 2

Given that (1), i.e., \(|V \cap H| \ge (\gamma - \alpha )n\) and \(V \cap A = \emptyset \), is promised to be satisfied (except with small probability), the only case when Find fails is \(|V|<(2\gamma +2\alpha )n\) yet \(|V\cap P| > (\gamma -3\alpha )n\), where P is the set of passive parties before the third communication round. These conditions together with (1) imply that

$$\begin{aligned} |V| = |V\cap H| + |V\cap P| \ge (\gamma -\alpha )n + (\gamma -3\alpha )n = (2\gamma -4\alpha )n \end{aligned}$$

and

$$\begin{aligned} |V\cap H|=|V|-|V \cap P|\le |V|-|V\cap P|\le (2\gamma +2\alpha )n-(\gamma -3\alpha )n\le (\gamma +5\alpha )n \, \end{aligned}$$

This holds for set of honest parties H even before Round 4 as the set of honest parties before Round 4 is a subset of that before Round 3. Combine this observation with Proposition 4, we come to conclusion that \(|V\cap H|\in [(\gamma -\alpha )n, (\gamma +5\alpha )n]\) holds for the set of honest parties H before Round 4, i.e., the number of honest parties within V is in the above range.

We can thus conclude that if Find fails then the set \(W := [n] \setminus V\) satisfies

$$\begin{aligned} (1-2\gamma -2\alpha )n \le |W| \le (1-2\gamma +4\alpha )n \end{aligned}$$

and, given that \(|W \cap H| = t+1 - |V \cap H|\),

$$\begin{aligned} (\frac{1}{2}-\gamma -5\alpha )n \le |W \cap H| \le (\frac{1}{2}-\gamma +\alpha )n + 1 . \end{aligned}$$

As we mention before, this holds for the set of honest parties H before Round 4. Moreover,

$$\begin{aligned} |W\cap P|=|P|-|V\cap P|=|P\cup H|-|H|-|V\cap P|\le \gamma n-(\gamma n -3\alpha n)\le 3\gamma n. \end{aligned}$$

as \(|P|\le \gamma n\). We point out that the statement \(|W\cap P|\le 3\gamma n\) holds even for the set of passive parties P before Round 4 as \(|P\cup H|=t+1+\gamma n\), |H| remains bigger than \(t+1\) and \(|V\cap P|\) remains bigger than \(\gamma -3\alpha n\). In the following section we show that if W satisfies the above constraints then the algorithm Graph finds the correct decoding of \(\mathbf{s}\) (when initiated with an honest party v and two fresh consistency graphs).

4.3 Graph Algorithm

Recall that \({n'}^{\,\mathsf {out}}_{W}\) refers to \(n^{\mathsf {out}}_W\) but for the graph \(G'\) rather than G, and similarly for \({n'}_{W}^{\,\mathsf {in}}\). This graph algorithm resembles the one in [8], due to that they share the same goal of finding a subset of parties that contains all honest parties and a few dishonest parties whose majority are the passive parties. The differences lie in the range of parameters due to that the graph algorithm in this paper takes the subset of n parties as an input instead of n parties and honest parties may not be a majority in this subset.

figure g

In this section, we assume that the graph algorithm Graph\((G, G', W, \epsilon , v)\) starts with an honest party v. Set \(c=\frac{1}{2}-\gamma \) and we have \(c\in [\frac{1}{4}, \frac{1}{2}]\) as \(\gamma \le \frac{1}{4}\). Note that P and H now become the set of passive parties and honest parties before Round 3. According to Remark 2, it suffices to prove the correctness of this graph algorithm under the condition that

$$\begin{aligned} |W|\in [(2c-2\alpha )n,(2c+4\alpha )],\quad |W\cap H|\in [c-5\alpha , c+\alpha ],\quad |W\cap P|\le 3\alpha n . \end{aligned}$$
(3)

Recall that \(\alpha = \frac{1}{5 \log n}\) and \(\alpha ^2 d = 24 \log n\). We also note that by Remark 2, the above condition also holds for the set of passive parties and honest parties before Round 4. In what follows, when we claim that some event happens with high probability, we mean this holds for all set WP and H in above range.

Let \(H_W=H\cap W\) and \(P_W=P\cap W\). The subset of active parties in W is still A. The out-degree of vertices in \(G|_W\) and \(G'|_W\) is expected to be \(d\frac{|W|}{n}\in [(2c-2\alpha )d, (2c+4\alpha )d]\) and that (due to the MAC’s) the edges from honest parties to active parties are labeled bad, and the edges from honest parties to honest or passive parties are labeled good.

We also recall that whether a corrupt party i is passive or active, i.e., in P or in A, depends on \(s_i\) and \(r_i\) only, as announced in the first communication round in the reconstruction phase. Note that a passive party may well lie about, say, his neighborhood \(E_i\). Our reasoning only relies on the neighborhoods of the honest parties, which are random and independent conditioned on the adversary’s view, as explained in Proposition 1.

Theorem 2

Under the claim of Proposition 1, and assuming that v is honest and W satisfies (3), the algorithm will output a correct codeword except with probability \(\epsilon _{graph} \le n^{-15}\). Moreover, it runs in time poly(mn).

The proof follows almost literally the one of [8] adjusted to the parameter regime considered here. For completeness, we provide the proof of this theorem. The proof of Theorem 2 consists of the analysis of Step ii to Step v and the Graph expansion algorithm. The analysis of Step ii to Step v is deferred to the Appendix.

4.4 Graph Expansion

We start by analyzing the expansion property of \(G|_{H_W}\), the subgraph of G restricted to the set of honest parties \(H_W\).

Lemma 5

(Expansion property of \(G|_{H_W}\)). If \(H'\subset H_W\) is so that \(|H'|\le \frac{\alpha |H_W|}{2d}\) and the \(E_v\)’s for \(v \in H'\) are still random and independent in G when given \(H'\) and H, then

$$\begin{aligned} n^{\mathsf {out}}_H(H') := \bigg |\bigcup _{v \in H'} N^{\mathsf {out}}_H(v)\bigg | \ge (c-7\alpha )d|H'| \end{aligned}$$

except with probability \(O(n^{-23})\).

figure h

Proof

By Remark 2, we know that the size of \(H_W\) is at least \((c-5\alpha )n\). By assumption on the \(E_i\)’s and by Corollary 1, for any vertex \(v \in H'\), \(\Pr [n^{\mathsf {out}}_{H_W}(v) < (c-6\alpha )d]\le 2^{-\alpha ^2 d/2c}=O(n^{-24})\) as \(\alpha ^2 d=24\log n\) and \(c\le 1/2\). Taking the union bound, this hold for all \(v \in H'\) except with probability \(O(n^{-23})\). In the remainder of the proof, we may thus assume that \(N^{\mathsf {out}}_{H_W}(v)\) consist of \(d' := (c-6\alpha )d\) random outgoing edges.

Let \(N:=|H_W|\), \(N':=|H'|\), and let \(v_1,\ldots , v_{d'N'}\) denote the list of neighbours of all \(v \in H'\), with repetition. To prove the conclusion, it suffices to bound the probability \(p_f\) that more than \(\alpha dN'\) of these \(d'N'\) vertices are repeated.

The probability that a vertex \(v_i\) is equal to one of \(v_1,\ldots ,v_{i-1}\) is at most

$$\begin{aligned} \frac{i}{N-1} \le \frac{d'N'}{N-1} = (c-6\alpha )d \cdot \frac{\alpha |N|}{2d} \cdot \frac{1}{N-1} \le \frac{\alpha }{4} . \end{aligned}$$

as \(c\le \frac{1}{2}\).

Taking over all vertex sets of size \(\alpha dN'\) in these \(d'N'\) neighbours, the union bound shows that \(p_f\) is at most

$$\begin{aligned}&\left( {\begin{array}{c}d'N'\\ \alpha dN'\end{array}}\right) \Big (\frac{\alpha }{4}\Big )^{\alpha dN'}\le \left( {\begin{array}{c}dN'\\ \alpha dN'\end{array}}\right) \Big (\frac{\alpha }{4}\Big )^{\alpha dN'} \le 2^{dN'H(\alpha )+\alpha dN'(\log \alpha -2)}\\&\le 2^{\alpha dN'(\frac{1}{\ln 2}-2+O(\alpha ))} \le 2^{-\varOmega (\alpha dN')}\le 2^{-\varOmega (\log ^2 n)}. \end{aligned}$$

The first inequality is due to that \(\left( {\begin{array}{c}n\\ k\end{array}}\right) \le 2^{nH(\frac{k}{n})}\) and the second due to

$$\begin{aligned}&H(\alpha )=-\alpha \log \alpha -(1-\alpha )\log (1-\alpha )=-\alpha \log \alpha +\frac{\alpha }{\ln 2}+O(\alpha ^2)\\ \end{aligned}$$

for \(\alpha =\frac{1}{5\log n}\) and the Taylor series \(\ln (1-\alpha )=\alpha +O(\alpha ^2)\).

5 Parallel Repetition

The failure probability \(\delta \) of our local reconstruction scheme includes the failure probability of recovering tag \(\epsilon _{tag}\), the failure probability of labelling of consistency graph \(\epsilon _{mac}\), the failure probability of algorithm GraphB \(\epsilon _{graph}\), the failure probability of algorithm BigP \(\epsilon _{bigP}\), the failure probability of algorithm Cand \(\epsilon _{Cand}\) and the failure probability of Check \(\epsilon _{check}\). Therefore, we have

$$\begin{aligned} \delta =\epsilon _{mac}+\epsilon _{tag}+\epsilon _{check}+(t+1)\epsilon _{graph}+\epsilon _{bigP}+\epsilon _{cand}=O(\frac{\log ^3 n}{n^2}). \end{aligned}$$

Note that our graph has degree \(d=\varOmega (\log ^3 n)\) and \(\mathbb {F}\) is a finite field with \(mn^3\) elements. The total share size is \(m+O(d(\log n+\log m))=m+O(\log ^4 n+\log m \log ^3 n)\). We summarize our result as follows.

Theorem 3

The scheme \((\mathbf {Share}\), \(\mathbf {Rec})\) is a \(2t+1\)-party \((t,O(\frac{\log ^3 n}{n^2}))\)-robust secret sharing scheme with running time poly(mn) and share size \(m+O(\log ^4 n+\log m \log ^3 n)\).

The error probability can be made arbitrarily small by several independent executions of \((\mathbf {Share}\), \(\mathbf {Rec})\), except that the same Shamir shares \(s_i\) would be used in all the instances. This could be done in a same manner in [8] or [3]. We skip the details but refer interested reader to [8] or [3]. In conclusion, we obtain the following main result.

Theorem 4

For any set of positive integers \(t,n,\kappa ,m\) with \(t < n/2\), there exists a n-party \((t,2^{-\kappa })\)-robust secret sharing scheme against rushing adversary with secret size m, share size \(m+O(\kappa (\log ^4 n+\log ^3 n\log m))\), and running time poly(mn).