1 Introduction

Zero-knowledge proofs, introduced in the seminal work of Goldwasser, Micali and Rackoff [GMR89], are a remarkable and incredibly influential notion. Loosely speaking, a zero-knowledge proof lets a prover \(\text {\textsf {P}}\) convince a verifier \(\text {\textsf {V}}\) of the validity of some statement without revealing any additional information.

In this work we focus on statistical zero-knowledge proofs. These proof-systems simultaneously provide unconditional soundness and zero-knowledge:

  • Even a computationally unbounded prover \(\text {\textsf {P}}^*\) cannot convince \(\text {\textsf {V}}\) to accept a false statement (except with some negligible probability).

  • Any efficient, but potentially malicious, verifier \(\text {\textsf {V}}^*\) learns nothing in the interaction (beyond the validity of the statement) in the following strong, statistical, sense: there exists an algorithm, called the simulator, which can efficiently simulate the entire interaction between \(\text {\textsf {V}}^*\) and \(\text {\textsf {P}}\) based only on the input x, so that the simulation is indistinguishable from the real interaction even to a computationally unbounded distinguisher.

The class of promise problemsFootnote 1 having a statistical zero-knowledge proof is denoted by \(\mathsf {SZK}\). This class contains many natural problems, including many of the problems on which modern cryptography is based, such as (relaxations of) integer factoring [GMR89], discrete logarithm [GK93, CP92] and lattice problems [GG00, MV03, PV08, APS18].

Since the study of \(\mathsf {SZK}\) was initiated in the early 80’s many surprising and useful structural properties of this class have been discovered (see, e.g., [For89, AH91, Oka00, SV03, GSV98, GV99, NV06, OV08]), and several applications have been found for hard problems in this (and related) classes (for example, see [Ost91, OW93, BDRV18a, BDRV18b, KY18, BBD+20]). It is known to be connected to various cryptographic primitives [BL13, KMN+14, LV16, PPS15] and algorithmic and complexity-theoretic concepts [Dru15], and has consequently been used to show conditional impossiblility results. In particular, a notable and highly influential development was the discovery of natural complete problems for \(\mathsf {SZK}\) [SV03, GV99].

In this work we are interested in the following natural question. Suppose that a particular problem \(\varPi \) has an \(\mathsf {SZK}\) protocol. This means that there is a way to efficiently prove that \(x \in \varPi \) in zero-knowledge. However, in many scenarios, one wants to be convinced not only that a single instance belongs to \(\varPi \) but rather that k different inputs \(x_1,\dots ,x_k\) all belong to \(\varPi \). One way to do so is to simply run the underlying protocol for \(\varPi \) k times, in sequence, once for each input \(x_i\).Footnote 2 However, it is natural to ask whether one can do better. In particular, assuming that the \(\mathsf {SZK}\) protocol for \(\varPi \) has communication complexity m, can one prove (in statistical zero-knowledge) that \(x_1,\dots ,x_k \in \varPi \) with communication complexity \(\ll k \cdot m\)? We refer to this problem as batch verification for \(\mathsf {SZK}\).

We view batch verification of \(\mathsf {SZK}\) as being of intrinsic interest, and potentially of use in the study of the structure of \(\mathsf {SZK}\). Beyond that, batch verification of \(\mathsf {SZK}\) may be useful to perform various cryptographic tasks, such as batch verification of digital signature schemes [NMVR94, BGR98, CHP12] or batch verification of well-formedness of public keys (see, e.g., [GMR98]).

1.1 Our Results

We show that non-trivial batch verification is possible for a large and natural subset of languages in \(\mathsf {SZK}\). Specifically, we consider the class of promise problems having non-interactive statistical zero-knowledge proofs. A non-interactive statistical zero-knowledge proof [BFM88] is a variant of \(\mathsf {SZK}\) in which the verifier and the prover are given access to a uniformly random common random string (\(\mathsf {CRS}\)). Given this \(\mathsf {CRS}\) and an input x, the prover generates a proof string \(\pi \) which it sends to the verifier. The verifier, given x, the \(\mathsf {CRS}\), and the proof string \(\pi \), then decides whether to accept or reject. In particular, no additional interaction is allowed other than the proof \(\pi \). Zero-knowledge means that it is possible to simulate the verifier’s view (which consists of the \(\mathsf {CRS}\) and proof \(\pi \)) so that the simulation is statistically indistinguishable from the real interaction. The corresponding class of promise problems is abbreviated as \(\mathsf {NISZK}\).

Remark 1.1

An \(\mathsf {NISZK}\) for a problem \(\varPi \) is equivalent to a two-round public-coin honest-verifier \(\mathsf {SZK}\) protocol. Recall that honest-verifier zero-knowledge, means that the honest verifier learns essentially nothing in the interaction, but a malicious verifier may be able to learn non-trivial information.

The class \(\mathsf {NISZK}\) contains many natural and basic problems such as: variants of the quadratic residuosity problem [BSMP91, DSCP94], lattice problems [PV08, APS18], etc. It is also known to contain complete problems [SCPY98, GSV99], related to the known complete problems for \(\mathsf {SZK}\).

Our main result is an honest-verifier statistical zero-knowledge protocol for batch verification of any problem in \(\mathsf {NISZK}\). In order to state the result more precisely, we introduce the following definition.

Definition 1.2

Let \(\varPi = ({\mathrm {YES}},{\mathrm {NO}})\) be a promise problem, where \(\mathrm {YES}= ( \mathrm {YES}_{n})_{n \in \mathbb {N}}\) and \(\mathrm {NO}= ( \mathrm {NO}_{n})_{n \in \mathbb {N}}\), and let \(k = k(n) \in \mathbb {N}\). We define the promise problem \(\varPi ^{\otimes k} = \big ( \mathrm {YES}^{\otimes k},\mathrm {NO}^{\otimes k} \big )\), where \(\mathrm {YES}^{\otimes k} = (\mathrm {YES}_n^{\otimes k})_{n \in \mathbb {N}}\), \(\mathrm {NO}^{\otimes k} = (\mathrm {NO}^{\otimes k}_n)_{n \in \mathbb {N}}\) and

$$ \mathrm {YES}^{\otimes k}_n = \left( \mathrm {YES}_n \right) ^k $$

and

$$ \mathrm {NO}^{\otimes k}_n = \left( \mathrm {YES}_{n} \cup \mathrm {NO}_{n} \right) ^k \backslash \left( \mathrm {YES}_{n} \right) ^k. $$

That is, instances of \(\varPi ^{\otimes k}\) are k instances of \(\varPi \), where the \(\mathrm {YES}\) instances are all in \(\mathrm {YES}\) and the \(\mathrm {NO}\) instances consist of at least one \(\mathrm {NO}\) instances for \(\varPi \).Footnote 3

With the definition of \(\varPi ^{\otimes k}\) in hand, we are now ready to formally state our main result:

Theorem 1.3

(Informally Stated, see Theorem 3.1). Suppose that \(\varPi \in \mathsf {NISZK}\). Then, for every \(k = k(n) \in \mathbb {N}\), there exists an (interactive) honest-verifier \(\mathsf {SZK}\) protocol for \(\varPi ^{\otimes k}\) with communication complexity \(\mathsf {poly}(n)+k \cdot \mathsf {poly}(\log {n},\log {k})\), where n refers to the length of a single instance and \(\mathsf {poly}\) refers to a fixed polynomial independent of k.

The verifier’s running time is \(k \cdot \mathsf {poly}(n)\) and the number of rounds is O(k).

We emphasize that our protocol for \(\varPi ^{\otimes k}\) is interactive and honest-verifier statistical zero-knowledge (\(\mathsf {HVSZK}\)). Since we start with an \(\mathsf {NISZK}\) protocol (which as mentioned above is a special case of \(\mathsf {HVSZK}\)), it is somewhat expected that the resulting batch verification protocol is only \(\mathsf {HVSZK}\). Still, obtaining a similar result to Theorem 1.3 that achieves malicious-verifier statistical zero-knowledge is a fascinating open problem (see Sect. 1.4 for additional open problems). We mention that while it is known [GSV98] how to transform any \(\mathsf {HVSZK}\) protocol into a full-fledged \(\mathsf {SZK}\) protocol (i.e., one that is zero-knowledge even wrt a malicious verifier), this transformation incurs a polynomial overhead that we cannot afford.

1.2 Related Works

Batch Verification via \(\mathbf {IP}=\mathsf {PSPACE}\). A domain in which batch computing is particularly easy is bounded space computation - if a language \(\mathcal {L}\) can be decided in space s then k instances of \(\mathcal {L}\) can be solved in space \(s+\log (k)\) (by reusing space). Using this observation, the \(\mathbf {IP}=\mathsf {PSPACE}\) theorem [LFKN92, Sha92] yields an efficient interactive proof for batch verification of any problem in \(\mathsf {PSPACE}\). However, the resulting protocol has several major drawbacks. In particular, it does not seem to preserve zero-knowledge, which makes it unsuitable for the purposes of our work.

Batch Verification with Efficient Prover. Another caveat of the \(\mathbf {IP}=\mathsf {PSPACE}\) approach is that it does not preserve the efficiency of the prover. That is, even if we started with a problem that has an interactive proof with an efficient prover, the batch verification protocol stemming from the \(\mathbf {IP}=\mathsf {PSPACE}\) theorem has an inefficient prover.

Reingold et al. [RRR16, RRR18] considered the question of whether batch verification of \(\mathsf {NP}\) proofs with an efficiency prover is possible, assuming that the prover is given the \(\mathsf {NP}\) witnesses as an auxiliary input. These works construct such an interactive batch verification protocol for all problems in \(\mathsf {UP}\subseteq \mathsf {NP}\) (i.e., languages in \(\mathsf {NP}\) in which YES instances have a unique proof). In particular, the work of [RRR18] yields a batch verification protocol for \(\mathsf {UP}\) with communication complexity \(k^\delta \cdot \mathsf {poly}(m)\), where m is the original \(\mathsf {UP}\) witness length and \(\delta >0\) is any constant.

Note that it seems unlikely that the [RRR16, RRR18] protocols preserve zero-knowledge. Indeed, these protocols fundamentally rely on the so-called unambiguity (see [RRR16]) of the underlying \(\mathsf {UP}\) protocol, which, at least intuitively, seems at odds with zero-knowledge.

Batch Verification with Computational Soundness. Focusing on protocols achieving only computational soundness, we remark that interactive batch verification can be obtained directly from Kilian’s [Kil92] highly efficient protocol for all of \(\mathsf {NP}\) (assuming collision resistant hash functions). A non-interactive batch verification protocol was given by Brakerski et al. [BHK17] assuming the hardness of learning with errors. Non-interactive batch verification protocols also follow from the existence of succinct non-interactive zero-knowledge arguments (zkSNARGs), which are known to exist under certain strong, and non-falsifiable, assumptions (see, e.g. [Ish], for a recent survey).

Randomized Iterates. The randomized iterate is a concept introduced by Goldreich, Krawczyk, and Luby [GKL93], and further developed by later work [HHR11, YGLW15], who used it to construct pseudorandom generators from regular one-way functions. Given a function f, its randomized iterate is computed on an input x and descriptions of hash functions \(h_1, \dots , h_m\) by starting with \(x_0 = f(x)\) and iteratively computing \(x_{i} = f(h_i(x_{i-1}))\). The hardcore bits of these iterates were then used to obtain pseudorandomness. While the randomized iterate was used for a very different purpose, this process of alternating the evaluation of a given function with injection of randomness (which is what the hash functions were for) is strongly reminiscent of our techniques. It would be very interesting if there is a deeper connection between our techniques and the usage of these iterates in relation to pseudorandom generators.

1.3 Technical Overview

Batch Verification for Permutations. As an initial toy example, we first consider batch verification for a specific problem in \(\mathsf {NISZK}\). Let \(\mathsf {PERM}\) be the promise problem defined as follows. The input to \(\mathsf {PERM}\) is a description of a Boolean circuit \(C : \left\{ 0,1 \right\} ^n \rightarrow \left\{ 0,1 \right\} ^n\). The \(\mathrm {YES}\) inputs consist of circuits that define permutations over \(\left\{ 0,1 \right\} ^n\) whereas the \(\mathrm {NO}\) inputs are circuits so that every element in the image has at least two preimages.Footnote 4 It is straightforward to see that \(\mathsf {PERM}\in \mathsf {NISZK}\).Footnote 5

Our goal is, given as input k circuits \(C_1,\dots ,C_k\), to distinguish (via a zero-knowledge proof) the case that all of the circuits are permutations from the case that one or more is 2-to-1. Such a protocol can be constructed as follows: the verifier chooses at random \(x_1 \in \left\{ 0,1 \right\} ^n\), computes \(x_{k+1} = C_k(C_{k-1}(\dots C_1(x_1) \dots ))\) and sends \(x_{k+1}\) to the prover. The prover now responds with the preimage \(x_1' = C^{-1}_1(C^{-1}_2( \dots C_k^{-1}(x_{k+1}) \dots ))\). The verifier checks that \(x_1=x_1'\) and if so it accepts, otherwise it rejects.Footnote 6

Completeness follows from the fact that the circuits define permutations and so \(x_1=x_1'\). For soundness, observe that for a \(\mathrm {NO}\) instance, \(x_{k+1}\) has at least two preimages under the composed circuit \(C_k \circ \cdots \circ C_1\). Therefore, a cheating prover can guess the correct preimage \(x_0\) with probability at most 1/2 (and the soundness error can be reduced by repetition). Lastly observe that the protocol is (perfect) honest-verifier zero-knowledge: the simulator simply emulates the verifier while setting \(x_1'=x_1\).

The Approximate Injectivity Problem. Unfortunately, as mentioned above, \(\mathsf {PERM}\) is presumably not \(\mathsf {NISZK}\)-complete and so we cannot directly use the above protocol to perform batch verification for arbitrary problems in \(\mathsf {NISZK}\). Instead, our approach is to identify a relaxation of \(\mathsf {PERM}\) that is both \(\mathsf {NISZK}\)-complete and amenable to batch verification, albeit via a significantly more complicated protocol.

More specifically, we consider the Approximate Injectivity (promise) problem. The goal here is to distinguish circuits that are almost injective, from ones that are highly non-injective. In more detail, let \(\delta \in [0,1]\) be a parameter. We define \(\mathsf {AI}_\delta \) to be a promise problem in which the input is again a description of a Boolean circuit C mapping n input bits to \(m \ge n\) output bits. YES instances are those circuits for which all but \(\delta \) fraction of the inputs x have no collisions (i.e., \(\Pr _x [|C^{-1}(C(x))|>1]<\delta \)). NO instances are circuits for which all but \(\delta \) fraction of the inputs have at least one colliding input (i.e., \(\Pr _x[|C^{-1}(C(x))|=1] < \delta \)).

Our protocol for batch verification of any problem \(\varPi \in \mathsf {NISZK}\) consists of two main steps:

  • First, we show that \(\mathsf {AI}_\delta \) is \(\mathsf {NISZK}\)-hard: i.e., there exists an efficient Karp reduction from \(\varPi \) to \(\mathsf {AI}_\delta \).

  • Our second main step is showing an efficient \(\mathsf {HVSZK}\) batch verification protocol for \(\mathsf {AI}_\delta \). In particular, the communication complexity of the protocol scales (roughly) additively with the number of instances k.

Equipped with the above, an \(\mathsf {HVSZK}\) protocol follows by having the prover and verifier reduce the instances \(x_1,\dots ,x_k\) for \(\varPi \) to instances \(C_1,\dots ,C_k\) for \(\mathsf {AI}_\delta \), and then engage in the batch verification protocol for \(\mathsf {AI}_\delta \) on common input \((C_1,\dots ,C_k)\).

Before describing these two steps in detail, we remark that we find the identification of \(\mathsf {AI}_\delta \) as being \(\mathsf {NISZK}\)-hard (in fact, \(\mathsf {NISZK}\)-complete) to be of independent interest. In particular, while \(\mathsf {AI}_\delta \) bears some resemblance to problems that were already known to be \(\mathsf {NISZK}\)-complete, the special almost-injective nature of the YES instances of \(\mathsf {AI}_\delta \) seems very useful. Indeed, this additional structure is crucial for our batch verification protocol.

\(\mathsf {AI}_\delta \) is \(\mathsf {NISZK}\)-hard. We show that \(\mathsf {AI}_\delta \) is \(\mathsf {NISZK}\)-hard by reducing to it from the Entropy Approximation problem (\(\mathsf {EA}\)), which is known to be complete for \(\mathsf {NISZK}\) [GSV99].Footnote 7 An instance of \(\mathsf {EA}\) is a circuit C along with a threshold \(k \in \mathbb {R}^+\), and the problem is to decide whether the Shannon entropy of the output distribution of C when given uniformly random inputs (denoted H(C)) is more than \(k+10\) or less than \(k-10\).Footnote 8

For simplicity, suppose we had a stronger promise on the output distribution of C — that it is a flat distribution (in other words, it is uniform over some subset of its range). In this case, for any output y of C, the promise of \(\mathsf {EA}\) tells us something about the number of pre-images of y. To illustrate, suppose C takes n bits of input. Then, in a YES instance of \(\mathsf {EA}\), the size of the set \(\left| C^{-1}(y)\right| \) is at most \(2^{n-(k+10)}\), and in the NO case it is at least \(2^{n-(k-10)}\). Recall that for a reduction to \(\mathsf {AI}_\delta \), we need to make the sizes of most inverse sets 1 for YES instances and more than 1 for NO instances. This can now be done by using a hash function to shatter the inverse sets of C.

That is, consider the circuit \(\widehat{C}\) that takes as input an x and also the description of a hash function h from, say, a pairwise-independent hash family H, and outputs (C(x), hh(x)). If we pick H so that its hash functions have output length \((n-k)\), then out of any set of inputs of size \(2^{n-(k+10)}\), all but a small constant fraction will be mapped injectively by a random h from H. On the other hand, out of any set of inputs of size \(2^{n-(k-10)}\), only a small constant fraction will be mapped injectively by a random h. Thus, in the YES case, it may be argued that all but a small constant fraction of inputs (xh) are mapped injectively by \(\widehat{C}\), and in the NO case only a small constant fraction of inputs are. So for some constant \(\delta \), this is indeed a valid reduction from \(\mathsf {EA}\) to \(\mathsf {AI}_\delta \). For smaller functions \(\delta \), the reduction is performed by first amplifying the gap in the promise of \(\mathsf {EA}\) and then proceeding as above.

Finally, we can relax the simplifying assumption of flatness using the asymptotic equipartition property of distributions. In this case, this property states that, however unstructured C may be, its t-fold repetition \(C^{\otimes t}\) (that takes an input tuple \((x_1,\dots ,x_t)\) and outputs \((C(x_1),\dots ,C(x_t))\)) is “approximately flat” for large enough t. That is, with increasingly high probability over the output distribution of \(C^{\otimes t}\), a sample from it will have a pre-image set of size close to its expectation, which is \(2^{t\cdot (n-H(C))}\). Such techniques have been previously used for similar purposes in the \(\mathsf {SZK}\) literature and elsewhere, for example as the flattening lemma of Goldreich et al. [GSV99] (see the full version for details and the proof).

Batch Verification for Exact Injectivity. For sake of simplicity, for this overview we focus on batch verification of the exact variant of \(\mathsf {AI}_\delta \), that is, when \(\delta =0\). In other words, distinguishing circuits that are truly injective from those in which every image y has at least two preimages (with no exceptions allowed). We refer to this promise problem as \(\mathsf {INJ}\). Modulo some technical details, the batch verification protocol for \(\mathsf {INJ}\), presented next, captures most of the difficulty in our batch verification protocol for \(\mathsf {AI}_\delta \).

Before proceeding we mention that the key difference between \(\mathsf {INJ}\) and \(\mathsf {PERM}\) is that YES instances of the former are merely injective whereas for the latter they are permutations. Interestingly, this seemingly minor difference causes significant complications.

Our new goal is, given as input circuits \(C_1,\dots ,C_k : \left\{ 0,1 \right\} ^n \rightarrow \left\{ 0,1 \right\} ^m\), with \(m \ge n\), to distinguish the case that all of the circuits are injective from the case that at least one is entirely non-injective.

Inspired by the batch verification protocol for \(\mathsf {PERM}\), a reasonable approach is to choose \(x_1\) at random but then try to hash the output \(y_i = C_i(x_i) \in \left\{ 0,1 \right\} ^m\), of each circuit \(C_i\), into an input \(x_{i+1} \in \left\{ 0,1 \right\} ^n\) for \(C_{i+1}\). If a hash function could be found that was injective on the image of \(C_i\) then we would be done. However, it seems that finding such a hash function is, in general, extremely difficult.

Rather, we will hash each \(y_i\) by choosing a random hash function from a small hash function family. More specifically, for every iteration \(i \in [k]\) we choose a random seed \(z_i\) for a (seeded) randomness extractor \(\text {\textsf {Ext}}: \left\{ 0,1 \right\} ^m \times \left\{ 0,1 \right\} ^d \rightarrow \left\{ 0,1 \right\} ^n\) and compute \(x_{i+1} = \text {\textsf {Ext}}(y_i,z_i)\). See Fig. 1 for a diagram describing this sampling process.

In case all the circuits are injective (i.e., a YES instance), a simple inductive argument can be used to show that each \(y_i\) is (close to) a distribution having min-entropy n, and therefore the output \(x_{i+1} = \text {\textsf {Ext}}(y_i,z_i)\) of the extractor is close to uniform in \(\left\{ 0,1 \right\} ^n\). Note that for this to be true, we need a very good extractor that essentially extracts all of the entropy. Luckily, constructions of such extractors with a merely poly-logarithmic seed length are known [GUV07].

This idea leads us to consider the following strawman protocol. The verifier chooses at random \(x_1 \in \left\{ 0,1 \right\} ^n\) and k seeds \(z_1,\dots ,z_{k}\). The verifier then computes inductively: \(y_i = C_i(x_i)\) and \(x_{i+1} = \text {\textsf {Ext}}(y_i,z_i)\), for every \(i \in [k]\). The verifier sends \((x_{k+1},z_1,\dots ,z_{k})\) to the prover, who in turn needs to guess the value of \(x_1\).

Fig. 1.
figure 1

The sampling process

The major difficulty that arises in this protocol is in completeness: the honest prover’s probability of predicting \(x_1\) is very small. To see this, suppose that all of the circuits \(C_1,\dots ,C_k\) are injective. Consider the job of the honest prover: given \((x_{k+1},z_1,\dots ,z_{k})\) the prover needs to find \(x_1\). The difficulty is that \(x_{k+1}\) is likely to have many preimages under \(\text {\textsf {Ext}}(\cdot ,z_k)\). While this statement depends on the specific structure of the extractor, note that even in a “dream scenario” in which \(\text {\textsf {Ext}}(\cdot ,z_k)\) were a random function, a constant fraction of \(x_{k+1}\)’s would have more than one preimage (in the image of \(C_k\)).

A similar type of collision in the extractor is likely to occur in most of the steps \(i \in [k]\). Therefore, the overall number of preimages \(x'_1\) that are consistent with \((x_{k+1},z_1,\dots ,z_k)\) is likely to be \(2^{\varOmega (k)}\) and the prover has no way to guess the correct one among them. The natural remedy for this is to give the prover some additional information, such as a hash of \(x_1\), in order to help pick it correctly among the various possible options. However, doing so also helps a cheating prover find \(x_1\) in the case where one of the circuits is non-injective. And it turns out that the distribution of the number of \(x_1\)’s in the two cases – where all the \(C_i\)’s are injective and where one is non-injective – are similar enough that it is not clear how to make this approach work as is.

Isolating Preimages via Interaction. We circumvent the issue discussed above by employing interaction. The key observation is that, even though the number of pre-images \(x_1\) of the composition of all k circuits is somewhat similar in the case of all YES instance and the case of one NO instance among them, if we look at this composition circuit-by-circuit, the number of pre-images in an injective circuit is clearly different from that in a non-injective circuit. In order to exploit this, we have the verifier gradually reveal the \(y_i\)’s rather than revealing them all at once.

Taking a step back, let us consider the following naive protocol:

  1. 1.

    For \(i = k,\dots ,1\):

    1. (a)

      The verifier chooses at random \(x_i \in \left\{ 0,1 \right\} ^n\) and sends \(y_i=C_i(x_i)\) to the prover.

    2. (b)

      The (honest) prover responds with \(x'_i = C_i^{-1}(y_i)\).

    3. (c)

      The verifier immediately rejects if the prover answered incorrectly (i.e., \(x'_i \ne x_i\)).

It is not difficult to argue that this protocol is indeed an \(\mathsf {HVSZK}\) protocol (with soundness error 1/2, which can be reduced by repetition). Alas, the communication complexity is at least \(k \cdot n\), which is too large.

However, a key observation is that this protocol still works even if we generate the \(y_i\)’s as in the strawman protocol. Namely, \(x_{i+1} = \text {\textsf {Ext}}(y_i,z_i)\) and \(y_i = C_i(x_i)\), for every \(i \in [k]\), where the \(z_i\)’s are fresh uniformly distributed (short) seeds. This lets us significantly reduce the randomness complexity of the above naive protocol. Later we shall use this to also reduce the communication complexity, which is our main goal.

To see that the “derandomized” variant of the naive protocol works, we first observe that completeness and zero-knowledge indeed still hold. Indeed, since in a YES case all the circuits are injective, the honest prover always provides the correct answer - i.e., \(x'_i=x_i\). Thus, not only does the verifier always accept (which implies completeness), but it can also easily simulate the messages sent by the prover (which guarantees honest-verifier statistical zero-knowledge).Footnote 9

Arguing soundness is slightly more tricky. Let \(i^* \in [k]\) be the smallest integer so that \(C_{i^*}\) is a NO instance. Recall that in the \(i^*\)-th iteration of the protocol, the prover is given \(y_{i^*}\) and needs to predict \(x_{i^*}\). If we can argue that \(x_{i^*}\) is (close to) uniformly distributed then (constant) soundness follows, since \(C_{i^*}\) is a NO instance and therefore non-injective on every input.

We argue that \(x_{i^*}\) is close to uniform by induction. For the base case \(i=1\) this is obviously true since \(x_1\) is sampled uniformly at random. For the inductive step, assume that \(x_{i-1}\) is close to uniform, with \(i \le i^*\). Since \(C_{i-1}\) is injective (since \(i-1<i^*)\), this means that \(y_{i-1}=C_{i-1}(x_{i-1})\) is close to uniform in the image of \(C_1\), a set of size \(2^n\). Thus, \(\text {\textsf {Ext}}(y_{i-1},z_{i-1})\) is applied to a source (that is close to a distribution) with min-entropy n. Since \(\text {\textsf {Ext}}\) is an extractor, this means that \(x_i\) is close to uniform, concluding the induction.

Reducing Communication Complexity via Hashing. Although we have reduced the randomness complexity of the protocol, we have not reduced the communication complexity (which is still \(k \cdot n\)). We shall do so by, once more, appealing to hashing.

Let us first consider the verifier to prover communication. Using hashing, we show how the verifier can specify each \(y_i\) to the prover by transmitting only a poly-logarithmic number of bits. Consider, for example, the second iteration of the protocol. In this iteration the verifier is supposed to send \(y_{k-1}\) to the prover but can no longer afford to do so. Notice however that at this point the prover already knows \(x_k\). We show that with all but negligible probability, the number of candidate pairs \((y_{k-1},z_{k-1})\) that are consistent with \(x_{k}\) (and so that \(y_{k-1}\) is in the image of \(C_{i-1}\)) is very small. This fact (shown in Proposition 4.4 in Sect. 4), follows from the fact that \(\text {\textsf {Ext}}\) is an extractor with small seed length.Footnote 10 In more detail, we show that with all but negligible probability, the number of candidates is roughly (quasi-)polynomial. Thus, it suffices for the verifier to send a hash of poly-logarithmic length (e.g., using a pairwise independent hash function) to specify the correct pair \((y_{k-1},z_{k-1})\). This idea extends readily to all subsequent iterations.

Thus, we are only left with the task of reducing the communication from the prover to the verifier (which is currently \(n \cdot k\)). We yet again employ hashing. The observation is that rather than sending \(x_i\) in its entirety in each iteration, it suffices for the prover to send a short hash of \(x_i\). The reason is that, in the case of soundness, when we reach iteration \(i^*\), we know that \(y_i\) has two preimages: \(x_i^{(0)}\) and \(x_i^{(1)}\). The prover at this point has no idea which of the two is the correct one and so as long as the hashes of \(x_i^{(0)}\) and \(x_i^{(1)}\) differ, the prover will only succeed with probability 1/2. Thus, it suffices to use a pairwise independent hash function.

To summarize, after an initial setup phase in which the verifier specifies \(y_k\) and the different hash functions, the protocol simply consists of a “ping pong” of hash values between the verifier and the prover. In each iteration the verifier first reveals a hash of the pair \((y_i,z_i)\), which suffices for the prover to fully recover \(y_i\). In response, the prover sends a hash of \(x_i\), which suffices to prove that the prover knows the correct preimage of \(y_i\). For further details, a formal description of the protocol, and the proof, see Sect. 4.

1.4 Discussion and Open Problems

Theorem 1.3 gives a non-trivial batch verification protocol for any problem in \(\mathsf {NISZK}\). However, we believe that it is only the first step in the study of batch verification of \(\mathsf {SZK}\). In particular, and in addition to the question of obtaining malicious verifier zero-knowledge that was already mentioned, we point out several natural research directions:

  1. 1.

    As already pointed out, Theorem 1.3 only gives a batch verification protocol for problems in \(\mathsf {NISZK}\). Can one obtain a similar result for all of \(\mathsf {SZK}\)? As a special interesting case, consider the problem of batch verification for the graph non-isomorphism problem: deciding whether or not there exists a pair of isomorphic graphs among k such pairs. Theorem 1.3 yields an efficient batch verification protocol for this problem under the assumption that the graphs have no non-trivial automorphisms. Handling the general case remains open and seems like a good starting point for a potential generalization of Theorem 1.3 to all of \(\mathsf {SZK}\).

  2. 2.

    Even though we started off with an \(\mathsf {NISZK}\) protocol for \(\varPi \), the protocol for \(\varPi ^{\otimes k}\) is highly interactive. As a matter of fact, the number of rounds is O(k). Is there an \(\mathsf {NISZK}\) batch verification protocol for any \(\varPi \in \mathsf {NISZK}\)?

  3. 3.

    While the communication complexity in the protocol for \(\varPi ^{\otimes k}\) only depends (roughly) additively on k, this additive dependence is still linear. Is a similar result possible with a sub-linear dependence on k?Footnote 11 For example, with \(\mathsf {poly}(n,\log {k})\) communication?

  4. 4.

    A different line of questioning follows from looking at prover efficiency. While in general one cannot expect provers in interactive proofs to be efficient, it is known that any problem in \(\mathsf {SZK}\cap \mathsf {NP}\) has an \(\mathsf {SZK}\) protocol where the honest prover runs in polynomial-time given the \(\mathsf {NP}\) witness for the statement being proven [NV06]. Our transformations, however, make the prover quite inefficient. This raises the interesting question of whether there are batch verification protocols for languages in \(\mathsf {SZK}\cap \mathsf {NP}\) (or even \(\mathsf {NISZK}\cap \mathsf {NP}\)) that are zero-knowledge and also preserve the prover efficiency. This could have interesting applications in, say, cryptographic protocols where the honest prover is the party that generated the instance in the first place and so has a witness for it (e.g., in a signature scheme where the signer wishes to prove the validity of several signatures jointly).

While the above list already raises many concrete directions for future work, one fascinating high-level research agenda that our work motivates is a fine-grained study of \(\mathsf {SZK}\). In particular, optimizing and improving our understanding of the concrete polynomial overheads in structural study of \(\mathsf {SZK}\).

Remark 1.4

(Using circuits beyond random sampling). To the best of our knowledge, all prior works studying complete problems for \(\mathsf {SZK}\) and \(\mathsf {NISZK}\) only make a very restricted usage of the given input circuits. Specifically, all that is needed is the ability to generate random samples of the form (rC(r)), where r is uniformly distributed random string and C is the given circuit (describing a probability distribution).

In contrast, our protocol leverages the ability to feed a (hash of an) output of one circuit as an input to the next circuit. This type of adaptive usage escapes the “random sampling paradigm” described above. In particular, our technique goes beyond the (restrictive) black box model of Holenstein and Renner [HR05], who showed limitations for statistical distance polarization within this model (see also [BDRV19]).

1.5 Organization

We start with preliminaries in Sect. 2. The batch verification result for \(\mathsf {NISZK}\) is formally stated in Sect. 3 and proved therein, based on results that are proved in the subsequent sections. In Sect. 4 we show a batch verification protocol for \(\mathsf {AI}_\delta \). Due to lack of space, we defer the proof that \(\mathsf {AI}_\delta \) is \(\mathsf {NISZK}\)-complete to the full version.

2 Preliminaries

2.1 Probability Theory Notation and Background

Given a random variable X, we write \(x\leftarrow X\) to indicate that x is sampled according to X. Similarly, given a finite set S, we let \(s\leftarrow S\) denote that s is selected according to the uniform distribution on S. We adopt the convention that when the same random variable occurs several times in an expression, all occurrences refer to a single sample. For example, \(\Pr [f(X)=X]\) is defined to be the probability that when \(x\leftarrow X\), we have \(f(x)=x\). We write \(U_n\) to denote the random variable distributed uniformly over \(\left\{ 0,1 \right\} ^n\). The support of a distribution D over a finite set U, denoted \(Supp(D)\), is defined as \(\left\{ u\in U: D(u)>0 \right\} \).

The statistical distance of two distributions P and Q over a finite set U, denoted as \(\Delta (P,Q)\), is defined as \(\max _{S\subseteq U} (P(S)-Q(S)) = \frac{1}{2} \sum _{u\in U}\left| P(u)-Q(u)\right| \).

We recall some standard basic facts about statistical distance.

Fact 2.1

(Data processing inequality for statistical distance). For any two distributions X and Y, and every (possibly randomized) process f:

$$\begin{aligned} \Delta \left( f(X), f(Y) \right) \le \Delta \left( X, Y \right) \end{aligned}$$

Fact 2.2

For any two distributions X and Y, and event E:

$$\begin{aligned} \Delta \left( X, Y \right) \le \Delta \left( X|_E, Y \right) + \Pr _{X}[\lnot E], \end{aligned}$$

where \(X|_E\) denotes the distribution of X conditioned on E.

Proof

Let \(p_u = \Pr [X=u]\) and \(q_u = \Pr [Y=u]\). Also, let \(p_{u|E} = \Pr _X[X=u|E]\) and \(p_{u|\lnot E} = \Pr _X[X=u|\lnot E]\).

$$\begin{aligned} \Delta \left( X, Y \right) =&\frac{1}{2} \sum _{u}\big | p_u - q_u \big |\\ =&\frac{1}{2} \sum _{u} \Big | \Pr _{X}[E] \cdot p_{u|E} +\Pr _{X}[\lnot E] \cdot \mathsf {p}_{u|\lnot E} - \Pr _{X}[E] \cdot q_u - \Pr _{X}[\lnot E] \cdot q_u \Big |\\ \le&\frac{1}{2} \sum _{u} \left( \Big | \Pr _{X}[E] \cdot p_{u|E} -\Pr _{X}[E] \cdot q_u \Big | + \Big | \Pr _{X}[\lnot E] \cdot p_{u|\lnot E} - \Pr _{X}[\lnot E] \cdot q_u \Big | \right) \\ =&\Pr _{X}[E] \cdot \Delta \left( X|_E, Y \right) + \Pr _{X}[\lnot E]\cdot \Delta \left( X|_{\lnot E}, Y \right) \\ \le&\Delta \left( X|_E, Y \right) + \Pr _{X}[\lnot E]. \end{aligned}$$

We also recall Chebyshev’s inequality.

Lemma 2.3

(Chebyshev’s inequality). Let X be a random variable. Then, for every \( \alpha >0\):

$$\begin{aligned} \Pr \big [ |X-E\left[ X \right] |\ge \alpha \big ] \le \frac{Var\left[ X \right] }{\alpha ^2}. \end{aligned}$$

2.2 Zero-Knowledge Proofs

We use \((\text {\textsf {P}},\text {\textsf {V}})(x)\) to refer to the transcript of an execution of an interactive protocol with prover \(\text {\textsf {P}}\) and verifier \(\text {\textsf {V}}\) on common input x. The transcript includes the input x, all messages sent by \(\text {\textsf {P}}\) to \(\text {\textsf {V}}\) in the protocol and the verifier’s random coin tosses. We say that the transcript \(\tau = (\text {\textsf {P}},\text {\textsf {V}})(x)\) is accepting if at the end of the corresponding interaction, the verifier accepts.

Definition 2.4

(\(\mathsf {HVSZK}\)). Let \(c=c(n) \in [0,1]\), \(s=s(n) \in [0,1]\) and \(z=z(n) \in [0,1]\). An \(\mathsf {Honest~Verifier}\) \(\mathsf {SZK}\) \(\mathsf {Proof}\)-\(\mathsf {System}\) \(\mathsf {(}\) \(\mathsf {HVSZK}\) \(\mathsf {)}\) \(\mathsf {with~completeness~error}\) c, \(\mathsf {soundness~error}\) s \(\mathsf {and~zero}\)-\(\mathsf {knowledge~error}\) z for a promise problem \(\varPi =(\varPi _\mathrm {YES},\varPi _\mathrm {NO})\), consists of a probabilistic polynomial-time verifier \(\text {\textsf {V}}\) and a computationally unbounded prover \(\text {\textsf {P}}\) such that following properties hold:

  • Completeness: For any \(x \in \varPi _\mathrm {YES}\):

    $$\begin{aligned} \Pr \left[ (\text {\textsf {P}},\text {\textsf {V}})(x)\text { is accepting} \right] \ge 1 - c(|x|). \end{aligned}$$
  • Soundness: For any (computationally unbounded) cheating prover \(\text {\textsf {P}}^*\) and any \(x \in \varPi _\mathrm {NO}\):

    $$\begin{aligned} \Pr \left[ (\text {\textsf {P}}^*,\text {\textsf {V}})(x)\text { is accepting} \right] \le s(|x|). \end{aligned}$$
  • Honest Verifier Statistical Zero Knowledge: There is a probabilistic polynomial-time algorithm \(\text {\textsf {Sim}}\) (called the simulator) such that for any \(x \in \varPi _{\mathrm {YES}}\):

    $$\begin{aligned} \Delta \left( (\text {\textsf {P}},\text {\textsf {V}})(x), \text {\textsf {Sim}}(x) \right) \le z(|x|). \end{aligned}$$

If the completeness, soundness and zero-knowledge errors are all negligible, we simply say that \(\varPi \) has an \(\mathsf {HVSZK}\) protocol. We also use \(\mathsf {HVSZK}\) to denote the class of promise problems having such an \(\mathsf {HVSZK}\) protocol.

We also define non-interactive zero knowledge proofs as follows.

Definition 2.5

(\(\mathsf {NISZK}\)). Let \(c=c(n) \in [0,1]\), \(s=s(n) \in [0,1]\) and \(z=z(n) \in [0,1]\). An non-interactive statistical zero-knowledge proof \(\mathsf {(}\) \(\mathsf {NISZK}\) \(\mathsf {)}\) \(\mathsf {with~completeness~error}\) c, \(\mathsf {soundness~error}\) s \(\mathsf {and zero}\)-\(\mathsf {knowledge~error}\) z for a promise problem \(\varPi =(\varPi _\mathrm {YES},\varPi _\mathrm {NO})\), consists of a probabilistic polynomial-time verifier \(\text {\textsf {V}}\), a computationally unbounded prover \(\text {\textsf {P}}\) and a polynomial \(\ell =\ell (n)\) such that following properties hold:

  • Completeness: For any \(x \in \varPi _\mathrm {YES}\):

    $$\begin{aligned} \Pr _{r \in \left\{ 0,1 \right\} ^{\ell (|x|)}} \left[ \text {\textsf {V}}(x,r,\pi ) \text { accepts} \right] \ge 1 - c(|x|), \end{aligned}$$

    where \(\pi = \text {\textsf {P}}(x,r)\).

  • Soundness: For any \(x \in \varPi _\mathrm {NO}\):

    $$\begin{aligned} \Pr _{r \in \left\{ 0,1 \right\} ^{\ell (|x|)}} \left[ \exists \pi ^* \text { s.t. } \text {\textsf {V}}(x,r,\pi ^*) \text { accepts} \right] \le s(|x|). \end{aligned}$$
  • Honest Verifier Statistical Zero Knowledge: There is a probabilistic polynomial-time algorithm \(\text {\textsf {Sim}}\) (called the simulator) such that for any \(x \in \varPi _\mathrm {YES}\):

    $$\begin{aligned} \Delta \left( (U_\ell ,\text {\textsf {P}}(x,U_\ell )), \text {\textsf {Sim}}(x) \right) \le z(|x|). \end{aligned}$$

As above, if the errors are negligible, we say that \(\varPi \) has a \(\mathsf {NISZK}\) protocol and use \(\mathsf {NISZK}\) to denote the class of all such promise problems.

2.3 Many-Wise Independent Hashing

Hash functions offering bounded independence are used extensively in the literature. We use a popular variant in which the output of the hash function is almost uniformly distributed on the different points. This relaxation allows us to save on the representation length of functions in the family.

Definition 2.6

(\(\delta \)-almost \(\ell \)-wise Independent Hash Functions). For \(\ell = \ell (n) \in \mathbb {N}\), \(m=m(n) \in \mathbb {N}\) and \(\delta =\delta (n) > 0\), a family of functions \(\mathcal {F} = (\mathcal {F}_n)_n\), where \(\mathcal {F}_n = \big \{ f:\left\{ 0,1 \right\} ^m \rightarrow \left\{ 0,1 \right\} ^n \big \}\) is \(\delta \)-almost \(\ell \)-wise independent if for every \(n \in \mathbb {N}\) and distinct \(x_1,x_2,\ldots ,x_\ell \in \left\{ 0,1 \right\} ^m\) the distributions:

  • \((f(x_1),\dots ,f(x_\ell ))\), where \(f \leftarrow \mathcal {F}_n\); and

  • The uniform distribution over \((\left\{ 0,1 \right\} ^n)^\ell \),

are \(\delta \)-close in statistical distance.

When \(\delta =0\) we simply say that the hash function family is \(\ell \)-wise independent. Constructions of (efficiently computable) many-wise hash function families with a very succinct representation are well known. In particular, when \(\delta =0\) we have the following well-known construction:

Lemma 2.7

(See, e.g., [Vad12, Section 3.5.5]). For every \(\ell = \ell (n) \in \mathbb {N}\) and \(m=m(n) \in \mathbb {N}\) there exists a family of \(\ell \)-wise independent hash functions \(\mathcal {F}^{(\ell )}_{n,m}=\left\{ f:\{0,1\}^m \rightarrow \{0,1\}^n \right\} \) where a random function from \(\mathcal {F}^{(\ell )}_{n,m}\) can be selected using \(O \big (\ell \cdot \max (n,m) \big )\) bits, and given a description of \(f\in \mathcal {F}^{(\ell )}_{n.m}\) and \(x\in \{0,1\}^m\), the value f(x) can be computed in time \(\mathsf {poly}(n,m,\ell )\).

For \(\delta >0\), the seminal work of Naor and Naor [NN93] yields a highly succinct construction.

Lemma 2.8

([NN93, Lemma 4.2]). For every \(\ell = \ell (n) \in \mathbb {N}\), \(m=m(n) \in \mathbb {N}\) and \(\delta =\delta (n)>0\), there exists a family of \(\delta \)-almost \(\ell \)-wise independent hash functions \(\mathcal {F}^{(\ell )}_{n,m}=\left\{ f:\{0,1\}^m \rightarrow \{0,1\}^n \right\} \) where a random function from \(\mathcal {F}^{(\ell )}_{n,m}\) can be selected using \(O \big (\ell \cdot n + \log (m) + \log (1/\delta ) \big )\) bits, and given a description of \(f\in \mathcal {F}^{(\ell )}_{n.m}\) and \(x\in \{0,1\}^m\), the value f(x) can be computed in time \(\mathsf {poly}(n,m,\ell ,\log (1/\delta ))\).

2.4 Seeded Extractors

The min-entropy of a distribution X over a set \(\mathcal {X}\) is defined as \(H_\infty (X) = \min _{x \in \mathcal {X}} \log (1/\Pr [X=x])\). In particular, if \(H_\infty (X) = k\) then, \(\Pr [X=x] \le 2^{-k}\), for every \(x \in \mathcal {X}\).

Definition 2.9

([NZ96]). Let \(k = k(n) \in \mathbb {N}\), \(m=m(n) \in \mathbb {N}\), \(d=d(n)\) and \(\epsilon =\epsilon (n) \in [0,1]\). We say that the family of functions \(\text {\textsf {Ext}}= (\text {\textsf {Ext}}_n)_{n \in \mathbb {N}}\), where \(\text {\textsf {Ext}}_n: \left\{ 0,1 \right\} ^n\times \left\{ 0,1 \right\} ^d\rightarrow \left\{ 0,1 \right\} ^m\), is a \((k,\epsilon )\)-extractor if for every \(n \in \mathbb {N}\) and distribution X supported on \(\left\{ 0,1 \right\} ^n\) with \(H_\infty (X) \ge k\), it holds that \(\Delta \big ( \text {\textsf {Ext}}(X,U_d),U_m \big ) \le \epsilon \), where \(U_d\) (resp., \(U_m\)) denotes the uniform distribution on d (resp., m) bit strings.

Lemma 2.10

([GUV07, Theorem 4.21]). Let \(k = k(n) \in \mathbb {N}\), \(m=m(n) \in \mathbb {N}\) and \(\epsilon =\epsilon (n) \in [0,1]\) such that \(k \le n\), \(m \le k+d-2\log (1/\epsilon )-O(1)\), \(d=\log (n)+O(\log (k) \cdot \log (k/\epsilon ))\) and the functions k, m and \(\epsilon \) are computable in \(\mathsf {poly}(n)\) time. Then, there exists a polynomial-time computable \((k,\epsilon )\)-extractor \(\text {\textsf {Ext}}=(\text {\textsf {Ext}}_n)_{n \in \mathbb {N}}\) such that \(\text {\textsf {Ext}}_n:\left\{ 0,1 \right\} ^n\times \left\{ 0,1 \right\} ^d\rightarrow \left\{ 0,1 \right\} ^m\).

3 Batch Verification for \(\mathsf {NISZK}\)

In this section we formally state and prove our main result.

Theorem 3.1

Let \(\varPi \in \mathsf {NISZK}\) and \(k = k(n) \in \mathbb {N}\) such that \(k(n) = 2^{n^{o(1)}}\). Then, \(\varPi ^{\otimes k}\) has an O(k)-round \(\mathsf {HVSZK}\) protocol with communication complexity \(k \cdot \mathsf {poly}(\log {n},\log {k}) + \mathsf {poly}(n)\) and verifier running time \(k \cdot \mathsf {poly}(n)\).

The proof of Theorem 3.1 is divided into two main steps:

  1. 1.

    As our first main step, we introduce a new \(\mathsf {NISZK}\)-hard problem, called approximate injectivity. The problem is defined formally in Definition 3.2 below and its \(\mathsf {NISZK}\) hardness is established by Lemma 3.3. Due to space restrictions, the proof of Lemma 3.3 is deferred to the full version.

  2. 2.

    The second step is constructing a batch verification protocol for approximate injectivity, as given in Theorem 3.4. The proof of Theorem 3.4 appears in Sect. 4.

We proceed to define the approximate injectivity problem and state its \(\mathsf {NISZK}\)-hardness.

Definition 3.2

Let \(\delta = \delta (n) \rightarrow [0,1]\) be a function computable in \(\mathsf {poly}(n)\) time. The Approximate Injectivity problem with approximation \(\delta \), denoted by \(\mathsf {AI}_\delta \), is a promise problem \((\mathrm {YES},\mathrm {NO})\), where \(\mathrm {YES}= (\mathrm {YES}_n)_{n\in \mathbb {N}}\) and \(\mathrm {NO}= (\mathrm {NO}_n)_{n\in \mathbb {N}}\) are sets defined as follows:

$$\begin{aligned} \mathrm {YES}_n&= \left\{ (1^n,C) \;:\; \Pr _{x \leftarrow \left\{ 0,1 \right\} ^n }\left[ \left| C^{-1}(C(x))\right|> 1 \right] < \delta (n) \right\} \\ \mathrm {NO}_n&= \left\{ (1^n,C) \;:\; \Pr _{x \leftarrow \left\{ 0,1 \right\} ^n }\left[ \left| C^{-1}(C(x))\right|> 1 \right] > 1 - \delta (n) \right\} \end{aligned}$$

where, in both cases, C is a circuit that takes n bits as input. The size of an instance \((1^n,C)\) is n.

Lemma 3.3

Let \(\delta = \delta (n) \in [0,1]\) be a non-increasing function such that \(\delta (n) > 2^{-o(n^{1/4})}\). Then, \(\mathsf {AI}_\delta \) is \(\mathsf {NISZK}\)-hard.

As mentioned above, the proof of Lemma 3.3 appears in the full version. Our main technical result is a batch verification protocol for \(\mathsf {AI}_\delta \).

Theorem 3.4

For any \(k = k(n) \in \mathbb {N}\), \(\delta = \delta (n) \in [0,\frac{1}{100k^2}]\) and security parameter \(\lambda =\lambda (n)\), the problem \(\mathsf {AI}_\delta ^{\otimes k}\) has an \(\mathsf {HVSZK}\) protocol with communication complexity \(O(n) + k\cdot \mathsf {poly}(\lambda ,\log N, \log k)\), where N is an upper bound on the size of each of the given circuits (on n input bits). The completeness and zero-knowledge errors are \(O( k^2 \cdot \delta +2^{-\lambda })\) and the soundness error is a constant bounded away from 1.

The verifier running time is \(k\cdot \mathsf {poly}(N,\log k, \lambda )\) and the number of rounds is O(k).

The proof of Theorem 3.4 appears in Sect. 4. With Lemma 3.3 and Theorem 3.4 in hand, the proof of Theorem 3.1 is now routine.

Proof

(Proof of Theorem 3.1). Let \(\varPi \in \mathsf {NISZK}\). We construct an \(\mathsf {HVSZK}\) protocol for \(\varPi ^{\otimes k}\) as follows. Given as common input \((x_1,\dots ,x_k)\), the prover and verifier each first employ the Karp reduction of Lemma 3.3 to each instance to obtain circuits \((C_1,\dots ,C_k)\) wrt \(\delta = \frac{1}{2^{\mathsf {poly}(\log {n},\log {k})}}\). The size of each circuit, as well as the number of inputs bits, is \(\mathsf {poly}(n)\).

The parties then emulate a \(\mathsf {poly}(\log {n},\log {k})\) parallel repetition of the \(\mathsf {SZK}\) protocol of Theorem 3.4 on input \((C_1,\dots ,C_k)\) and security parameter \(\lambda = \mathsf {poly}(\log {n},\log {k})\). Completeness, soundness and honest-verifier zero-knowledge follow directly from Lemma 3.3 and Theorem 3.4, with error \(O(k \cdot \delta + 2^{-\lambda }) = negl(n,k)\), where we also use the fact that parallel repetition of interactive proofs reduces the soundness error at an exponential rate, and that parallel repetition preserves honest verifier zero-knowledge.

To analyze the communication complexity and verifier running time, observe that the instances \(C_i\) that the reduction of Lemma 3.3 generates have size \(\mathsf {poly}(n)\). The batch verification protocol of Theorem 3.4 therefore has communication complexity \(\mathsf {poly}(n) + k \cdot \mathsf {poly}(\log {n},\log {k})\) and verifier running time \(k \cdot \mathsf {poly}(n)\).

Fig. 2.
figure 2

A batch \(\mathsf {SZK}\) protocol for \(\mathsf {AI}\)

4 Batch Verification for \(\mathsf {AI}\)

In this section we prove Theorem 3.4 by constructing an \(\mathsf {HVSZK}\) protocol for batch verification of the approximate injectivity problem \(\mathsf {AI}_\delta \) (see Definition 3.2 for the definition of \(\mathsf {AI}_\delta \)). For convenience, we restate Theorem 3.4 next.

Theorem 3.4. For any \(k = k(n) \in \mathbb {N},\delta =\delta (n)\in [0,\frac{1}{100k^{2}}]\) and security parameter \(\lambda = \lambda (n)\), the problem AI\(_{\delta }^{\otimes k}\) has an HVSZK protocol with communication complexity \(O(n) + k\) poly \((\lambda , \text {log}N, \text {log} k)\), where N is an upper bound on the size of each of the given circuits (on n input bits). The completeness and zero-knowledge errors are \(O(k^{2}\cdot \delta + 2^{-\lambda })\) and the soundness error is a constant bounded away from 1.

The verifier running time is k \(\cdot \) poly\((N,\text {log} k,\lambda )\) and the number of rounds is O(k).

Let k, \(\delta \) and \(\lambda \) be as in the statement of Theorem 3.4. In order to prove the theorem we need to present an \(\mathsf {HVSZK}\) protocol for \(\mathsf {AI}_\delta ^{\otimes k}\) with the specified parameters. The protocol is presented in Fig. 2. The rest of this section is devoted to proving that the protocol indeed satisfies the requirements of Theorem 3.4.

Section Organization. First, in Sect. 4.1 we prove several lemmas that will be useful throughout the analysis of the protocol. Using these lemmas, in Sects. 4.2 and 4.4, we, respectively, establish the completeness, honest-verifier statistical zero-knowledge and soundness properties of the protocol. Lastly, in Sect. 4.5 we analyze the communication complexity and verifier runtime.

4.1 Useful Lemmas

Let \(C_1,\dots ,C_k : \left\{ 0,1 \right\} ^n \rightarrow \left\{ 0,1 \right\} ^m\) be the given input circuits (these can correspond to either a YES or NO instance of \(\mathsf {AI}_\delta \)). Throughout the proof we use \(i^* \in [k+1]\) to denote the index of the first NO instance circuit, if such a circuit exists, and \(i^*=k+1\) otherwise. That is, \(i^*=\min \big ( \{ k+1 \}\cup \{i\in [k]: C_i\text { is a NO instance}\} \big )\).

For every \(i \in [k]\) we introduce the following notations:

  • We denote by \(X_i\) the distribution over the string \(x_i \in \left\{ 0,1 \right\} ^n\) as sampled in the verifier’s setup phase. That is, \(X_1 = U_n\) and for every \(i \in [k]\), it holds that \(Y_i = C_i(X_i)\) and \(X_{i+1}=\text {\textsf {Ext}}(Y_i,Z_i)\), where each \(Z_i\) is an iid copy of \(U_d\).

  • We denote the subset of strings in \(\left\{ 0,1 \right\} ^m\) having a unique preimage under \(C_i\) by \(S_i\) (i.e., \(S_i=\big \{y_i:\left| C^{-1}_{i}(y_i)\right| =1\big \}\)). Abusing notation, we also use \(S_i\) to refer to the uniform distribution over the corresponding set.

For a function f, we define \(\nu _f\) as \(\nu _f(x) = \left| \left\{ x': f(x') = f(x) \right\} \right| \). We say that \(x\in \left\{ 0,1 \right\} ^n\) has siblings under f, if \(\nu _f(x)>1\). When f is clear from the context, we omit it from the notation.

Lemma 4.1

For every \(i\le i^*\) it holds that \(\Delta \left( X_{i}, U_n \right) \le \frac{1}{k\cdot 2^{\lambda }}+k\cdot \delta \).

Proof

We show by induction on i that \(\Delta \left( X_{i}, U_n \right) \le (i-1) \cdot \left( \frac{1}{k^2\cdot 2^{\lambda }} + \delta \right) \). The lemma follows by the fact that \(i \le k\).

For the base case (i.e., \(i=1\)), since \(X_{1}\) is uniform in \(\left\{ 0,1 \right\} ^n\) we have that \(\Delta \left( X_{1}, U_n \right) =0\). Let \(1 < i \le i^*\) and suppose that the claim holds for \(i-1\). Note that \(i-1 < i^*\) and so \(C_{i-1}\) is a YES instance circuit.

Claim 4.1.1

\(\Delta \big ( \text {\textsf {Ext}}(S_{i-1},U_d), U_n \big ) \le \frac{1}{k^2 \cdot 2^\lambda }\).

Proof

By definition of \(\mathsf {AI}_\delta \), the set \(S_{i-1}\) has cardinality at least \((1-\delta ) \cdot 2^n\). Since \(\delta < 1/2\), this means that the min-entropy of (the uniform distribution over) \(S_i\) is at least \(n-1\). The claim follows by the fact that \(\text {\textsf {Ext}}\) is an extractor for min-entropy \(n-1\) with error \(\epsilon = \frac{1}{k^2 \cdot 2^\lambda }\).

We denote by \(W_{i}\) the distribution obtained by selecting \((x_{i-1},z_{i-1})\) uniformly in \(\left\{ 0,1 \right\} ^n\times \left\{ 0,1 \right\} ^d\) and outputting \(\text {\textsf {Ext}}\big (C_{i-1}(x_{i-1}),z_{i-1}\big )\).

Claim 4.1.2

\(\Delta \left( W_i, U_n \right) \le \frac{1}{k^2\cdot 2^{\lambda }}+\delta \).

Proof

Consider the event that \(X_{i-1}\) has a sibling (under \(C_{i-1}\)). Since \(C_{i-1}\) is a YES instance, this event happens with probability at most \(\delta \). On the other hand, the distribution of \(C_i(X_{i-1})\), conditioned on \(X_{i-1}\) not having a sibling, is simply uniform in \(S_i\). The claim now follows by Claim 4.1.1 and Fact 2.2.

We are now ready to bound \(\Delta \left( X_i, U_n \right) \), as follows:

$$\begin{aligned} \Delta \left( X_{i}, U_n \right)&\le \Delta \left( X_{i}, W_i \right) +\Delta \left( W_{i}, U_n \right) \\&= \Delta \Big ( \text {\textsf {Ext}}(C_{i-1}(X_{i-1}),U_d), \text {\textsf {Ext}}(C_{i-1}(U_n),U_d) \Big )+\Delta \left( W_{i}, U_n \right) \\&\le \Delta \left( X_{i-1}, U_n \right) +\frac{1}{k^2\cdot 2^{\lambda }}+\delta \\&\le (i-1)\cdot \left( \frac{1}{k^2\cdot 2^{\lambda }}+\delta \right) , \end{aligned}$$

where the first inequality is by the triangle inequality, the second inequality is by Fact 2.1 and Claim 4.1.2 and the third inequality is by the inductive hypothesis.

Definition 4.2

We say that the tuple \((x_{1}, h, z_{1},...z_{k})\) is good if the following holds, where we recursively define \(y_i = C_i(x_i)\) and \(x_{i+1} = \text {\textsf {Ext}}(y_{i},z_i)\), for every \(i<i^*\):

  1. 1.

    For every \(i<i^*\), there does not exist \(x_i'\ne x_i\) s.t. \(C_{i}(x_{i})=C_{i}(x_{i}')\) (i.e., \(x_i\) has no siblings).

  2. 2.

    For every \(i<i^*\), there does not exist \((y_i',z_i')\ne (y_i,z_i)\) such that \(y_i'\in S_i\), \(\text {\textsf {Ext}}(y_i',z_i')=\text {\textsf {Ext}}(y_i,z_i)\) and \(h(y_i',z_i')=h(y_i,z_i)\).

Lemma 4.3

The tuple \((x_{1}, h, z_{1},...z_{k})\) sampled by the verifier \(\text {\textsf {V}}\) is good with probability at least \(1-O(k^2\cdot \delta +2^{-\lambda })\).

In order to prove Lemma 4.3, we first establish the following proposition, which bounds the number of preimages of a random output of the extractor.

Proposition 4.4

For any \(S\subseteq \left\{ 0,1 \right\} ^m\) with \(2^{n-1} \le \left| S\right| \le 2^{n}\) and any security parameter \(\lambda > 1\), it holds that:

$$\begin{aligned} \Pr _{y \leftarrow S,z\leftarrow U_d}\left[ \nu _\text {\textsf {Ext}}(y,z) > 2^{d+\lambda } \right] \le \epsilon +\frac{1}{2^{\lambda }}. \end{aligned}$$

Proof

Throughout the current proof we use \(\nu \) as a shorthand for \(\nu _\text {\textsf {Ext}}\). Abusing notation, we also use S to refer to the uniform distribution over the set S.

For a given security parameter \(\lambda > 1\), denote by H (for “heavy”) the set of all \((y,z) \in S \times \left\{ 0,1 \right\} ^d\) that have \(\nu (y,z) > \left| S\right| \cdot 2^{d-n+\lambda }\), and by \(\text {\textsf {Ext}}(H)\) the set \(\left\{ \text {\textsf {Ext}}(y,z) : (y,z)\in H \right\} \). By definition, for any \(z \in \text {\textsf {Ext}}(H)\), we have that \(\Pr \left[ \text {\textsf {Ext}}(S,U_d) = z \right] > 2^{-n+\lambda }\). This implies that:

$$\begin{aligned} \left| \text {\textsf {Ext}}(H)\right| < 2^{n-\lambda }. \end{aligned}$$

Note again that for any \(z\in \text {\textsf {Ext}}(H)\), the above probability is more than \(2^{-n}\), which is the probability assigned to z by the uniform distribution \(U_n\). It then follows from the definition of statistical distance that:

$$\begin{aligned} \Delta \left( \text {\textsf {Ext}}(S_,U_d), U_n \right)&\ge \sum _{z\in \text {\textsf {Ext}}(H)} \big ( \Pr \left[ \text {\textsf {Ext}}(S,U_d) = z \right] - 2^{-n} \big ) \nonumber \\&= \Pr \left[ \text {\textsf {Ext}}(S,U_d)\in \text {\textsf {Ext}}(H) \right] - \left| \text {\textsf {Ext}}(H)\right| \cdot 2^{-n} \nonumber \\&> \Pr \left[ \text {\textsf {Ext}}(S,U_d)\in \text {\textsf {Ext}}(H) \right] - 2^{-\lambda }. \end{aligned}$$

Since \(\left| S\right| \ge 2^{n-1}\), the min entropy of S is at least \(n-1\), and therefore, it holds that \(\Delta \left( \text {\textsf {Ext}}(S,U_d), U_n \right) \le \epsilon \). Together with the fact that \(\Pr \left[ \text {\textsf {Ext}}(S,U_d)\in \text {\textsf {Ext}}(H) \right] = \Pr \left[ (S,U_d)\in H \right] \), we have:

$$\begin{aligned} \Pr _{y \leftarrow S,z\leftarrow U_d}\left[ \nu (y,z) > \left| S\right| \cdot 2^{d-n+\lambda } \right] \le \epsilon +\frac{1}{2^{\lambda }}. \end{aligned}$$

And since \(\left| S\right| \le 2^{n}\) we have

$$\begin{aligned} \Pr _{y \leftarrow S,z\leftarrow U_d}\left[ \nu (y,z) > 2^{d+\lambda } \right] \le \epsilon +\frac{1}{2^{\lambda }}. \end{aligned}$$

Using Proposition 4.4 we are now ready to prove Lemma 4.3.

Proof

(Proof of Lemma 4.3). For any \(i<i^*\), let \(E_i\) denote the event that either (1) there exists \(x_i'\ne x_i\) such that \(C_{i}(x_{i})=C_{i}(x_{i}')\), or (2) there exists \((y_i',z_i')\ne (y_i,z_i)\) such that \(y_i'\in S_i\), \(\text {\textsf {Ext}}(y_i',z_i')=\text {\textsf {Ext}}(y_i,z_i)\) and \(h(y_i',z_i')=h(y_i,z_i)\), where \((x_1,\dots ,x_{k+1},y_1,\dots ,y_k,z_1,\dots ,z_k)\) are as sampled by the verifier.

Lemma 4.3 follows from the following claim, and a union bound over all \(i \in [k]\).

Claim 4.4.1

\(\Pr [E_i] \le (k+1) \cdot \delta + \frac{4}{k \cdot 2^{\lambda }}\), for every \(i \in [k]\).

Proof

We first analyze the probability for the event \(E_i\) when \(x_i\) is sampled uniformly at random. By definition of \(\mathsf {AI}_{\delta }\):

$$\begin{aligned} {\Pr _{x_{i}\leftarrow U_n}[x_i\text { has siblings }]\le \delta .} \end{aligned}$$

Let us condition on \(x_i\) with no siblings being chosen. Under this conditioning, \(C_i(x_i)\) is uniform in \(S_i\). We note that \(|S_i| \ge (1-\delta ) \cdot 2^{n} \ge 2^{n-1}\) and \(|S_i| \le 2^n\). Thus, by Proposition 4.4 (using security parameter \(\lambda +\log k\)) it holds that:

$$\begin{aligned} \Pr _{y_i \leftarrow S_i,z\leftarrow U_d}\left[ \nu _\text {\textsf {Ext}}(y,z) > k \cdot 2^{\lambda +d} \right] \le \epsilon +\frac{1}{k \cdot 2^{\lambda }}\le \frac{2}{k \cdot 2^{\lambda }}, \end{aligned}$$

where the last inequality follows from the fact that \(\epsilon = \frac{1}{k^2 \cdot 2^\lambda }\).

Let us therefore assume that the pair \((y_i,z_i)\) has at most \(k \cdot 2^{\lambda +d}\) siblings under \(\text {\textsf {Ext}}\). We wish to bound the probability that there exists a preimage that collides with \((y_i,z_i)\) under h. Since h is \(2^{-(2\lambda +d+2\log {k})}\)-almost pairwise-independent (into a range of size \(2^{2\lambda +d+2\log {k}}\)), for any pair \((y',z')\), the probability that it collides with \((y_i,z_i)\) under h is at most \(\frac{2}{2^{2\lambda +d+2\log {k}}}\). Since \(y_i\) has at most \(k \cdot 2^{\lambda +d}\) siblings (under \(\text {\textsf {Ext}}\)), by a union bound, the probability that any of them collide with \((y_i,z_i)\) (under h) is at most \(k \cdot 2^{\lambda +d} \cdot 2^{-(2\lambda +d+2\log {k})} = \frac{1}{k \cdot 2^{\lambda }}\).

Thus, when \(x_i\) is sampled uniformly at random, the probability that it has a sibling (under \(C_i\)) or that there exist \((y',z')\) such that \(\text {\textsf {Ext}}(y',z')=\text {\textsf {Ext}}(y_i,z_i)\), where \(y_i'\in S_i\) and \(h(y',z')=h(y_i,z_i)\), is at most:

$$ \delta + \frac{2}{k \cdot 2^{\lambda }}+\frac{1}{k \cdot 2^{\lambda }} = \delta + \frac{3}{k \cdot 2^{\lambda }}. $$

The claim follows by the fact that, by Lemma 4.1, the actual distribution of \(x_i\) is \(\big ( \frac{1}{k\cdot 2^{\lambda }}+k\cdot \delta \big )\)-close to uniform.

This concludes the proof of Lemma 4.3.

4.2 Completeness

Let \(C_1,\dots ,C_k \in \mathsf {AI}_\delta \). Assume first that \(\text {\textsf {V}}\) generates a good tuple \((x_1,h,z_1,\dots ,z_k)\) (as per Definition 4.2). Observe that in such a case, by construction of the protocol, it holds that \(x'_i=x_i\) and \(y'_i=y_i\), for every \(i \in [k]\). Therefore, the verifier accepts in such a case (with probability 1).

By Lemma 4.3, the tuple \((x_1,h,g,z_1,\dots ,z_k)\) is good with all but \(O \left( k^2 \cdot \delta + 2^{-\lambda } \right) \) probability. Thus, the completeness error is upper bounded by \(O \left( k^2 \cdot \delta + 2^{-\lambda } \right) \).

4.3 Honest-Verifier Statistical Zero-Knowledge

The simulator is presented in Fig. 3.

Fig. 3.
figure 3

Simulator for \(\mathsf {AI}_{\delta }\) batch verification

The generation of \((x_{1} ,h,g, z_1,...,z_{k})\) is identical for the verifier and for the simulator. Assuming that the tuple \((x_{1} ,h, z_1,...,z_{k})\) is good, by construction the prover does not abort in the honest execution (as in the case of completeness). Moreover, in this case, each \(x'_i\) (resp., \((y'_i,z'_i)\)) found by the prover is equal to \(x_i\) (resp., \((y_i,z_i)\)) chosen by the verifier. Therefore, conditioned on the tuple \((x_{1} ,h, z_1,...,z_{k})\) being good, the distributions of (1) the transcript generated in the honest execution, and (2) the simulated transcript are identically distributed. The fact that the protocol is honest-verifier statistical zero-knowledge now follows from Lemma 4.3, and by applying Fact 2.2 twice.

4.4 Soundness

Let \(C_1,...,C_k : \left\{ 0,1 \right\} ^n \rightarrow \left\{ 0,1 \right\} ^m\) be such that one of them is a NO instance of \(\mathsf {AI}_\delta \). Recall that \(i^* \in [k]\) denotes the index of the first such NO instance circuit (i.e., \(C_{i^*}\) is a NO instance of \(\mathsf {AI}_\delta \) but for every \(i<i^*\), it holds that \(C_i\) is a YES instance).

We first make two simplifying assumptions. First, recall that value of \(y_{i^*}\) is specified by the verifier by having it send \(x_{k+1},\beta _k,\dots ,\beta _{i^*}\) to the prover. Instead, we will simply assume that the verifier sends \(y_{i^*}\) directly to the prover. Since \(y_{i^*}\) can be used to generate the verifier’s distribution consistently, revealing \(y_{i^*}\) only makes the prover’s job harder and therefore can only increase the soundness error. Second, we modify the protocol so that the verifier merely checks that \(\alpha _{i^*} = g(x_{i^*})\) – if so it accepts and otherwise it rejects. Once again having removed the verifier’s other tests can only increase the soundness error.

Thus, it suffices to bound the soundness error of the following protocol. The verifier samples \(x_{i^*}\) as in the real protocol, sends \(y_{i^*}=C_{i^*}(x_{i^*})\) and the hash function g to the prover and expects to get in response \(g(x_{i^*})\). We show that the prover’s probability of making the verifier accept is bounded by a constant.

In order to bound the prover’s success probability in the foregoing experiment, we first give an upper bound assuming that \(x_{i^*}\) is uniform in \(\left\{ 0,1 \right\} ^n\), rather than as specified by the protocol (and, as usual, \(y_{i^*}=C_{i^*}(x_{i^*})\)). Later we shall remove this assumption using Lemma 4.1, which guarantees that \(x_{i^*}\) is actually close to uniform.

Let \(\text {\textsf {P}}^*\) be the optimal prover strategy. Namely, given g and \(y_{i^*}\), the prover \(\text {\textsf {P}}^*\) outputs the hash value \(\alpha _{i^*} \in \left\{ 0,1 \right\} ^{\ell }\) with the largest probability mass (i.e., that maximizes \(| C^{-1}_{i^*}(y_{i^*}) \cap g^{-1}(\alpha )|\)).

Let \(\widehat{Y}_{i^*}\) denote the distribution obtained by sampling \(x \in \left\{ 0,1 \right\} ^n\) uniformly at random, conditioned on x have a sibling under \(C_{i^*}\) and outputting \(C_{i^*}(x)\). Using elementary probability theory we have that:

$$\begin{aligned} \Pr _{\begin{array}{c} g \leftarrow G_n\\ x_{i^*} \leftarrow \left\{ 0,1 \right\} ^n \end{array}} \big [ \text {\textsf {P}}^*(g,y_{i^*}) = g(x_{i^*}) \big ]&\le \Pr _{\begin{array}{c} g \leftarrow G_n\\ x_{i^*} \leftarrow \left\{ 0,1 \right\} ^n \end{array}} \big [ \text {\textsf {P}}^*(g,y_{i^*}) = g(x_{i^*}) \;|\; x_{i^*}\text { has siblings} \big ] \nonumber \\&\quad + \Pr [ x_{i^*} \text { has no siblings}] \nonumber \\&\le \Pr _{\begin{array}{c} g \leftarrow G_n\\ y_{i^*} \leftarrow \widehat{Y}_{i^*} \\ x_{i^*} \leftarrow C_{i^*}^{-1}(y_{i^*}) \end{array}} \big [ \text {\textsf {P}}^*(g,y_{i^*}) = g(x_{i^*}) \big ] + \delta \nonumber \\&= E_{y_{i^*} \leftarrow \widehat{Y}_{i^*}}\left[ \Pr _{ \begin{array}{c} g \leftarrow G_n \\ x_{i^*} \leftarrow C^{-1}_{i^*}(y_{i^*}) \end{array}} \big [ \text {\textsf {P}}^*(g,y_{i^*}) = g(x_{i^*}) \big ] \right] + \delta , \end{aligned}$$
(1)

where the second inequality follows from the fact that \(C_{i^*}\) is a NO instance.

Fix \(y_{i^*}\) in the support of \(\widehat{Y}_{i^*}\) (i.e., \(|C^{-1}_{i^*}(y_{i^*})| \ge 2\)) and let \(u = |C^{-1}_{i^*}(y_{i^*})|\). We show that \(\Pr _{g \in G^n, x_{i^*} \leftarrow C^{-1}_{i^*}(y_{i^*})} \left[ \text {\textsf {P}}^*(g,y_{i^*}) = g(x_{i^*}) \right] \) is upper bounded by a constant.

Let E be the event (defined only over the choice of g) that for every hash value \(\alpha \in \left\{ 0,1 \right\} ^{\ell }\), it holds that \(|C^{-1}_{i^*}(y_{i^*}) \cap g^{-1}(\alpha )| \le \frac{7}{8} u\). That is, the event E means that no hash value has more than 7/8 fraction of the probability mass (when sampling \(x_{i^*}\) uniformly in \(C_{i^*}^{-1}(y_{i^*})\) and outputting \(g(x_{i^*})\)).Footnote 12

Claim 4.4.2

The event E occurs with probability a least 1/10.

Proof

Fix a hash value \(\alpha \in \left\{ 0,1 \right\} ^{\ell }\), and let \(X = |C^{-1}_{i^*}(y_{i^*}) \cap g^{-1}(\alpha )|\) be a random variable (over the randomness of g). Observe that X can be expressed as a sum of u pairwise independent Bernoulli random variables, each of which is 1 with probability \(2^{-\ell }\) and 0 otherwise. Thus, the expectation of X is \(u/2^{\ell }\) and the variance is \(u \cdot 2^{-\ell } \cdot (1-2^{-\ell }) \le u\cdot 2^{-\ell }\). By Chebyshev’s inequality (Lemma 2.3), it holds that

$$\begin{aligned} \Pr \left[ X> \frac{7}{8} u \right]&\le \Pr \left[ \left| X-\frac{u}{2^{\ell }}\right| > \frac{3}{4} u \right] \\&\le \frac{Var\left[ X \right] }{ (3/4)^2 \cdot u^2 } \\&\le \frac{16}{9u} \cdot \frac{1}{2^{\ell }}, \end{aligned}$$

where the first inequality follows from the fact that \(\ell \) is a sufficiently large constant. Taking a union bound over all \(\alpha \)’s we have that the probability that there exists some \(\alpha \) with more than 7/8 fraction of the preimages in U (under g) is less than \(\frac{16}{9u} < 0.9\), where we use the fact that \(u \ge 2\).

Observe that conditioned on the event E, the probability (over \(x_{i^*} \leftarrow C^{-1}_{i^*}(y_{i^*})\)) that \(\text {\textsf {P}}^*(g,y_{i^*}) = g(x_{i^*})\) is at most 7/8. Thus, by Claim 4.4.2 we obtain that:

$$\begin{aligned} \Pr _{g, x_{i^*} \leftarrow C^{-1}_{i^*}(y_{i^*})} \big [ \text {\textsf {P}}^*(g,y_{i^*}) \ne g(x_{i^*}) \big ] \ge \Pr [E] \cdot \Pr _{g, x_{i^*} \leftarrow C^{-1}_{i^*}(y_{i^*})} \big [ \text {\textsf {P}}^*(g,y_{i^*}) \ne g(x_{i^*}) | E \big ] \ge 1/80. \end{aligned}$$

Plugging this into Eq. (1), we have that the prover convinces the verifier to accept with probability at most \(1-\frac{1}{80} + \delta \), when \(x_{i^*}\) is sampled uniformly at random in \(\left\{ 0,1 \right\} ^n\).

By Lemma 4.1 it holds that \(\Delta \left( X_{i^*}, U_n \right) \le \frac{1}{k\cdot 2^{\lambda }}+k\cdot \delta \). Therefore (using Fact 2.2), the probability that the verifier accepts when \(x_{i^*}\) is sampled as in the protocol is at most \(1- \frac{1}{80} +\frac{1}{k\cdot 2^{\lambda }}+(k+1)\cdot \delta \), which is bounded away from 1 since \(\delta < \frac{1}{100k^2}\) and \(\lambda \) is sufficiently large.

4.5 Communication Complexity and Verifier Run Time

We first bound the amount of bits sent during the interaction:

  • Sending \(x_{k+1}\) costs n bits.

  • By Lemma 2.10, the seed length of the extractor is \(d = \log (m)+O(\log n \cdot \log (\frac{n}{\epsilon }))=\log (N)+ \lambda \cdot polylog(n,k)\) and therefore, the cost of sending \(z_1,...,z_k\) is \(k\cdot (\log (N)+\lambda \cdot polylog(n,k))\).

  • By Lemma 2.8, the description length of \(h : \{0,1\}^{m}\times \left\{ 0,1 \right\} ^d \rightarrow \left\{ 0,1 \right\} ^{2\lambda +d+2\log {k}}\), a \(\frac{1}{2^{2\lambda +d+2\log {k}}}\)-almost pairwise-independent hash function, is \(O(\log (N) +\lambda +polylog(k))\). The cost of sending the hashes \(\beta _1,...,\beta _k\) is \(k \cdot O( \lambda + d + \log {k})= k\cdot (\log (N)+ \lambda \cdot polylog(n,k))\).

  • By Lemma 2.7, the description length of \(g\in {\{0,1\}^{n}\rightarrow \{0,1\}^{\ell }}\), a pairwise independent hash function, is O(n). The cost of sending the hashes \(\alpha _1,...,\alpha _k\) is O(k).

In total, the communication complexity is \(O(n) + k\cdot (\log (N)+ \lambda \cdot polylog(n,k))\). As for the verifier run time, For each iteration i the verifier running time is as follows:

  • Evaluating the circuit \(C_i\) takes time \(\mathsf {poly}(N)\).

  • By Lemma 2.10, evaluating \(\text {\textsf {Ext}}\) takes time \(\mathsf {poly}(m,d)=\mathsf {poly}(N,\log k ,\lambda )\).

  • By Lemma 2.8, evaluating h on an input of size \(m+d\) takes time \(\mathsf {poly}(N,\log k,\lambda )\).

  • By Lemma 2.7, evaluating g on an input of size n takes time \(\mathsf {poly}(n)\).

In total, the verifier running time is \(k\cdot \mathsf {poly}(N,\log k, \lambda )\).