Batch Verification and Proofs of Proximity with Polylog Overhead

Rothblum, Guy N.; Rothblum, Ron D.

doi:10.1007/978-3-030-64378-2_5

Batch Verification and Proofs of Proximity with Polylog Overhead

Guy N. Rothblum¹⁰ &
Ron D. Rothblum¹¹

Conference paper
First Online: 09 December 2020

612 Accesses
4 Citations

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 12551))

Abstract

Suppose Alice wants to convince Bob of the correctness of k $\mathsf {NP}$ statements. Alice could send k witnesses to Bob, but as k grows the communication becomes prohibitive. Is it possible to convince Bob using smaller communication (without making cryptographic assumptions or bounding the computational power of a malicious Alice)? This is the question of batch verification for $\mathsf {NP}$ statements. Our main result is a new interactive proof protocol for verifying the correctness of k $\mathsf {UP}$ statements ($\mathsf {NP}$ statements with a unique witness) using communication that is poly-logarithmic in k (and a fixed polynomial in the length of a single witness).

This result is obtained by making progress on a different question in the study of interactive proofs. Suppose Alice wants to convince Bob that a huge dataset has some property. Can this be done if Bob can’t even read the entire input? In other words, what properties can be verified in sublinear time? An Interactive Proof of Proximity guarantees that Bob accepts if the input has the property, and rejects if the input is far (say in Hamming distance) from having the property. Two central complexity measures of such a protocol are the query and communication complexities (which should both be sublinear). For every query parameter q, and for every language in logspace uniform $\mathsf {NC}$, we construct an interactive proof of proximity with query complexity q and communication complexity $(n/q) \cdot \mathsf{polylog}(n)$.

Both results are optimal up to poly-logarithmic factors, under reasonable complexity-theoretic or cryptographic assumptions. The second result, which is our main technical contribution, builds on a distance amplification technique introduced in a beautiful recent work of Ben-Sasson, Kopparty and Saraf [CCC 2018].

In Memoriam: Uriel G. Rothblum

You have full access to this open access chapter, Download conference paper PDF

1 Introduction

The power of efficiently verifiable proof-systems is a central question in the study of computation. It has been the focus of a rich literature spanning cryptography and complexity theory. This literature has put forth and studied different notions of proof systems and different notions of efficient verification. Interactive proofs, introduced in the seminal work of Goldwasser, Micali and Rackoff [GMR89], are one of the most fundamental notions in this field. An interactive proof is an interactive protocol between a randomized verifier and an untrusted prover. The prover convinces the verifier of the validity of a computational statement, usually framed as membership of an input x in a language $\mathcal {L}$. Soundness is unconditional. Namely, if the input is not in the language, then no matter what (unbounded and adaptive) strategy a cheating prover might employ, the verifier should reject with high probability over its own coin tosses. Interactive proofs have had a dramatic impact on complexity theory and on cryptography. Opening the door to randomized and interactive verification led to revolutionary notions of proof verification, such as zero knowledge interactive proofs [GMR89, GMW91] and probabilistically checkable proofs ($\mathsf {PCP}$s) [BGKW88, FRS94, BFL91, BFLS91, FGL+96, AS92, ALM+98]. Interactive proof-systems also allow for more efficient verification of larger classes of computations (compared with $\mathsf {NP}$ proof systems), as demonstrated in the celebrated $\mathsf IP=\mathsf PSPACE$ Theorem [LFKN92, Sha92].

Still, foundational questions about the power of interactive proof systems have remained open. Our work studies two such questions:

1.1 Batch Verification

Can interactive proofs allow for more efficient batch verification of a collection of NP statements?

A naive solution is sending the k witnesses in their entirety. An honest prover, who knows the witnesses, runs in polynomial time, but the communication grows linearly with k. For the case of $\mathsf {UP}$ statements—$\mathsf {NP}$ statements with a unique witness—we show a protocol where the communication complexity grows poly-logarithmically with k (and the honest prover remains efficient):

Theorem 1

(Informally Stated, see Theorems 4.2 and 4.1). Let $\mathcal {L}\in \mathsf {UP}$ with witnesses of length $m=m(n)$. There exists an interactive proof for verifying that k instances $x_1,\dots ,x_k$, each of length n, all belong to $\mathcal {L}$. The communication complexity is $\mathsf{poly}(\log (k),m)$, where $\mathsf{poly}$ refers to a fixed polynomial that depends only on the language $\mathcal {L}$. The number of rounds is $\mathsf{polylog}(k).$ The verifier runs in time $\tilde{O}( k \cdot n ) + \mathsf{polylog}(k) \cdot \mathsf{poly}(m)$, where n is the length of each of the instances. The honest prover runs in time $\mathsf{poly}(k,n,m)$ given the k unique witnesses.

This resolves the communication complexity of batch verification for $\mathsf {UP}$ up to $\mathsf{poly}(\log (k),m)$ factors: under complexity-theoretic assumptions, even for $k=1$ there are $\mathsf {UP}$ languages (e.g. unique SAT) for which every interactive proof system requires communication complexity $\varOmega (m)$ [GH98, GVW02]. When the number of instances k is large, this can be a significant improvement over the naive solution in which the prover sends over all k witnesses.

We note that for $\mathsf {UP}$ relations that are checkable in log-space uniform $\mathsf {NC}$, we can reduce the communication complexity to $m \cdot \mathsf{polylog}(k,m)$. As discussed above, this is tight up to $\mathsf{polylog}(k)$ factors (under complexity assumptions). We also note that, assuming the existence of one-way functions, our batch verification protocol (which is public coin) can be made zero-knowledge using standard techniques [BGG+88].

Comparison to Prior Work. A different solution can be obtained via the $\mathsf IP=\mathsf PSPACE$ theorem, by observing that the membership of k inputs in an $\mathsf {NP}$ language can be decided in space $O(\log k + m \cdot \mathsf{poly}(n))$, where n is the length of a single input and m is the length of a single $\mathsf {NP}$ witness. Thus, by the $\mathsf IP=\mathsf PSPACE$ Theorem, there is an interactive proof for batch verification with communication complexity $\mathsf{poly}(\log {k},n,m)$. A major caveat, however, is that the complexity of proving correctness (the running time of the honest prover) is exponential in $\mathsf{poly}(n,m)$. We, on the other hand, focus on batch verification where the honest prover runs in polynomial time given the k $\mathsf {NP}$ witnesses. We refer to such an interactive proof as having an efficient prover.^{Footnote 1} Another significant drawback of this solution is that the number of rounds becomes $\mathsf{poly}(m,\log k)$.

Two recent works have constructed protocols for efficient batch verification of $\mathsf {UP}$ statements. Reingold, Rothblum and Rothblum [RRR16] gave a protocol with communication complexity $\mathsf{polylog}(k) \cdot \mathsf{poly}( m ) + k \cdot \mathsf{polylog}(m)$. In a subsequent work [RRR18] they eliminated the additive k factor but increased the multiplicative factor, by showing a (constant-round) protocol with communication complexity $k^{\varepsilon } \cdot \mathsf{poly}( m )$, for any $\varepsilon >0$. Our main result achieves the best of both worlds: eliminating the additive linear factor while preserving the poly-logarithmic multiplicative factor (although our protocol has a larger number of rounds than that of [RRR18]).

1.2 Interactive Proofs of Proximity

A different question (which turns out to be related) asks which statements can be verified in sublinear time, i.e. without even reading the entire input. This immediately raises the question of what computational model is used to capture “sublinear time”. Drawing inspiration from the literature on sublinear algorithms, a natural choice is to adopt the perspective of property testing, a study initiated by Rubinfeld and Sudan [RS96] and Goldreich, Goldwasser and Ron [GGR98], which considers highly-efficient randomized algorithms that solve approximate decision problems, while only inspecting a small fraction of the input. Such algorithms, commonly referred to as property testers for a set S (say the set of objects with some property), are given query access to an input, and are required to determine whether the input is in S (has the property), or is far (say, in Hamming distance) from every string in S (far from having the property). A rich literature has put forward property testers for many natural properties.

Analogously, in the proof verification setting, Interactive Proofs of Proximity ($\mathsf {IPP}$s) aim to verify that a given input is close to a set (or a property). Given a desired proximity parameter $\delta \in (0,1]$, the soundness condition of standard interactive proofs is relaxed: it should be impossible to convince the verifier to accept statements that are $\delta $-far (in fractional Hamming distance) from true statements (except with small probability). Such proof-systems were first introduced by Ergün, Kumar and Rubinfeld [EKR04] and were more recently further studied by Rothblum, Vadhan and Wigderson [RVW13] and by Gur and Rothblum [GR13]. The verifier’s query complexity and running time, as well as the communication, should all be sublinear in the input length. Other parameters of interest include the (honest) prover’s running time and the number of rounds.

The hope is that $\mathsf {IPP}$s can overcome inherent limitations of property testing: for example, demonstrating specific properties where verifying proximity can be significantly faster than the time needed to test (without a prover). Another goal is showing that sublinear-time verification is possible for much richer families of properties than those for which property testers exist. In particular, research on property testing has focused on constructing testers for languages based on their combinatorial or algebraic structure. This limitation seems inherent, because there exist simple and natural languages for which (provably) no sublinear time property testers exist. In contrast, it is known that highly non-trivial $\mathsf {IPP}$s exist for every language that can be decided in bounded-polynomial depth or space [RVW13, RRR16]. However, the optimal tradeoffs between the query and communication complexities needed for proof verification were not known, and this is the second foundational question we study:

For the case of languages in (uniform) $\mathsf {NC}$—languages that can be decided by polynomial-sized circuits of polylogarithmic depth—we show that the product of the query and communication complexities can be quasi-linear.

Theorem 2

(Informally Stated, see Theorem 4.1). Let $t = t(n) \le n$ be a parameter. For every $\delta \le \frac{t \cdot \mathsf{polylog}(n)}{n}$ and every language $\mathcal {L}$ in log-space uniform $\mathsf {NC}$, there exists an $\mathsf {IPP}$ for $\mathcal {L}$ with respect to proximity parameter $\delta $, with communication complexity $t \cdot \mathsf{polylog}(n)$ and query complexity $O(1/\delta )$. The verifier runs in time $\tilde{O}(t + n/t)$ and the prover runs in time $\mathsf{poly}(n)$.

For example, by setting $t(n)=\sqrt{n}$ we obtain an $\mathsf {IPP}$ for $\mathsf {NC}$ with query, communication and verification complexity all $\tilde{O}(\sqrt{n})$. This result resolves the question for such languages, up to polylogarithmic factors, as Kalai and Rothblum [KR15] showed that (under a reasonable cryptographic assumption) there exists a language in $\mathsf {NC}^1$ for which the product of the query and communication complexities cannot be sublinear.

Comparison and Relationship to [RVW13]. Theorem 2 shows that the product of the query complexity and the communication can be quasi-linear (for a distance parameter that is the inverse of the query complexity). Rothblum, Vadhan and Wigderson [RVW13] showed a similar statement, but the product of the query and communication complexities was $n^{1+o(1)}$.

Our protocol builds on the framework developed in their work, introducing several new ideas and using a key distance amplification technique from a beautiful recent work of Ben-Sasson, Kopparty and Saraf [BKS18]. We find the improvement from $n^{1+o(1)}$ to $\tilde{O}(n)$ to be significant: beyond the fact that it provides a nearly-optimal (up to $\mathsf{polylog}(n)$ factors) trade-off for a foundational problem, it allows for $\mathsf {IPP}$s with $\mathsf{polylog}(n)$ communication and sublinear query complexity. In prior work, achieving sublinear query complexity (for $\mathsf {NC}$) required $n^{o(1)}$ communication. The importance of this distinction is exemplified in the application of $\mathsf {IPP}$s towards batch verification for $\mathsf {UP}$ [RRR18]. That construction repeatedly uses $\mathsf {IPP}$s with slightly-sublinear query complexity. The communication of the resulting batch verification protocol is dominated by the communication complexity of the $\mathsf {IPP}$s. Indeed, the improved $\mathsf {IPP}$ of Theorem 2 is the key component behind the improved $\mathsf {UP}$ batch verification protocol of Theorem 1.

1.3 Related Works

Batch Verification with Computational Soundness. If one is willing to settle for computational soundness (i.e., soundness holds only against polynomial-time cheating strategies) and to use cryptographic assumptions, then efficient batch verification is possible for all of $\mathsf {NP}$. In particular, Kilian [Kil92] gave an interactive argument-system for all of $\mathsf {NP}$ based on collision-resistant hash functions with only poly-logarithmic communication complexity. Since verifying the membership of k instances in an $\mathsf {NP}$ language is itself an $\mathsf {NP}$ problem, we immediately obtain a batch verification protocol with communication complexity $\mathsf{poly}(\log (n),\log (k),\kappa )$, where $\kappa $ is a cryptographic security parameter.

More recently, Brakerski, Holmgren and Kalai [BHK17] obtained an efficient non-interactive batch-verification protocol assuming the existence of a computational private information retrieval scheme. Non-interactive batch verification protocols also follow from the existence of succinct non-interactive zero-knowledge arguments (zkSNARGs), which are known to exist under certain strong, and non-falsifiable, assumptions (see, e.g. [Ish], for a recent survey).

We emphasize that the batch verification protocols of both [Kil92] and [BHK17] only provide computational soundness and are based on unproven cryptographic assumptions. In contrast, the result of Theorem 1 offers statistical soundness and is unconditional.

Interactive Proofs of Proximity. Beyond the works [EKR04, RVW13, GR13] that were mentioned above, interactive proofs of proximity have drawn considerable attention [FGL14, GGR15, KR15, RRR16, GR17, BRV18, RRR18, CG18, GLR18, RR19, GRSY20].

In particular, we mention that a recent work of Ron-Zewi and Rothblum [RR19, Theorem 3, see also Remark 1.3] shows that for every constant $\epsilon $, every language computable in polynomial-time and bounded polynomial space has an $\mathsf {IPP}$ with communication complexity $\epsilon \cdot n$ and constant query complexity. Note that the product between the query and communication complexity in their result is O(n), rather than $n \cdot \mathsf{polylog}(n)$ as in Theorem 2. However, in constrast to Theorem 2, their result is restricted to the regime of constant query complexity and only yields communication complexity that is smaller by a constant factor than that of the trivial solution (see Proposition 3.3).

1.4 Organization

Section 2 contains a technical overview of our techniques. In Sect. 3 we provide preliminaries and our main results are stated in Sect. 4. In Sect. 5 we introduce the $\mathsf {PVAL}$ problem and show how to amplify its distance. Our efficient $\mathsf {PVAL}$ $\mathsf {IPP}$ is in Sect. 6. Lastly, in Sect. 7 we use the results established in the prior sections to prove Theorem 1 and Theorem 2.

2 Technical Overview

To prove Theorem 1 we rely on a recent result of Reingold et al. [RRR18] who showed how to reduce the construction of $\mathsf {UP}$ batch verification protocol to that of constructing efficient $\mathsf {IPP}$s. In particular, via the connection established in [RRR18], in order to prove Theorem 1, it suffices to prove Theorem 2 with respect to $cc = \mathsf{polylog}(n)$.

Thus, in this overview we focus on proving Theorem 2. Our starting point for the proof of Theorem 2 is the $\mathsf {IPP}$ construction for $\mathsf {NC}$ from [RVW13] (which achieves weaker parameters than those of Theorem 2).

The [RVW13] protocol is centered around a parameterized problem called $\mathsf {PVAL}$, which stands for “Polynomial eVALuation” and is defined next. A key step in the [RVW13] proof is showing that $\mathsf {PVAL}$ is “complete” for constructing $\mathsf {IPP}$s for $\mathsf {NC}$. In more detail, for every language $\mathcal {L}\in \mathsf {NC}$, [RVW13] show an interactive reduction, in which the verifier makes no queries to its input. At the end of the reduction, the verifier generates a “parameterization” of the $\mathsf {PVAL}$ problem so that if the original input x belonged to $\mathcal {L}$ then x belongs to $\mathsf {PVAL}$, whereas if x was far from $\mathcal {L}$ then, with high probability, x is also far from $\mathsf {PVAL}$.

Thus, an efficient $\mathsf {IPP}$ for $\mathsf {PVAL}$ immediately yields an efficient $\mathsf {IPP}$ for $\mathcal {L}$ as follows: the prover and verifier first engage in the interactive reduction to obtain a parameterization of the $\mathsf {PVAL}$ problem. Then, the two parties run the efficient $\mathsf {IPP}$ protocol to check proximity to the newly generated $\mathsf {PVAL}$ instance.

In this work we follow the same strategy. We do not modify the interactive reduction step from [RVW13]. Our improved efficiency stems from a more efficient $\mathsf {IPP}$ for $\mathsf {PVAL}$ (than that of [RVW13]), which suffices to obtain our main results.

We start by defining a specific variant^{Footnote 2} of the $\mathsf {PVAL}$ problem that suffices for our purposes.

The $\mathsf {PVAL}$ Problem. Let $\mathbb {F}$ be a (sufficiently large) finite field. The $\mathsf {PVAL}$ problem is parameterized by an integer $t \in \mathbb {N}$, which we refer to as the arity, and a dimension $m \in \mathbb {N}$. In addition the problem is parameterized by t vectors $\mathbf {j}=(\mathbf {j}_1,\dots ,\mathbf {j}_t) \in (\mathbb {F}^m)^t$ and t scalars $\mathbf {v}=(v_1,\dots ,v_t) \in \mathbb {F}^t$. The main input to $\mathsf {PVAL}(t,\mathbf {j},\mathbf {v})$ is the truth table of a function $f :\{0,1\}^m \rightarrow \mathbb {F}$. We say that $f \in \mathsf {PVAL}(t,\mathbf {j},\mathbf {v})$ if it holds that $\hat{f}(\mathbf {j}_i)=v_i$, for every $i \in [t]$, where $\hat{f} : \mathbb {F}^m \rightarrow \mathbb {F}$ is the multi-linear extension of f.^{Footnote 3} Thus, the goal of the $\mathsf {PVAL}$ verifier is to distinguish the case that (1) the multilinear extension $\hat{f}$ of the input function f is equal, at t given points, to t corresponding values, or (2) is far from any such function. Note that the verifier is only allowed to make a sub-linear (i.e., $\ll 2^m$) number of queries to f, but is allowed to communicate with the (untrusted) prover who has full access to f.

Our main technical contribution is an $\mathsf {IPP}$ for checking $\delta $-proximity to $\mathsf {PVAL}(t,\mathbf {j},\mathbf {v})$ with communication complexity roughly $t \cdot \mathsf{poly}(m)$ and query complexity $O(1/\delta )$ (see Theorem 6.1 for the formal statement). (Note that setting $\delta = 2^m \cdot \mathsf{poly}(m)/t$ results in the product of the query and communication complexities being $\tilde{O}(2^m)$, which is quasi-linear in the input length.) We proceed to describe the new $\mathsf {IPP}$ for $\mathsf {PVAL}$.

Attempt 1: Divide and Conquer. Fix a parameterization $(t,\mathbf {j},\mathbf {v})$ for $\mathsf {PVAL}$, where $\mathbf {j}=(\mathbf {j}_1,\dots ,\mathbf {j}_t)$ and $\mathbf {v}=(v_1,\dots ,v_t)$, and consider a given input $f : \{0,1\}^m \rightarrow \mathbb {F}$. Following [RVW13], we would like to first decompose the t claims that we are given about f into claims about the underlying functions $f_0,f_1 : \{0,1\}^{m-1} \rightarrow \mathbb {F}$, where $f_0( \cdot ) \equiv f(0,\cdot )$ and $f_1( \cdot ) \equiv f(1,\cdot )$. To do so, the verifier asks the prover to provide the contributions of $f_0$ and $f_1$ to the linear claims $\hat{f}(\mathbf {j}_i)=v_i$, for all $i \in [t]$. In more detail, let us view each vector $\mathbf {j}_i$ as $\mathbf {j}_i = (\chi _i,\mathbf {j}'_i)$ where $\chi _i \in \mathbb {F}$ and $\mathbf {j}'_i \in \mathbb {F}^{m-1}$ (i.e., we isolate the first component of $\mathbf {j}_i$ as $\chi _i$ and the remaining components as an $(m-1)$-dimensional vector $\mathbf {j}'_i$). The prover sends the vectors $\mathbf {v}_0,\mathbf {v}_1 \in \mathbb {F}^{m-1}$, where $\mathbf {v}_0 = \hat{f}_0|_{\mathbf {j}'}$ and $\mathbf {v}_1 = \hat{f}_1|_{\mathbf {j}'}$. Note that the prover cannot send arbitrary vectors since the verifier can check (and indeed does check) that $\mathbf {v}_0$ and $\mathbf {v}_1$ are consistent with $\mathbf {v}$. (I.e., that $\mathbf {v} = (1-\bar{\chi }) \cdot \mathbf {v}_0 + \bar{\chi } \cdot \mathbf {v}_1$, where $\bar{\chi }=(\chi _1,\dots ,\chi _t)$ and the multiplication is pointwise.) See Fig. 1 for an illustration.

A natural idea at this point, is to try to combine $f_0$ and $f_1$ (and the corresponding claims that we have about them) into a single $m-1$ variate function on which we can recurse. For example, we can take a random linear combination of the two functions as follows: the verifier chooses random coefficients $c^{(0)},c^{(1)} \in \mathbb {F}$, sends them to the prover. The two parties then recurse on the input $f' = c^{(0)} \cdot f_0 + c^{(1)} \cdot f_1$ wrt the claims $\hat{f}'|_{\mathbf {j}'} = \mathbf {v}'$, with $\mathbf {v}' = c^{(0)} \cdot \mathbf {v}_0+ c^{(1)} \cdot \mathbf {v}_1$.^{Footnote 4}

Note that the input has shrunk by a factor of 2 and it is not too difficult to argue that if f was $\delta $-far from $\mathsf {PVAL}(t,\mathbf {j},\mathbf {v})$ then $f'$ is about $\delta $-far from $\mathsf {PVAL}(t,\mathbf {j}',\mathbf {v}')$. Thus, with very little communication (i.e., $O(t \cdot \log (|\mathbb {F}|)$) we have reduced the input size by half and preserved the distance. We can continue recursing as such until the input reaches a sufficiently small size so that the verifier can solve the problem by itself (or, rather, verifier can employ a “trivial” protocol, with $1/\delta $ query complexity and linear communication in the size of the final input).

The problem with this approach is that while the input size has shrunk by half, as we recurse we will need to emulate each query to $f'$ by using two queries to f. Thus, while the input length has shrunk by half, the query complexity has doubled and essentially no progress has been made. Indeed, if we unwind the recursion, we see that the total query complexity in the proposed protocol is linear in the input length.

Doubling the Distance: A Pipe Dream? For every $b \in \{0,1\}$, let $\delta _b$ be the distance of $f_b$ from $\mathsf {PVAL}(t,\mathbf {j}',\mathbf {v}_b)$ and let $P_b \in \mathsf {PVAL}(t,\mathbf {j}',\mathbf {v}_b)$ such that $\varDelta (f_b,P_b)=\delta _b$ (without getting into the details we remark that $P_0$ and $P_1$ will be unique in our regime of parameters). Note that if f is $\delta $-far from $\mathsf {PVAL}$ then $\delta _0+\delta _1 \ge 2\delta $, since otherwise f is $\delta $-close to the function $P \in \mathsf {PVAL}(t,\mathbf {j},\mathbf {v})$ defined as $P(\sigma ,\mathbf {x}) = (1-\sigma ) \cdot P_0(\mathbf {x}) + \sigma \cdot P_1(\mathbf {x})$.

For every $b \in \{0,1\}$, let $I_b \subseteq \{0,1\}^m$ be the set of $\delta _b \cdot 2^m$ points on which $P_b$ and $f_b$ disagree (we refer to these as the “error pattern”). Suppose momentarily that $I_0$ and $I_1$ have a small intersection (or are even disjoint). In such a case, $f'$ is roughly $\delta _0+\delta _1 \ge 2\delta $ far from $c_0 \cdot P_0 + c_1 \cdot P_1 \in \mathsf {PVAL}(t,\mathbf {j}',\mathbf {v}')$. This leads us to wonder whether $f'$ could actually be $2\delta $ far from $\mathsf {PVAL}(t,\mathbf {j}',\mathbf {v}')$ even when the error patterns have a large intersection (rather than just $\delta $ far as in the analysis above).

Note that if it is indeed the case that $f'$ is $2\delta $ far from the corresponding $\mathsf {PVAL}$ instance then we have improved two parameters: both the distance and the input size, while only paying in the query complexity. If we continue the recursion now (stopping when the input size is of size roughly t) we obtain an $\mathsf {IPP}$ for $\mathsf {PVAL}$ with poly-logarithmic overhead, as we desired.

Unfortunately, the above analysis was centered on the assumption that $I_0$ and $I_1$ have a small intersection, which we cannot justify. As a matter of fact, for all we know, the two sets could very well be identical. In such a case, the distance of $f'$ from $\mathsf {PVAL}$ will indeed be (roughly) $\delta $ and we are back to square one.

See Fig. 2 for an illustration for the possible “error patterns” of $f_0$ and $f_1$ and how they affect the “error patterns” of $f'$.

We pause here for a detour, recalling the approach of [RVW13] (this is not essential for understanding our construction and can be skipped). They observe that if $\delta _0$ and $\delta _1$ are roughly equal, then the verifier can simply recurse on one of them. This roughly maintains the distance, while avoiding doubling the query complexity. On the other hand, if say $\delta _0 \gg \delta _1$, they show that the random linear combination technique described above does increase the distance (intuitively, the row with smaller distance cannot “cancel out” the error pattern of the row with larger distance). We remark that they lose a constant multiplicative factor in this argument, which leads them to consider a decomposition into $\mathsf{polylog}(n)$ many rows, rather than 2. Of course, the verifier does not know whether $\delta _0 \approx \delta _1$ or $\delta _0 \gg \delta _1$. However, they show that the verifier can “cover its bases” by considering a small number of (approximations to the) decompositions of the distance across rows. This results in the creation of $O(\log \log n)$ smaller recursive instances, where the product of the new distance and the new effective query complexity of at least one of these instances is “good” (the definition of “good” allows for losing super-constant multiplicative factors). Over the course of $\varOmega (\log n{/}\log \log n)$ recursive steps, the losses and the ballooning number of recursive instances add up, and result in a roughly $2^{\log n{/}\log \log n} = n^{o(1)}$ overhead in the product between the final query and communication complexities.

Reducing the Intersection Size. A key new ingredient in our protocol is randomly permuting the truth tables of the functions $f_0$ and $f_1$, in order to make the sets $I_0$ and $I_1$ (pseudo-)random, and therefore likely to have a small intersection. This is inspired by the beautiful recent result of Ben Sasson, Kopparty and Saraf [BKS18] on amplifying distances from the Reed-Solomon code. More precisely, the verifier chooses random permutations $\pi _0,\pi _1 : \{0,1\}^m \rightarrow \{0,1\}^m$ (from a suitable family of permutations, to be discussed below). We consider the new functions $f_0 \circ \pi _0$ and $f_1 \circ \pi _1$. The hope is that the entropy induced by these permutations will make the error patterns in $f_0 \circ \pi _0$ and in $f_1 \circ \pi _1$ have a small intersection. Then, rather than recursing on $c_0 \cdot f_0 + c_1 \cdot f_1$, we will aim to recurse on $f' = c_0 \cdot (f_0 \circ \pi _0) + c_1 \cdot (f_1 \circ \pi _1)$.

To make this approach work we have to overcome several difficulties. First, we need to ensure that we can translate the claims that we have about $\hat{f}_0$ and $\hat{f}_1$ into claims about $\widehat{f_0 \circ \pi _0}$ and $\widehat{f_1 \circ \pi _1}$. We do so by choosing $\pi _0$ and $\pi _1$ as random affine maps over $\mathbb {F}^m$, while ensuring that the restriction of these maps to $\{0,1\}^m$ forms a permutation. We argue that this ensures that:

$$\begin{aligned} \widehat{f_0 \circ \pi _0} \equiv \hat{f}_0 \circ \pi _0, \end{aligned}$$

(1)

and similarly for $f_1$. To see that Eq. (1) holds, observe that if $\pi _0$ is an affine function, then both sides of the equation are multilinear polynomials that agree on $\{0,1\}^m$. Therefore they must also agree on $\mathbb {F}^m$.

Equation (1) implies that the claims that we have about the multi-linear extensions of $f_0 \circ \pi _0$ and $f_1 \circ \pi _1$ are simply permutations of the claims about $f_0$ and $f_1$, respectively.

A second difficulty that arises at this point is that the claims that we have about $f_0 \circ \pi _0$ and $f_1 \circ \pi _1$ are not “aligned”. The former claims are about positions $\pi _0^{-1}(\mathbf {j}')$ and the latter about $\pi _1^{-1}(\mathbf {j}')$ (in the multi-linear extensions of $f_0 \circ \pi _0$ and $f_1 \circ \pi _1$, respectively). Since the claims are not aligned, it unclear how to combine them to get t claims about the input $f'$.

As our first step toward resolving this difficulty, we have the prover “complete the picture” by providing the verifier also with the (alleged) values of $\hat{f}_0 \circ \pi _0$ at positions $\pi _1^{-1}(\mathbf {j}')$ and those of $\hat{f}_1 \circ \pi _1$ at positions $\pi _0^{-1}(\mathbf {j}')$.

Note that the prover can cheat to its heart’s desire about these claims, but the point is that we now have a single set $I = \pi _0^{-1}(\mathbf {j}') \cup \pi _1^{-1}(\mathbf {j}')$ so that each function $f_b$ is still $\delta _b$ far from the claims that we have about $f_b |_I$. Since the claims are now properly aligned, we can derive a new sequence of claims about $f'$. More importantly, we prove a technical lemma (building on the result of Ben Sasson et al. [BKS18]), showing that if f is $\delta $-far from $\mathsf {PVAL}(t,\mathbf {j},\mathbf {v})$ then, with high probability, $f'$ is roughly $2\delta $-far from the corresponding $\mathsf {PVAL}$ instance (induced by the prover’s new claims).

To summarize, the approach so far lets us double the distance in each iteration as we desired. Unfortunately, it also raises a new problem: the arity of the new $\mathsf {PVAL}$ instance that we generated has doubled - rather than just having t claims we now have roughly 2t claims (corresponding to the size of the set I). See Fig. 3 for an illustration.

Arity Reduction Step. We resolve this final difficulty by once more employing interaction, and using the prover in order to reduce the 2t claims that we have about $f_0 \circ \pi _0$ and $f_1 \circ \pi _1$ to just t (aligned) claims each, while preserving the distance.

The idea here is to consider a degree O(t) curve $\mathcal {C} : \mathbb {F}\rightarrow \mathbb {F}^{m}$ passing through the set of points I. The prover sends to the verifier the values of $\hat{f}_0 \circ \pi _0 |_{\mathcal C}$ and $\hat{f}_1 \circ \pi _1 |_{\mathcal C}$. The verifier checks that that the provided values lie on a degree O(t) univariate polynomial (since $\hat{f}_b \circ \pi _b \circ \mathcal {C}$ has low degree, for both $b \in \{0,1\}$). The verifier also checks that the values that correspond to points in the set I, are consistent with the claims that it has. The verifier now chooses a set of t random points $\rho = (\rho _1,\dots ,\rho _t)$ on the curve. The new claims about $\hat{f}_0$ and $\hat{f}_1$ are those that correspond to the set of points in $\rho $. In particular, this lets us reduce the number of claims from 2t to t.

We want to argue that this arity-reduction sub-protocol preserves the distance. This is accomplished by taking a union bound over all inputs that are (roughly) $2\delta $-close to $f'$, and showing that for each of them, the probability that it satisfies the new claim is tiny. We conclude that $f'$ is indeed (roughly) $2\delta $-far from the resulting $\mathsf {PVAL}$ instance (a similar idea was used in the proof that $\mathsf {PVAL}$ is complete [RVW13]).

3 Preliminaries

For a string $x \in \varSigma ^n$ and an index $i \in [n]$, we denote by $x_i \in \varSigma $ the $i^\mathrm{th}$ entry in x. If $I \subseteq [n]$ is a set then we denote by $x| _I$ the sequence of entries in x corresponding to coordinates in I.

Let $x,y \in \varSigma ^n$ be two strings of length $n \in \mathbb {N}$ over a (finite) alphabet $\varSigma $. We define the (relative Hamming) distance of x and y as $\varDelta \left( x,y \right) {\mathop {=}\limits ^\mathrm{def}}\left| \left\{ x_i \ne y_i \;:\; i \in [n] \right\} \right| / n$. If $\varDelta \left( x,y \right) \le \varepsilon $, then we say that x is $\varepsilon $-close to y, and otherwise we say that x is $\varepsilon $-far from y. We define the distance of x from a (non-empty) set $S \subseteq \varSigma ^n$ as $\varDelta \left( x,S \right) {\mathop {=}\limits ^\mathrm{def}}\min _{y \in S} \varDelta \left( x,y \right) $. If $\varDelta \left( x,S \right) \le \varepsilon $, then we say that x is $\varepsilon $-close to S and otherwise we say that x is $\varepsilon $-far from S. We extend these definitions from strings to functions by identifying a function with its truth table. For a set S, take its minimum distance to be the minimum, over all distinct vectors $x,y \in S$ of $\varDelta (x,y)$. We use $\varDelta (S)$ to denote the minimum distance of S. Fixing a vector space, for a set S and a vector x, we denote $(x + S) = \{x + y : y \in S\}$. For a scalar c, we denote $(c \cdot S) = \{c \cdot y : y \in S\}$.

3.1 Multivariate Polynomials and Low Degree Extensions

We recall some important facts on multivariate polynomials (see [Sud95] for a far more detailed introduction). A basic fact, captured by the Schwartz-Zippel lemma is that low degree polynomials cannot have too many roots.

Lemma 3.1

(Schwartz-Zippel Lemma). Let $P : \mathbb {F}^m \rightarrow \mathbb {F}$ be a non-zero polynomial of total degree d. Then,

$$\begin{aligned} \Pr _{x \in \mathbb {F}^m} \big [ P(x) = 0 \big ] \le \frac{d}{|\mathbb {F}|}. \end{aligned}$$

An immediate corollary of the Schwartz-Zippel Lemma is that two distinct polynomials $P,Q : \mathbb {F}^m \rightarrow \mathbb {F}$ of total degree d may agree on at most a $\frac{d}{|\mathbb {F}|}$-fraction of their domain $\mathbb {F}^m$.

Throughout this work we consider fields in which operations can be implemented efficiently (i.e., in poly-logarithmic time in the field size). Formally we define such fields as follows.

Definition 3.1

We say that an ensemble of finite fields $\mathbb {F}= (\mathbb {F}_n)_{n \in \mathbb {N}}$ is constructible if elements in $\mathbb {F}_n$ can be represented by $O(\log (|\mathbb {F}_n|))$ bits and field operations (i.e., addition, subtraction, multiplication, inversion and sampling random elements) can all be performed in $\mathsf{polylog}(|\mathbb {F}_n|)$ time given this representation.

A well known fact is that for every $S=S(n)$, there exists a constructible field ensemble of size O(S) and its representation can be found in $\mathsf{polylog}(S)$ time (see, e.g., [Gol08, Appendix G.3] for details).

Let $\mathbb {H}$ be a finite field and $\mathbb {F}\supseteq \mathbb {H}$ be an extension field of $\mathbb {H}$. Fix an integer $m \in \mathbb {N}$. A basic fact is that for every function $\phi :\mathbb {H}^m \rightarrow \mathbb {F}$, there exists a unique extension of $\phi $ into a function $\hat{\phi }:\mathbb {F}^m \rightarrow \mathbb {F}$ (which agrees with $\phi $ on $\mathbb {H}^m$; i.e., $\hat{\phi }|_{\mathbb {H}^m}\equiv \phi $), such that $\hat{\phi }$ is an m-variate polynomial of individual degree at most $|\mathbb {H}|-1$. Moreover, there exists a collection of $|\mathbb {H}|^m$ functions $\{\hat{\tau }_x \}_{x \in \mathbb {H}^m}$ such that each $\hat{\tau }_x : \mathbb {F}^m \rightarrow \mathbb {F}$ is the m-variate polynomial of degree $|\mathbb {H}|-1$ in each variable defined as:

$$\begin{aligned} \hat{\tau }_x (z) {\mathop {=}\limits ^\mathrm{def}}\prod _{i \in [m]} \prod _{h \in \mathbb {H}\backslash \{x_i\}} \frac{z_i-h}{x_i-h}. \end{aligned}$$

and for every function $\phi : \mathbb {H}^m \rightarrow \mathbb {F}$ it holds that

$$ \hat{\phi }(z_1,\ldots ,z_m)=\sum _{x \in \mathbb {H}^m}\hat{\tau }_x(z_1,\ldots ,z_m)\cdot \phi (x). $$

The function $\hat{\phi }$ is called the low degree extension of $\phi $ (with respect to $\mathbb {F}$, $\mathbb {H}$ and m). In the special case in which $\mathbb {H}=\mathbb {GF}(2)$, the function $\hat{\phi }$ (which has individual degree 1) is called the multilinear extension of $\phi $ (with respect to $\mathbb {F}|$ and m).

3.2 A Useful Permutation Family

Let $m \in \mathbb {N}$. For every $a \in \mathbb {GF}(2^m)$, let $f_a:(\mathbb {GF}(2))^m \rightarrow (\mathbb {GF}(2))^m$ be defined as $f_a(x) = a \cdot x$, where we identify elements in $\mathbb {GF}(2^m)$ with vectors in $(\mathbb {GF}(2))^m$ in the natural way. Thus, for every $a \in (\mathbb {GF}(2))^m$, there exists a matrix $M_a \in (\mathbb {GF}(2))^{m \times m}$ such that $f(x) = M_a \cdot x$. Note that if $a \ne 0$ then the matrix $M_a$ is invertible, and its inverse is given by $M_{a^{-1}}$.

Let $\mathbb {F}$ be a finite field that is an extension field of $\mathbb {GF}(2)$. For every $a,b \in (\mathbb {GF}(2))^m$ consider the function $\pi _{a,b}: \mathbb {F}^m \rightarrow \mathbb {F}^m$ defined as: $\pi _{a,b}(x) = M_a \cdot x + b$. Let $\pi _{a,b}|_{(\mathbb {GF}(2))^m}$ denote the restriction of $\pi _{a,b}$ to the domain $(\mathbb {GF}(2))^m$ and let $\varPi _m = \{ \pi _{a,b} \;:\; a,b \in (\mathbb {GF}(2))^m, a \ne 0\}$.

Proposition 3.1

The following holds for every $a,b \in (\mathbb {GF}(2))^m$:

1.
labelitem:permutationspslinear The function $\pi _{a,b}$ is an affine map over $\mathbb {F}$.
2.
If $a \ne 0$ then the function $\pi _{a,b}$ forms a permutation over $\mathbb {F}^m$ and $\pi _{a,b}|_{(\mathbb {GF}(2))^m}$ forms a permutation over $(\mathbb {GF}(2))^m$.
3.
The function family $\{ \pi _{a,b} |_{(\mathbb {GF}(2))^m} \}_{a,b \in (\mathbb {GF}(2))^m}$ is pairwise independent.
4.
If $\mathbb {F}$ and $\mathbb {GF}(2^m)$ are constructible, then given $a,b \in (\mathbb {GF}(2))^m$ and $x \in \mathbb {F}^m$ it is possible to compute $\pi _{a,b}(x)$ in time $\mathsf{poly}(m,\log (|\mathbb {F}|))$.

Proof

Item 1 is evident from the construction. For Item 2, let $a \ne 0$ and take any $x,x' \in \mathbb {F}^m$. Observe that if $M_a \cdot x + b = M_a \cdot x'+b$ then $M_a \cdot (x-x') = \mathbf {0}$. Multiplying both sides on the left by $M_{a^{-1}}$ (a matrix in $\mathbb {GF}(2)^{m \times m} \subseteq \mathbb {F}^{m \times m}$) we get that $x=x'$ . Thus, $\pi _{a,b}$ is a permutation over $\mathbb {F}^m$. Since the image of $\pi _{a,b} |_{(\mathbb {GF}(2))^m}$ lies in $(\mathbb {GF}(2))^m$ this also means that $\pi _{a,b} |_{(\mathbb {GF}(2))^m}$ is a permutation over $(\mathbb {GF}(2))^m$.

For Item 3, let $x_1,x_2,y_1,y_2 \in (\mathbb {GF}(2))^m$ with $x_1 \ne x_2$. Then:

$$\begin{aligned} \Pr _{a,b}[ M_a \cdot x_1+b = y_1 \wedge M_a \cdot x_2 + b = y_2 ]&= \Pr _{a,b}[ a \cdot x_1+b = y_1 \wedge a \cdot x_2 + b = y_2 ] \\&= \Pr _{a,b} \left[ \begin{pmatrix}a\\ b\end{pmatrix} \cdot \begin{pmatrix}x_1 &{}1 \\ x_2 &{}1\end{pmatrix} = \begin{pmatrix}y_1\\ y_2\end{pmatrix} \right] \\&= 2^{-2m}, \end{aligned}$$

where in the first expression the arithmetic is over the field $\mathbb {GF}(2)$ and in the second and third expressions the arithmetic is over $\mathbb {GF}(2^m)$, and the last equality follows from the fact that $\mathsf {det}\begin{pmatrix}x_1 &{}1 \\ x_2 &{}1\end{pmatrix} = x_1-x_2 \ne 0$.

Lastly, for Item 4, observe that $M_a$ can be generated in $\mathsf{poly}(m)$ time by taking the product of a with a basis of $(\mathbb {GF}(2))^m$. Given the full description of $M_a$, the product $M_a \cdot x + b$ can be computed in $\mathsf{poly}(m,\log (|\mathbb {F}|))$ time.

Proposition 3.2

Let $\phi :(\mathbb {GF}(2))^m \rightarrow \mathbb {F}$ and let $\hat{\phi }:\mathbb {F}^m \rightarrow \mathbb {F}$ be its multilinear extension. Let $\psi = \phi \circ (\pi _{a,b}|_{(\mathbb {GF}(2))^m})$ (a function over $(\mathbb {GF}(2))^m$), and let $\hat{\psi }$ be the multilinear extension of $\psi $. Then:

$$\begin{aligned} \forall x \in \mathbb {F}^m,\; (\hat{\phi } \circ \pi _{a,b})(x) = \hat{\psi }(x). \end{aligned}$$

Proof

By Proposition 3.1, the function $\pi _{a,b}$ is an affine map over $\mathbb {F}$. Thus, $(\hat{\phi } \circ \pi _{a,b})$ is multilinear. By definition, $\hat{\psi }$ is also multilinear (since it is a low degree extension). We have that $\hat{\psi }$ and $\hat{\phi } \circ \pi _{a,b}$ are both multilinear, and they agree over $(\mathbb {GF}(2))^m$. By uniqueness of the multilinear extension, they must also agree over $\mathbb {F}^m$.

3.3 Succinct Descriptions

Throughout this work we use $\mathsf {NC}^1$ to refer to the class of logspace uniform Boolean circuits of logarithmic depth and constant fan-in. Namely, $\mathcal {L}\in \mathsf {NC}^1$ if there exists a logspace Turing machine M that on input $1^n$ outputs a full description of a logarithmic depth circuit $C:\{0,1\}^n \rightarrow \{0,1\}$ such that for every $x \in \{0,1\}^n$ it holds that $C(x)=1$ if and only if $x \in \mathcal {L}$.

We next define a notion of succinct representation of circuits. Loosely speaking, a function $f:\{0,1\}^n \rightarrow \{0,1\}$ has a succinct representation if there is a short string $\left\langle f \right\rangle $, of poly-logarithmic length, that describes f. That is, $\left\langle f \right\rangle $ can be expanded to a full description of f. The actual technical definition is slightly more involved and in particular requires that the full description of f be an $\mathsf {NC}_1$ (i.e., logarithmic depth) circuit:

Definition 3.2

(Succinct Description of Functions). We say that a function $f:\{0,1\}^n \rightarrow \{0,1\}$ of size s has a succinct description if there exists a string $\left\langle f \right\rangle $ of length $\mathsf{polylog}(n)$ and a logspace Turing machine M (of constant size, independent of n) such that on input $1^n$, the machine M outputs a full description of an $\mathsf {NC}^1$ circuit C such that for every $x \in \{0,1\}^n$ it holds that $C(\left\langle f \right\rangle ,x)=f(x)$. We refer to $\left\langle f \right\rangle $ as the succinct description of f.

We also define succinct representation for sets $S \subseteq [k]$. Roughly speaking this means that the set can be described by a string of length $\mathsf{polylog}(k)$. The formal definition is somewhat more involved:

Definition 3.3

(Succinct Description of Sets). We say that a set $S \subseteq [k]$ of size s has a succinct description if there exists a string $\left\langle S \right\rangle $ of length $\mathsf{polylog}(k)$ and a logspace Turing machine M such that on input $1^k$, the machine M outputs a full description of a depth $\mathsf{polylog}(k)$ and size $\mathsf{poly}(s,\log {k})$ circuit (of constant fan-in) that on input $\left\langle S \right\rangle $ outputs all the elements of S as a list (of length $s \cdot \log (k)$).

We emphasize that the size of the circuit that M outputs is proportional to the actual size of the set S, rather than the universe size k.

3.4 Interactive Proofs of Proximity

Loosely speaking, $\mathsf {IPP}$s are interactive proofs in which the verifier runs in sub-linear time in the input length, where the soundness requirement is relaxed to rejecting inputs that are far from the language w.h.p. (for inputs that are not in the language, but are close to it, no requirement is made). Actually, we will think of the input of the verifier as being composed of two parts: an explicit input $x \in \{0,1\}^n$ to which the verifier has direct access, and an implicit (longer) input $y \in \{0,1\}^m$ to which the verifier has oracle access. The goal is for the verifier to run in time that is sub-linear in m and to verify that y is far from any $y'$ such that the pair $(x,y')$ are in the language. Since such languages are composed of input pairs, we refer to them as pair languages.

Definition 3.4

(Interactive Proof of Proximity ($\mathsf {IPP}$) [EKR04, RVW13]). An interactive proof of proximity ($\mathsf {IPP}$) for the pair language $\mathcal {L}$ is an interactive protocol with two parties: a (computationally unbounded) prover $\mathcal {P}$ and a computationally bounded verifier $\mathcal {V}$. Both parties get as input $x \in \{0,1\}^n$ and a proximity parameter $\varepsilon >0$. The verifier also gets oracle access to $y \in \{0,1\}^m$ whereas the prover has full access to y. At the end of the interaction, the following two conditions are satisfied:

1.
Completeness: For every pair $(x,y) \in \mathcal {L}$, and proximity parameter $\varepsilon >0$ it holds that
$$\begin{aligned} \Pr \Big [ \big ( \mathcal {P}(y),\mathcal {V}^y \big )(x,|y|,\varepsilon ) = 1 \Big ] = 1. \end{aligned}$$
2.
Soundness: For every $\varepsilon > 0$, $x \in \{0,1\}^n$ and y that is $\varepsilon $-far from the set $\{y' : (x,y') \in \mathcal {L}\}$, and for every computationally unbounded (cheating) prover $\mathcal {P}^*$ it holds that
$$\begin{aligned} \Pr \Big [ \big ( \mathcal {P}^*(y),\mathcal {V}^y \big )(x,|y|,\varepsilon ) = 1 \Big ] \le 1/2. \end{aligned}$$

An $\mathsf {IPP}$ for $\mathcal {L}$ is said to have query complexity $q= q(n,m,\varepsilon )$ if, for every $\varepsilon >0$ and $(x,y) \in \mathcal {L}$, the verifier $\mathcal {V}$ makes at most $q(|x|,|y|,\varepsilon )$ queries to y when interacting with $\mathcal {P}$. The $\mathsf {IPP}$ is said to have communication complexity $\mathsf {cc}=\mathsf {cc}(n,m,\varepsilon )$ if, for every $\varepsilon >0$ and pair $(x,y) \in \mathcal {L}$, the communication between $\mathcal {V}$ and $\mathcal {P}$ consists of at most $\mathsf {cc}(|x|,|y|,\varepsilon )$ bits. If the honest prover’s running time is polynomial in n and m, then we way that the $\mathsf {IPP}$ is doubly-efficient.

The special case of $\mathsf {IPP}$s in which the entire interaction consists of a single message sent from the prover to the verifier is called $\mathsf {MAP}$s (in analogy to the complexity class $\mathsf {MA}$) and was studied in [GR17, GGR15]. We will use the following simple observation:

Proposition 3.3

(See, e.g., [GR17]). Every $\mathcal {L}\in \mathsf {DTIME}(t)$ has an $\mathsf {MAP}$ with respect to proximity parameter $\delta \in (0,1)$ with communication complexity n and query complexity $O(1/\delta )$. The verifier runs in time $t + n + O(\log (n)/\varepsilon )$. The prover runs in time O(n)

Proof

(Proof Sketch). The prover sends to the verifier a full description of the input x (i.e., an n bit string). Given the message $x'$ received from the prover (allegedly equal to the input x), the verifier first checks that $x' \in \mathcal {L}$ (this step requires no queries to x). The verifier further checks that x and $x'$ agree on a random set of $O(1/\delta )$ coordinates.

Completeness is immediate, whereas to see that soundness holds, observe that the prover must send $x' \in \mathcal {L}$, since otherwise the verifier rejects. If x is $\delta $-far from $\mathcal {L}$ then x and $x'$ disagree on a at least $\delta $ fraction of their coordinates and so the verifier accepts with probability at most $(1-\delta )^{O(1/\delta )} = 1/2$.

4 Our Results

Our first main result is an $\mathsf {IPP}$ for any language in $\mathsf {NC}$ with optimal query/communication tradeoff (up to poly-logarithmic factors).

Theorem 4.1

Let $\delta = \delta (n) \in (0,1)$ be a proximity parameter and let $\mathcal {L}$ be a pair language that is computable by logspace-uniform Boolean circuits of depth $D = D(n) \ge \log n$ and size $S = S(n) \ge n$ with fan-in 2 (where n denotes the implicit input and $n_{ exp }$ denotes the explicit input). Then, $\mathcal {L}$ has a public-coin $\mathsf {IPP}$ for $\delta $-proximity with perfect completeness and the following parameters:

Soundness Error: 1/2.
Query complexity: $q = O( 1/\delta )$.
Communication Complexity: $cc = \delta \cdot n \cdot D \cdot \mathsf{poly}\log (S)$.
Round Complexity: $D \cdot \mathsf{polylog}(S)$.
Verifier Running Time: $\delta \cdot n \cdot n_{ exp }\mathsf{poly}(D,\log (S)) + (1/\delta ) \cdot \mathsf{polylog}(n)$.
Prover Running Time: $\mathsf{poly}(S)$.

Furthermore, the verification procedure can be described succinctly as follows. At the end of the interaction either the verifier rejects or in time $\delta \cdot n \cdot \mathsf{poly}(D,\log (S))$ it outputs a succinct description $\left\langle Q \right\rangle $ of a set $Q \subseteq [n]$ of size q and a succinct description $\left\langle \phi \right\rangle $ of a predicate $\phi : \{0,1\}^{q} \rightarrow \{0,1\}$ so that its decision predicate given an input function f is equal to $\phi (f|Q)$.

Our second main result (which relies on Theorem 4.1) is an interactive proof for batch verification of any $\mathsf {UP}$ language, with communication complexity that is optimal up to poly-logarithmic factors.

Theorem 4.2

For every $\mathsf {UP}$ language $\mathcal {L}$ with witness length $m = m(n)$, whose witness relation can be computed in logspace-uniform $\mathsf {NC}$, there exists a public-coin interactive proof (with perfect completeness) for verifying that k instances $x_1,\ldots ,x_k$, each of length $n \le \mathsf{poly}(m)$, are all in $\mathcal {L}$. The complexity of the protocol is as follows:

Communication complexity: $m \cdot \mathsf{polylog}(k,m)$.
Number of rounds: $\mathsf{polylog}(k,m)$.
Verifier runtime: $(n \cdot k + m) \cdot \mathsf{polylog}(k,m)$.
The honest prover, given the k unique witnesses, runs in time $\mathsf{poly}(m,k)$.

Using the Cook-Levin reduction, any $\mathsf {UP}$ language can be reduced to $\mathsf {Unique\text {-}SAT}$ which is a $\mathsf {UP}$ language whose witness relation can be computed in logspace-uniform $\mathsf {NC}$, with only a $\mathsf{poly}(n,m)$ blowup to the witness size. Hence, Theorem 4.2 yields the following corollary.

Corollary 4.1

For every $\mathsf {UP}$ language $\mathcal {L}$ with witness length $m = m(n)$, there exists a public-coin interactive proof (with perfect completeness) for verifying that k instances $x_1,\ldots ,x_k$, each of length $n \le \mathsf{poly}(m)$, are all in $\mathcal {L}$. The complexity of the protocol is as follows:

Communication complexity: $\mathsf{poly}(m,\log (k))$.
Number of rounds: $\mathsf{polylog}(m,k)$.
Verifier runtime: $(n \cdot k) \cdot \mathsf{polylog}(m,k) + \mathsf{poly}(m,\log {k})$.
The honest prover, given the k unique witnesses, runs in time $\mathsf{poly}(m,k)$.

5 The $\mathsf {PVAL}$ Problem

In this section we define the $\mathsf {PVAL}$ problem and state properties related to it that we will need in our proof. Due to lack of space, all proofs in this section are deferred to the full version.

Let $\mathbb {F}$ be a finite field, $\mathbb {H}\subseteq \mathbb {F}$ and $m \in \mathbb {N}$ be an integer.

Definition 5.1

The $\mathsf {PVAL}$ problem is parameterized by an ensemble $(\mathbb {F},\mathbb {H},m)_n$. The explicit input to the problem is $(n,t,\mathbf {j},\mathbf {v})$, where $t \in \mathbb {N}$, $\mathbf {j}=(j_1,\dots ,j_t) \in (\mathbb {F}^m)^t$ and $\mathbf {v} = v_1,\dots ,v_t \in \mathbb {F}^t$. The implicit input is a function $f : \mathbb {H}^m \rightarrow \mathbb {F}$. YES instances of the problems are all functions $f : \mathbb {H}^m \rightarrow \mathbb {F}$ such that for every $i \in [t]$ it holds that $\hat{f}(j_i)=v_i$, where $\hat{f}$ is the low degree extension of f.

Since the low-degree extension is an error correcting code with high distance, for sufficiently large randomly chosen location sets $\mathbf {j}$, the induced $\mathsf {PVAL}$ problem has large minimum distance:

Proposition 5.1

($\mathsf {PVAL}$ on random locations has large distance). Let $\mathbb {H}\subseteq \mathbb {F}$ be finite fields, and let m, d be integers s.t. $|\mathbb {F}| \ge 2m|\mathbb {H}|$. For $t \ge (d \cdot \log (|\mathbb {H}|^m \cdot |\mathbb {F}|) + \kappa )$ it is the case that:

$$\begin{aligned} \Pr _{\mathbf {j} \in (\mathbb {F}^m)^t} \left[ \varDelta (\mathsf {PVAL}(t,\mathbf {j},\mathbf {0})) \le \frac{d}{|\mathbb {H}|^m} \right] < 2^{-\kappa }. \end{aligned}$$

The following key lemma builds on the distance amplification theorem for Reed Solomon codes of Ben-Sasson, Kopparty and Saraf [BKS18].

Lemma 5.1

Fix a finite field $\mathbb {F}$ of characteristic 2 and integers $m,t > 0$. For $\mathbf {j} \in \left( \mathbb {F}^m\right) ^t$, suppose that $\mathsf {PVAL}(t,\mathbf {j},\mathbf {0})$ has size strictly larger than 1 and minimal distance $\lambda $. Let $\mathbf {v}',\mathbf {v}'' \in \mathbb {F}^t$ be vectors s.t. both $\mathsf {PVAL}(t,\mathbf {j},\mathbf {v}')$ and $\mathsf {PVAL}(t,\mathbf {j},\mathbf {v}'')$ are non-empty. Let $f': \{0,1\}^m \rightarrow \mathbb {F}$ be at distance $\delta '$ from $\mathsf {PVAL}(t,\mathbf {j},\mathbf {v}')$, and let $f'': \{0,1\}^m \rightarrow \mathbb {F}$ be at distance $\delta ''$ from $\mathsf {PVAL}(t,\mathbf {j},\mathbf {v}'')$. Consider permutations $\sigma ,\pi \in \varPi $, where $\varPi $ is the useful collection of permutations over $\mathbb {F}^{m}$ defined in Sect. 3.2. For scalars $c',c'' \in \mathbb {F}$, define:

$$\begin{aligned} f \triangleq c' \cdot (f' \circ \sigma ) + c'' \cdot (f'' \circ \pi ), \end{aligned}$$

and let $S_{\sigma ,\pi } \subseteq \mathbb {F}^{2t}$ be the set of pairs of vectors $(\mathbf {u},\mathbf {w})$ s.t. the sets $\mathsf {PVAL}\left( 2t, (\sigma ^{-1}(\mathbf {j}), \pi ^{-1}(\mathbf {j}) ) , ( \mathbf {v}' , \mathbf {u} ) \right) $ and $\mathsf {PVAL}\left( 2t, (\sigma ^{-1}(\mathbf {j}), \pi ^{-1}(\mathbf {j}) ) , (\mathbf {w} , \mathbf {v}'' ) \right) $ are non-empty. For $(\mathbf {u},\mathbf {w}) \in S_{\sigma ,\pi }$, define:

$$\begin{aligned} \delta _{\sigma ,\pi ,c',c'',\mathbf {u},\mathbf {w}} = \varDelta \left( f, \mathsf {PVAL}\left( 2t, (\sigma ^{-1}(\mathbf {j}), \pi ^{-1}(\mathbf {j}) ) , ( c' \cdot ( \mathbf {v}' , \mathbf {u} ) + c'' \cdot (\mathbf {w} , \mathbf {v}'' ) ) \right) \right) . \end{aligned}$$

Then for every $\varepsilon \in [0,1/2]$, taking

$$\begin{aligned} \delta = \max \left( \frac{\delta ' + \delta ''}{2} , \min (\delta ' + \delta '' - \delta ' \delta '' - 2\varepsilon , \lambda /3 - 3\varepsilon ) \right) , \end{aligned}$$

it is the case that:

$$\begin{aligned} \Pr _{\sigma ,\pi \leftarrow \varPi } \left[ \exists (\mathbf {u},\mathbf {w}) \in S_{\sigma ,\pi } \text{ s.t. } \Pr _{c',c'' \leftarrow \mathbb {F}} \left[ \delta _{\sigma ,\pi ,c',c'',\mathbf {u},\mathbf {w}}< \delta \right] > \frac{1}{\varepsilon |\mathbb {F}|} + \frac{1}{|\mathbb {F}|} \right] < \frac{\min (\delta ',\delta '')}{\varepsilon ^2 \cdot 2^m} + \frac{2}{2^m} \end{aligned}$$

(2)

5.1 Interactive Proof for $\mathsf {PVAL}$ Emptiness

Our $\mathsf {PVAL}$ $\mathsf {IPP}$ will also utilize the following (standard) interactive proof for checking whether a given $\mathsf {PVAL}$ instance (specified by the vector sequence $\mathbf {j}$) is empty.

Lemma 5.2

Let $t,m \in \mathbb {N}$ and $\mathbb {F}$ a finite field. There is a public-coin interactive proof for the language $\mathcal {L}= \big \{ \mathbf {j} \in (\mathbb {F}^m)^t : \mathsf {PVAL}(t,\mathbf {j},\mathbf {0}) \ne \emptyset \big \}$ with perfect completeness and the following parameters:

Communication complexity: $\mathsf{poly}(m,\log (t))$.
Round Complexity: $\mathsf{poly}(m,\log (t))$.
Verifier running time: $t \cdot \mathsf{poly}(m,log(|\mathbb {F}|))$.
Prover running time: $\mathsf{poly}(2^m,t)$.

6 Efficient $\mathsf {IPP}$ for $\mathsf {PVAL}$

In this section we show our efficient $\mathsf {IPP}$ protocol for the $\mathsf {PVAL}$ problem.

Theorem 6.1

($\mathsf {IPP}$ for $\mathsf {PVAL}$). Let $t,m \in \mathbb {N}$ such that $m \in [\log (t),\frac{t^{1/5}}{14}]$. Let $\mathbb {F}$ be a constructible finite field ensemble of characteristic 2 such that $|\mathbb {F}| = \varTheta \left( 2^{m} \cdot t^2 \cdot m^2 \right) $. Let $\mathbf {j} = (\mathbf {j}_1,\dots ,\mathbf {j}_t) \in (\mathbb {F}^m)^t$ and $\mathbf {v} = (v_1,\dots ,v_t) \in \mathbb {F}^t$ such that $\varDelta ( \mathsf {PVAL}(t,\mathbf {j},\mathbf {0})) \ge (t/2^m) \cdot \frac{1}{14 m^2}$.

Then, for every proximity parameter $\delta \ge \frac{200 m^3}{2^m}$ the set $\mathsf {PVAL}(t,\mathbf {j},\mathbf {v})$ has a public-coin $\mathsf {IPP}$ with respect to proximity parameter $\delta $, with perfect completeness and the following parameters:

Soundness Error: 1/2.
Query complexity: $q = O \Big ( \max \big (1/\delta , \frac{2^m}{t} \cdot \mathsf{poly}(m) \big ) \Big )$.
Round complexity: $\mathsf{poly}(m)$.
Communication Complexity: $cc = t \cdot \mathsf{poly}(m)$.
Verifier Running Time: $(t+q) \cdot \mathsf{poly}(m)$.
Prover Running Time: $\mathsf{poly}(2^m)$.

Furthermore, if $\delta > (t/2^{m}) \cdot \frac{1}{\mathsf{poly}(m)}$ then, the entire verification procedure can be described succinctly as follows. At the end of the interaction either the verifier rejects or in time $\mathsf{poly}(m)$ it outputs a succinct description $\left\langle Q \right\rangle $ of a set $Q \subseteq [|2^m|]$ of size q and a succinct description $\left\langle \phi \right\rangle $ of a predicate $\phi : \{0,1\}^{q} \rightarrow \{0,1\}$ so that its decision predicate given an input function f is equal to $\phi (f|Q)$.

The rest of this section is devoted to the proof of Theorem 6.1.

The $\mathsf {IPP}$ protocol for $\mathsf {PVAL}$ is recursive. In each step we reduce the dimension m by 1 (which shrinks the input size by half), while simultaneously (roughly) doubling the distance of the problem from the relevant $\mathsf {PVAL}$ instance but also doubling the query complexity.

We denote the starting dimension by $m_0$ whereas the current dimension (within the recursion) is denoted by m (initially we set $m=m_0$. With that notation, the efficient $\mathsf {IPP}$ protocol for $\mathsf {PVAL}$ is presented in Fig. 4. Its completeness, soundness and complexity are analyzed in the subsequent subsections.

6.1 Completeness

We prove that completeness holds by induction on m. The base case (i.e., $m \le \log (t)$) follows from Step 1 in the protocol (while relying on Proposition 3.3). We proceed to analyze the case $m>\log (t)$ (under the inductive hypothesis that the protocol is complete for dimension $m-1$).

Let $\mathbf {j} = (\mathbf {j}_1,\dots ,\mathbf {j}_t) \in (\mathbb {F}^m)^t$ and $\mathbf {v}=(v_1,\dots ,v_t) \in \mathbb {F}^t$. Suppose that $f \in \mathsf {PVAL}(t,\mathbf {j},\mathbf {v})$. As in the protocol, for every $i \in [t]$, decompose $\mathbf {j}_i$ into $\mathbf {j}_i= \left( \chi _i,\mathbf {j}'_i \right) \in \mathbb {F}\times \mathbb {F}^{m-1}$. Let $\mathbf {j}' {\mathop {=}\limits ^\mathrm{def}}\left( \mathbf {j}'_1,\dots ,\mathbf {j}'_t \right) \in (\mathbb {F}^{m-1})^t$.

We show that all the checks made by the verifier in the protocol pass (when interacting with the honest prover):

1.
In Step 5, for every $i \in [t]$:
$$\begin{aligned} (1-\chi _i) \cdot \zeta ^{(b)}_{i} + \chi _i \cdot \zeta ^{(1)}_{i} = (1-\chi _i) \cdot \hat{f}(0,\mathbf {j}'_i) + \chi _i \cdot \hat{f}(1,\mathbf {j}'_i) = \hat{f}(\mathbf {j}_i) = v_i, \end{aligned}$$
as required.
2.
In Step 9, for every $i \in [t]$ and $b \in \{0,1\}$:
$$\begin{aligned} g^{(b)} \left( \lambda ^{(b)}_i \right) = \hat{f} \left( b, \pi ^{(b)} \left( \mathcal {C}\left( \lambda ^{(b)}_i \right) \right) \right) = \hat{f} \left( b, \pi ^{(b)} \left( (\pi ^{(b)})^{-1} \left( \mathbf {j}'_i \right) \right) \right) = \hat{f}(b,\mathbf {j}'_i) = \zeta ^{(b)}_i, \end{aligned}$$
as required.
3.
For Step 11, for every $i \in [t]$, it holds that:
$$\begin{aligned} \hat{f'}(\mathcal {C}(\xi _i)) = c^{(0)} \cdot \hat{f} \big ( 0, \pi ^{(0)} \cdot \mathcal {C}(\xi _i) \big ) + c^{(1)} \cdot \hat{f} \big ( 1, \pi ^{(1)}(\mathcal {C}( \xi _i) ) \big ) = c^{(0)} \cdot g^{(0)}( \xi _i) + c^{(1)} \cdot g^{(1)} ( \xi _i), \end{aligned}$$
where the first equality follows from Proposition 3.2.

Since all the verifier’s checks pass, it accepts, and completeness follows.

6.2 Soundness

We prove, by induction on m, that the soundness error of the protocol is at most $\frac{m}{10 m_0} + \frac{1}{100}$. The base case (i.e., $m\le \log (t)$) is immediate from Step 1 (while relying on Proposition 3.3). We proceed to analyze the case that $m>\log (t)$ (under the inductive hypothesis that the protocol has soundness error at most $\frac{m-1}{10 m_0}$ for dimension $m-1$).

Let $\mathbf {j} = (\mathbf {j}_1,\dots ,\mathbf {j}_t) \in (\mathbb {F}^m)^t$ and $\mathbf {v}=(v_1,\dots ,v_t) \in \mathbb {F}^t$. Let $\delta = \varDelta \big ( f, \mathsf {PVAL}(t,\mathbf {j},\mathbf {v}) \big )$. Fix a cheating prover strategy $\tilde{P}$. Assume without loss of generality that $\tilde{P}$ is deterministic (otherwise fix its best choice of randomness).

We start by defining several important values that will be used in the analysis. Let $\left( \tilde{\zeta }^{(b)}_{i} \right) _{b \in \{0,1\}, i \in [t]}$ be the (fixed) values sent by $\tilde{P}$ as its first message (i.e., in Step 4). We may assume that

$$\begin{aligned} (1-\chi _i) \cdot \tilde{\zeta }^{(0)}_{i} + \chi _i \cdot \tilde{\zeta }^{(1)}_{i} = v_i, \end{aligned}$$

(3)

for every $i \in [t]$, since otherwise the verifier rejects in Step 5.

Define $\mathbf {v}^{(b)} {\mathop {=}\limits ^\mathrm{def}}\left( \tilde{\zeta }^{(b)}_1,\dots ,\tilde{\zeta }^{(b)}_t \right) \in \mathbb {F}^t$. Also, for every $i \in [t]$, decompose $\mathbf {j}_i$ into $\mathbf {j}_i= \left( \chi _i,\mathbf {j}'_i \right) \in \mathbb {F}\times \mathbb {F}^{m-1}$. Let $\mathbf {j}' {\mathop {=}\limits ^\mathrm{def}}\left( \mathbf {j}'_1,\dots ,\mathbf {j}'_t \right) \in (\mathbb {F}^{m-1})^t$. Lastly, for every $b \in \{0,1\}$, let $f^{(b)}(\cdot ) {\mathop {=}\limits ^\mathrm{def}}f(b,\cdot )$ and let $\delta ^{(b)} {\mathop {=}\limits ^\mathrm{def}}\varDelta \left( f^{(b)}, \mathsf {PVAL}\left( t,\mathbf {j}',\mathbf {v}^{(b)} \right) \right) $.

Our goal will be to show that the input $f'$ for the recursive step (i.e., Step 11) has distance roughly $2\delta $ from the corresponding $\mathsf {PVAL}$ instance (i.e., the distance doubles). This is done in two steps: showing that $f'$ has distance roughly $\delta ^{(0)}+\delta ^{(1)}$, and that this quantity is lower bounded by $2\delta $. Since it is simpler, we start with the latter step.

Claim 6.2

$$ \delta ^{(0)}+\delta ^{(1)} \ge 2\delta . $$

Proof

For every $b \in \{0,1\}$, let $P^{(b)} : \{0,1\}^{m-1} \rightarrow \mathbb {F}$ such that $P^{(b)} \in \mathsf {PVAL}\left( t,\mathbf {j}',\mathbf {v}^{(b)} \right) $ and $\varDelta \left( P^{(b)},f^{(b)} \right) = \delta ^{(b)}$. Such a $P^{(b)}$ exists as long as $\mathsf {PVAL}\left( t,\mathbf {j}',\mathbf {v}^{(b)} \right) \ne \emptyset $ and note that otherwise $\delta ^{(b)}$ is infinite and the claim clearly holds).

Consider the function $P:\{0,1\}^m \rightarrow \mathbb {F}$ defined as $P(b,\mathbf {x}) = P^{(b)}(\mathbf {x})$. Observe that $\varDelta (P,f)= \frac{\delta ^{(0)}+\delta ^{(1)}}{2}$. On the other hand, for every $i \in [t]$, $b \in \{0,1\}$ and $(j_1,\dots ,j_m) \in \mathbb {F}^{m}$:

$$\begin{aligned} \hat{P}(j_1,\dots ,j_m)&= (1-j_1) \cdot \hat{P}(0,j_2,\dots ,j_m) + j_1 \cdot \hat{P}(1,j_2,\dots ,j_m) \nonumber \\&= (1-j_1) \cdot \hat{P}^{(0)}(j_2,\dots ,j_m) + j_1 \cdot \hat{P}^{(1)}(j_2,\dots ,j_m), \end{aligned}$$

(4)

where both equalities can be verified by observing that they hold for all $(j_1,\dots ,j_m) \in \{0,1\}^m$, and therefore hold also for all $(j_1,\dots ,j_m) \in \mathbb {F}^m$ (since two multilinear polynomials that agree on the Boolean hypercube agree everywhere).

Thus, for every $i \in [t]$, it holds that

$$\begin{aligned} \hat{P}(\mathbf {j}_i)&= (1-\chi _i) \cdot \hat{P}^{(0)}(\mathbf {j}'_i) + \chi _i \cdot \hat{P}^{(1)}(\mathbf {j}'_i) \\&= (1-\chi _i) \cdot \tilde{\zeta }^{(0)}_i + \chi \cdot \tilde{\zeta }^{(1)}_i \\&=v_i, \end{aligned}$$

where the first equality is by Eq. (4), the second equality follows from the fact that $P^{(b)} \in \mathsf {PVAL}\left( t,\mathbf {j}',\mathbf {v}^{(b)} \right) $ and the third equality from Eq. (3).

We conclude that f is $\left( \frac{\delta ^{(0)}+\delta ^{(1)}}{2} \right) $-close to $\mathsf {PVAL}(t,\mathbf {j},\mathbf {v})$ and so $\frac{\delta ^{(0)}+\delta ^{(1)}}{2} \ge \delta $.

Now, let $\pi ^{(0)},\pi ^{(1)} \leftarrow \varPi $ be the permutations sampled randomly by the verifier in Step 6 and let $\rho _1,\dots ,\rho _t \in \mathbb {F}$ be the random values sampled in Step 7. As in the protocol, let $\mathcal {C}: \mathbb {F}\rightarrow \mathbb {F}^{m-1}$ be the unique degree $3t-1$ curve such that $\mathcal {C}(\lambda ^{(b)}_i) = (\pi ^{(b)})^{-1}(\mathbf {j}'_i)$, for every $i \in [t]$ and $b \in \{0,1\}$, and $\mathcal {C}(\lambda ^{(\bot )}_i) = \rho _i$, for every $i \in [t]$. Let $\tilde{g}^{(0)}$ and $\tilde{g}^{(1)}$ be the degree $O(m \cdot t)$ univariate polynomials sent by $\tilde{P}$ in Step 8. Note that $\mathcal {C}$, $\tilde{g}^{(0)}$ and $\tilde{g}^{(1)}$ are all random variables that depend on $\pi ^{(0)},\pi ^{(1)}$ and $\rho _1,\dots ,\rho _t$.

We may assume without loss of generality that for every choice of $\pi ^{(0)},\pi ^{(1)},\rho _1,\dots ,\rho _t$ made by the verifier it holds that

$$\begin{aligned} \forall i \in [t],\; b \in \{0,1\}:\quad \tilde{g}^{(b)}_i(\lambda ^{(b)}_i)=\tilde{\zeta }^{(b)}_i, \end{aligned}$$

(5)

since otherwise the verifier immediately rejects in Step 9. Thus, we can modify the prover $\tilde{P}$ to always send polynomials satisfying Eq. (5) without decreasing $\tilde{P}$’s success probability.

For every $c^{(0)},c^{(1)} \in \mathbb {F}$, define the function $f'_{\pi ^{(0)},\pi ^{(1)},c^{(0)},c^{(1)}} : \{0,1\}^{m-1} \rightarrow \mathbb {F}$ as $f'_{\pi ^{(0)},\pi ^{(1)},c^{(0)},c^{(1)}}(\mathbf {x}) = c^{(0)} \cdot f(0,\pi ^{(0)}(\mathbf {x})) + c^{(1)} \cdot f( 1,\pi ^{(1)}(\mathbf {x}) )$.

Recall that $\delta ^{(b)} = \varDelta \left( f^{(b)}, \mathsf {PVAL}\left( t,\mathbf {j}',\mathbf {v}^{(b)} \right) \right) $. We now invoke Lemma 5.1 on $f^{(0)}$ and $f^{(1)}$ with $\varepsilon {\mathop {=}\limits ^\mathrm{def}}\delta _{\mathsf {avg}}/m_0$, where $\delta _{\mathsf {avg}} = (\delta ^{(0)}+\delta ^{(1)})/2$. We obtain that:

$$\begin{aligned} \Pr _{\pi ^{(0)},\pi ^{(1)} \leftarrow \varPi } \left[ \exists \mathbf {u},\mathbf {w} \in S_{\pi {(0)},\pi ^{(1)}} \text{ s.t. } \Pr _{c^{(0)},c^{(1)} \in \mathbb {F}} \left[ \delta _{\pi ^{(0)},\pi ^{(1)},c^{(0)},c^{(1)},\mathbf {u},\mathbf {w}} < \delta ^* \right] > \frac{1}{\varepsilon |\mathbb {F}|} + \frac{1}{|\mathbb {F}|}\right] \end{aligned}$$

(6)

is less than $\frac{\min (\delta ^{(0)},\delta ^{(1)}) }{\varepsilon ^2 \cdot 2^{m}} + \frac{2}{2^m}$, where $\delta _{\pi ^{(0)},\pi ^{(1)},c^{(0)},c^{(1)},\mathbf {u},\mathbf {w}}$ is defined as the distance of $f'_{\pi ^{(0)},\pi ^{(1)},c^{(0)},c^{(1)}}$ from

$$\begin{aligned} \mathsf {PVAL}\bigg ( 2t, \Big ( (\pi ^{(0)})^{-1}(\mathbf {j}'), (\pi ^{(1)})^{-1}(\mathbf {j}') \Big ) , \Big ( c^{(0)} \cdot \big ( \mathbf {v}^{(0)} , \mathbf {u} \big ) + c^{(1)} \cdot \big ( \mathbf {w} , \mathbf {v}^{(1)} \big ) \Big ) \bigg ), \end{aligned}$$

and

$$\begin{aligned} \delta ^* {\mathop {=}\limits ^\mathrm{def}}\max \left( \delta _{\mathsf {avg}}, \min \big ( \delta ^{(0)} + \delta ^{(1)} - \delta ^{(0)} \cdot \delta ^{(1)} - 2\varepsilon , \lambda /3 - 3\varepsilon \big ) \right) , \end{aligned}$$

(7)

and $S_{\pi ^{(0)},\pi ^{(1)}} \subseteq \mathbb {F}^{2t}$ is the set of pairs of vectors $(\mathbf {u},\mathbf {w})$ such that the sets $\mathsf {PVAL}( 2t, ((\pi ^{(0)})^{-1}(\mathbf {j}'), (\pi ^{(1)})^{-1}(\mathbf {j}') ) , ( \mathbf {v}^{(0)} , \mathbf {u} ) )$ and $\mathsf {PVAL}( 2t, ((\pi ^{(0)})^{-1}(\mathbf {j}'), (\pi ^{(1)})^{-1}(\mathbf {j}') ), (\mathbf {w} , \mathbf {v}^{(1)}))$ are non-empty.

We have that $\frac{\min (\delta ^{(0)},\delta ^{(1)}) }{\varepsilon ^2 \cdot 2^{m}} + \frac{2}{2^m} \le \frac{\delta _\mathsf {avg}}{\varepsilon ^2 \cdot 2^{m}} + \frac{2}{2^m} = \frac{m_0^2}{\delta _{\mathsf {avg}} \cdot 2^{m}} + \frac{2}{2^m} \le \frac{1}{50 m_0}$ and that $\frac{1}{\varepsilon |\mathbb {F}|} + \frac{1}{|\mathbb {F}|} \le \frac{m_0}{\delta _{\mathsf {avg}} \cdot 2^{m_0}} + \frac{1}{2^{m_0}} \le \frac{m_0}{\delta _{\mathsf {avg}} \cdot 2^{m}} + \frac{1}{2^m} \le \frac{1}{100 m_0}$, where for both we used the fact that $\delta _{\mathsf {avg}} \ge \delta \ge \frac{100 m_0^3}{2^{m}}$ (by Claim 6.2 and our invariant on $\delta $) and our setting of $|\mathbb {F}|$ and m. Thus, Eq. (6) implies that:

$$\begin{aligned} \Pr _{\pi ^{(0)},\pi ^{(1)} \leftarrow \varPi } \left[ \exists \mathbf {u},\mathbf {w} \in S_{\pi {(0)},\pi ^{(1)}} \text{ s.t. } \Pr _{c^{(0)},c^{(1)} \in \mathbb {F}} \left[ \delta _{\pi ^{(0)},\pi ^{(1)},c^{(0)},c^{(1)},\mathbf {u},\mathbf {w}} < \delta ^* \right] > \frac{1}{100 m_0} \right] \end{aligned}$$

(8)

is less than $\frac{1}{50 m_0}$. We proceed to show that $\delta ^*$ is lower bounded by (roughly) $2\delta $.

Proposition 6.1

$\delta ^* \ge \min \big ( 2 \delta \cdot (1-\frac{2}{m_0}), \lambda /100 \big )$.

Proof

Recall that $\delta ^* \ge \delta _{\mathsf {avg}}$ and that $\delta ^* \ge \min (\lambda /3 - 3\varepsilon , \delta ^{(0)} + \delta ^{(1)} - \delta ^{(0)} \cdot \delta ^{(1)} - 2\varepsilon )$ (see Eq. (7)). The proof of the proposition is based on a (somewhat tedious) case analysis.

Suppose first that $\delta ^* \ge \delta ^{(0)} + \delta ^{(1)} - \delta ^{(0)} \cdot \delta ^{(1)} - 2\varepsilon $. In this case the proposition follows from the following claim:

Claim 6.3

$\delta ^{(0)} + \delta ^{(1)} - \delta ^{(0)} \cdot \delta ^{(1)} -2\varepsilon \ge 2 \delta \cdot (1-\frac{2}{m_0})$.

Proof

By the AM-GM inequality it holds that:

$$\begin{aligned} \delta ^{(0)} + \delta ^{(1)} - \delta ^{(0)} \cdot \delta ^{(1)} \ge 2 \delta _\mathsf {avg}- \delta _\mathsf {avg}^2. \end{aligned}$$

We consider two cases. Suppose first that $\delta _\mathsf {avg}\le 2/m_0$. Then,

$$\begin{aligned} 2 \delta _\mathsf {avg}- \delta _\mathsf {avg}^2 - 2 \varepsilon = 2 \delta _{\mathsf {avg}} - \delta _\mathsf {avg}^2 - 2 \delta _\mathsf {avg}/m_0 \ge 2\delta \cdot (1-2/m_0), \end{aligned}$$

where the inequality is based on Claim 6.2 and our presumed upper bound on $\delta _{\mathsf {avg}}$. Thus, we may assume that $\delta _\mathsf {avg}\ge 2/m_0$. Then,

$$\begin{aligned} 2 \delta _\mathsf {avg}-\delta _\mathsf {avg}^2 - 2\varepsilon \ge \delta _{\mathsf {avg}}/2 \ge 1/m_0 \ge 2\delta \cdot (1-2/m_0), \end{aligned}$$

where the first inequality holds for sufficiently large $m_0$ and using the fact that $\delta _{\mathsf {avg}} \le 1$ and the last inequality from the fact that $\delta <\frac{1}{100m_0}$.

Thus, we may assume that $\delta ^* \ge \lambda /3-3\varepsilon $.

Suppose now that $\varepsilon < \lambda /300$. Then, we have that $\delta \ge \lambda /3-3\varepsilon \ge \lambda /100$ and we are done. Thus, we may assume that $\lambda /300 \le \varepsilon = \delta _{\mathsf {avg}}/m_0$. On the other hand, we have that $\delta ^* \ge \delta _{\mathsf {avg}} \ge \frac{m_0}{300} \cdot \lambda \ge \lambda /100$, for sufficiently large $m_0$.

This concludes the proof of Proposition 6.1.

Fix $\pi ^{(0)}$ and $\pi ^{(1)}$ such that the event specified in Eq. (8) does not hold. That is, for every $\mathbf {u},\mathbf {w} \in S_{\pi {(0)},\pi ^{(1)}}$ it holds that

$$ \Pr _{c^{(0)},c^{(1)} \in \mathbb {F}} \left[ \delta _{\pi ^{(0)},\pi ^{(1)},c^{(0)},c^{(1)},\mathbf {u},\mathbf {w}} < \delta ^* \right] \le \frac{1}{100 m_0}. $$

Let $\mathbf {u}=\left( \tilde{g}^{(0)}(\lambda ^{(1)}_1),\dots ,\tilde{g}^{(0)}(\lambda ^{(1)}_t) \right) $ and $\mathbf {w}= \left( \tilde{g}^{(1)}(\lambda ^{(0)}_1),\dots ,\tilde{g}^{(1)}(\lambda ^{(0)}_t) \right) $. Suppose that $\mathbf {u},\mathbf {w} \not \in S_{\pi ^{(0)},\pi ^{(1)}}$ then $\mathsf {PVAL}\Big ( 2t, \big \{ \lambda ^{(b)} \big \}_{b \in \{0,1\}, i \in [t]}, \big \{ \tilde{g}^{(b')}( \lambda ^{(b)}_i ) \big \}_{b \in \{0,1\}, i \in [t]} \Big ) = \emptyset $, for either $b' = 0$ or $b'=1$. In Step 9 the verifier and prover run an interactive proof to check that this is not the case and so the verifier rejects in this case with probability at least $1-\frac{1}{100m_0}$. Thus, we may assume that $\mathbf {u},\mathbf {w} \in S_{\pi ^{(0)},\pi ^{(1)}}$.

In particular, this means that for all but $\frac{1}{100 m_0}$ fraction of $c^{(0)},c^{(1)} \in \mathbb {F}$, it holds that $f'_{\pi ^{(0)},\pi ^{(1)},c^{(0)},c^{(1)}}$ is at distance at least $\delta ^*$ from

$$ \mathsf {PVAL}\bigg ( 2t, \Big ( (\pi ^{(0)})^{-1}(\mathbf {j}'), (\pi ^{(1)})^{-1}(\mathbf {j}') \Big ), \left( \omega _{k}\right) _{k \in [2t]} \bigg ), $$

where $\omega _{b \cdot t + i} = c^{(0)} \cdot \tilde{g}^{(0)}_i(\lambda ^{(b)}_i) + c^{(1)} \cdot \tilde{g}^{(1)}_i(\lambda ^{(b)}_i)$.

Let us fix $c^{(0)}$ and $c^{(1)}$ such that the foregoing statement holds. Let

$$\begin{aligned} \delta ' {\mathop {=}\limits ^\mathrm{def}}\min \big ( 2 \delta \cdot (1-\frac{2}{m_0}), (t/2^m) \cdot \frac{1}{1400 m_0^2} \big ) \end{aligned}$$

(9)

and observe that by Proposition 6.1 (and the invariant lower bound on $\lambda $), it holds that $\delta ' \le \delta ^*$.

Claim 6.4

With all but $\frac{1}{100 m_0}$ probability over the choice of $\xi _1,\dots ,\xi _t \in \mathbb {F}$ it holds that the function $f'_{\pi ^{(0)},\pi ^{(1)},c^{(0)},c^{(1)}}$ is at distance at least $\delta '$ from the set $\mathsf {PVAL}\left( t, \left( \mathcal {C}(\xi _i) \right) _{i \in [t]}, \left( c^{(0)} \cdot \tilde{g}^{(0)}_i( \xi _i) + c^{(1)} \cdot \tilde{g}^{(1)}( \xi _i ) \right) _{i \in [t]} \right) $.

Proof

Fix some $h : \{0,1\}^{m-1} \rightarrow \mathbb {F}$ at relative distance $\le \delta ' \le \delta ^*$ from $f'_{\pi ^{(0)},\pi ^{(1)},c^{(0)},c^{(1)}}$. By our assumption on $c^{(0)}$ and $c^{(1)}$ we have $h \not \in \mathsf {PVAL}\left( 2t, ( (\pi ^{(0)})^{-1}(\mathbf {j}'), (\pi ^{(1)})^{-1}(\mathbf {j}') ) , \left( \omega _k \right) _{k \in [2t]} \right) $, where $\omega _{b \cdot t + i} = c^{(0)} \cdot \tilde{g}^{(0)}_i(\lambda ^{(b)}_i) + c^{(1)} \cdot \tilde{g}^{(1)}_i(\lambda ^{(b)}_i)$. In particular, this means that there exists some $b \in \{0,1\}$ and $i \in [t]$ such that:

$$ \hat{h} \left( \mathcal {C}( \lambda ^{(b)}_i) \right) \ne c^{(0)} \cdot \tilde{g}^{(0)}_i(\lambda ^{(b)}_i) + c^{(1)} \cdot \tilde{g}^{(1)}_i(\lambda ^{(b)}_i). $$

The functions $\hat{h} \circ \mathcal {C}$ and $c^{(0)} \cdot \tilde{g}^{(0)}(\cdot ) + c^{(1)} \cdot \tilde{g}^{(1)}(\cdot )$ are therefore different polynomials of degree $O(m \cdot t)$. Thus, the probability over a random $\xi \in \mathbb {F}^{m-1}$ that $\hat{h}(\mathcal {C}(\xi )) = \tilde{g}^{(0)}(\xi ) + c \cdot \tilde{g}^{(1)}(\xi )$ is at most $O(m \cdot t / |\mathbb {F}|) \le 1/2$. Therefore, the probability that $h \in \mathsf {PVAL}\left( t, \left( \mathcal {C}(\xi _i) \right) _{i \in [t]}, \left( \tilde{g}^{(0)}_i( \xi _i) + c \cdot \tilde{g}^{(1)}( \xi _i ) \right) _{i \in [t]} \right) $ is at most $2^{-t}$.

The number of functions $h : \{0,1\}^{m-1} \rightarrow |\mathbb {F}|$ that are $\delta '$-close to $f'_{\pi ,c}$ is upper bounded by $(2^{m-1} \cdot |\mathbb {F}|)^{\delta ' \cdot 2^{m-1}} \le 2^{\delta ' \cdot 2^m \cdot m \cdot \log (|\mathbb {F}|)}$. Therefore, by a union bound, we have that $f'_{\pi ,c}$ is $\delta '$-far from $\mathsf {PVAL}\Big ( t, \left( \mathcal {C}(\xi _i) \right) _{i \in [t]}, c^{(0)} \cdot \big ( \tilde{g}^{(0)}_i( \xi _i) + c^{(1)} \cdot \tilde{g}^{(1)}( \xi _i ) \big )_{i \in [t]} \Big )$, with all but $2^{\delta ' \cdot 2^m \cdot m \cdot \log (|\mathbb {F}|) - t}$ probability. Since $\delta ' \le (t/2^m) \cdot \frac{1}{1400 m_0^2}$, we have that this probability is upper bounded by $\frac{1}{100 m_0}$.

Assuming that the event stated in Claim 6.4 holds, the protocol is run recursively on input $f'_{\pi ^{(0)},\pi ^{(1)},c^{(0)},c^{(1)}}$ that is at least $\delta '$-far from the relevant $\mathsf {PVAL}$ instance. At this point we would like to argue that by the inductive hypothesis, the verifier rejects with high probability. However, to do so, we still need to argue that the recursive invocation satisfies all the prescribed invariants.

Claim 6.5

$\delta ' \ge \frac{200 m_0^3}{2^{m-1}} \cdot (1-\frac{2}{m_0})^{m_0-(m-1)}$.

Proof

We consider two cases. If $\delta ' = 2 \delta \cdot (1-\frac{2}{m_0})$ then:

$$ \delta ' \ge 2 \left( \frac{200 m_0^3}{2^m} \cdot (1-\frac{2}{m_0})^{m_0-m} \right) \cdot (1-\frac{2}{m_0}) = \frac{200 m_0^3}{2^{m-1}} \cdot (1-\frac{2}{m_0})^{m_0-(m-1)}, $$

as required. Otherwise,

$$ \delta ' = (t/2^m) \cdot \frac{1}{1400 m_0^2} \ge \frac{200 m_0^3}{2^{m-1}} \ge \frac{200 m_0^3}{2^{m-1}} \cdot (1-\frac{2}{m_0})^{m_0-(m-1)}, $$

where the first inequality follows from the fact that $m_0 \le t^{1/5}/14$.

Claim 6.6

With all but $\frac{1}{100 m_0}$ probability over the choice of $\rho _1,\dots ,\rho _t$ and $\xi _1,\dots ,\xi _t$, it holds that $\lambda ' \ge (t/2^{m-1}) \cdot \frac{1}{14 m_0^2}$, where $\lambda ' = \varDelta (\mathsf {PVAL}(t,\mathbf {j}',\mathbf {0}))$.

Proof

Observe that $\xi _1,\dots ,\xi _t \not \in \{ \lambda ^{(b)}_i \}_{b \in \{0,1\}, i \in [t]}$ with probability $1-\frac{2t^2}{|\mathbb {F}|} \ge 1-\frac{1}{200 m_0}$.

Since the curve ${\mathcal C}$ passes through t random points (i.e., $\rho _1,\dots ,\rho _t$), the distribution over points through which the curve $\mathcal {C}$ passes is t-wise independent, other than at the fixed points $\{ \lambda _i^{(b)}\}_{b \in \{0,1\}, i \in [t]}$. Putting the above two facts together, we obtain that with all but $\frac{1}{200 m_0}$ probability, the set of points $\mathbf {j}' = \big ( \mathcal {C}(\xi _1),\dots ,\mathcal {C}(\xi _t) \big )$ is uniformly distributed in $(\mathbb {F}^{m-1})^t$.

Recall that $\lambda ' = \varDelta (\mathsf {PVAL}(t,\mathbf {j}',\mathbf {0}))$. By Proposition 5.1, since $t \ge \frac{t}{14 m_0^2} \cdot \log (2^m \cdot |\mathbb {F}|) + \log (200m_0)$,

$$ \Pr \left[ \lambda ' \le (t/2^{m-1}) \cdot \frac{1}{14 m_0^2} \right] < \frac{1}{200 m_0}, $$

and the claim follows.

Thus, the invariants for the recursive step are satisfied and so the verifier accepts in the recursion with probability at most $\frac{m-1}{10m_0} + 1/100$. Overall, by accounting for all of the bad events in the analysis above, we get that the verifier accepts with probability at most:

$$ \frac{m-1}{10m_0} + 1/100 + 5 \cdot \frac{1}{100 m_0} \le \frac{m}{10m_0} + 1/100 $$

as required.

6.3 Complexity

Communication Complexity. We first analyze the complexity of a single iteration (i.e., excluding the recursion). The verifier only sends to the prover a specification of the permutations $\pi ^{(0)}$ and $\pi ^{(1)}$ (which take 2m bits each), the values $\rho _1,\dots ,\rho _t$, $\xi _1,\dots ,\xi _t \in \mathbb {F}$ and $c^{(0)},c^{(1)} \in \mathbb {F}$. Overall the verifier-to-prover communication is $2m + (2t+1) \cdot \log _2(|\mathbb {F}|)$. The prover in turn sends $( \zeta ^{(b)}_i )_{i \in [t], b \in \{0,1\}}$ and the polynomials $\tilde{g}^{(0)}$ and $\tilde{g}^{(1)}$ (of degree $O(t \cdot m)$). Thus, the total prover to verifier communication is $O( t \cdot m \cdot \log (|\mathbb {F}|))$.

Thus, the overall communication complexity is given by cc(m) where $cc(m) = O(t \cdot m \cdot \log (|\mathbb {F}|)) + cc(m-1)$ if $m > \log (t)$ and $cc(m) = 2^m \cdot \log (|\mathbb {F}|)$ otherwise. Overall we have $cc(m) \le O(m^2 \cdot t \cdot \log (|\mathbb {F}|))$.

Query Complexity. Denote the query complexity by $q(m,\delta )$. Note that if $m \le \log (t)$ then $q(m,\delta ) = O(1/\delta )$ and otherwise $q(m,\delta ) = 2 \cdot q(m-1,\delta ') = 2 q \Big ( m-1, \min \big ( 2 \delta \cdot (1-\frac{2}{m_0}), (t/2^m) \cdot \frac{1}{1400 m_0^2} \big ) \Big )$. The stated query complexity follows from the following claim.

Claim 6.7

There exists a fixed constant c such that for every m and $\delta $ it holds that $q(m,\delta ) \le c \cdot (1-\frac{2}{m_0})^{-m} \cdot \max \Big ( \frac{1}{\delta }, \frac{2800 \cdot 2^{m} \cdot m_0^2}{t} \Big )$.

Proof

We prove by induction on m. The base case $m=\log (t)$ is immediate. Suppose that the claim holds for $m-1$. Then:

$$\begin{aligned} q(m,\delta )&= 2 q \Big ( m-1, \min \big ( 2 \delta \cdot (1-\frac{2}{m_0}), (t/2^{m}) \cdot \frac{1}{1400 m_0^2} \big ) \Big ) \end{aligned}$$

Suppose first that $2 \delta \cdot (1-\frac{2}{m_0}) < (t/2^{m}) \cdot \frac{1}{1400 m_0^2}$. Then,

$$\begin{aligned} q(m,\delta )&= 2 q \big ( m-1, 2 \delta \cdot (1-\frac{2}{m_0}) \big ) \\&\le 2c \cdot (1-\frac{2}{m_0})^{-(m-1)} \cdot \max \Big (\frac{1}{2 \delta \cdot (1-\frac{2}{m_0})}, \frac{2^{m-1} \cdot m_0^2}{t} \Big ) \\&= c \cdot (1-\frac{2}{m_0})^{-m} \cdot \max \Big (\frac{1}{\delta }, \frac{2^{m} \cdot m_0^2}{t} \Big ) \end{aligned}$$

as required. Otherwise, $2 \delta \cdot (1-\frac{2}{m_0}) \ge (t/2^{m}) \cdot \frac{1}{1400 m_0^2}$ and we have that:

$$\begin{aligned} q(m,\delta )&= 2 q \Big ( m-1,(t/2^{m}) \cdot \frac{1}{1400 m_0^2} \Big ) \\&\le 2c \cdot (1-\frac{2}{m_0})^{-(m-1)} \cdot \max \Big ( \frac{1400 \cdot 2^m \cdot m_0^2}{t}, \frac{2800 \cdot 2^{m-1} \cdot m_0^2}{t} \Big ) \\&\le c \cdot (1-\frac{2}{m_0})^{-m} \cdot \frac{2800 \cdot 2^m \cdot m_0^2}{t} \\&\le c \cdot (1-\frac{2}{m_0})^{-m} \cdot \max \Big ( \frac{1}{\delta }, \frac{2800 \cdot 2^{m} \cdot m_0^2}{t} \Big ). \end{aligned}$$

Prover Runtime. In every iteration, the prover only does elementary manipulations of the truth table of f (and never needs to fully materialize the truth table of $\hat{f}$). It also runs the prover of Lemma 5.2. Overall its running time is $\mathsf{poly}(2^m,m_0,\log (|\mathbb {F}|),t) = \mathsf{poly}(2^{m_0})$.

Verifier Runtime and Succinct Description. The queries made by the verifier can be succinctly specified by the permutations $\pi ^{(0)}$ and $\pi ^{(1)}$ used through the recursion as well as the random locations that it queries in the base case. The total number of bits needed to describe the permutations is at most $2(m_0)^2$. The number of bits needed in the base case is equal to the total number of queries divided by $2^{m_0}/t$ (since in each of the $m_0-\log (t)$ iterations the number of queries doubled) and multiplied by $\log (2^m)=m$ (to specify the location). By the above analysis this quantity is therefore upper bounded by $O \Big ( \frac{t \cdot m_0}{2^{m_0}} \cdot \max \big (1/\delta , \frac{2^{m_0}}{t} \cdot \mathsf{poly}(m) \big ) \Big ) = O(\mathsf{poly}(m) + \frac{t \cdot m_0}{2^{m_0}}) \cdot (1/\delta ))$. If $\delta > (t/2^{m_0}) \cdot \frac{1}{\mathsf{poly}(m)}$ this string has $\mathsf{poly}(m)$ length as required.

Given the set of base points we can generate the list of q queries by repeatedly applying the two permutations that we have for each level of the recursion. Since the permutations can be computed in $\mathsf{poly}(m)$ time (see Proposition 3.1), we obtain that a logspace Turing machine can generate a $\mathsf{poly}(m)$ depth circuit that outputs the entire set of q query locations.

As for the succinct description of the verification predicate, observe that all of the verifier’s checks that do not involve the input can be implemented in time $\mathsf{poly}(t,m_0,\log (|\mathbb {F}|)) = \mathsf{poly}(t)$. The testing of the actual input only happens in the case in which the prover sends over the alleged actual input $\tilde{f}_\bot $ (which at the end of the recursion has length $t \cdot \log (|\mathbb {F}|)$). This string $\tilde{f}_\bot $ is part of the description of the verification predicate, together also with all of the $c^{(0)},c^{(1)}$ values generated in the recursion. Using these values it is possible to construct a $q \cdot \mathsf{poly}(m_0,\log (|\mathbb {F}|))$ size depth $\mathsf{poly}(m_0)$ circuit that given the query answers checks their consistency with $\tilde{f}_\bot $.

7 Proving Theorem 4.1 and Theorem 4.2

Theorem 4.1 follows immediately by combining [RVW13, Theorem 1.3] with Theorem 6.1, while setting $t = \delta \cdot n \cdot \mathsf{polylog}(n)$.

In order to prove Theorem 4.2 we utilize an idea from the work of Reingold et al. [RRR18] who used known $\mathsf {IPP}$ protocols to achieve batch verification for $\mathsf {UP}$ languages. We restate a more general form of their reduction below. In the interest of directness, we avoid defining or using Interactive Witness Verification protocols, as they did. Instead, we use $\mathsf {IPP}$s for pair languages:

Theorem 7.1

(From $\mathsf {IPP}$s to $\mathsf {UP}$ batch verification (generalization of [RRR18, Theorem 3.3])). Suppose that for every query parameter $q= q(n) \in \{1,\ldots ,m\}$, and for every pair languages $\mathcal {L}$ that can be computed by log-space uniform polynomial-size circuits with fan-in 2 and depth $D = D(n)$, there exists an interactive proof of proximity where the verifier is public-coin and, on input (x, y), at the end of the interaction either the verifier rejects, or it outputs a succinct description $\left\langle Q \right\rangle $ of a set $Q \subseteq [|y|]$ of size $q$ and succinct description $\left\langle \phi \right\rangle $ of a predicate $\phi : \{0,1\}^{q} \rightarrow \{0,1\}$, and for every input pair (x, y):

Completeness: If $(x,y) \in \mathcal {L}$ then
$$ \Pr \big [ \mathcal {V}\text { does not reject and } \phi (y_Q)=1 \big ]=1. $$
Soundness: If $\mathcal {L}(x) = \emptyset $ (there is no $y'$ s.t. $(x,y') \in \mathcal {L}$), then for every prover $\mathcal {P}^*$:
$$ \Pr \big [ \mathcal {V}\text { does not reject and }\phi (y_{Q})=1 \big ] \le 1/2. $$

Let $\mathsf {cc}= \mathsf {cc}(q,D,n,m)$ be the communication complexity, $r= r(q,D,n,m)$ the number of rounds, $\mathcal {V}\mathsf {time}(q,D,n,m)$ the verifier’s runtime, and assume that the honest prover runs in polynomial time.

Then, for every $\mathsf {UP}$ language $\mathcal {L}$ with witness length $m = m(n)$, whose witness relation can be computed in $\mathsf {NC}$, there exists a public-coin interactive proof (with perfect completeness) for verifying that k instances $x_1,\ldots ,x_k$, each of length n, are all in $\mathcal {L}$. Taking $D' = \mathsf{polylog}(n,k)$ and $m' = k \cdot m$, the complexity of the protocol is as follows:

Communication complexity: $O\left( m + \sum _{i=1}^{\log k} \mathsf {cc}(\frac{k}{2^i}, D', \frac{n'}{2^i}, \frac{m'}{2^i}) \right) .$
Number of rounds: $O \left( \sum _{i=1}^{\log k} r(\frac{k}{2^i},D',\frac{n'}{2^i}, \frac{m'}{2^i}) \right) .$
Verifier runtime: $O \left( m\log n + \sum _{i=1}^{\log k} \mathcal {V}\mathsf {time}(\frac{k}{2^i},D',\frac{n'}{2^i}, \frac{m'}{2^i}) \right) .$
The honest prover, given the k unique witnesses, runs in polynomial time.

Theorem 4.2 now follows from Theorem 7.1 by utilizing the efficient $\mathsf {IPP}$s for $\mathsf {NC}$ given in Theorem 4.1.

Notes

1.
Efficiency of the honest prover (given an $\mathsf {NP}$ witness) has been central in the study of zero-knowledge interactive proofs [GMR89, GMW91]. It has also been central to the study of efficient batch verification in recent works [RRR16, RRR18].
2.
In particular, for simplicity and since it is sufficient for our results we consider a variant of $\mathsf {PVAL}$ with respect to the multi-linear extension rather than a more general low degree extension considered in [RVW13].
3.
Recall that the multilinear extension $\hat{f} : \mathbb {F}^m \rightarrow \mathbb {F}$ of $f : \{0,1\}^m \rightarrow \mathbb {F}$ is the unique multilinear polynomial that agrees with f on $\{0,1\}^m$. See Sect. 3.1 for details.
4.
Intuitively, the reason to use a random linear combination rather than some fixed combination such as $f_0+f_1$ is avoiding (w.h.p) the possibility that the differences of $f_0$ and $f_1$ from their corresponding $\mathsf {PVAL}$ instances (i.e. the 0/1 vectors that can be added to $f_0$ and $f_1$ to reach vectors in $\mathsf {PVAL}$) cancel each other out.

References

Arora, S., Lund, C., Motwani, R., Sudan, M., Szegedy, M.: Proof verification and the hardness of approximation problems. J. ACM 45(3), 501–555 (1998)
Article MathSciNet Google Scholar
Arora, S., Safra, S.: Probabilistic checking of proofs; A new characterization of NP. In: 33rd Annual Symposium on Foundations of Computer Science, Pittsburgh, Pennsylvania, USA, 24–27 October 1992, pp. 2–13 (1992)
Google Scholar
Babai, L., Fortnow, L., Lund, C.: Non-deterministic exponential time has two-prover interactive protocols. Comput. Complex. 1, 3–40 (1991)
Article Google Scholar
Babai, L., Fortnow, L., Levin, L.A., Szegedy, M.: Checking computations in polylogarithmic time. In: Proceedings of the 23rd Annual ACM Symposium on Theory of Computing, New Orleans, Louisiana, USA, 5–8 May 1991, pp. 21–31 (1991)
Google Scholar
Ben-Or, M., et al.: Everything provable is provable in zero-knowledge. In: Goldwasser, S. (ed.) CRYPTO 1988. LNCS, vol. 403, pp. 37–56. Springer, New York (1990). https://doi.org/10.1007/0-387-34799-2_4
Chapter Google Scholar
Ben-Or, M., Goldwasser, S., Kilian, J., Wigderson, A.: Multi-prover interactive proofs: how to remove intractability assumptions. In: Proceedings of the 20th Annual ACM Symposium on Theory of Computing, Chicago, Illinois, USA, 2–4 May 1988, pp. 113–131 (1988)
Google Scholar
Brakerski, Z., Holmgren, J., Kalai, Y.T.: Non-interactive delegation and batch NP verification from standard computational assumptions. In: Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2017, Montreal, QC, Canada, 19–23 June 2017, pp. 474–482 (2017)
Google Scholar
Ben-Sasson, E., Kopparty, S., Saraf, S.: Worst-case to average case reductions for the distance to a code. In: 33rd Computational Complexity Conference, CCC 2018, San Diego, CA, USA, 22–24 June 2018, pp. 24:1–24:23 (2018)
Google Scholar
Berman, I., Rothblum, R.D., Vaikuntanathan, V.: Zero-knowledge proofs of proximity. In: 9th Innovations in Theoretical Computer Science Conference, ITCS 2018, Cambridge, MA, USA, 11–14 January 2018, pp. 19:1–19:20 (2018)
Google Scholar
Chiesa, A., Gur, T.: Proofs of proximity for distribution testing. In: 9th Innovations in Theoretical Computer Science Conference, ITCS 2018, Cambridge, MA, USA, 11–14 January 2018, pp. 53:1–53:14 (2018)
Google Scholar
Ergün, F., Kumar, R., Rubinfeld, R.: Fast approximate probabilistically checkable proofs. Inf. Comput. 189(2), 135–159 (2004)
Article MathSciNet Google Scholar
Feige, U., Goldwasser, S., Lovász, L., Safra, S., Szegedy, M.: Interactive proofs and the hardness of approximating cliques. J. ACM 43(2), 268–292 (1996)
Article MathSciNet Google Scholar
Fischer, E., Goldhirsh, Y., Lachish, O.: Partial tests, universal tests and decomposability. In: ITCS, pp. 483–500 (2014)
Google Scholar
Fortnow, L., Rompel, J., Sipser, M.: On the power of multi-prover interactive protocols. Theor. Comput. Sci. 134(2), 545–557 (1994)
Article MathSciNet Google Scholar
Goldreich, O., Goldwasser, S., Ron, D.: Property testing and its connection to learning and approximation. J. ACM (JACM) 45(4), 653–750 (1998)
Article MathSciNet Google Scholar
Goldreich, O., Gur, T., Rothblum, R.D.: Proofs of proximity for context-free languages and read-once branching programs. In: Halldórsson, M.M., Iwama, K., Kobayashi, N., Speckmann, B. (eds.) ICALP 2015. LNCS, vol. 9134, pp. 666–677. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-47672-7_54
Chapter MATH Google Scholar
Goldreich, O., Håstad, J.: On the complexity of interactive proofs with bounded communication. Inf. Process. Lett. 67(4), 205–214 (1998)
Article MathSciNet Google Scholar
Gur, T., Liu, Y.P., Rothblum, R.D.: An exponential separation between MA and AM proofs of proximity. In: Chatzigiannakis, I., Kaklamanis, C., Marx, D., Sannella, D. (eds.) 45th International Colloquium on Automata, Languages, and Programming, ICALP 2018, Prague, Czech Republic, 9–13 July 2018. LIPIcs, vol. 107, pp. 73:1–73:15. Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2018)
Google Scholar
Goldwasser, S., Micali, S., Rackoff, C.: The knowledge complexity of interactive proof systems. SIAM J. Comput. 18(1), 186–208 (1989)
Article MathSciNet Google Scholar
Goldreich, O., Micali, S., Wigderson, A.: Proofs that yield nothing but their validity for all languages in NP have zero-knowledge proof systems. J. ACM 38(3), 691–729 (1991)
Article MathSciNet Google Scholar
Goldreich, O.: Computational Complexity - A Conceptual Perspective. Cambridge University Press, Cambridge (2008)
Book Google Scholar
Gur, T., Rothblum, R.D.: Non-interactive proofs of proximity. In: Electronic Colloquium on Computational Complexity (ECCC), vol. 20, p. 78 (2013)
Google Scholar
Gur, T., Rothblum, R.D.: A hierarchy theorem for interactive proofs of proximity. In: 8th Innovations in Theoretical Computer Science Conference, ITCS 2017, Berkeley, CA, USA, 9–11 January 2017, pp. 39:1–39:43 (2017)
Google Scholar
Goldwasser, S., Rothblum, G.N., Shafer, J., Yehudayoff, A.: Interactive proofs for verifying machine learning. In: Electronic Colloquium on Computational Complexity (ECCC), vol. 27, p. 58 (2020)
Google Scholar
Goldreich, O., Vadhan, S.P., Wigderson, A.: On interactive proofs with a laconic prover. Comput. Complex. 11(1–2), 1–53 (2002)
Article MathSciNet Google Scholar
Ishai, Y.: Zero-knowledge proofs from information-theoretic proof systems. https://zkproof.org/2020/08/12/information-theoretic-proof-systems/
Kilian, J.: A note on efficient zero-knowledge proofs and arguments (extended abstract). In: STOC, pp. 723–732 (1992)
Google Scholar
Kalai, Y.T., Rothblum, R.D.: Arguments of proximity. In: Gennaro, R., Robshaw, M. (eds.) CRYPTO 2015, Part II. LNCS, vol. 9216, pp. 422–442. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-48000-7_21
Chapter Google Scholar
Lund, C., Fortnow, L., Karloff, H.J., Nisan, N.: Algebraic methods for interactive proof systems. J. ACM 39(4), 859–868 (1992)
Article MathSciNet Google Scholar
Ron-Zewi, N., Rothblum, R.: Local proofs approaching the witness length. In: Electronic Colloquium on Computational Complexity (ECCC), vol. 26, p. 127 (2019)
Google Scholar
Reingold, O., Rothblum, G.N., Rothblum, R.D.: Constant-round interactive proofs for delegating computation. In: Proceedings of the 48th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2016, Cambridge, MA, USA, 18–21 June 2016, pp. 49–62 (2016)
Google Scholar
Reingold, O., Rothblum, G.N., Rothblum, R.D.: Efficient batch verification for UP. In: 33rd Computational Complexity Conference, CCC 2018, San Diego, CA, USA, 22–24 June 2018, pp. 22:1–22:23 (2018)
Google Scholar
Rubinfeld, R., Sudan, M.: Robust characterizations of polynomials with applications to program testing. SIAM J. Comput. 25(2), 252–271 (1996)
Article MathSciNet Google Scholar
Rothblum, G.N., Vadhan, S., Wigderson, A.: Interactive proofs of proximity: delegating computation in sublinear time. In: STOC, pp. 793–802 (2013)
Google Scholar
Shamir, A.: IP = PSPACE. J. ACM 39(4), 869–877 (1992)
Article MathSciNet Google Scholar
Sudan, M. (ed.): Efficient Checking of Polynomials and Proofs and the Hardness of Appoximation Problems. LNCS, vol. 1001. Springer, Heidelberg (1995). https://doi.org/10.1007/3-540-60615-7
Book MATH Google Scholar

Download references

Acknowledgments

We thank Oded Goldreich and Omer Reingold for helpful and illuminating discussions on these topics. We thank the TCC revewiers for their careful reading of the manuscript and useful comments.

Guy Rothblum has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No. 819702). Research also supported by the Israel Science Foundation (grant number 5219/17) and an Amazon Research Award.

Ron Rothblum was supported in part by a Milgrom family grant, by the Israeli Science Foundation (Grants No. 1262/18 and 2137/19), and the Technion Hiroshi Fujiwara cyber security research center and Israel cyber directorate.

Author information

Authors and Affiliations

Weizmann Institute, Rehovot, Israel
Guy N. Rothblum
Technion - Israel Institute of Technology, Haifa, Israel
Ron D. Rothblum

Authors

Guy N. Rothblum
View author publications
You can also search for this author in PubMed Google Scholar
Ron D. Rothblum
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Guy N. Rothblum or Ron D. Rothblum .

Editor information

Editors and Affiliations

Cornell Tech, New York, NY, USA
Rafael Pass
Institute of Science and Technology Austria, Klosterneuburg, Austria
Krzysztof Pietrzak

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rothblum, G.N., Rothblum, R.D. (2020). Batch Verification and Proofs of Proximity with Polylog Overhead. In: Pass, R., Pietrzak, K. (eds) Theory of Cryptography. TCC 2020. Lecture Notes in Computer Science(), vol 12551. Springer, Cham. https://doi.org/10.1007/978-3-030-64378-2_5

Download citation

DOI: https://doi.org/10.1007/978-3-030-64378-2_5
Published: 09 December 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-64377-5
Online ISBN: 978-3-030-64378-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

International Association for Cryptologic Research (opens in a new tab)

Abstract

1 Introduction

1.1 Batch Verification

Theorem 1

1.2 Interactive Proofs of Proximity

Theorem 2

1.3 Related Works

1.4 Organization

2 Technical Overview

3 Preliminaries

3.1 Multivariate Polynomials and Low Degree Extensions

Lemma 3.1

Definition 3.1

3.2 A Useful Permutation Family

Proposition 3.1

Proof

Proposition 3.2

Proof

3.3 Succinct Descriptions

Definition 3.2

Definition 3.3

3.4 Interactive Proofs of Proximity

Definition 3.4

Proposition 3.3

Proof

4 Our Results

Theorem 4.1

Theorem 4.2

Corollary 4.1

5 The \(\mathsf {PVAL}\) Problem

Definition 5.1

Proposition 5.1

Lemma 5.1

5.1 Interactive Proof for \(\mathsf {PVAL}\) Emptiness

Lemma 5.2

6 Efficient \(\mathsf {IPP}\) for \(\mathsf {PVAL}\)

Theorem 6.1

6.1 Completeness

6.2 Soundness

Claim 6.2

Proof

Proposition 6.1

Proof

Claim 6.3

Proof

Claim 6.4

Proof

Claim 6.5

Proof

Claim 6.6

Proof

6.3 Complexity

Claim 6.7

Proof

7 Proving Theorem 4.1 and Theorem 4.2

Theorem 7.1

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation