Keywords

1 Introduction

Cryptography usually adopts a worst-case angle on complexity. For example, in the context of concrete security, a typical theorem shows that an adversary running for at most t steps succeeds with advantage at most \(\varepsilon \). In this paper, we instead study the concrete security of cryptographic schemes and assumptions as a function of the expected running time of the adversary.

Expected-time complexity is a natural measure in its own right – e.g., it is very common in cryptanalysis, as it is often much easier to analyze. But it is also a useful technical tool – indeed, simulators and extractors are often expected time, sometimes inherently so [1]. To use these technical tools, we need assumptions to hold with respect to expected time.

The problem has been studied closely by Katz and Lindell [14], who also suggest expected-time adversaries as a natural model, which however also comes with several technical challenges. Either way, the resulting common wisdom is that assumptions which are true with respect to (non-uniform) worst-case polynomial time are true for expected polynomial-time, and often more fine-grained statements are possible via Markov’s inequality (see below). However, for concrete security, such generic argument fail to give tight bounds.

Summary of contributions. This paper makes progress on two fronts.

First, as our main technical contribution, we introduce general tools to give tight concrete security bounds in information-theoretic settings (e.g., in the random-oracle or generic-group models) for expected-time adversaries. Our tools can easily translate many existing proofs from the worst-case to the expected-time regime. We derive for example tight bounds for finding collisions in a random oracle, for the PRF security of random oracles, and for computing discrete logarithms in the generic-group model. We also obtain bounds for the security of key-alternating ciphers against expected-time adversaries.

Second, we study a “Forking Lemma” to prove soundness of multi-round public-coin proofs and arguments (of knowledge) satisfying a generalized notion of special soundness, enabling witness extraction from a suitable tree of accepting interactions. In particular, we follow a blueprint by Bootle et al. [6], which has also been adopted by follow-up works [7, 8, 24]. In contrast to prior works, we provide a concrete analysis of the resulting expected-time witness extraction strategy, and also give a modular treatment of the techniques which may be of independent interest.

We showcase these tools by deriving concrete bounds for the soundness of Bulletproofs [7] in terms of the expected-time hardness of solving the discrete logarithm problem. Instantiating the bound with our generic-group model analysis will in particular illustrate the dependence of soundness on group parameters and on the complexity of the statement to be proved. We are unaware of any such result having been proved, despite the practical appeal of these protocols.

The remainder of this introduction provides a detailed overview of our results.

1.1 Information-Theoretic Bounds for Expected-Time Adversaries

Our first contribution is a framework to prove tight bounds with respect to expected-time adversaries. We focus on information-theoretic analyses, such as those in the random oracle [3] and the generic group [18, 22] models.

Our focus on tight bounds is what makes the problem hard. Indeed, one can usually obtain a non-tight bound using Markov’s inequality. For example, the probability \(\varepsilon (T, N)\) of a T-time adversary finding a collision in a random oracle with N outputs satisfies \(\varepsilon (T, N) \leqslant T^2/2N\), and this bound is tight. If we instead aim to upper bound the probability \(\varepsilon (\mu _T, N)\) of finding a collision for an adversary that runs in expected time \(\mu _T = \mathsf {E}[T]\), Markov’s inequality yields, for every \(T^*> \mu _T\),

$$\begin{aligned} \varepsilon (\mu _T, N) \leqslant \mathsf {Pr}\left[ T > T^* \right] + \frac{(T^*)^2}{2N} \leqslant \frac{\mu _T}{T^*} + \frac{(T^*)^2}{2N} \leqslant 2 \cdot \root 3 \of {\frac{\mu _T^2}{2N}}\;, \end{aligned}$$
(1)

where the right-most inequality is the result of setting \(T^*\) such that \(\frac{\mu _T}{T^*} = \frac{(T^*)^2}{2N}\). Here, we prove the better upper bound

$$\begin{aligned} \varepsilon (\mu _T, N) \leqslant \sqrt{\frac{\mu _T^2}{2N}} \;, \end{aligned}$$
(2)

as a corollary of the techniques we introduce below. This bound is tight: To see this, take an adversary which initially flips a biased coin, which is heads with probability \(\mu _T/\sqrt{N}\). If the coin is tails, it aborts, failing to find a collision. If the coin is heads, it makes \(\sqrt{N}\) queries to find a collision with high probability. Then, this adversary succeeds with probability \(\varOmega (\mu _T/\sqrt{N}) = \varOmega (\sqrt{\mu _T^2/N})\), and its expected run time is \(\mu _T\).

Both (1) and (2) show that \(\mu _T \geqslant \varOmega (\sqrt{N})\) must hold to find a collision with probability one. However, exact probability bounds are important in the regime \(\mu _T = o(\sqrt{N})\). For example, say we are asked to find a collision in at least one out of u independent random oracles, and the expected number of queries to each is \(\mu _T\). Then, a hybrid argument bounds the probability by \(u \cdot \varepsilon (\mu _T, N)\), making the difference between a square-root and a cube-root bound on \(\varepsilon (\mu _T, N)\) important.

A Generic Approach for bad-flag analyses. We aim for a general approach to transform information-theoretic bounds for worst-case query complexity into bounds with respect to expected query complexity. If an existing analysis (with respect to worst-case complexity) follows a certain pattern, then we easily obtain an expected query complexity bound.

More concretely, many security proofs follow the “equivalent-until-bad” format (as formalized by Bellare and Rogaway [4], but equivalent formulations can be derived from the works of Maurer [17] and Shoup [23]). The goal here is to upper bound the advantage of an adversary \(\mathcal {A}\) distinguishing two games \(\mathsf {G}_0\) and \(\mathsf {G}_1\), which behave identically until some bad flag \(\mathsf {bad}\) is set. Then, the distinguishing advantage is upper bounded by the probability of setting \(\mathsf {bad}\) to true – an event we denote as \(\mathsf {BAD}^{\mathcal {A}}\). Typically, \(\mathsf {G}_0\) is the “real world” and \(\mathsf {G}_1\) is the “ideal world”. Now, let \(Q_1\) be the number of queries by an adversary \(\mathcal {A}\) in \(\mathsf {G}_1\), which is a random variable. Then, we say that this game pair satisfies \(\delta \) -boundedness if

$$\begin{aligned} \mathsf {Pr}\left[ \mathsf {BAD}^{\mathcal {A}} \;|\; Q_1 = q \right] \leqslant \delta (q) \end{aligned}$$

for all \(q \geqslant 1\) and adversaries \(\mathcal {A}\). This condition is not without loss of generality, but it can be ensured in all examples we verified.

Our first main theorem (Theorem 1) shows that if \(\delta (q) = \varDelta \cdot q^d/N\), then the probability of setting \(\mathsf {BAD}^{\mathcal {A}}\) (in either of the two games), and hence the advantage of distinguishing \(\mathsf {G}_0\) and \(\mathsf {G}_1\), is upper bounded as

$$\begin{aligned} \mathsf {Pr}\left[ \mathsf {BAD}^{\mathcal {A}} \right] \leqslant 5 \cdot \left( \frac{\varDelta \mathsf {E}[Q_0]^d}{N} \right) ^{1/d} \;, \end{aligned}$$

where (quite) crucially \(Q_0\) is the number of queries of \(\mathcal {A}\) in \(\mathsf {G}_0\). This asymmetry matters in applications - we typically measure complexity in the real world, but \(\delta \)-boundedness only holds in the ideal world.

Proof idea. The key step behind the proof of Theorem 1 is the introduction of an early-terminating adversary \(\mathcal {B}\), which behaves as \(\mathcal {A}\) in attempting to set \(\mathsf {bad}\), but aborts early after \(U=\left\lfloor \root d \of {Nu/\varDelta } \right\rfloor = \varTheta (\root d \of {N/\varDelta })\) queries, where \(u=2^{-d}\). One can then show that (we can think of the following probabilities in \(\mathsf {G}_0\))

$$\begin{aligned} \mathsf {Pr}\left[ \mathsf {BAD}^{\mathcal {A}} \right] \leqslant \mathsf {Pr}\left[ \mathsf {BAD}^{\mathcal {B}} \right] + \mathsf {Pr}\left[ Q_0 > U \right] \;, \end{aligned}$$

because \(\mathsf {Pr}\left[ \mathsf {BAD}^{\mathcal {A}} \wedge Q_0 \leqslant U \right] \leqslant \mathsf {Pr}\left[ \mathsf {BAD}^{\mathcal {B}} \right] \). Markov’s inequality then yields

$$\begin{aligned} \mathsf {Pr}\left[ Q_0 > U \right] \leqslant \frac{\mathsf {E}\left[ Q_0\right] }{U} = \varTheta \left( \root d \of {\varDelta \mathsf {E}\left[ Q_0\right] ^d/N}\right) \;, \end{aligned}$$

which is of the right order.

Therefore, the core of the proof is to show \(\mathsf {Pr}\left[ \mathsf {BAD}^{\mathcal {B}} \right] = O\left( \root d \of {\varDelta \mathsf {E}\left[ Q_0\right] ^d/N}\right) \). This will require using \(\delta \)-boundedness first, but a careful reader may observe that this will only upper bound the probability with respect to \(\mathsf {E}\left[ Q_1\right] \), and not \(\mathsf {E}\left[ Q_0\right] \). The bulk of the proof is then to switch between the two.

Examples. We apply the above framework to a few examples, to show its applicability. We show bounds on the hardness of discrete logarithms in the generic-group model [18, 22], and on the collision-resistance and PRF security of random oracles. In particular, our framework also works for notions which are not indistinguishability based, such as collision-resistance of a random oracle, by introducing a suitable world \(\mathsf {G}_1\) where it is hard to win the game.

The H-Coefficient method. Equivalent-until-bad analyses are not always the simplest way to prove security (despite the fact that in principle every analysis can be cast in this format, as shown in [19]). We also give a variant of the above approach tailored at proving security in a simpler version of the H-coefficient method [9, 20] which considers what is referred to as pointwise-proximity in [12]. This amounts to using the standard H-coefficient method without bad transcripts. (To the best of our knowledge, this simpler version of the method is due to Bernstein [5].) This allows us to obtain expect-time versions of security bounds for the PRF/PRP switching lemma and for key-alternating ciphers, the latter building on top of work by Hoang and Tessaro [12].We defer details on this to the full version of this paper [13].

1.2 Forking Lemmas and Concrete Soundness

One motivation for studying expected-time adversaries is as a tool to prove bounds for worst-case complexity, rather than as a goal on itself. We expose here one such application in the context of proving soundness bounds for public-coin proofs/arguments (of knowledge). In particular, soundness/proof-of-knowledge proofs for several protocols (like [6,7,8, 24]) rely on generalizations of the Forking Lemma (originally proposed by Pointcheval and Stern [21] for three-round protocols) which adopt expected-time witness extraction strategies. These have only been analyzed in an asymptotic sense, and our goal is to give a concrete-security treatment. We propose here a modular treatment of these techniques, and instantiate our framework to provide concrete bounds on the soundness of Bulletproofs [7], a succinct proof system which has enjoyed wide popularity.

Forking Lemmas. Pointcheval and Stern’s original “Forking Lemma” [21] deals with \(\varSigma \)-protocols that satisfy special soundness - these are three-round protocols, where a transcript takes the form (acd), with c being the verifier’s single random challenge. Here, given common input x, the prover \(\mathsf {P}\) proves knowledge to \(\mathsf {V}\) of a witness w for a relation \(\mathsf {R}\). The proof of knowledge property is proved by giving an extractor \(\mathcal {B}\) which produces a witness for x given (black-box) access to a prover \(\mathsf {P}^*\) – if \(\mathsf {P}^*\) succeeds with probability \(\varepsilon \), then \(\mathcal {B}\) succeeds with probability (roughly) \(\varepsilon ^2\). Concretely, \(\mathcal {B}\) simulates an execution of \(\mathsf {P}^*\) with a random challenge c, which results in a transcript (acd), and then rewinds \(\mathsf {P}^*\) to just before obtaining c, and feeds a different challenge \(c'\) to obtain a transcript \((a, c', d')\). If both transcripts are accepting, and \(c \ne c'\), a witness can be extracted via special soundness. Bellare and Neven [2] give alternative Forking Lemmas where \(\mathcal {B}\)’s success probability approaches \(\varepsilon \), at the cost of a larger running time.

Expected-time extraction. It is natural to expect that the success probability of \(\mathcal {B}\) above degrades exponentially in the number of required accepting transcripts. Crucially, however, one can make the Forking Lemma tight with respect to probability if we relax \(\mathcal {B}\) to have bounded expected running time. Now, \(\mathcal {B}\) runs \(\mathsf {P}^*\) once with a random challenge c and, if it generates a valid transcript (acd), we rewind \(\mathsf {P}^*\) to before receiving the challenge c, and keep re-running it from there with fresh challenges until we obtain a second valid transcript \((a, c', d')\) for \(c \ne c'\). The expected running time is only twice that of \(\mathsf {P}^*\).

A general Forking Lemma. An extension of this idea underlies the analysis of recent succinct public-coin multi-round interactive arguments of knowledge [6,7,8, 24], following a workflow introduced first by Bootle et al. (BCCGP) [6] which extracts a witness from a tree of multi-round executions obtained by clever rewinding of \(\mathsf {P}^*\). In particular, since the number of generated accepted interactions is large (i.e., exponential in the number of rounds), the usage of an expected-time strategy is essential to extract with good enough probability.

These works in fact prove the stronger property of witness-extended emulation [11, 16]. This means that with black-box access to a prover \(\mathsf {P}^*\), an expected-time emulator \(\mathsf {E}\) (1) generates a transcript with the same distribution as in an interaction between \(\mathsf {P}^*\) and the verifier \(\mathsf {V}\), and (2) if this transcript is accepting, then a valid witness is produced along with it. In the case of arguments, it is possible that (2) fails, but this would imply breaking an underlying assumption.

The BCCGP framework was refined in follow-up works [7, 8, 24], but these remain largely asymptotic. We give here a clean and modular treatment of the BCCGP blueprint, which makes it amenable to a concrete security treatment. This will in particular require using our tools from the first part of the paper to analyze the probability that we generate a well-formed tree of transcripts from which a witness can be generated. We believe this to be of independent interest.

In the full version of this paper [13], we compare this expected-time forking lemma to one with strict running-time guarantees and confirm that the expected-time approach achieves a clear benefit in terms of tightness of the reduction.

Application to Bulletproofs. Finally, we apply the above framework to obtain a bound on the concrete soundness for public-coin interactive argument systems, and focus on Bulletproofs [7]Footnote 1. We obtain a bound in terms of the expected-time hardness of the discrete logarithm problem, and we combine this with our generic-group analysis to get a bound on the soundness in the generic-group modelFootnote 2. Of independent interest, the result relies on a tight reduction of finding non-trivial discrete log relations to the plain discrete log problem – which we give in Lemma 3.

Our bound is in particular on the probability \(\mathsf {Adv}^{\mathsf {sound}}_{\mathsf {PS},G}(\mathsf {P}^*)\) of a cheating prover \(\mathsf {P}^*\) convincing a verifier \(\mathsf {V}\) (from proof system \(\mathsf {PS}\)) on input x generated by a (randomized) instance generator G, and we show that

$$\begin{aligned} \mathsf {Adv}^{\mathsf {sound}}_{\mathsf {PS},G}(\mathsf {P}^*) \leqslant \mathsf {Adv}^{\mathsf {wit}}_{\mathsf {PS},G}(\mathcal {B}) + O\left( \frac{q_{\mathsf {P}^{*}}\cdot LM^3\log _2(M)}{\sqrt{|\mathbb {G}|}}\right) \;, \end{aligned}$$

where \(q_{\mathsf {P}^{*}}\) measures the number of group operations by \(\mathsf {P}^*\), M is the number of multiplication gates for a circuit representing the relation \(\mathsf {R}\), L is a parameter of that circuit (which we assume is small for this discussion, but may be as large as 2M), \(\mathsf {Adv}^{\mathsf {wit}}_{\mathsf {PS},G}(\mathcal {B})\) is the probability of \(\mathcal {B}\) extracting a witness w for an x sampled by G, where \(\mathcal {B}\) is an extractor whose (expected) running time amounts to roughly \(M^3\) that of \(\mathsf {P}^*\).

This bound is interesting because it highlights the dependence of the soundness probability on the group size \(|\mathbb {G}|\) and on M. It in fact shows that for typical instantiations, where \(|\mathbb {G}| \approx 2^{256}\), the guaranteed security level is fairly low for modest-sized circuits (say with \(M = 2^{20}\)). It is a good question whether this bound can be made tighter, in particular with respect to its dependence on M.

We also note that for specific instance generators G our tools may be helpful to estimate \(\mathsf {Adv}^{\mathsf {wit}}_{\mathsf {PS},G}(\mathcal {B})\).

2 Preliminaries

Let \(\mathbb {N}=\{0,1,2,\dots \}\) and \(\mathbb {N}_{>0}=\mathbb {N}\setminus \{0\}\). For \(N\in \mathbb {N}\), let \([N]=\{1,2,\dots ,N\}\). For \(j>k\) we adopt the conventions that \(\prod _{i=j}^k n_i=1\) and \((m_j,m_{j+1},\dots ,m_k)=()\). Equivalence mod p is denoted \(\equiv _p\).

We let \(\mathsf {Perm}(S)\) denote the set of all permutations on set S and \(\mathsf {Fcs}(S,S')\) denote the set of all functions from S to \(S'\). Sampling x uniformly from the set S is denoted . The notation \(S=S'\sqcup S''\) means that \(S=S'\cup S''\) and \(S'\cap S''=\emptyset \), i.e., \(S'\) and \(S''\) partition S. We let \(\{0,1\}^*\) denote the set of finite-length bitstrings and \(\{0,1\}^\infty \) denote the set of infinite-length bitstrings.

We let \(y\leftarrow \mathcal {A}^{\textsc {O}}(x_1,x_2,\dots ; c)\) denote the execution of \(\mathcal {A}\) on input \(x_1,x_2,\dots \) and coins \(c\in \{0,1\}^\infty \) with access to oracle(s) \(\textsc {O}\), producing output y. When c is chosen uniformly we write . For a stateful algorithm \(\mathcal {A}\) with state s we use \(y\leftarrow \mathcal {A}^{\textsc {O}}(x_1,x_2,\dots :s;c)\) as shorthand for the expression \((y,s)\leftarrow A^{\textsc {O}}(x_1,x_2,\dots ,s;c)\). When some of an algorithm’s output is not going to be used we will write \(\cdot \) in place of giving it a variable name.

We use pseudocode games, inspired by the code-based game framework of Bellare and Rogaway [4]. See Fig. 1 for some example games. If \({ \textsf {H}}\) is a game, then \(\mathsf {Pr}[{ \textsf {H}}]\) denotes the probability that it outputs \(\mathsf {true}\). We use \(\wedge \), \(\vee \), \(\Leftrightarrow \), and \(\lnot \) for the logical operators “and”, “or”, “iff”, and “not”.

Running-time conventions. The most commonly used notion for the running time of an algorithm is worst-case. For this, one first fixes a computational model with an associated notion of computational steps. Then an algorithm \(\mathcal {A}\) has worst-case running time t if for all choice of \(x_1,x_2,\dots \) and c it performs at most t computation steps in the execution \(\mathcal {A}^{\textsc {O}}(x_1,x_2,\dots ; c)\), no matter how \(\textsc {O}\) responds to any oracle queries \(\mathcal {A}\) makes.

In this paper we are interested in proving bounds that instead depend on the expected number of computation steps that \(\mathcal {A}\) performs. There may be randomness in how the inputs \(x_1,x_2,\dots \) to \(\mathcal {A}\) and the responses to \(\textsc {O}\) queries are chosen (in addition to the random selection of c).

There is more variation in how expected running time may be defined. We will provide our bounds in terms of the expected running time of adversaries interacting with the “real” world that they expect to interact with. Such a notion of expected runtime is brittle because the expected runtime of the adversary may vary greatly when executing in some other world; however, this notion is the strongest for the purposes of our results because it will guarantee the same bounds for notions of expected running time which restrict the allowed adversaries more. See [10, 15] for interesting discussion of various ways to define expected polynomial time.

For many of the results of this paper, rather than directly measuring the runtime of the adversary we will look at the (worst-case or expected) number of oracle queries that it makes. The number of oracle queries can, of course, be upper bounded by the number of computational steps.

Useful lemmas. We will make use of Markov’s inequality and the Schwartz-Zippel Lemma, which we reproduce here.

Lemma 1

(Markov’s Inequality). Let X be a non-negative random variable and \(c>0\) be a non-negative constant, then

$$\begin{aligned} \mathsf {Pr}[X > c] \leqslant \mathsf {Pr}[X \geqslant c] \leqslant \mathsf {E}[X]/c. \end{aligned}$$

Lemma 2

(Schwartz-Zippel Lemma). Let \(\mathbb {F}\) be a finite field and let \(p\in \mathbb {F}[x_1,x_2,\dots x_n]\) be a non-zero polynomial with degree \(d \geqslant 0\). Then

$$\begin{aligned} \mathsf {Pr}[p(r_1,\dots ,r_n)=0]\leqslant d/|\mathbb {F}| \end{aligned}$$

where the probability is over the choice of \(r_1,\dots ,r_n\) according to .

Fig. 1.
figure 1

Left: Game defining discrete log security of group \(\mathbb {G}\). Middle: Game defining discrete log relation security of group \(\mathbb {G}\). Right: Reduction adversary for Lemma 3.

Discrete Logarithm Assumptions. Let \(\mathbb {G}\) be a cyclic group of prime order p with identity \(1_\mathbb {G}\) and \(\mathbb {G}^*=\mathbb {G}\setminus \{1_{\mathbb {G}}\}\) be its set of generators. Let \((g_0,\dots ,g_n)\in \mathbb {G}^n\) and \((a_0,\dots ,a_n)\in \mathbb {Z}_p\). If \(\prod _{i=0}^n g_i^{a_i}=1_{\mathbb {G}}\) and a least one of the \(a_i\) are non-zero, this is said to be a non-trivial discrete log relation. It is believed to be hard to find non-trivial discrete log relations in cryptographic groups (when the \(g_i\) are chosen at random). We refer to computing \(\prod _{i=0}^n g_i^{a_i}\) as a multi-exponentiation of size \(n+1\).

Discrete log relation security is defined by the game in the middle of Fig. 1. In it, the adversary \(\mathcal {A}\) is given a vector \(\mathbf {g}=({g}_0,\dots ,{g}_n)\) and attempts to find a non-trivial discrete log relation. We define \(\mathsf {Adv}^{\mathsf {dl}\text{- }\mathsf {rel}}_{\mathbb {G},n}(\mathcal {A})=\mathsf {Pr}[{ \textsf {H}}^{\mathsf {dl}\text{- }\mathsf {rel}}_{\mathbb {G},n}(\mathcal {A})]\). Normal discrete log security is defined by the game in the left panel of Fig. 1. In it, the adversary attempts to find the discrete log of \(h\in \mathbb {G}\) with respect to a generator \(g\in \mathbb {G}^*\). We define \(\mathsf {Adv}^{\mathsf {dl}}_{\mathbb {G}}(\mathcal {A})=\mathsf {Pr}[{ \textsf {H}}^{\mathsf {dl}}_{\mathbb {G}}(\mathcal {A})]\).

It is well known that discrete log relation security is asymptotically equivalent to discrete log security. The following lemma makes careful use of self-reducibility techniques to give a concrete bound showing that discrete log relation security is tightly implied by discrete log security.

Lemma 3

Let \(\mathbb {G}\) be a group of prime order p and \(n\geqslant 1\) be an integer. For any \(\mathcal {B}\), define \(\mathcal {C}\) as shown in Fig. 1. Then

$$\begin{aligned} \mathsf {Adv}^{\mathsf {dl}\text{- }\mathsf {rel}}_{\mathbb {G},n}(\mathcal {B}) \leqslant \mathsf {Adv}^{\mathsf {dl}}_{\mathbb {G}}(\mathcal {C}) + 1/p. \end{aligned}$$

The runtime of \(\mathcal {C}\) is that of \(\mathcal {B}\) plus the time to perform \(n+1\) multi-exponentiations of size 2 and some computations in the field \(\mathbb {Z}_p\).

The proof of this theorem is deferred to the full version of the paper [13].

3 Bad Flag Analysis for Expected-Time Adversaries

In this section we show how to (somewhat) generically extend the standard techniques for analysis of “bad” flags from worst-case adversaries to expected-time adversaries. Such analysis is a fundamental tool for cryptographic proofs and has been formalized in various works [4, 17, 23]. Our results are tailored for the setting where the analysis of the bad flag is information theoretic (e.g., applications in ideal models), rather than reliant on computational assumptions.

We start by introducing our notation and model for identical-until-bad games in Sect. 3.1. Then in Sect. 3.2 we give the main theorem of this section which shows how to obtain bounds on the probability that an expected time adversary causes a bad flag to be set. Finally, in Sect. 3.3 we walk through some basic applications (collision-resistance and PRF security in the random oracle model and discrete log security in the generic group model) to show the analysis required for expected time adversaries follows from simple modifications of the techniques used for worst-case adversaries.

Fig. 2.
figure 2

Identical-until-bad games defined from game specification \((G,G')\).

3.1 Notation and Experiments for Identical-Until-Bad Games

Identical-until-bad games. Consider Fig. 2 which defines a pair of games \({ \textsf {G}}^{(G,G')}_0\) and \({ \textsf {G}}^{(G,G')}_1\) from a game specification \((G,G')\). Here G and \(G'\) are stateful randomized algorithms. At the beginning of the game, coins \(c_0\), \(c_1\), and \(c_\mathcal {A}\) are sampled uniformly at randomFootnote 3. The first two of these are used by G and \(G'\) while the last is used by \(\mathcal {A}\)Footnote 4. The counter t is initialized to 0, the flag \(\mathsf {bad}\) is set to \(\mathsf {false}\), and states s and \(s'\) are initialized for use by G and \(G'\).

During the execution of the game, the adversary \(\mathcal {A}\) repeatedly makes queries to the oracle \(\textsc {Orac}\). The variable t counts how many queries \(\mathcal {A}\) makes. As long as \(\mathsf {bad}\) is still \(\mathsf {false}\) (so \(\lnot {\mathsf {bad}}\) is \(\mathsf {true}\)), for each query made by \(\mathcal {A}\) the algorithm \(G'\) will be given this query to determine if \(\mathsf {bad}\) should be set to \(\mathsf {true}\). When \(b=1\), the behavior of \(\textsc {Orac}\) does not depend on whether \(\mathsf {bad}\) is set because the output of the oracle is always determined by running \(G(1,x:s;c_1,c_1)\). When \(b=0\), the output of the oracle is defined in the same way up until the point that \(\mathsf {bad}\) is set to \(\mathsf {true}\). Once that occurs, the output is instead determined by running \(G(0,x:s;c_1,c_0)\). Because these two games are identical except in the behavior of the code \(d\leftarrow b\) which is only executed once \(\mathsf {bad}=\mathsf {true}\), they are “identical-until-bad”.

In this section, the goal of the adversary is to cause \(\mathsf {bad}\) to be set to \(\mathsf {true}\). Bounding the probability that \(\mathcal {A}\) succeeds in this can be used to analyze security notions in two different ways. For indistinguishability-based security notions (e.g., PRG or PRF security), the two games \({ \textsf {G}}_b\) would correspond to the two worlds the adversary is attempting to distinguish between. For other security notions (e.g., collision resistance or discrete log security), we think of one of the \({ \textsf {G}}_b\) as corresponding to the game the adversary is trying to win and the other as corresponding to a related “ideal” world in which the adversary’s success probably can easily be bounded. In either case, the fundamental lemma of game playing [4] can be used to bound the advantage of the adversary using a bound on the probability that \(\mathsf {bad}\) is set.

A combined experiment. For our coming analysis it will be useful to relate executions of \({ \textsf {G}}^{(G,G')}_0(\mathcal {A})\) and \({ \textsf {G}}^{(G,G')}_1(\mathcal {A})\) to each other. For this we can think of a single combined experiment in which we sample \(c_0\), \(c_1\), and \(c_{\mathcal {A}}\) once and then run both games separately using these coins.

For \(b\in \{0,1\}\), we let \(Q_b^{\mathcal {A}}\) be a random variable denoting how many oracle queries \(\mathcal {A}\) makes in the execution of \({ \textsf {G}}^{(G,G')}_b(\mathcal {A})\) during this experiment. We let \(\mathsf {BAD}^{\mathcal {A}}_t[b]\) denote the event that \(G'\) sets \(\mathsf {bad}_t\) to \(\mathsf {true}\) in the execution of \({ \textsf {G}}^{(G,G')}_b(\mathcal {A})\). Note that \(\mathsf {BAD}^{\mathcal {A}}_t[0]\) will occur if and only if \(\mathsf {BAD}^{\mathcal {A}}_t[1]\) occurs, because the behavior of both games are identical up until the first time that \(\mathsf {bad}\) is set and \(G'\) is never again executed once \(\mathsf {bad}\) is \(\mathsf {true}\). Hence we can simplify notation by defining \(\mathsf {BAD}^{\mathcal {A}}_t\) to be identical to the event \(\mathsf {BAD}^{\mathcal {A}}_t[0]\), while keeping in mind that we can equivalently think of this event as occurring in the execution of either game. We additionally define the event that \(\mathsf {bad}\) is ever set \(\mathsf {BAD}^{\mathcal {A}}=\bigvee _{i=1}^\infty \mathsf {BAD}^{\mathcal {A}}_i\), the event that \(\mathsf {bad}\) is set by one of the first j queries the adversary makes \(\mathsf {BAD}^{\mathcal {A}}_{\leqslant j}=\bigvee _{i=1}^j \mathsf {BAD}^{\mathcal {A}}_j\), and the event that \(\mathsf {bad}\) is set after the j-th query the adversary makes \(\mathsf {BAD}^{\mathcal {A}}_{>j}=\bigvee _{i=j+1}^\infty \). Clearly, \(\mathsf {Pr}[\mathsf {BAD}^{\mathcal {A}}]=\mathsf {Pr}[\mathsf {BAD}^{\mathcal {A}}_{\leqslant j}]+\mathsf {Pr}[\mathsf {BAD}^{\mathcal {A}}_{>j}].\) Again we can equivalently think of these events as occurring in either game. When the adversary is clear from context we may choose to omit it from the superscript in our notation.

The fact that both games behave identically until \(\mathsf {bad}\) is set \(\mathsf {true}\) allows us to make several nice observations. If \(\mathsf {BAD}\) does not hold, then \(Q_0=Q_1\) must hold. If \(\mathsf {BAD}_t\) holds for some t, then both \(Q_0\) and \(Q_1\) must be at least t. One implication of this is that if \(Q_1=q\) holds for some q, then \(\mathsf {BAD}\) is equivalent to \(\mathsf {BAD}_{\leqslant q}\). Additionally, we can see that \(\mathsf {Pr}[\mathsf {BAD}_{>q}]\leqslant \mathsf {Pr}[Q_b>q]\) must hold.

Defining our events and random variables in this single experiment will later allow to consider the expectation \(\mathsf {E}[Q_0^d | Q_1 = q]\) for some \(d,q\in \mathbb {N}\). In words, that is the expected value of \(Q_0\) raised to the d-th power conditioned on \(c_0,c_1,c_\mathcal {A}\) having been chosen so that \(Q_1 = q\) held. Since \(Q_0\) and \(Q_1\) can only differ if \(\mathsf {BAD}\) occurs we will be able to use \(\mathsf {Pr}[\mathsf {BAD}|Q_1=q]\) to bound how far \(\mathsf {E}[Q_0^d | Q_1 = q]\) can be from \(\mathsf {E}[Q_1^d | Q_1 = q]=q^d\).

\(\delta \) -boundedness. Existing analysis of identical-until-bad games is done by assuming a worst-case bound \(q_\mathcal {A}\) on the number of oracle queries that \(\mathcal {A}\) makes (in either game). Given such a bound, one shows that \(\mathsf {Pr}[\mathsf {BAD}^{\mathcal {A}}] \leqslant \delta (q_{\mathcal {A}})\) for some function \(\delta \). We will say that a game specification \((G,G')\) is \(\delta \)-bounded if for all \(\mathcal {A}\) and \(q\in \mathbb {N}\) we have that

$$\begin{aligned} \mathsf {Pr}[\mathsf {BAD}^{\mathcal {A}} | Q_1 = q] \leqslant \delta (q). \end{aligned}$$

As observed earlier, if \(Q_1=q\) holds then \(\mathsf {bad}_t\) cannot be set for any \(t>q\). Hence \(\mathsf {Pr}[\mathsf {BAD}^{\mathcal {A}} | Q_1=q] = \mathsf {Pr}[\mathsf {BAD}^{\mathcal {A}}_{\leqslant q} | Q_1=q]\).

We will, in particular, be interested in that case that \(\delta (q)=\varDelta \cdot q^d/N\) for some \(\varDelta ,d,N\geqslant 1\)Footnote 5. We think of \(\varDelta \) and d as “small” and of N as “large”. The main result of this section bounds the probability that an adversary sets \(\mathsf {bad}\) by \(O\left( \root d \of {\delta \left( \mathsf {E}[Q_b] \right) }\right) \) for either b if \((G,G')\) is \(\delta \)-bounded for such a \(\delta \).

While \(\delta \)-boundedness may seem to be a strange condition, we show in Sect. 3.3 that the existing techniques for proving results of the form \(\mathsf {Pr}[\mathsf {BAD}^{\mathcal {A}}] \leqslant \delta (q_{\mathcal {A}})\) for \(\mathcal {A}\) making at most \(q_{\mathcal {A}}\) queries can often be easily extended to show the \(\delta \)-boundedness of a game \((G,G')\). The examples we consider are the collision-resistance and PRF security of a random oracle and the security of discrete log in the generic group model. In particular, these examples all possess a common form. First, we note that the output of \(G(1,\dots )\) is independent of \(c_0\). Consequently, the view of \(\mathcal {A}\) when \(b=1\) is independent of \(c_0\) and hence \(Q_1\) is independent of \(c_0\). To analyze \(\mathsf {Pr}[\mathsf {BAD}| Q_1 = q]\) we can then think of \(c'\) and \(c_1\) being fixed (fixing the transcript of interaction between \(\mathcal {A}\) and its oracle in \({ \textsf {G}}^G_1\)) and argue that for any such length q interaction the probability of \(\mathsf {BAD}\) is bounded by \(\delta (q)\) over a random choice of \(c_0\).

We note that this general form seems to typically be implicit in the existing analysis of \(\mathsf {bad}\) flags for the statistical problems one comes across in ideal model analysis, but would not extend readily to examples where the probability of the bad flag being set is reduced to the probability of an adversary breaking some computational assumption.

3.2 Expected-Time Bound from \(\delta \)-boundedness

We can now state our result lifting \(\delta \)-boundedness to a bound on the probability that an adversary sets \(\mathsf {bad}\) given only its expected number of oracle queries.

Theorem 1

Let \(\delta (q)=\varDelta \cdot q^d/N\) for \(\varDelta ,d,N\geqslant 1\). Let \((G,G')\) be a \(\delta \)-bounded game specification. If \(N \geqslant \varDelta \cdot 6^d\), then for any \(\mathcal {A}\),

$$\begin{aligned} \mathsf {Pr}[\mathsf {BAD}^\mathcal {A}] \leqslant 5\root d \of {\frac{\varDelta \cdot \mathsf {E}[Q^\mathcal {A}_0]^d}{N}} = 5\root d \of {\delta \left( \mathsf {E}[Q^\mathcal {A}_0] \right) }. \end{aligned}$$

If \(N \geqslant \varDelta \cdot 2^d\), then for any \(\mathcal {A}\),

$$\begin{aligned} \mathsf {Pr}[\mathsf {BAD}^\mathcal {A}] \leqslant 3\root d \of {\frac{\varDelta \cdot \mathsf {E}[Q^\mathcal {A}_1]^d}{N}} = 3\root d \of {\delta \left( \mathsf {E}[Q^\mathcal {A}_1] \right) }. \end{aligned}$$

We provide bounds based on the expected runtime in either of the two games since they are not necessarily the same. Typically, one of the two games will correspond to a “real” world and it will be natural to desire a bound in terms of the expected runtime in that game. The proof for \(Q_0\) is slightly more complex and is given in this section. The proof for \(Q_1\) is simpler and deferred to the full version of this paper [13]. In the full version we show via a simple attack that the d-th root in these bounds is necessary.

Proof

(of Theorem 1). Let \(u=2^{-d}\) and \(U=\left\lfloor \root d \of {Nu/\varDelta } \right\rfloor \). Note that \(\delta (U)\leqslant u\). Now let \(\mathcal {B}\) be an adversary that runs exactly like \(\mathcal {A}\), except that it counts the number of oracle queries made by \(\mathcal {A}\) and halts execution if \(\mathcal {A}\) attempts to make a \(U+1\)-th query. We start our proof by bounding the probability of \(\mathsf {BAD}^{\mathcal {A}}\) by the probability of \(\mathsf {BAD}^{\mathcal {B}}\) and an \(O\left( \root d \of {\delta \left( \mathsf {E}[Q^\mathcal {A}_0] \right) }\right) \) term by applying Markov’s inequality. In particular we perform the calculations

$$\begin{aligned} \mathsf {Pr}[\mathsf {BAD}^\mathcal {A}]&= \mathsf {Pr}[\mathsf {BAD}^{\mathcal {A}}_{\leqslant U}]+\mathsf {Pr}[\mathsf {BAD}^{\mathcal {A}}_{>U}] \end{aligned}$$
(3)
$$\begin{aligned}&= \mathsf {Pr}[\mathsf {BAD}^{\mathcal {B}}_{\leqslant U}]+\mathsf {Pr}[\mathsf {BAD}^{\mathcal {A}}_{>U}] \end{aligned}$$
(4)
$$\begin{aligned}&\leqslant \mathsf {Pr}[\mathsf {BAD}^\mathcal {B}] + \mathsf {Pr}\left[ Q_0^\mathcal {A}> U \right] \end{aligned}$$
(5)
$$\begin{aligned}&\leqslant \mathsf {Pr}[\mathsf {BAD}^\mathcal {B}] + \mathsf {E}[Q_0^\mathcal {A}]/U \end{aligned}$$
(6)
$$\begin{aligned}&\leqslant \mathsf {Pr}[\mathsf {BAD}^\mathcal {B}] + 3\mathsf {E}[Q_0^\mathcal {A}]\root d \of {\varDelta /N}. \end{aligned}$$
(7)

Step 4 follows because for all queries up to the U-th, adversary \(\mathcal {B}\) behaves identically to \(\mathcal {A}\) (and thus \(\mathsf {BAD}^{\mathcal {A}}_i=\mathsf {BAD}^{\mathcal {B}}_i\) for \(i\leqslant U\)). Step 5 follows because \(\mathsf {BAD}^\mathcal {B}_{>U}\) cannot occur (because \(\mathcal {B}\) never makes more than U queries) and \(\mathsf {BAD}^\mathcal {A}_{>U}\) can only occur if \(Q^\mathcal {A}_0\) is at greater than U. Step 6 follows from Markov’s inequality. Step 7 follows from the following calculation which uses the assumption that \(N \geqslant \varDelta \cdot 6^d\) and that \(u=2^{-d}\),

$$\begin{aligned} U&=\left\lfloor \root d \of {Nu/\varDelta } \right\rfloor \geqslant \root d \of {Nu/\varDelta } -1 = \root d \of {N/\varDelta } \left( \root d \of {u}- \root d \of {\varDelta /N}\right) \\&\geqslant \root d \of {N/\varDelta } \left( \root d \of {2^{-d}}- \root d \of {\varDelta /(\varDelta \cdot 6^d)}\right) =\root d \of {N/\varDelta } \left( 1/2 - 1/6\right) . \end{aligned}$$

In the rest of the proof we need to establish that \(\mathsf {Pr}[\mathsf {BAD}^\mathcal {B}]\leqslant 2\mathsf {E}[Q_0^\mathcal {A}]\root d \of {\varDelta /N}\). We show this with \(\mathsf {E}[Q_0^\mathcal {B}]\), which is clearly upper bounded by \(\mathsf {E}[Q_0^\mathcal {A}]\). We will do this by first bounding \(\mathsf {Pr}[\mathsf {BAD}^\mathcal {B}]\) in terms of \(\mathsf {E}[(Q_1^\mathcal {B})^d]\), then bounding \(\mathsf {E}[(Q_1^\mathcal {B})^d]\) in terms of \(\mathsf {E}[(Q_0^\mathcal {B})^d]\), and then concluding by bounding this in terms of \(\mathsf {E}[Q_0^\mathcal {B}]\). For the first of these steps we expand \(\mathsf {Pr}[\mathsf {BAD}^{\mathcal {B}}]\) by conditioning on all possible values of \(Q_1^{\mathcal {B}}\) and applying our assumption that \((G,G')\) is \(\delta \)-bounded to get

$$\begin{aligned} \mathsf {Pr}[\mathsf {BAD}^{\mathcal {B}}]&= \sum _{q= 1}^U \mathsf {Pr}[\mathsf {BAD}^{\mathcal {B}} | Q^{\mathcal {B}}_1=q] \mathsf {Pr}[Q^{\mathcal {B}}_1=q] \leqslant \sum _{q= 1}^{U} (\varDelta \cdot q^d/N)\mathsf {Pr}[Q^{\mathcal {B}}_1=q]\\&= \varDelta /N \sum _{q= 1}^{U}q^d\mathsf {Pr}[Q^{\mathcal {B}}_1=q] = \varDelta \mathsf {E}[(Q^{\mathcal {B}}_1)^d]/N. \end{aligned}$$

So next we will bound \(\mathsf {E}[(Q^{\mathcal {B}}_1)^d]\) in terms of \(\mathsf {E}[(Q^{\mathcal {B}}_0)^d]\). To start, we will give a lower bound for \(\mathsf {E}[(Q^{\mathcal {B}}_0)^d | Q_1^{\mathcal {B}}=q]\) (when \(q\leqslant U\)) by using our assumption that \((G,G')\) is \(\delta \)-bounded. Let \(R_0\) be a random variable which equals \(Q^{\mathcal {B}}_0\) if \(\mathsf {BAD}^{\mathcal {B}}\) does not occur and equals 0 otherwise. Clearly \(R_0\leqslant Q^{\mathcal {B}}_0\) always. Recall that if \(\mathsf {BAD}^{\mathcal {B}}\) does not occur, then \(Q^{\mathcal {B}}_0=Q^{\mathcal {B}}_1\) (and hence \(R_0=Q^{\mathcal {B}}_1\)) must hold. We obtain

$$\begin{aligned} \mathsf {E}[(Q_0^{\mathcal {B}})^d | Q_1^{\mathcal {B}}=q]&\geqslant \mathsf {E}[R_0^d | Q^{\mathcal {B}}_1=q]\\&= q^d \mathsf {Pr}[\lnot {\mathsf {BAD}}^{\mathcal {B}} | Q^{\mathcal {B}}_1=q] + 0^d \mathsf {Pr}[\mathsf {BAD}^{\mathcal {B}} | Q^{\mathcal {B}}_1=q]\\&= q^d (1-\mathsf {Pr}[\mathsf {BAD}^{\mathcal {B}} | Q^{\mathcal {B}}_1=q])\\&\geqslant q^d (1-\delta (q)) \geqslant q^d(1-u). \end{aligned}$$

The last step used that \(\delta (q)\leqslant \delta (U)\leqslant u\) because \(q\leqslant U\).

Now we proceed by expanding \(\mathsf {E}[(Q_1^{\mathcal {B}})^d]\) by conditioning on the possible value of \(Q_1^{\mathcal {B}}\) and using the above bound to switch \(\mathsf {E}[(Q_0^{\mathcal {B}})^d | Q^{\mathcal {B}}_1=q]\) in for \(q^d\). This gives,

$$\begin{aligned} \mathsf {E}[(Q_1^{\mathcal {B}})^d]&=\sum _{q= 1}^U q^d\cdot \mathsf {Pr}[Q_1^{\mathcal {B}}=q]\\&=\sum _{q=1}^U\mathsf {E}[(Q_0^{\mathcal {B}})^d | Q_1^{\mathcal {B}}=q]\cdot \frac{q^d}{\mathsf {E}[(Q_0^{\mathcal {B}})^d | Q_1^{\mathcal {B}}=q]}\cdot \mathsf {Pr}[Q_1^{\mathcal {B}}=q]\\&\leqslant \sum _{q= 1}^U\mathsf {E}[(Q_0^{\mathcal {B}})^d | Q_1^{\mathcal {B}}=q]\cdot \frac{q^d}{ q^d(1-u)}\cdot \mathsf {Pr}[Q_1^{\mathcal {B}}=q]\\&=(1-u)^{-1}\mathsf {E}[(Q_0^{\mathcal {B}})^d] \end{aligned}$$

Our calculations so far give us that \(\mathsf {Pr}[\mathsf {BAD}^\mathcal {B}]\leqslant (1-u)^{-1}\mathsf {E}[(Q^\mathcal {B}_0)^d]\cdot \varDelta /N\). We need to show that this is bounded by \(2\mathsf {E}[Q_0^\mathcal {B}]\root d \of {\varDelta /N}\). First note that \(Q^\mathcal {B}_0\leqslant U\) always holds by the definition of \(\mathcal {B}\), so

$$\begin{aligned} (1-u)^{-1}\mathsf {E}[(Q^\mathcal {B}_0)^d]\cdot \varDelta /N \leqslant (1-u)^{-1}\mathsf {E}[Q^\mathcal {B}_0]\cdot U^{d-1}\cdot \varDelta /N. \end{aligned}$$

Now since \(U=\left\lfloor \root d \of {Nu/\varDelta } \right\rfloor \), we have \(U^{d-1}\leqslant (Nu/\varDelta )^{(d-1)/d}\) which gives

$$\begin{aligned} (1-u)^{-1}\mathsf {E}[Q^\mathcal {B}_0]\cdot U^{d-1}\cdot \varDelta /N \leqslant (1-u)^{-1} (u^{(d-1)/d})\mathsf {E}[Q^\mathcal {B}_0]\root d \of {\varDelta /N} . \end{aligned}$$

Finally, recall that we set \(u=2^{-d}\) and so

$$\begin{aligned} (1-u)^{-1} (u^{(d-1)/d})&= \frac{2^{-d\cdot (d-1)/d}}{1-2^{-d} } = \frac{2^{1-d}}{1-2^{-d} } \leqslant \frac{2^{1-1}}{1-2^{-1}} =2. \end{aligned}$$

Bounding \(\mathsf {E}[Q^\mathcal {B}_0]\leqslant \mathsf {E}[Q^\mathcal {A}_0]\) and combining with our original bound on \(\mathsf {Pr}[\mathsf {BAD}^{\mathcal {A}}]\) completes the proof. \(\square \)

3.3 Example Applications of Bad Flag Analysis

In this section we walk through some basic examples to show how a bound of \(\mathsf {Pr}[\mathsf {bad}| Q_1 = q] \leqslant \varDelta \cdot q^d/N\) can be proven using essentially the same techniques as typical bad flag analysis for worst-case runtime, allowing Theorem 1 to be applied. All of our examples follow the basic structure discussed earlier in this section. We write the analysis in terms of two games which are identical-until-bad and parameterized by a bit b. In the \(b=1\) game, the output of its oracles will depend on some coins we identify as \(c_1\), while in the \(b=0\) case the output will depend on both \(c_1\) and independent coins we identify as \(c_0\). Then we think of fixing coins \(c_1\) and the coins used by the adversary, which together fix \(Q_1\) (the number of queries \(\mathcal {A}\) would make in the \(b=1\) case), and argue a bound on the probability that \(\mathsf {bad}\) is set over a random choice of \(c_0\).

We write the necessary games in convenient pseudocode and leave the mapping to a game specification \((G,G')\) to apply Theorem 1 implicit. We will abuse notation and use the name of our pseudocode game to refer to the corresponding game specification.

Collision-resistance of a random oracle. Our first example is the collision resistance of a random oracle. Here an adversary is given access to a random function \(h:\{0,1\}^*\rightarrow [N]\). It wins if it can find \(x\ne y\) for which \(h(x)=h(y)\), i.e., a collision in the random oracle. One way to express this is by the game \({ \textsf {H}}^{\mathsf {cr}}_0\) shown in Fig. 3. The random oracle is represented by the oracle \(\textsc {Ro}\) and the oracle \(\textsc {Fin}\) allows the adversary to submit supposed collisions.

Fig. 3.
figure 3

Game capturing collision-resistance of a random oracle (when \(b=0\)).

Fig. 4.
figure 4

Games capturing PRF security of a random oracle.

In it, we have written \(\textsc {Ro}\) in a somewhat atypical way to allow comparison to \({ \textsf {H}}^{\mathsf {cr}}_1\) with which it is identical-until-bad. The coins used by these games determine a permutation \(\pi \) sampled at the beginning of the game and a value X chosen at random from [N] during each \(\textsc {Ro}\) queryFootnote 6. We think of the former as \(c_1\) and the latter as \(c_0\). Ignoring repeat queries, when in \({ \textsf {H}}^{\mathsf {cr}}_1\) the output of \(\textsc {Ro}\) is simply \(\pi [1],\pi [2],\dots \) in order. Thus clearly, \(\mathsf {Pr}[{ \textsf {H}}^{\mathsf {cr}}_1(\mathcal {A})]=0\) since there are no collisions in \(\textsc {Ro}\). In \({ \textsf {H}}^{\mathsf {cr}}_0\) the variable X modifies the output of \(\textsc {Ro}\) to provide colliding outputs with the correct distribution.

These games are identical-until-bad, so the fundamental lemma of game playing [4] gives us,

$$\begin{aligned} \mathsf {Pr}[{ \textsf {H}}^{\mathsf {cr}}_0(\mathcal {A})]&\leqslant \mathsf {Pr}[{ \textsf {H}}^{\mathsf {cr}}_0(\mathcal {A}) \text{ sets } \mathsf {bad}] + \mathsf {Pr}[{ \textsf {H}}^{\mathsf {cr}}_1(\mathcal {A})] = \mathsf {Pr}[{ \textsf {H}}^{\mathsf {cr}}_0(\mathcal {A}) \text{ sets } \mathsf {bad}]. \end{aligned}$$

Now think of the adversary’s coins and the choice of \(\pi \) as fixed. This fixes a value of \(Q_1\) and a length \(Q_1\) transcript of \(\mathcal {A}\)’s queries in \({ \textsf {H}}^{\mathsf {cr}}_1(\mathcal {A})\). If \(\mathcal {A}\) made all of its queries to \(\textsc {Fin}\), then \(\textsc {Ro}\) will have been executed \(2Q_1\) times. On the i-th query to \(\textsc {Ro}\), there is at most an \((i-1)/N\) probability that the choice of X will cause \(\mathsf {bad}\) to be set. By a simple union bound we can get,

$$\begin{aligned} \mathsf {Pr}[\mathsf {BAD}| Q_1 = q] \leqslant q(2q-1)/N. \end{aligned}$$

Setting \(\delta (q)= 2q^2/N\) we have that \({ \textsf {H}}^{\mathsf {cr}}\) is \(\delta \)-bounded, so Theorem 1 gives

$$\begin{aligned} \mathsf {Pr}[{ \textsf {H}}^{\mathsf {cr}}_0(\mathcal {A})] \leqslant 5\root 2 \of {\frac{2 \cdot \mathsf {E}[Q^\mathcal {A}_0]^2}{N}}. \end{aligned}$$

Pseudorandomness of a random oracle. Now consider using a random oracle with domain \([N]\times \mathcal {D}\) and range \(\mathcal {R}\) as a pseudorandom function. The games for this are shown in Fig. 4. The real world is captured by \(b=0\) (because to output of the random oracle \(\textsc {Ro}\) is made to be consistent with output of the real-or-random oracle \(\textsc {Ror}\)) and the ideal world by \(b=1\).

The coins of the game are random tables T and F as well as a random key K. We think of the key as \(c_0\) and the tables as \(c_1\). Because we have written the games so that the consistency check occurs in \(\textsc {Ro}\), we can clearly see the output of the oracles in \({ \textsf {H}}^{\mathsf {prf}}_1\) are independent of \(c_0=K\).

These games are identical-until-bad so from the fundamental lemma of game playing we have,

$$\begin{aligned} \mathsf {Pr}[{ \textsf {H}}^{\mathsf {prf}}_0(\mathcal {A})]-\mathsf {Pr}[{ \textsf {H}}^{\mathsf {prf}}_1(\mathcal {A})]&\leqslant \mathsf {Pr}[{ \textsf {H}}^{\mathsf {prf}}_0(\mathcal {A}) \text{ sets } \mathsf {bad}]. \end{aligned}$$

Now we think of \(c_1\) and the coins of \(\mathcal {A}\) as fixed. Over a random choice of K, each \(\textsc {Ro}\) query has a 1/N change of setting \(\mathsf {bad}\). By a simple union bound we get,

$$\begin{aligned} \mathsf {Pr}[\mathsf {BAD}| Q_1 = q] \leqslant q/N. \end{aligned}$$

Defining \(\delta (q)=q/N\) we have that \({ \textsf {H}}^{\mathsf {prf}}\) is \(\delta \)-bounded, so Theorem 1 gives

$$\begin{aligned} \mathsf {Pr}[{ \textsf {H}}^{\mathsf {prf}}_0(\mathcal {A})]-\mathsf {Pr}[{ \textsf {H}}^{\mathsf {prf}}_1(\mathcal {A})] \leqslant 5\cdot \mathsf {E}[Q^\mathcal {A}_0]/N. \end{aligned}$$

Discrete logarithm security in the generic group model. Next we consider discrete logarithm security in the generic group model for a prime order group \(\mathbb {G}\) with generator g. One way to express this is by the game \({ \textsf {H}}^{\mathsf {dl}}_0\) shown in Fig. 5. In this expression, the adversary is given labels for the group elements it handles based on the time that this group element was generated by the adversary. The more general framing of the generic group model where \(g^x\in \mathbb {G}\) is labeled by \(\sigma (x)\) for a randomly chosen \(\sigma :\mathbb {Z}_{|\mathbb {G}|}\rightarrow \{0,1\}^l\) for some \(l\geqslant \lceil \log {|\mathbb {G}|}\rceil \) can easily be reduced to this version of the game.

At the beginning of the game polynomials \(p_0(\cdot )=0\), \(p_1(\cdot )=1\), and \(p_2(\cdot )=X\) are defined. These are polynomials of the symbolic variable X, defined over \(\mathbb {Z}_{|\mathbb {G}|}\). Then a random x is sampled and the goal of the adversary is to find this x. Throughout the game, a polynomial \(p_i\) represents the group element \(g^{p_i(x)}\). Hence \(p_0\) represents the identity element of the group, \(p_1\) represents the generator g, and \(p_2\) represents \(g^x\). We think of the subscript of a polynomial as the adversary’s label for the corresponding group element. The variable t tracks the highest label the adversary has used so far.

We let \(\mathcal {P}^i\) denote the set of the first i polynomials that have been generated and \(\mathcal {P}^i_x\) be the set of their outputs when evaluated on x. The oracle \(\textsc {Init}\) tells the adversary if x happened to be 0 or 1 by returning the appropriate value of \(\ell \). The oracle \(\textsc {Op}\) allows the adversary to perform multi-exponentiations. It specifies a vector \(\mathbf {j}\) of labels for group elements and a vector \(\mathbf {\alpha }\) of coefficients. The variable t is incremented and its new value serves as the label for the group element \(\prod _i g_{\mathbf {j}[i]}^{\mathbf {\alpha }[i]}\) where \(g_{\mathbf {j}[i]}\) is the group element with label \(\mathbf {j}[i]\), i.e., \(g^{p_{\mathbf {j}[i]}(x)}\). The returned value \(\ell \) is set equal to the prior label of a group element which equals this new group element (if \(\ell =t\), then no prior labels represented the same group element).

Fig. 5.
figure 5

Game capturing discrete logarithm security of a generic group (when \(b=0\)). For \(i\in \mathbb {N}\) and \(x\in \mathbb {Z}_{|\mathbb {G}|}\), we use the notation \(\mathcal {P}^i = \{ p_0,\dots ,p_i\}\subset \mathbb {Z}_{|\mathbb {G}|}[X]\) and \(\mathcal {P}^i_x=\{p(x) : p\in \mathcal {P}^i\}\subset \mathbb {Z}_{|\mathbb {G}|}\).

The only coins of this game are the choice of x which we think of as \(c_0\). In \({ \textsf {H}}^{\mathsf {dl}}_1\), the adversary is never told when two labels it handles non-trivially represent the same group element so the view of \(\mathcal {A}\) is independent of \(c_0\), as desiredFootnote 7. Because the view of \(\mathcal {A}\) is independent of x when \(b=1\) we have that \(\mathsf {Pr}[{ \textsf {H}}^{\mathsf {dl}}_1(\mathcal {A})]=1/|\mathbb {G}|\).

From the fundamental lemma of game playing,

$$\begin{aligned} \mathsf {Pr}[{ \textsf {H}}^{\mathsf {dl}}_0(\mathcal {A})]&\leqslant \mathsf {Pr}[{ \textsf {H}}^{\mathsf {dl}}_0(\mathcal {A}) \text{ sets } \mathsf {bad}] + \mathsf {Pr}[{ \textsf {H}}^{\mathsf {dl}}_1(\mathcal {A})] = \mathsf {Pr}[{ \textsf {H}}^{\mathsf {cr}}_0(\mathcal {A}) \text{ sets } \mathsf {bad}] + 1/|\mathbb {G}| \end{aligned}$$

Now thinking of the coins of \(\mathcal {A}\) as fixed, this fixes a value of \(Q_1\) and a length \(Q_1\) transcript of queries that would occur in \({ \textsf {H}}^{\mathsf {dl}}_1(\mathcal {A})\). This in turn fixes the set of polynomials \(\mathcal {P}^{Q_1+2}\). The flag \(\mathsf {bad}\) will be set iff any of polynomials in the set

$$\begin{aligned} \{ p(\cdot )-r(\cdot ) | p\ne r\in \mathcal {P}^{Q_1+2}\} \end{aligned}$$

have the value 0 when evaluated on x. Note these polynomials are non-zero and have degree at most 1. Thus, applying the Schwartz-Zippel lemma and a union bound we get,

$$\begin{aligned} \mathsf {Pr}[\mathsf {BAD}| Q_1 = q] \leqslant \left( {\begin{array}{c}q+3\\ 2\end{array}}\right) \cdot (1/|\mathbb {G}|) \leqslant 6q^2/|\mathbb {G}|. \end{aligned}$$

Note the bound trivially holds when \(q=0\), since \(\mathsf {Pr}[\mathsf {bad}| Q_1 = q]=0\), so we have assumed \(q\geqslant 1\) for the second bound. Defining \(\delta (q)=6 q^2/|\mathbb {G}|\) we have that \({ \textsf {H}}^{\mathsf {dl}}\) is \(\delta \)-bounded, so Theorem 1 gives

$$\begin{aligned} \mathsf {Pr}[{ \textsf {H}}^{\mathsf {dl}}_0(\mathcal {A})] \leqslant 5\root 2 \of {\frac{6 \cdot \mathsf {E}[Q^\mathcal {A}_0]^2}{|\mathbb {G}|}} + \frac{1}{|\mathbb {G}|}. \end{aligned}$$

4 Concrete Security for a Forking Lemma

In this section we apply our techniques to obtaining concrete bounds on the soundness of proof systems. Of particular interest to us will be proof systems that can be proven to achieve a security notion known as witness-extended emulation via a very general “Forking Lemma” introduced by Bootle, Cerulli, Chaidos, Groth, and Petit (BCCGP) [6]. Some examples include Bulletproofs [7], Hyrax [24], and Supersonic [8]. Our expected-time techniques arise naturally for these proof systems because witness-extended emulation requires the existence of an expected-time emulator \(\mathsf {E}\) for a proof system which is given oracle access to a cheating prover and produces transcripts with the same distribution as the cheating prover, but additionally provides a witness w for the statement being proven whenever it outputs an accepting transcript.

In this section we use a new way of expressing witness-extended emulation as a special case of a more general notion we call predicate-extended emulation. The more general notion will serve as a clean, modular way to provide a concrete security version of the BCCGP forking lemma. This modularity allows us to hone in on the steps where our expected time analysis can be applied to give concrete bounds and avoid some technical issues with the original BCCGP formulation of the lemma.

In the BCCGP blueprint, the task of witness-extended emulation is divided into a generic tree-extended emulator which for any public coin proof system produces transcripts with the same distribution as a cheating prover together with a set of accepting transcripts satisfying a certain tree structure and an extractor for the particular proof system under consideration which can extract a witness from such a tree of transcripts. The original forking lemma of BCCGP technically only applied for extractors that always output a witness given a valid tree with no collisions. However, typical applications of the lemma require that the extractor be allowed to fail when the cheating prover has (implicitly) broken some presumed hard computational problem. Several works subsequent to BCCGP noticed this gap in the formalism [7, 8, 24] and stated slight variants of the BCCGP forking lemma. However, these variants are still unsatisfactory. The variant lemmas in [7, 24] technically only allows extractors which fail in extracting a witness with at most negligible probability for every tree (rather than negligible probably with respect to some efficiently samplable distribution over trees, as is needed). The more recent variant lemma in [8] is stated in such a way that the rewinding analysis at the core of the BCCGP lemma is omitted from the variant lemma and (technically) must be shown separately anytime it is to be applied to a proof system. None of these issues represent issues with the security of the protocols analyzed in these works. The intended meaning of each of their proofs is clear from context and sound, these issues are just technical bugs with the formalism of the proofs. However, to accurately capture concrete security it will be important that we have a precise and accurate formalism of this. Our notion of predicate-extended emulation helps to enable this.

In Sect. 4.1, we provide the syntax of proof systems as well as defining our security goals of predicate-extended emulation (a generalization of witness-extended emulation) and generator soundness (a generalization of the standard notion of soundness). Then in Sect. 4.2, we provide a sequence of simple lemmas and show how they can be combined to give our concrete security version on the forking lemma. Finally in Sect. 4.3, we discuss how our forking lemma can easily be applied to provide concrete bounds on the soundness of various existing proof systems. As a concrete example we give the first concrete security bound on the soundness of the Bulletproof zero-knowledge proof system for arithmetic circuits by Bünz et al. [7].

Fig. 6.
figure 6

Predicates we use. Other predicates \(\mathrm {\varPi }_{\mathsf {bind}}\) and \(\mathrm {\varPi }_{\mathsf {rsa}}\) are only discussed informally.

4.1 Syntax and Security of Proof Systems

Proof System. A proof system \(\mathsf {PS}\) is a tuple \(\mathsf {PS}=(\mathsf {S},\mathsf {R},\mathsf {P},\mathsf {V},\mu )\) specifying a setup algorithm \(\mathsf {S}\), a relation \(\mathsf {R}\), a prover \(\mathsf {P}\), verifier \(\mathsf {V}\), and \(\mu \in \mathbb {N}\). The setup algorithm outputs public parameters \(\pi \). We say w is a witness for the statement u if \((u,w)\in \mathsf {R}_{\pi }\). The prover (with input (uw)) and the verifier (with input u) interact via \(2\mu +1\) moves as shown in Fig. 7.

Fig. 7.
figure 7

Interaction between (honest) prover \(\mathsf {P}\) and verifier \(\mathsf {V}\) with public parameters \(\pi \). Here \(tr\) is the transcript and \(d\in \{0,1\}\) is the decision.

Here \(tr\) is the transcript of the interaction and \(d\in \{0,1\}\) is the decision of \(\mathsf {V}\) (with \(d=1\) representing acceptance and \(d=0\) representing rejection). Perfect completeness requires that for all \(\pi \) and \((u,w)\in \mathsf {R}_{\pi }\), . If \(\mathsf {PS}\) is public-coin, then \(m_{2i-1}\) output by \(\mathsf {V}\) each round is set equal to its random coins. In this case, we let \(\mathsf {V}_{\pi }(u,tr)\in \{0,1\}\) denote \(\mathsf {V}\)’s decision after an interaction that produced transcript \(tr\)Footnote 8. Throughout this section we will implicitly assume that any proof systems under discussion is public-coin. We sometimes refer to the verifier’s outputs as challenges.

Predicate-extended emulation. The proof systems we consider were all analyzed with the notion of witness-extended emulation [11, 16]. This requires that for any efficient cheating prover \(\mathsf {P}^*\) there exists an efficient emulator \(\mathsf {E}\) which (given oracle access to \(\mathsf {P}^*\) interacting with \(\mathsf {V}\) and the ability to rewind them) produces transcripts with the same distribution as \(\mathsf {P}^*\) and almost always provides a witness for the statement when the transcript it produces is accepting. We will capture witness-extended emulation as a special case of what we refer to as predicate-extended emulation. We cast the definition as two separate security properties. The first (emulation security) requires that \(\mathsf {E}\) produces transcripts with the same distribution as \(\mathsf {P}^*\). The second (predicate extension) is parameterized by a predicate \(\mathrm {\varPi }\) and requires that whenever \(\mathsf {E}\) produces an accepting transcript, its auxiliary output must satisfy \(\mathrm {\varPi }\). As we will see, this treatment will allow a clean, modular treatment of how BCCGP and follow-up work [6,7,8, 24] analyze witness-extended emulation.

We start by considering game \({ \textsf {H}}^{\mathsf {emu}}\) defined in Fig. 8. It is parameterized by a public-coin proof system \(\mathsf {PS}\), emulator \(\mathsf {E}\), and bit b. The adversary consists of a cheating prover \(\mathsf {P}^*\) and an attacker \(\mathcal {A}\). This game measures \(\mathcal {A}\)’s ability to distinguish between a transcript generated by \(\langle \mathsf {P}^*_{\pi }(u,s),\mathsf {V}_{\pi }(u)\rangle \) and one generated by \(\mathsf {E}\). The emulator \(\mathsf {E}\) is given access to oracles \(\textsc {Next}\) and \(\textsc {Rew}\). The former has \(\mathsf {P}^*\) and \(\mathsf {V}\) perform a round of interaction and returns the messages exchanged. The latter rewinds the interaction to the prior round. We define the advantage function \(\mathsf {Adv}^{\mathsf {emu}}_{\mathsf {PS},\mathsf {E}}(\mathsf {P}^*,\mathcal {A})=\mathsf {Pr}[{ \textsf {H}}^{\mathsf {emu}}_{\mathsf {PS},\mathsf {E},1}(\mathsf {P}^*,\mathcal {A})]-\mathsf {Pr}[{ \textsf {H}}^{\mathsf {emu}}_{\mathsf {PS},\mathsf {E},0}(\mathsf {P}^*,\mathcal {A})]\). For the examples we consider there will be an \(\mathsf {E}\) which (in expectation) performs a small number of oracle queries and does a small amount of local computation such that for any \(\mathsf {P}^*\) and \(\mathcal {A}\) we have \(\mathsf {Adv}^{\mathsf {emu}}_{\mathsf {PS},\mathsf {E}}(\mathsf {P}^*,\mathcal {A})=0\).

Note that creating a perfect emulator is trivial in isolation; \(\mathsf {E}\) can just make \(\mu +1\) calls to \(\textsc {Next}\) to obtain a \(tr\) with the exactly correct distribution. Where it gets interesting is that we will consider a second, auxiliary output of \(\mathsf {E}\) and insist that it satisfies some predicate \(\mathrm {\varPi }\) whenever \(tr\) is an accepting transcript. The adversary wins whenever \(tr\) is accepting, but the predicate is not satisfied. This is captured by the game \({ \textsf {H}}^{\mathsf {predext}}\) shown in Fig. 8. We define \(\mathsf {Adv}^{\mathsf {predext}}_{\mathsf {PS},\mathsf {E},\mathrm {\varPi }}(\mathsf {P}^*,\mathcal {A})=\mathsf {Pr}[{ \textsf {H}}^{\mathsf {predext}}_{\mathsf {PS},\mathsf {E},\mathrm {\varPi }}(\mathsf {P}^*,\mathcal {A})]\). Again this notion is trivial in isolation; \(\mathsf {E}\) can just output rejecting transcripts. Hence, both security notions need to be considered together with respect to the same \(\mathsf {E}\).

The standard notion of witness-extended emulating is captured by the predicate \(\mathrm {\varPi }_{\mathsf {wit}}\) which checks if \(\mathrm {aux}\) is a witness for u, that is, \(\mathrm {\varPi }_{\mathsf {wit}}(\pi ,u,\mathrm {aux})=\left( (u,\mathrm {aux})\in \mathsf {R}_\pi \right) \). Later we will define some other predicates. All the predicates we will make use of are summarized in Fig. 6. A proof system with a good witness-extended emulator under some computational assumption may be said to be an argument of knowledge.

Fig. 8.
figure 8

Games defining predicate-extended emulation security of proof system \(\mathsf {PS}\).

Hard predicates. One class of predicates to consider are those which embed some computational problem about the public parameter \(\pi \) that is assumed to be hard to solve. We will say that \(\mathrm {\varPi }\) is witness-independent if its output does not depend on its second input u. For example, if \(\mathsf {S}\) outputs of length n vector of elements from a group \(\mathbb {G}\) (we will denote this setup algorithm by \(\mathsf {S}^n_{\mathbb {G}}\)) we can consider the predicate \(\mathrm {\varPi }^{\mathbb {G},n}_{\mathsf {dl}}\) which checks if \(\mathrm {aux}\) specifies a non-trivial discrete log relation. This predicate is useful for the analysis of a variety of proof systems [6, 7, 24]. Other useful examples include: (i) if \(\mathsf {S}\) output parameters for a commitment scheme with \(\mathrm {\varPi }_{\mathsf {bind}}\) that checks if \(\mathrm {aux}\) specifies a commitment and two different opening for it [6, 8, 24] and (ii) if \(\mathsf {S}\) outputs a group of unknown order together with an element of that group and \(\mathrm {\varPi }_{\mathsf {rsa}}\) checks if \(\mathrm {aux}\) specifies a non-trivial root of that element [8].

Whether a witness-independent predicate \(\mathrm {\varPi }\) is hard to satisfy given the output of \(\mathsf {S}\) is captured by the game \({ \textsf {H}}^{\mathsf {pred}}\) shown on the left side of Fig. 9. We define \(\mathsf {Adv}^{\mathsf {pred}}_{\mathsf {S},\mathrm {\varPi }}(\mathcal {A}) =\mathsf {Pr}[{ \textsf {H}}^{\mathsf {pred}}_{\mathsf {S},\mathrm {\varPi }}(\mathcal {A})]\). Note, for example, that if \(\mathsf {S}^n_{\mathbb {G}}\) and \(\mathrm {\varPi }^{\mathbb {G},n}_{\mathsf {dl}}\) is used, then this game is identical to discrete log relation security, i.e., \(\mathsf {Adv}^{\mathsf {pred}}_{\mathsf {S}^n_{\mathbb {G}},\mathrm {\varPi }^{\mathbb {G},n}_{\mathsf {dl}}}(\mathcal {A})=\mathsf {Adv}^{\mathsf {dl}\text{- }\mathsf {rel}}_{\mathbb {G},n}(\mathcal {A})\) for any adversary \(\mathcal {A}\).

Generator soundness. Consider the games shown on the right side of Fig. 9. Both are parameterized by a statement generator G which (given the parameters \(\pi \)) outputs a statement u and some auxiliary information s about the statement. The first game \({ \textsf {H}}^{\mathsf {sound}}\) measure how well a (potentially cheating) prover \(\mathsf {P}^*\) can use s to convince \(\mathsf {V}\) that u is true. The second game \({ \textsf {H}}^{\mathsf {wit}}\) measures how well an adversary \(\mathcal {B}\) can produce a witness for u given s. We define \(\mathsf {Adv}^{\mathsf {sound}}_{\mathsf {PS},G}(\mathsf {P}^*)=\mathsf {Pr}[{ \textsf {H}}^{\mathsf {sound}}_{\mathsf {PS},G}(\mathsf {P}^*)]\) and \(\mathsf {Adv}^{\mathsf {wit}}_{\mathsf {PS},G}(\mathcal {B})=\mathsf {Pr}[{ \textsf {H}}^{\mathsf {wit}}_{\mathsf {PS},G}(\mathcal {B})]\).

Fig. 9.
figure 9

Left. Game defining hardness of satisfying predicate \(\mathrm {\varPi }\). Right. Games defining soundness of proof system \(\mathsf {PS}\) with respect to instance generator G and difficulty of finding witness for statements produced by G.

Note that the standard notion of soundness (that proving false statements is difficult) is captured by considering G which always outputs false statements. In this case, \(\mathsf {Adv}^{\mathsf {wit}}_{\mathsf {PS},G}(\mathcal {A})=0\) for all \(\mathcal {A}\). In other contexts, it may be assumed that it is computationally difficult to find a witness for G’s statement.

4.2 Concrete Security Forking Lemma

Now we will work towards proving our concrete security version of the BCCGP forking lemma. This lemma provides a general framework for how to provide a good witness-extended emulator for a proof system. First, BCCGP showed how to construct a tree-extended emulator \(\mathsf {T}\) which has perfect emulation security and (with high probability) outputs a set of transcripts satisfying a tree-like structure (defined later) whenever it outputs an accepting transcript. Then one constructs, for the particular proof system under consideration, an “extractor” \(\mathsf {X}\) which given such a tree of transcripts can always produce a witness for the statement or break some other computational problem assumed to be difficult. Combining \(\mathsf {T}\) and \(\mathsf {X}\) appropriately gives a good witness-extended emulator.

Before proceeding to our forking lemma we will provide the necessary definitions of a tree-extended emulator and extractor, then state some simple lemmas that help build toward our forking lemma.

Transcript Tree. Fix a proof system \(\mathsf {PS}=(\mathsf {S},\mathsf {R},\mathsf {P},\mathsf {V},\mu )\) and let the vector \(\mathbf {n}=(n_1,\dots ,n_\mu )\in \mathbb {N}_{>0}^\mu \) be given. Let \(\pi \) be an output of \(\mathsf {S}\) and u be a statement. For \(h=0,\dots ,\mu \) we will inductively define an \((n_{\mu -h+1},\dots ,n_\mu )\)-tree of transcripts for \((\mathsf {PS},\pi ,u)\). We will often leave some of \((\mathsf {PS},\pi ,u)\) implicit when they are clear from context.

First when \(h=0\), a ()-tree is specified by a tuple \((m_{2\mu -1},m_{2\mu },\ell )\) where \(m_{2\mu -1},m_{2\mu }\in \{0,1\}^*\) and \(\ell \) is an empty list. Now an \((n_{\mu -(h+1)},\dots ,n_\mu )\)-tree is specified by a tuple \((m_{2(\mu -h)-1},m_{2(\mu -h)},\ell )\) where \(m_{2(\mu -h)-1},m_{2(\mu -h)}\in \{0,1\}^*\) and \(\ell \) is a length \(n_{\mu -(h+1)}\) list of \((n_{\mu -h},\dots ,n_{\mu })\)-trees for \((\mathsf {PS},\pi ,u,tr)\).

When discussing such trees we say their height is h. When \(h<\mu \) we will sometimes refer to it as a partial tree. We use the traditional terminology of nodes, children, parent, root, and leaf. We say the root node is at height h, its children are at height \(h-1\), and so on. The leaf nodes are thus each at height 0. If a node is at height h, then we say it is at depth \(\mu -h\).

Every path from the root to a leaf in a height h tree gives a sequence \((m_{2(\mu -h)-1},m_{2(\mu -h)},\dots ,m_{2\mu -1},m_{2\mu })\) where \((m_{2(\mu -i)-1},m_{2(\mu -i)})\) are the pair from the node at height i. Now if we fix a transcript prefix \(tr'=(m_{-1},m_0,\dots ,m_{2(\mu -h-1)-1},m_{2(\mu -h-1)})\), then we can think of \(tr'\) and the tree as inducing \(\prod _{i=1}^\mu n_i\) different transcripts \(tr=(m_0,\dots ,m_{2\mu -1},m_{2\mu })\), one for each path. We will say that the tree is valid for \(tr'\) if \(\mathsf {V}_{\pi }(u,tr)=1\) for each transcript \(tr\) induced by the tree. Note that \(tr'\) is an empty list when \(h=\mu \) so we can omit reference to \(tr'\) and simply refer to the tree as valid.

Suppose \(\mathsf {V}\)’s coins are drawn from \(S\times \mathbb {Z}_p\) for some set S and \(p\in \mathbb {N}\). We will refer to the second component of its coins are the integer component. Let \(\mathsf {node}\) be a parent node at height \(i>0\). If any two of its children have \(m_{2(\mu -i+1)-1}\) with identical integer components, then we say that \(\mathsf {node}\) has a challenge collision. A tree has a challenge collision if any of its nodes have a challenge collision.

A tree-extractor emulator should return trees which are valid and have no challenge collision. We capture this with the predicates \(\mathrm {\varPi }^{\mathbf {n}}_{\mathsf {val}}\) and \(\mathrm {\varPi }^{\mathbf {n}}_{\mathsf {no}\mathsf {col}}\) defined by:

  • \(\mathrm {\varPi }^{\mathbf {n}}_{\mathsf {val}}(\pi ,u,\mathrm {aux})\) returns \(\mathsf {true}\) iff \(\mathrm {aux}\) is a valid \(\mathbf {n}\)-tree.

  • \(\mathrm {\varPi }^{\mathbf {n}}_{\mathsf {no}\mathsf {col}}(\pi ,u,\mathrm {aux})\) returns \(\mathsf {true}\) iff \(\mathrm {aux}\) is an \(\mathbf {n}\)-tree that does not have a challenge collision.

Tree-extended Emulator. Let a proof system \(\mathsf {PS}=(\mathsf {S},\mathsf {R},\mathsf {P},\mathsf {V},\mu )\) and let \((n_1,\dots ,n_\mu )\in \mathbb {N}_{>2}^\mu \) be given. Then consider the tree-extended emulator \(\mathsf {T}\) given in Fig. 10 which comes from BCCGP. The sub-algorithms \(\mathsf {T}_i\) are given a partial transcript \(tr\). They call \(\textsc {Next}\) to obtain the next messages of a longer partial transcript and attempt to create a partial tree with is valid for it. This is done by repeatedly calling \(\mathsf {T}_{i+1}\) to construct each branch of the tree. Should the first such call fail, then \(\mathsf {T}_i\) will abort. Otherwise, it will continue calling \(\mathsf {T}_{i+1}\) as many times as necessary to have \(n_{i+1}\) branches. The base case of this process is \(\mathsf {T}_{\mu }\) which does not need children branches and instead just checks if its transcript is accepting, returning \(\bot \) to its calling procedure if not. The following result shows that \(\mathsf {T}\) emulates any cheating prover perfectly and almost always outputs a valid tree whenever it outputs an accepting transcript. The technical core of the lemma is in the bound on the expected efficiency of \(\mathsf {T}\).

Lemma 4

Let \(\mathsf {PS}=(\mathsf {S},\mathsf {R},\mathsf {P},\mathsf {V},\mu )\) be a public coin proof system. Suppose \(\mathsf {V}\)’s challenges are uniformly drawn from \(S\times \mathbb {Z}_p\) for set S and \(p\in \mathbb {N}\). Let \(\mathbf {n}=(n_1,\dots ,n_\mu )\in \mathbb {N}_{>0}^\mu \) be given. Let \(N=\prod _{i=1}^{\mu } n_i\). Let \(\mathsf {P}^*\) be a cheating prover and \(\mathcal {A}\) be an adversary. Define \(\mathsf {T}\) as shown in Fig. 10. Then the following all hold:

  1. 1.

    \(\mathsf {Adv}^{\mathsf {emu}}_{\mathsf {PS},\mathsf {T}}(\mathsf {P}^*,\mathcal {A})=0\)

  2. 2.

    \(\mathsf {Adv}^{\mathsf {predext}}_{\mathsf {PS},\mathsf {T},\mathrm {\varPi }^{\mathbf {n}}_{\mathsf {val}}}(\mathsf {P}^*,\mathcal {A})=0\)

  3. 3.

    \(\mathsf {Adv}^{\mathsf {predext}}_{\mathsf {PS},\mathsf {T},\mathrm {\varPi }^{\mathbf {n}}_{\mathsf {no}\mathsf {col}}}(\mathsf {P}^*,\mathcal {A})\leqslant 5\mu N/\sqrt{2p}\)

  4. 4.

    The expected number of times \(\mathsf {T}\) executes \(\mathsf {V}_{\pi }(u,\cdot )\) is N.

  5. 5.

    The expected number of queries that \(\mathsf {T}\) makes to \(\textsc {Next}\) is less than \(\mu N+1\)Footnote 9. Exactly one of these queries is made while \(i=1\) in \(\textsc {Next}\).

For comparison, in the full version of this paper [13] we analyze a natural tree-extended emulator with a small bounded worst-case runtime. Its ability to produce valid trees is significantly reduced by its need to work within a small worst-case runtime, motivating the need for \(\mathsf {T}\) to only be efficient in expected runtime.

Fig. 10.
figure 10

The BCCGP tree-extended emulator.

Proof

(of Lemma 4). All of the claims except the third follow from BCCGP’s analysis of \(\mathsf {T}\). The advantage \(\mathsf {Adv}^{\mathsf {predext}}_{\mathsf {PS},\mathsf {T},\mathrm {\varPi }^{\mathbf {n}}_{\mathsf {no}\mathsf {col}}}(\mathsf {P}^*,\mathcal {A})\) can be upper-bounded by the probability that the integer component of \(\mathsf {V}\)’s output is repeated across any of \(\mathsf {T}\)’s queries to \(\textsc {Next}\). BCCGP bounded this probability by applying Markov’s inequality to obtain an upper bound on \(\mathsf {T}\)’s running time and then applying the birthday bound to get an \(O(\mu N/\root 3 \of {p})\) bound. We can instead apply our switching lemma analysis from the full version of this paper [13] (or the techniques from our analysis of the collision resistance of a random oracle in Sect. 3.3) to obtain the stated bound because \(\mathsf {V}\) will sample \(\mu N\) challenges in expectation. \(\square \)

Extractors. Let \(\mathsf {X}\) be an algorithm and \(\mathrm {\varPi }_1,\mathrm {\varPi }_2\) be predicates. We say that \(\mathsf {X}\) is a \((\mathrm {\varPi }_1,\mathrm {\varPi }_2)\)-extractor if \(\mathrm {\varPi }_1(\pi ,u,\mathrm {aux})\Rightarrow \mathrm {\varPi }_2(\pi ,u,\mathsf {X}(\pi ,u,\mathrm {aux}))\). Let \(\mathsf {T}\) be an emulator. Then we define \(\mathsf {E}^{\dag }[\mathsf {T},\mathsf {X}]\) to be the emulator that on input \((\pi ,u)\) with oracle access to \(\textsc {Next}\) and \(\textsc {Rew}\) will first compute and then returns \((tr,\mathsf {X}(\pi ,u,\mathrm {aux}))\). The following straightforward lemma relates the security of \(\mathsf {T}\) and \(\mathsf {E}^{\dag }\).

Lemma 5

Let \(\mathsf {PS}\) be a proof system, \(\mathsf {T}\) be an emulator, \(\mathrm {\varPi }_1\) and \(\mathrm {\varPi }_2\) be predicates, \(\mathsf {P}^*\) be a cheating prover, and \(\mathcal {A}\) be an adversary. Let \(\mathsf {X}\) be a \((\mathrm {\varPi }_1,\mathrm {\varPi }_2)\)-extractor. Then the following hold:

  • \(\mathsf {Adv}^{\mathsf {emu}}_{\mathsf {PS},\mathsf {E}^{\dag }[\mathsf {T},\mathsf {X}]}(\mathsf {P}^*,\mathcal {A})=\mathsf {Adv}^{\mathsf {emu}}_{\mathsf {PS},\mathsf {T}}(\mathsf {P}^*,\mathcal {A})\)

  • \(\mathsf {Adv}^{\mathsf {predext}}_{\mathsf {PS},\mathsf {E}^{\dag }[\mathsf {T},\mathsf {X}],\mathrm {\varPi }_2}(\mathsf {P}^*,\mathcal {A})\leqslant \mathsf {Adv}^{\mathsf {predext}}_{\mathsf {PS},\mathsf {T},\mathrm {\varPi }_1}(\mathsf {P}^*,\mathcal {A})\)

Fig. 11.
figure 11

Reduction adversary for Theorem 2.

Forking lemma. Finally, we can state and prove our concrete security version of the BCCGP forking lemma. It captures the fact that any protocol with a \((\mathrm {\varPi }^{\mathbf {n}}_{\mathsf {val}}\wedge \mathrm {\varPi }^{\mathbf {n}}_{\mathsf {no}\mathsf {col}},\mathrm {\varPi }_{\mathsf {wit}}\vee \mathrm {\varPi }^{*})\)-extractor has a good witness-extended emulator (assuming \(\mathrm {\varPi }^{*}\) is computationally difficult to satisfy)Footnote 10.

Theorem 2

(Forking Lemma). Let \(\mathsf {PS}=(\mathsf {S},\mathsf {R},\mathsf {P},\mathsf {V},\mu )\) be a public coin proof system. Suppose \(\mathsf {V}\)’s challenges are uniformly drawn from \(S\times \mathbb {Z}_p\) for set S and \(p\in \mathbb {N}\). Let \(\mathbf {n}=(n_1,\dots ,n_\mu )\in \mathbb {N}_{>0}^\mu \) be given. Let \(N=\prod _{i=1}^{\mu } n_i\). Let \(\mathsf {P}^*\) be a cheating prover and \(\mathcal {A}\) be an adversary. Define \(\mathsf {T}\) as shown in Fig. 10. Let \(\mathrm {\varPi }^*\) be a witness-independent predicate. Let \(\mathsf {X}\) be a \((\mathrm {\varPi }^{\mathbf {n}}_{\mathsf {val}}\wedge \mathrm {\varPi }^{\mathbf {n}}_{\mathsf {no}\mathsf {col}},\mathrm {\varPi }_{\mathsf {wit}}\vee \mathrm {\varPi }^{*})\)-extractor. Let \(\mathsf {E}=\mathsf {E}^{\dag }[\mathsf {T},\mathsf {X}]\). Let \(\mathcal {B}_{\mathsf {E}}\) be as defined in Fig. 11. Then the following all hold:

  1. 1.

    \(\mathsf {Adv}^{\mathsf {emu}}_{\mathsf {PS},\mathsf {E}}(\mathsf {P}^*,\mathcal {A})=0\)

  2. 2.

    \(\mathsf {Adv}^{\mathsf {predext}}_{\mathsf {PS},\mathsf {E},\mathrm {\varPi }_{\mathsf {wit}}}(\mathsf {P}^*,\mathcal {A})\leqslant \mathsf {Adv}^{\mathsf {pred}}_{\mathsf {PS},\mathrm {\varPi }^{*}}(\mathcal {B}_{\mathsf {E}})+5\mu N/\sqrt{2p}\)

  3. 3.

    The expected number of times \(\mathsf {T}\) executes \(\mathsf {V}_{\pi }(u,\cdot )\) (inside of \(\mathsf {E}\)) is N.

  4. 4.

    The expected number of queries that \(\mathsf {E}\) makes to \(\textsc {Next}\) is less than \(\mu N+1\). Exactly one of these queries is made while \(i=1\) in \(\textsc {Next}\).

  5. 5.

    The expected runtime of \(\mathcal {B}_{\mathsf {E}}\) is approximately \(T_{\mathcal {A}}+Q_\mathsf {E}\cdot T_{\mathsf {P}^*}+T_{\mathsf {E}}\) where \(T_{x}\) is the worst-case runtime of \(x\in \{\mathcal {A},\mathsf {P}^*,\mathsf {E}\}\) and \(Q_\mathsf {E}<\mu N+1\) is the expected number of queries that \(\mathsf {E}\) makes to \(\textsc {Next}\) in \({ \textsf {H}}^{\mathsf {predext}}_{\mathsf {PS},\mathsf {E},\mathrm {\varPi }^{*}}(\mathsf {P}^*,\mathcal {A})\).

It will be useful to have the following simple lemma for comparing \(\mathsf {Adv}^{\mathsf {predext}}\) with different choices of predicate that are related by logical operators. It can be derived from basic probability calculations.

Lemma 6

Let \(\mathsf {PS}\) be a proof system, \(\mathsf {E}\) be an emulator, \(\mathrm {\varPi }_1\) and \(\mathrm {\varPi }_2\) be predicates, \(\mathsf {P}^*\) be a cheating prover, and \(\mathcal {A}\) be an adversary. Then,

$$\begin{aligned} \begin{aligned} \mathsf {Adv}^{\mathsf {predext}}_{\mathsf {PS},\mathsf {E},\mathrm {\varPi }_1\vee \mathrm {\varPi }_2}(\mathsf {P}^*,\mathcal {A})+ & {} \mathsf {Adv}^{\mathsf {predext}}_{\mathsf {PS},\mathsf {E},\mathrm {\varPi }_1\wedge \mathrm {\varPi }_2}(\mathsf {P}^*,\mathcal {A})\\= & {} \\ \mathsf {Adv}^{\mathsf {predext}}_{\mathsf {PS},\mathsf {E},\mathrm {\varPi }_1}(\mathsf {P}^*,\mathcal {A})+ & {} \mathsf {Adv}^{\mathsf {predext}}_{\mathsf {PS},\mathsf {E},\mathrm {\varPi }_2}(\mathsf {P}^*,\mathcal {A}). \end{aligned} \end{aligned}$$

and

$$\begin{aligned} \mathsf {Adv}^{\mathsf {predext}}_{\mathsf {PS},\mathsf {E},\mathrm {\varPi }_1}(\mathsf {P}^*,\mathcal {A}) \leqslant \mathsf {Adv}^{\mathsf {predext}}_{\mathsf {PS},\mathsf {E},\mathrm {\varPi }_1\vee \mathrm {\varPi }_2}(\mathsf {P}^*,\mathcal {A}) + \mathsf {Adv}^{\mathsf {predext}}_{\mathsf {PS},\mathsf {E},\lnot \mathrm {\varPi }_2}(\mathsf {P}^*,\mathcal {A}). \end{aligned}$$

Proof

(of Theorem 2). Applying Lemmas 4 and 5, and observing how \(\mathsf {E}\) is constructed give us the first, third, and fourth claim. For the other claims we need to consider the adversary \(\mathcal {B}_{\mathsf {E}}\). Note that it runs \(\mathsf {E}\) just it would be run in \({ \textsf {H}}^{\mathsf {predext}}_{\mathsf {PS},\mathsf {E},\mathrm {\varPi }^{*}}(\mathsf {P}^*,\mathcal {A})\), so the distribution over \((\pi ,\mathrm {aux})\) is identical in \({ \textsf {H}}^{\mathsf {pred}}_{\mathsf {S},\mathrm {\varPi }}(\mathcal {B}_{\mathsf {E}})\) as in that game. Furthermore, recall that \(\mathrm {\varPi }^{*}\) is witness-independent, so it ignores its second input. It follows that,

$$\begin{aligned} \mathsf {Adv}^{\mathsf {predext}}_{\mathsf {PS},\mathsf {E},\lnot \mathrm {\varPi }^*}(\mathsf {P}^*,\mathcal {A})&=\mathsf {Pr}[\mathsf {V}_\pi (u,tr)\wedge \lnot (\lnot \mathrm {\varPi }^*(\pi ,u,\mathrm {aux}))\text { in }{ \textsf {H}}^{\mathsf {predext}}]\\&\leqslant \mathsf {Pr}[\mathrm {\varPi }^*(\pi ,u,\mathrm {aux})\text { in }{ \textsf {H}}^{\mathsf {predext}}]\\&=\mathsf {Pr}[\mathrm {\varPi }^*(\pi ,\varepsilon ,\mathrm {aux})\text { in }{ \textsf {H}}^{\mathsf {pred}}] =\mathsf {Adv}^{\mathsf {pred}}_{\mathsf {S},\mathrm {\varPi }}(\mathcal {B}_{\mathsf {E}}). \end{aligned}$$

The claimed runtime of \(\mathcal {B}\) is clear from its pseudocode (noting that the view of \(\mathsf {E}\) is distributed identically to its view in \({ \textsf {H}}^{\mathsf {predext}}\) so its expected number of \(\textsc {Next}\) queries is unchanged).

For the second claim, we perform the calculations

$$\begin{aligned} \mathsf {Adv}^{\mathsf {predext}}_{\mathsf {PS},\mathsf {E},\mathrm {\varPi }_{\mathsf {wit}}}(\mathsf {P}^*,\mathcal {A})&\leqslant \mathsf {Adv}^{\mathsf {predext}}_{\mathsf {PS},\mathsf {E},\mathrm {\varPi }_{\mathsf {wit}}\vee \mathrm {\varPi }^{*}}(\mathsf {P}^*,\mathcal {A}) + \mathsf {Adv}^{\mathsf {predext}}_{\mathsf {PS},\mathsf {E},\lnot \mathrm {\varPi }^{*}}(\mathsf {P}^*,\mathcal {A})\\&= \mathsf {Adv}^{\mathsf {predext}}_{\mathsf {PS},\mathsf {E},\mathrm {\varPi }^{\mathbf {n}}_{\mathsf {val}}\wedge \mathrm {\varPi }^{\mathbf {n}}_{\mathsf {no}\mathsf {col}}}(\mathsf {P}^*,\mathcal {A}) + \mathsf {Adv}^{\mathsf {pred}}_{\mathsf {PS},\mathrm {\varPi }^{*}}(\mathcal {B})\\&= \mathsf {Adv}^{\mathsf {predext}}_{\mathsf {PS},\mathsf {E},\mathrm {\varPi }^{\mathbf {n}}_{\mathsf {val}}}(\mathsf {P}^*,\mathcal {A}) + \mathsf {Adv}^{\mathsf {predext}}_{\mathsf {PS},\mathsf {E},\mathrm {\varPi }^{\mathbf {n}}_{\mathsf {no}\mathsf {col}}}(\mathsf {P}^*,\mathcal {A}) + \mathsf {Adv}^{\mathsf {pred}}_{\mathsf {PS},\mathrm {\varPi }^{*}}(\mathcal {B})\\&\leqslant 5\mu N/\sqrt{2p} + \mathsf {Adv}^{\mathsf {pred}}_{\mathsf {PS},\mathrm {\varPi }^{*}}(\mathcal {B}). \end{aligned}$$

This sequence of calculation uses (in order) Lemma 6, Lemma 5 and the bound we just derived, Lemma 6 (again), and Lemma 4.

4.3 Concrete Bounds on Soundness

Now we discuss how the forking lemma we just derived can be used to provide concrete bounds on soundness. First we make the generic observation that witness-extended emulation implies soundness. Then we discuss how we can use these results together with our expected-time generic group model bound on discrete log security to give concrete bounds on the soundness of various proof systems based on discrete log security, in particular giving the first concrete bound on the soundness of the Bulletproofs proof system for arithmetic circuits.

Witness-extended emulation implies soundness. The following theorem observes that finding a witness for u cannot be much more difficult that convincing a verifier u if an efficient witness-extended extractor exists.

Theorem 3

Let \(\mathsf {PS}=(\mathsf {S},\mathsf {R},\mathsf {P},\mathsf {V},\mu )\) be a proof system, G be a statement generator, \(\mathsf {E}\) be an emulator, and \(\mathsf {P}^*\) be a cheating prover. Define \(\mathcal {A}\) and \(\mathcal {B}\) as shown in Fig. 12. Then,

$$\begin{aligned} \mathsf {Adv}^{\mathsf {sound}}_{\mathsf {PS},G}(\mathsf {P}^*) \leqslant \mathsf {Adv}^{\mathsf {wit}}_{\mathsf {PS},G}(\mathcal {B}) + \mathsf {Adv}^{\mathsf {emu}}_{\mathsf {PS},\mathsf {E}}(\mathsf {P}^*,\mathcal {A}) + \mathsf {Adv}^{\mathsf {predext}}_{\mathsf {PS},\mathsf {E},\mathrm {\varPi }_{\mathsf {wit}}}(\mathsf {P}^*,\mathcal {A}). \end{aligned}$$

The runtime of that \(\mathcal {A}\) is roughly that of G plus that of \(\mathsf {V}\). The runtime of \(\mathcal {B}\) is roughly that of \(\mathsf {E}\) when given oracle access to \(\mathsf {P}^*\) and \(\mathsf {V}\) interacting.

Fig. 12.
figure 12

Adversaries used in Theorem 3.

Proof

(Sketch). The use of \(\mathsf {V}\) in \(\mathcal {A}\) ensures that the probability \(\mathsf {E}\) outputs an accepting transcript must be roughly the same as the probability that \(\mathsf {P}^*\) convinces \(\mathsf {V}\) to accept. The difference between these probabilities is bounded by \(\mathsf {Adv}^{\mathsf {emu}}_{\mathsf {PS},\mathsf {E}}(\mathsf {P}^*,\mathcal {A})\). Then the \(\mathrm {\varPi }_{\mathsf {wit}}\) security of \(\mathsf {E}\) ensures that the probability it outputs a valid witness cannot be much less than the probability it outputs an accepting transcript. The difference between these probabilities is bounded by \(\mathsf {Adv}^{\mathsf {predext}}_{\mathsf {PS},\mathsf {E},\mathrm {\varPi }_{\mathsf {wit}}}(\mathsf {P}^*,\mathcal {A})\). Adversary \(\mathcal {B}\) just runs \(\mathsf {E}\) to obtain a witness, so \(\mathsf {Adv}^{\mathsf {wit}}_{\mathsf {PS},G}(\mathcal {B})\) is the probability that \(\mathsf {E}\) would output a valid witness.

Discrete log proof systems. A number of the proof systems in [6, 7, 24] were shown to have a \((\mathrm {\varPi }^{\mathbf {n}}_{\mathsf {val}}\wedge \mathrm {\varPi }^{\mathbf {n}}_{\mathsf {no}\mathsf {col}},\mathrm {\varPi }_{\mathsf {wit}}\vee \mathrm {\varPi }^{\mathbb {G},n}_{\mathsf {dl}})\)-extractor \(\mathsf {X}\). For any such proof system \(\mathsf {PS}\), Theorem 3 and Theorem 2 bound the soundness of \(\mathsf {PS}\) by the discrete log relation security of \(\mathbb {G}\) against an expected-time adversary \(\mathcal {B}_{\mathsf {E}^{\dag }[\mathsf {T},\mathsf {X}]}\). Moreover, we can then apply Lemma 3 to tightly bound this adversary’s advantage by the advantage of an expected-time adversary against normal discrete log security. We know how to bound the advantage of such an adversary in the generic group model from Sect. 3.3.

So to obtain a bound on the soundness of these proof systems in the generic group model we can just apply these results to the proof system. To obtain our final concrete security bound in the generic group model we need only to read the existing analysis of the proof system and extract the following parameters,

  • p: the size of the set \(\mathsf {V}\) draws the integer component of its challenges from

  • \(|\mathbb {G}|\): the size of the group used

  • \(N=\prod _{i=1}^{\mu }\mathbf {n}_i\): the size of the tree that \(\mathsf {X}\) requires

  • \(n\geqslant 1\): the number of group elements in the discrete log relation instance

  • \(q_{\mathsf {V}}\): the number of multi-exponentiations \(\mathsf {V}\) performsFootnote 11

  • \(q_{\mathsf {X}}\): the number of multi-exponentiations that \(\mathsf {X}\) performs

We say such a proof system \(\mathsf {PS}=(\mathsf {S},\mathsf {R},\mathsf {P},\mathsf {V},\mu )\) and extractor \(\mathsf {X}\) have parameters \((p,|\mathbb {G}|,N,n,q_{\mathsf {V}},q_{\mathsf {X}})\). We obtain the following theorem for such a system, bounding its soundness in the generic group model.

Theorem 4

Let \(\mathsf {PS}=(\mathsf {S},\mathsf {R},\mathsf {P},\mathsf {V},\mu )\) be a proof system and \(\mathsf {X}\) be an extractor that has parameters \((p,|\mathbb {G}|,N,n,q_{\mathsf {V}},q_{\mathsf {X}})\). Let G be a statement generator performing at most \(q_G\) multi-exponentiations and \(\mathsf {P}^{*}\) be a cheating prover that performs at most \(q_{\mathsf {P}^*}\) multi-exponentiations each time it is run. Define \(\mathcal {B}\) as shown in Fig. 12. Then in the generic group model we have,

$$\begin{aligned} \mathsf {Adv}^{\mathsf {sound}}_{\mathsf {PS},G}(\mathsf {P}^*) \leqslant \mathsf {Adv}^{\mathsf {wit}}_{\mathsf {PS},G}(\mathcal {B}) + 5\sqrt{\frac{6 \cdot Q_{\mathcal {C}}^2}{|\mathbb {G}|}} + \frac{2}{|\mathbb {G}|} + \frac{5\mu N}{\sqrt{2p}} \end{aligned}$$

where \(Q_{\mathcal {C}}=q_G+(\mu N+1)q_{\mathsf {P}^{*}}+q_{\mathsf {X}}+Nq_{\mathsf {V}}+n+1\). The runtime of \(\mathcal {B}\) is roughly that of \(\mathsf {E}^{\dag }[\mathsf {T},\mathsf {X}]\) when given oracle access to \(\mathsf {P}^*\) and \(\mathsf {V}\) interacting.

Proof

The result follows by applying Theorem 3, Theorem 2, Lemma 3, and the generic group model bound from Sect. 3.3 as discussed above. \(\square \)

Concrete security of bulletproofs. Finally, we can use the above to obtain a concrete security bound on the soundness of the Bulletproofs proof system for arithmetic circuits of Bünz et al. [7]Footnote 12. To do so we only need to figure out the parameters discussed above. Suppose the proof system is being used for an arithmetic circuit with M multiplication gates. Using techniques of BCCGP [6] this is represented by a size M Hadamard product and \(L\leqslant 2M\) linear constraints. Then per Bünz et al. the proof system has the following parameters:

  • \(p=(|\mathbb {G}|-1)/2\)Footnote 13

  • \(|\mathbb {G}|\) is the size of group \(\mathbb {G}\) in which discrete logs are assumed to be hard

  • \(N=7(L+1)M^3\)

  • \(n=2M+2\)

  • \(q_{\mathsf {V}} =3M+\log _2(M)+4\)

  • \(q_{\mathsf {X}}=0\)

Having proven our discrete log bound in a generic group model allowing multi-exponentiations is helpful here; it makes our bound not depend on the size of \(\mathsf {V}\)’s multi-exponentiations.

Corollary 1

Let \(\mathsf {PS}\) be the Bulletproofs proof system for arithmetic circuits define in Sect. 5.2 of [7] using a group of size \(|\mathbb {G}|\). Let M denote the number of multiplication gates in the circuit and \(L\leqslant 2M\) the number of linear constraints. Let G be a statement generator performing at most \(q_G\) multi-exponentiations and \(\mathsf {P}^{*}\) be a cheating prover that performs at most \(q_{\mathsf {P}^*}\) multi-exponentiations each time it is run. Define \(\mathcal {B}\) as shown in Fig. 12. Assume \(|\mathbb {G}|\geqslant 2\), \(L\geqslant 1\), and \(M\geqslant 16\). Then in the generic group model,

$$\begin{aligned} \mathsf {Adv}^{\mathsf {sound}}_{\mathsf {PS},G}(\mathsf {P}^*) < \mathsf {Adv}^{\mathsf {wit}}_{\mathsf {PS},G}(\mathcal {B}) + \frac{13q_G + 258q_{\mathsf {P}^{*}}\cdot LM^3\log _2(M) + 644\cdot LM^4}{\sqrt{|\mathbb {G}|}}. \end{aligned}$$

The runtime of \(\mathcal {B}\) is roughly that of \(\mathsf {E}^{\dag }[\mathsf {T},\mathsf {X}_B]\) when given oracle access to \(\mathsf {P}^*\) and \(\mathsf {V}\) interacting, where \(\mathsf {X}_B\) is the Bulletproofs extractor.

We expect \(q_{\mathsf {P}^{*}}\) to be the largest of the parameters, so the bound is dominated by the \(O\left( {q_{\mathsf {P}^{*}}\cdot LM^3\log _2(M)}/{\sqrt{|\mathbb {G}|}}\right) \) term.

Proof

The bound was obtained by plugging our parameters (and \(\mu =3+\log _2(M)\)) into Theorem 4, then simplifying the expression using that \(|\mathbb {G}|\geqslant 2\), \(L\geqslant 1\), and \(M\geqslant 16\). The (straightforward) details of this are provided in the full version of this paper [13]. \(\square \)