1 Introduction

Cryptographic reductions support the security of a cryptographic scheme \(\mathsf{S}\) by showing that any attack against \(\mathsf{S}\) can be transformed into an algorithm for solving a problem \(\mathsf{P}\). The tightness of a reduction is in general some measure of how closely the reduction relates the resources of attacks against \(\mathsf{S}\) to the resources of the algorithm for \(\mathsf{P}\). A tighter reduction gives a better algorithm for \(\mathsf{P}\), ruling out a larger class of attacks against \(\mathsf{S}\). Typically one considers resources like runtime, success probability, and sometimes the number of queries (to oracles defined in \(\mathsf{P}\)) of the resultant algorithm when evaluating the tightness of a reduction.

This work revisits how we measure the resources of the algorithm produced by a reduction. We observe that memory usage is an often important but overlooked metric in evaluating cryptographic reductions. Consider typical “tight” reductions from the literature, which start with an attack against a scheme \(\mathsf{S}\) that uses (say) time \(t_S\) to achieve success probability \(\varepsilon _S\), and transform the attack into an algorithm for problem \(\mathsf{P}\) running in time \(t_P \approx t_S\) and succeeding with probability \(\varepsilon _P \approx \varepsilon _S\). We observe that reductions tight in this sense are sometimes highly memory-loose: If the attack against \(\mathsf{S}\) used \(m_S\) bits of working memory, the reduction may produce an algorithm using \(m_P \gg m_S\) bits of memory to solve \(\mathsf{P}\). Depending on \(\mathsf{P}\), this changes the conclusions we can draw about the security of the scheme.

In this paper we investigate memory-efficiency in cryptographic reductions in various settings. We show that some standard decisions in security definitions have a bearing on memory efficiency of possible reductions. We give several simple techniques for improving memory efficiency of certain classes of reductions, and finally turn to a connection between streaming algorithms and memory/time-efficient reductions.

Tightness, memory-tightness, and security. Reductions between a problem \(\mathsf{P}\) and a cryptographic scheme \(\mathsf{S}\) that approximately preserve runtime and success probability are usually called tight (c.f. [6, 8, 17]). Tight reductions are preferred because they provide stronger assurance for the security of \(\mathsf{S}\). Specifically, let us call an algorithm running in time t and succeeding with probability \(\varepsilon \) a \((t,\varepsilon )\)-algorithm (for a given problem, or to attack a given scheme). Suppose that a reduction converts a \((t_S,\varepsilon _S)\)-adversary against scheme \(\mathsf{S}\) into a \((t_P,\varepsilon _P)\)-algorithm for \(\mathsf{P}\) where \((t_P,\varepsilon _P)\) are functions of the first two. If it is believed that no \((t_P,\varepsilon _P)\) algorithm should exist for \(\mathsf{P}\), then one concludes that no \((t_S,\varepsilon _S)\) adversary can exist against \(\mathsf{S}\).

If a reduction is not tight, then in order to conclude that scheme \(\mathsf{S}\) is secure against \((t_S,\varepsilon _S)\)-adversaries one must adjust the parameters of the instance of \(\mathsf{P}\) on which \(\mathsf{S}\) is built, leading to a less efficient construction. In some extreme cases, obtaining a reasonable security level for a scheme with a non-tight reduction leads to an impractical construction. Addressing this issue has become an active area of research in the last two decades (e.g. [4,5,6, 8, 11, 12, 18]).

In this work we keep track of the amount of memory used in reductions. To see when memory usage becomes relevant, let a \((t,m,\varepsilon )\)-algorithm use t time steps, m bits of memory, and succeed with probability \(\varepsilon \). A tight reduction from \(\mathsf{S}\) to \(\mathsf{P}\) transforms \((t_S,m_S,\varepsilon _S)\)-adversaries into \((t_P,m_P,\varepsilon _P)\)-algorithms, where “tight” guarantees \(t_S \approx t_P\) and \(\varepsilon _S \approx \varepsilon _P\), but permits \(m_P \gg m_S\), up to the worst-case \(m_P \approx t_P\).

Now, suppose concretely that we want \(\mathsf{S}\) to be secure against \((2^{256},2^{128},O(1))\)-adversaries, based on very conservative estimates of the resources available to a powerful government. Consider two possible “tight” reductions: One that is additionally “memory-tight” and transforms a \((2^{256},2^{128},O(1))\)-adversary \(\mathsf{A}\) against \(\mathsf{S}\) into a \((2^{256},2^{128},O(1))\)-algorithm \(\mathsf {B}_{\mathrm {mt}}\) for \(\mathsf{P}\), and one that is “memory-loose” and instead only yields a \((2^{256},2^{256},O(1))\)-algorithm \(\mathsf {B}_{\mathrm {nmt}}\) for \(\mathsf{P}\).

The crucial point is that some problems \(\mathsf{P}\) can be solved faster when larger amounts of memory are used. In our example above, it may be that \(\mathsf{P}\) is impossible to solve with \(2^{256}\) time and \(2^{128}\) memory for some specific security parameter \(\lambda \). But with both time and memory up to \(2^{256}\) bits, the best algorithm may be able to solve instances of \(\mathsf{P}\) with security parameter \(\lambda \), and with even larger parameters up to some \(\lambda '> \lambda \). The memory-looseness of the reduction now bites, because to achieve the original security goal for \(\mathsf{S}\) we must use the larger parameter \(\lambda '\) for \(\mathsf{P}\), resulting in a slower instantiation of the scheme. When \(\mathsf{P}\) is a problem involving a symmetric primitive where the “security parameter” cannot be changed the issue is more difficult to address.

We now address two points in turn: If \(\mathsf{P}\) is easier to solve when large memory is available, what does this mean for memory-tight reductions? And when are reductions “memory-loose”?

Fig. 1.
figure 1

Time/memory trade-off plots for collision-resistance (\(\mathsf {CR}_2\), left), triple collision-resistance (\(\mathsf {CR}_3\), middle) and LPN with dimension 1024 and error rate 1/4 (right). All plots are log-log and the axes on the right plot are not to scale.

Memory-sensitive problems and memory-tightness. Many, but not all, problems \(\mathsf{P}\) relevant to cryptography can be solved more quickly with large memory than with small. In the public-key realm these include factoring, discrete-logarithm in prime fields, Learning Parities with Noise (LPN), Learning With Errors (LWE), approximate Shortest Vector Problem, and Short Integer Solution (SIS). In symmetric-key cryptography such problems include key-recovery against multiple-encryption, finding multi-collisions in hash functions, and computation of memory-hard functions. We refer to problems like these as memory-sensitive. (We refer to Sect. 6 for more discussion.)

On the other hand, problems \(\mathsf{P}\) exist where the best known algorithm also uses small memory: Discrete-logarithm in elliptic curve groups over prime-fields [16], finding (single) collisions in hash functions [23], finding a preimage in hash functions (exhaustive search), and key recovery against block-ciphers (also exhaustive search).

Let us consider some specific examples to illustrate the impact of a memory-loose reduction to a non-memory-sensitive versus a memory-sensitive problem. Let \(\mathsf {CR}_k\) be the problem of finding a k-way collision in a hash function \(\mathsf{H}\) with \(\lambda \) output bits, that is, finding k distinct domain points \(x_1,\ldots ,x_k\) such that \(\mathsf{H}(x_1) = \mathsf{H}(x_2) = \cdots = \mathsf{H}(x_k)\) for some fixed \(k\ge 2\).

First suppose we reduce the security of a scheme \(\mathsf{S}\) to \(\mathsf {CR}_2\), which is standard collision-resistance. The problem \(\mathsf {CR}_2\) is not memory-sensitive, and the best known attack is a \((2^{\lambda /2}, O(1), O(1))\)-algorithm. In the left plot of Fig. 1 we visualize the “feasible” region for \(\mathsf {CR}_2\), where the shaded region is unsolvable. Now we consider two possible reductions. One is a memory-tight reduction which maps an adversary \(\mathsf{A}\) (with some time and memory complexity, with possibly much less memory than time) to an algorithm \(\mathsf {B}_{\mathrm {mt}}\) for \(\mathsf {CR}_2\) with the same time and memory. The other reduction is memory-loose (but time-tight) and maps \(\mathsf{A}\) to an adversary \(\mathsf {B}_{\mathrm {nmt}}\) that uses time and memory approximately equal to the time of \(\mathsf{A}\). We plot the effect of these reductions in the left part of the figure. A tight reduction leaves the point essentially unchanged, while a memory-loose reduction moves the point horizontally to the right. Both reductions will produce a \(\mathsf {B}_{\mathrm {nmt}}\) in the region not known to be solvable, thus giving a meaningful security statement about \(\mathsf{A}\) that amounts to ruling out the shaded region of adversaries. We do note that there is a possible quantitative difference in the guarantees of the reductions, since it is only harder to produce an algorithm with smaller memory, but this benefit is difficult to measure.

Now suppose instead that we reduce the security of a scheme \(\mathsf{S}\) to \(\mathsf {CR}_3\). The best known attack against \(\mathsf {CR}_3\) is a \((2^{(1-\alpha )\lambda },2^{\alpha \lambda },O(1))\)-algorithm due to Joux and Lucks [20], for any \(\alpha \le 1/3\). We visualize this time-memory trade-off in the middle plot of Fig. 1, and again any adversary with time and memory in the shaded region would be a cryptanalytic advance. We again consider a memory-tight versus a memory-loose reduction. The memory-tight reduction preserves the point for the adversary \(\mathsf{A}\) in the plot and thus rules out \((t_S,m_S,O(1))\) adversaries for any \(t_S,m_S\) in the shaded region. A memory-loose (but time-tight) reduction mapping \(\mathsf{A}\) to \(\mathsf {B}_{\mathrm {nmt}}\) for \(\mathsf {CR}_3\) that blows up memory usage up to time usage will move the point horizontally to the right. We can see that there are drastic consequences when the original adversary \(\mathsf{A}\) lies in the triangular region with time \({>}2\lambda /3\) and memory \({<}\lambda /3\), because the reduction produces an adversary \(\mathsf {B}_{\mathrm {nmt}}\) using resources for which \(\mathsf {CR}_3\) is known to be broken. In summary, the reduction only rules out adversaries \(\mathsf{A}\) below the horizontal line with time \(=2\lambda /3\).

Finally we consider an example instantiation of parameters for the learning parities with noise (LPN) problem, which is memory-sensitive, where a memory-loose reduction would diminish security guarantees. In Sect. 6 we recall this problem and the best attacks, and in the right plot of Fig. 1 the shaded region represents the infeasible region for the problem in dimension 1024 and error rate 1 / 4. (For simplicity, all hidden constants are ignored in the plot.) In this problem the effect of memory-looseness is more stark. Despite using a large dimension, a memory-loose reduction can only rule out attacks running in time \({<}2^{85}\). A memory-tight reduction, however, gives a much stronger guarantee for adversaries with memory less than \(2^{85}\).

Memory-loose reductions. Reductions are often memory-loose, and small decisions in definitions can lead to memory usage being artificially high. We start with an illustrative example.

Suppose we have a tight security reduction (in the traditional sense) in the random oracle model [7] between a problem \(\mathsf{P}\) and some cryptographic scheme \(\mathsf{S}\). More concretely, suppose a reduction transforms a \((t_S,m_S,\varepsilon _S)\)-adversary \(\mathsf{A}_S\) in the random-oracle model into a \((t_P,m_P,\varepsilon _P)\)-algorithm \(\mathsf{A}_P\) for \(\mathsf{P}\). A typical reduction has \(\mathsf{A}_P\) simulate a security game for \(\mathsf{A}_S\), including the random oracle, usually via a table that stores responses to queries issued by \(\mathsf{A}_S\). Naively removing the table from storage usually is not an option for various reasons: For example, if \(\mathsf{A}_S\) queries the oracle on the same input twice, then it expects to see the same output twice, or perhaps the reduction needs to “program” the random oracle with responses that must be remembered.

Storing a table for the random oracle may dramatically increase memory usage of the algorithm \(\mathsf{A}_P\). If adversary \(\mathsf{A}_S\) makes \(q_H\) queries to the random oracle, then \(\mathsf{A}_P\) will store \(\varOmega (q_H)\) bits of memory, plus the internal memory \(m_S\) of \(\mathsf{A}_S\) during the simulation, which gives

$$ m_P = m_S + \varOmega (q_H). $$

In the worst case, \(\mathsf{A}_S\) could run in constant memory and make one random oracle query per time unit, meaning that \(\mathsf{A}_P\) requires as much memory as its running time. Thus the reduction may be “tight” in the traditional sense with \(t_P \approx t_S, \varepsilon _P \approx \varepsilon _S\), but also have

$$\begin{aligned} m_P = m_S + t_S. \end{aligned}$$
(1)

Thus \(\mathsf{A}_P\) may use an enormous amount of memory \(m_P\) even if \(\mathsf{A}_S\) satisfied \(m_S = O(1)\).

This example is only the start. Memory-looseness is sometimes, but not always, easily fixed, and seems to occur because it was not measured in reductions. Below we will furnish examples of other reductions that are (sometimes implicitly) memory-loose. We will also discuss some decisions in definitions and modeling that dramatically effect memory usage but are not usually stressed.

1.1 Our Results

Even though there exists an extensive literature on tightness of cryptographic security reductions (e.g. [5, 8, 11, 12]), memory has, to the best of our knowledge, not been considered in the context of security reductions. In this paper we first identify the problems related to non-memory-tight security reductions. To overcome the problems, we initiate a systematic study on how to make known security reductions memory-tight. Concretely, we provide several techniques to obtain memory-efficient reductions and give examples where they can be applied. Our techniques can be used to make many security reductions memory-tight, but not all of them. Furthermore, we show that this is inherent, i.e., that there exist natural cryptographic problems that do not have a fully tight security reduction. Finally, we examine various memory-sensitive problems such as the learning parity with noise (LPN) problem, the factoring problem, and the discrete logarithm problem over finite fields.

The Random Oracle technique. Recall that a classical simulation of the random oracle using the lazy sampling technique requires the reduction to store \(O(q_H)\) values. The idea is to replace the responses H(x) to a random oracle query x by \(\mathsf {PRF}(k,x)\), where \(\mathsf {PRF}\) is a pseudo-random function and k is its key. The limitation of this technique is that it can only be applied to very restricted cases of a programmable random oracle.

The Rewinding Technique. The idea of the rewinding technique is to use the adversary as a “memory device.” Concretely, whenever the reduction would like to access values previously output by the adversary that it did not store in its memory, it simply rewinds the adversary which is executed with the same random coins and with the same input. This way the reduction’s running time doubles, but (unlike previous applications of the rewinding technique in cryptography, e.g., [22]) the overall success probability does not decrease. The rewinding technique can be applied multiple times providing a trade-off between memory efficiency and running time of the reduction. To exemplify the techniques, we show a memory-tight security reduction to the RSA full-domain hash signature scheme in the appendix.

A Lower Bound. Some reductions appear (to us at least) to inherently require increased memory. We take a first step towards formalizing this intuition by proving a lower bound on the memory usage of a class of black-box reductions in two scenarios.

First, we revisit a reduction implicitly used to justify the standard unforgeability notion for digital signatures, which reduces a game with several chances to produce a valid forgery to the standard game with only one chance. One can take this as a possible indication that signatures with memory-tight reductions in the more permissive model may be preferred. Second, we prove a similar lower bound on the memory usage of a class of reductions between a “multi-challenge” variant of collision resistance and standard collision resistance.

Interestingly, our lower bound follows from a result on streaming algorithms, which are designed to use small space while working with sequential access to a large stream of data.

Open problems. This work initiates the study of memory-tight reductions in cryptography. We give a number of techniques to obtain such reductions, but many open problems remain. There are likely other reductions in the literature that we have not covered, and to which our techniques do not apply. It is even unclear how one should consider basic definitions, like unforgeability for signatures, since the generic reductions from more complicated (but more realistic) definitions may be tight but not memory-tight.

One reduction we did consider, but could not improve, is the IND-CCA security proof for Hash ElGamal in the random oracle model [1] under the gap Diffie-Hellman assumption. This reduction (and some others that use “gap” assumptions) use their random oracle table in a way that our techniques cannot address. We conjecture that a memory-tight reduction does not exist in this case, and leave it as an open problem to (dis)prove our conjecture.

2 Complexity Measures

We denote random sampling from a finite set A according to the uniform distribution with . By \(\mathrm {Ber}(\alpha )\) we denote the Bernoulli distribution for parameter \(\alpha \), i.e., the distribution of a random variable that takes value 1 with probability \(\alpha \) and value 0 with probability \(1-\alpha \); by \( {\mathbb {P}}_{\ell } \) the set of primes of bit size \(\ell \) and by \(\log \) the logarithm with base 2.

2.1 Computational Model

Computational model. All algorithms in this paper are taken to be RAMs. These programs have access to memory with words of size \(\lambda \), along with a constant number of registers that each hold one word. In this paper \(\lambda \) will always be the security parameter of a construction or a problem under consideration.

We define probabilistic algorithms to be RAMs with a special instruction that fills a distinguished register with random bits (independent of other calls to the special instruction). We note that this instruction does not allow for rewinding of the random bits, so if the algorithm wants to access previously used random bits then it must store them. Running an algorithm \(\mathsf{A}\) means executing a RAM machine with input written in its memory (starting at address 0). If \(\mathsf{A}\) is randomized, we write to denote the random variable y that is obtained by running \(\mathsf{A}\) on input I (which may consist of a tuple \(I=(I_1, \ldots , I_n)\)). If \(\mathsf{A}\) is deterministic, we write \(\leftarrow \) instead of . We sometimes give an algorithm \(\mathsf{A}\) access to stateful oracles \({\mathsf {O}}_1,{\mathsf {O}}_2,\ldots ,{\mathsf {O}}_n\). Each \({\mathsf {O}}_i\) is defined by a RAM \(M_i\). We also define an associated string \(\mathsf {st}_{\mathsf {O}}\) called the oracle state that is stored in a protected region of the memory of \(\mathsf{A}\) that can only be read by the oracles. Initially \(\mathsf {st}_{\mathsf {O}}\) is defined to be empty. An algorithm \(\mathsf{A}\) calls an oracle \({\mathsf {O}}_i\) via a special instruction, which runs the corresponding RAM on input from a fixed region of memory of \(\mathsf{A}\) along with the oracle state \(\mathsf {st}_{\mathsf {O}}\). The RAM \(M_i\) uses its own protected working memory, and finally its output is written into a fixed region of memory for \(\mathsf{A}\), the updated state is written to \(\mathsf {st}_{\mathsf {O}}\), and control is transferred back to \(\mathsf{A}\).

Games. Most of our security definitions and proofs use code-based games [9]. A game \(\mathsf{G}\) consists of a RAM defining an \(\mathsf {Init}\) oracle, zero or more stateful oracles \({\mathsf {O}}_1,\ldots ,{\mathsf {O}}_n\), and a \(\mathsf {Fin}\) RAM oracle. An adversary \(\mathsf{A}\) is said to play game \(\mathsf{G}\) if its first instruction calls \(\mathsf {Init}\) (handing over its own input) and its last instruction calls \(\mathsf {Fin}\), and in between these calls it only invokes \({\mathsf {O}}_1,\ldots ,{\mathsf {O}}_n\) and performs local computation. We further require that \(\mathsf{A}\) outputs whatever \(\mathsf {Fin}\) outputs.

Executing game \(\mathsf{G}\) with \(\mathsf{A}\) is formally just running \(\mathsf{A}\) with input \(\lambda \), the security parameter. Keeping with convention, we denote the random variable induced by executing \(\mathsf{G}\) with \(\mathsf{A}\) as \(\mathsf{G}^\mathsf{A}\) (where the sample space is the randomness of \(\mathsf{A}\) and the associated oracles). By \(\mathsf{G}^\mathsf{A} \Rightarrow {\texttt {out}}\) we denote the event that \(\mathsf{G}\) executed with \(\mathsf{A}\) outputs \({\texttt {out}}\). In our games we sometimes denote a “Stop” command that takes an argument. When Stop is invoked, its argument is considered the output of the game (and the execution of the adversary is halted). If a game description omits the \(\mathsf {Fin}\) procedure, it means that when \(\mathsf{A}\) calls \(\mathsf {Fin}\) on some input x, \(\mathsf {Fin}\) simply invokes Stop with argument x. By default, integer variables are initialized to 0, set variables to \(\emptyset \), strings to the empty string and arrays to the empty array.

2.2 Complexity Measures

This work is concerned with measuring the resource consumption of an adversary in a way that allows for meaningful conclusions about security. Success probabilities and time are widely used in the cryptographic literature with general agreement on the details, which we recall first. Memory consumption of reductions is however new, so we next discuss the possible options in measuring memory and the implications.

Success Probability. We define the success probability of \(\mathsf{A}\) playing game \(\mathsf{G}\) as \(\mathbf {Succ}(\mathsf{G}^\mathsf{A}) := \Pr [\mathsf{G}^\mathsf{A}\Rightarrow 1]\).

Runtime. Let \(\mathsf{A}\) be an algorithm (RAM) with no oracles. The runtime of \(\mathsf{A}\), denoted \(\mathbf {Time}(\mathsf{A})\), is the worst-case number of computation steps of \(\mathsf{A}\) over all inputs of bit-length \(\lambda \) and all possible random choices. Now let \(\mathsf{G}\) be a game and \(\mathsf{A}\) be an adversary that plays game \(\mathsf{G}\). The runtime of executing \(\mathsf{G}\) with \(\mathsf{A}\) is usually taken to be the number of computation steps of \(\mathsf{A}\) plus the number of computation steps of each RAM used to respond to oracle queries: We denote this as \(\mathbf {TotalTime}(\mathsf{G}^\mathsf{A})\) or \(\mathbf {TotalTime}(\mathsf{A})\). One may prefer not to include the time used by the oracles, and in this case we denote \(\mathbf {LocalTime}(\mathsf{G}^\mathsf{A})\) or \(\mathbf {LocalTime}(\mathsf{A})\) to be the number of steps of \(\mathsf{A}\) only.

Memory. We define the memory consumption of a RAM program \(\mathsf{A}\) without oracles, denoted \(\mathbf {Mem}(\mathsf{A})\), to be size (in words of length \(\lambda \)) of the code of \(\mathsf{A}\) plus the worst-case number of registers used in memory at any step in computation, over all inputs of bit-length \(\lambda \) and all random choices. Now let \(\mathsf{G}\) be a game and \(\mathsf{A}\) be an adversary that plays game \(\mathsf{G}\). The memory required to execute game \(\mathsf{G}\) with \(\mathsf{A}\) includes the memory needed to input and output to \(\mathsf{A}\), as well as input and output to each oracle, along with the working memory and state of each oracle. We denote this as \(\mathbf {TotalMem}(\mathsf{G}^\mathsf{A})\) or \(\mathbf {TotalMem}(\mathsf{A})\). Alternatively, one may measure only the code and memory consumed by \(\mathsf{A}\), but not its oracles. We denote this measure by \(\mathbf {LocalMem}(\mathsf{A})\).

One advantage of the \(\mathbf {LocalMem}\) measure is that it can avoid small details of security definitions drastically changing the meaning of memory-tightness in reductions.

Sometimes it will be convenient to measure the memory consumption in bits, in which case we use \(\mathbf {Mem}_2(\mathsf{A})\), \(\mathbf {LocalMem}_2(\mathsf{A})\), and \(\mathbf {TotalMem}_2(\mathsf{A})\).

2.3 Case Study I: Unforgeability of Digital Signatures

Let \((\mathsf {Gen},\mathsf {Sign},\mathsf {Ver})\) be a digital signature scheme (see Sect. 5 for the exact syntax of signatures, which is standard). On the left side of Fig. 2 we recall the game \(\mathsf {UFCMA}\) that defines the standard notion of (existential) unforgeability under chosen-message attacks. The advantage of an adversary \(\mathsf{A}\) is defined by \(\mathbf {Adv}(\mathsf {UFCMA}^{\mathsf{A}}) = \mathbf {Succ}(\mathsf {UFCMA}^\mathsf{A})\), and a signature scheme where \(\mathbf {Adv}(\mathsf {UFCMA}^{\mathsf{A}}) \) is “small” for some class of adversaries is usually defined to be “secure”. In order for the definition to be meaningful, the game \(\mathsf {UFCMA}\) checks that the signature \(\sigma ^*\) on \(m^*\) is valid, and also that \(m^*\) was not queried to the signing oracle. In our version of the definition, the signing oracle maintains a set S of messages that were queried, and the game uses S to check if \(m^*\) was queried.

The \(\mathsf {UFCMA}\) game is an example where we prefer \(\mathbf {LocalMem}\) to \(\mathbf {TotalMem}\). Any adversary \(\mathsf{A}\) playing \(\mathsf {UFCMA}\) will always have \(\mathbf {TotalMem}(\mathsf{A}) = \varOmega (q_S)\), where \(q_S\) is the number of signature queries it issues, while it may have \(\mathbf {LocalMem}(\mathsf{A})\) much smaller. Restricting the number of signing queries \(q_S\) is an option but weakens the definition.

An alternative style of definition for unforgeability is to limit the class of adversaries \(\mathsf{A}\) considered to those that are “well behaved” in that they never submit an \(m^*\) that was previously queried. The game no longer needs to track which messages were queried to the signing oracle in order to be meaningful. This definition is equivalent up to a small increase in (local) running time, but it is not clear if the same is true for memory. To convert any adversary to be well behaved, natural approaches mimic our version of the game, storing a set S and checking the final forgery locally before submitting.

We contend that there is good reason to prefer our definition over the version that only quantifies over well-behaved adversaries. In principle, it is possible that a signature construction is secure against a class of well-behaved adversaries (say, running in a bounded amount of time and memory) but not against general adversaries running with the same time/memory. Counter-intuitively, such a general adversary might produce a forgery without knowing itself if the forgery is fresh and thus wins the game. Since we cannot rule this out, we prefer our stronger definition.

Fig. 2.
figure 2

Games \(\mathsf {UFCMA},\mathsf {mUFCMA}\).

Stronger unforgeability. Games in many crypto-definitions are chosen to be simple and compact but also general. The game \(\mathsf {UFCMA}\) only allows a single attempt at a forgery in order to shorten proofs, but the definition also tightly implies (up to a small increase in runtime) a version of unforgeability where the attacker gets many attempts, which more closely models usages where an attacker will have many chances to produce a forgery.

It is less clear how \(\mathsf {UFCMA}\) relates to more general definitions when memory tightness is taken into account. To make this more concrete, consider the game \(\mathsf {mUFCMA}\) (for “many \(\mathsf {UFCMA}\)”) on the right side of Fig. 2. In this game the adversary has an additional verification oracle. If it ever submits a fresh forgery to this oracle, it wins the game. It is easy to give a tight, but non-memory-tight, reduction converting any \((t,m,\varepsilon )\)-adversary playing \(\mathsf {mUFCMA}\) into a \((t',m',\varepsilon )\)-adversary playing \(\mathsf {UFCMA}\) for \(t' \approx t\) but \(m' \gg m\). Other trade-offs are also possible but achieving tightness in all three parameters seems difficult.

For the reasons described in the introduction, a memory-tight reduction from winning \(\mathsf {mUFCMA}\) to winning \(\mathsf {UFCMA}\) is desirable. In Sect. 4, we show that a certain class of black-box reductions for these problems in fact cannot be simultaneously tight in runtime, memory, and success probability. We conclude that signatures with dedicated memory-tight proofs against adversaries in the \(\mathsf {mUFCMA}\) may provide stronger security assurance, especially when security is reduced to a memory-sensitive problem like RSA.

We remark that the common reduction from multi-challenge to single-challenge \(\mathsf {IND}\text {-}\mathsf {CPA}\)/\(\mathsf {IND}\text {-}\mathsf {CCA}\) security for public-key encryption is memory tight (but not tight in terms of the success probability).

2.4 Case Study II: Collision-Resistance Definitions

Collision-resistance, and multi-collision-resistance of hash functions, is used for security reductions in many contexts. Let \(\mathsf{H}\) be a keyed hash function (with \(\kappa \)-bit keys), with standard syntax. On the left side of Fig. 3 we recall the game \(\mathsf {CR}_t\) used to define t-collision resistance. The game provides no extra oracles, and \(\mathsf{A}\) wins if it can find t domain points that are mapped to the same point by \(\mathsf{H}\).

As we will see in later sections, it is sometimes feasible to fix typical memory-tight reductions to \(\mathsf {CR}_t\). We however now consider using collision-resistance (for \(t=2\)) for domain extension of pseudorandom functions. Let \(\mathsf{F}:\{0,1\}^\kappa \times \{0,1\}^\delta \rightarrow \{0,1\}^\rho \) be a keyed function with input-length \(\delta \) which should have random looking input/output behavior to some class of adversaries (see Sect. 3.1 for a formal definition of PRFs). We can define a new keyed function \(\mathsf{F}^*\) that takes arbitrary-length inputs by

$$\begin{aligned}&\mathsf{F}^*:\{0,1\}^{2\kappa }\times \{0,1\}^*\rightarrow \{0,1\}^\rho , \\&\mathsf{F}^*((k,k_h),\ x) = \mathsf{F}(k,\ \mathsf{H}(k_h, x)). \end{aligned}$$

The proof that \(\mathsf{F}^*\) is a PRF is an easy hybrid argument. One first bounds the probability that an adversary submits two inputs that collide in \(\mathsf{H}\). Once this probability is known to be small, the memory-tight reduction to the pseudorandomness of \(\mathsf{F}\) is immediate.

Naive attempts at the reduction to collision-resistance are however not memory-tight. One can run the adversary attacking \(\mathsf{F}^*\) and record its queries, checking for any collisions, but this increases memory usage.

To model what such a proof is trying to do, we formulate a new game for t-collision resistance called \(\mathsf {mCR}_t\) in the right side of Fig. 3. In the game, the adversary has an oracle \(\mathsf {ProcInput}\) that takes a message and adds it to a set S. At the end of the game, the adversary wins if S contains any t inputs that are mapped to the same point. The game implements this check using counters stored in a dictionary.

Fig. 3.
figure 3

Games \(\mathsf {CR}_t,\mathsf {mCR}_t\).

Returning to the proof for \(\mathsf{F}^*\), one can easily construct an adversary to play \(\mathsf {mCR}_2\) using any PRF adversary. The resulting reduction will be memory-tight. Thus it would be desirable to have a memory-tight reduction from \(\mathsf {mCR}_2\) to \(\mathsf {CR}_2\) to complete the proof. This however seems difficult or even impossible, and in Sect. 4 we show that a class of black-box reductions cannot be memory-tight. As discussed in the introduction, t-collision-resistance is not memory sensitive for \(t=2\), and thus the meaning of a memory-tight reduction is somewhat diminished (i.e. it does not justify more aggressive parameter settings). For \(t>2\) the effect of memory-tightness is more significant.

3 Techniques to Obtain Memory Efficiency

In this section we describe four techniques to obtain memory-efficient reductions. In Sect. 5 we show how to apply those techniques to memory-tightly prove the security of the RSA Full Domain Hash signature scheme [7]. Using this example we also point to technical challenges that may arise when applying multiple techniques in the same proof.

3.1 Pseudorandom Functions

First, we formally define pseudorandom functions. They are the main tool used in this section to make reductions memory efficient.

Definition 1

Let \(\kappa \), \(\delta \) and \(\rho \) be integers. Further let \( \mathsf{F}:\{0,1\}^\kappa \times \{0,1\}^\delta \rightarrow \{0,1\}^\rho \) be a deterministic algorithm and let \(\mathsf{A}\) be an adversary that is given access to an oracle and outputs a single bit. The PRF advantage of \(\mathsf{A}\) is defined as \( \mathbf {Adv}(\mathsf {PRF}^\mathsf{A}) := |\mathbf {Succ}(\mathsf {Real}^\mathsf{A})-\mathbf {Succ}(\mathsf {Random}^\mathsf{A})|, \) where \(\mathsf {Real}\) and \(\mathsf {Random}\) are the games depicted in Fig. 4.

Fig. 4.
figure 4

Games defining PRF and \(\alpha \)-PRF advantage.

If the range of \(\mathsf{F}\) is just a single bit \(\{0,1\}\), we define the \(\alpha \)-PRF advantage with bias \(0 \le \alpha \le 1\) of \(\mathsf{A}\) as \( \mathbf {Adv}(\mathsf {PRF}_\alpha ^\mathsf{A}) := |\mathbf {Succ}(\mathsf {Real}^\mathsf{A})-\mathbf {Succ}(\mathsf {Random}_\alpha ^\mathsf{A})|, \) where \(\mathsf {Real}\) and \(\mathsf {Random}_\alpha \) are the games in Fig. 4.

Note that a \(2^{-\rho }\)-PRF can be easily constructed from a standard PRF with range \(\{0,1\}^{\rho }\) by mapping \(1^\rho \) to 1 and all other values to 0. A 1 / q-PRF for arbitrary q can be constructed in a similar way from a standard PRF with sufficiently large image size \(\rho \).

3.2 Generating (Pseudo)random Coins

Our first technique is the simplest, where we observe random coins used by adversaries can be replaced with pseudorandom coins, and that this substitution will save memory in certain reductions.

Consider a security game \(\mathsf{G}\) and an adversary \(\mathsf{A}\). Both are probabilistic processes and therefore require randomness. When considering memory efficiency details on storing random coins could come to dominate memory usage. Specifically, some reductions run an adversary multiple times with the same random tape, which must be stored in between runs. One possibility to do this is by sampling all randomness required in game \(\mathsf {G^A}\) (including the randomness used by \(\mathsf{A}\)) in advance. More formally let \( L \le 2^\lambda \) be an upper bound on the amount of executions of the instruction filling an register with random bits in \(\mathsf{G}^\mathsf{A}\). Then the sampling of random coins can be replaced filling and storing L registers (memory units) with random bits at the beginning of \(\mathsf {Init}\) and in the rest of the game replacing the ith call to the instruction with a procedure \(\mathsf {Coins}\) returning the contents of the ith register. This is formalized in game \(\mathsf {{G}^{{}}_{{0}}}\) of Fig. 5.

The game can be simulated in a memory-efficient way by replacing the random bits used by \(\mathsf{G}\) and \(\mathsf{A}\) with pseudorandom bits generated by a PRF \(\mathsf{F}:\{0,1\}^\kappa \times \{0,1\}^\delta \rightarrow \{0,1\}^\lambda \), as described in Game \(\mathsf {G}_1\) of Fig. 5. In this variant the game sets up the counter i in the usual way. Then a PRF key k is sampled from a key space \(\{0,1\}^\kappa \) and calls to \(\mathsf {Coins}\) are simulated by returning the pseudorandom bits \(\mathsf{F}(k,i)\). We now compare the two ways of executing the game in terms of success probability, running time, and memory consumption.

Fig. 5.
figure 5

Generating (pseudo)random coins in a memory-efficient way. By \(r_i\) we denote the \( i^{\mathrm {th}} \) block of \(\lambda \) bits of the string r.

Success Probability. By a simple reduction to the security of the PRF, there exists an adversary \(\mathsf{B}\) with \(\mathbf {LocalTime}(\mathsf{B})= \mathbf {LocalTime}(\mathsf{A})\), \(\mathbf {LocalMem}(\mathsf{B}) = \mathbf {LocalMem}(\mathsf{A})+1 \) such that

$$ \left| \mathbf {Succ}(\mathsf {{G}^{{A}}_{{0}}})] - \mathbf {Succ}(\mathsf {{G}^{{A}}_{{1}}})\right| \le \mathbf {Adv}(\mathsf {PRF}^\mathsf{B}) $$

(see Definition 1). Observe that \(\mathsf{B}\) perfectly simulates the \(\mathsf {Coins}\) oracle as follows. For \(\mathsf{A}\)’s \( i^{\mathrm {th}} \) query to \(\mathsf {Coins}\), it queries \(\mathsf {O_F}\) of the \(\mathsf {PRF}\) games on i and relays its response back to \(\mathsf{A}\). To do this, it needs to store a counter of \(\log L\) bits. All other procedures are simulated as specified in \(\mathsf{G}_1\).

Running Time. Game \(\mathsf {G}_1\) needs to evaluate the PRF (via algorithm \(\mathsf{F}\)) L times, hence we have \(\mathbf {TotalTime}(\mathsf {G_1^A}) \le \mathbf {TotalTime}(\mathsf {G_0^A})+L \cdot \mathbf {Time}(\mathsf{F})\).

Memory. Both games have to store a counter i of size \(\log L \le \lambda \) bits, which equals one memory unit. But while game \(\mathsf {G_0}\) needs memory for storing L strings, the memory-efficient game \(\mathsf {G_1}\) only needs additional memory \(\mathbf {Mem}(\mathsf{F})\). Note that the PRF key is included in the memory of \(\mathsf{F}\). So overall, we have

$$\begin{aligned} \mathbf {TotalMem}(\mathsf {G_0^A})&= \mathbf {LocalMem}(\mathsf{A}) + 1 + L , \\ \mathbf {TotalMem}(\mathsf {G_1^A})&= \mathbf {LocalMem}(\mathsf{A}) + 1 + \mathbf {Mem}(\mathsf{F}). \end{aligned}$$

Note that when applying this (and the following) techniques in a larger environment, special care has to be taken to keep the entire game consistent with the components changed by the technique. In particular, all intermediate reductions in a sequence of games have to be memory efficient to yield an overall memory-efficient reduction.

3.3 Random Oracles

Suppose a security game \(\mathsf{G}\) is defined in the random oracle model, that is one of the game’s procedures models a random oracle

$$\begin{aligned} H:\{0,1\}^\delta \rightarrow \{0,1\}^\lambda . \end{aligned}$$

The standard way of implementing this is via a technique called lazy sampling [9], meaning that when an adversary \(\mathsf{A}\) queries H on some value x, the game has to check if H(x) is already defined, and if not, it samples H(x) from some distribution and stores the value in a list, see \(\mathsf {{G}^{{}}_{{0}}}\) in Fig. 6. This means that in the worst case, it needs to store as many strings as the number of adversarial queries.

However, there are several settings where the random oracle can be implemented by a PRF \(\mathsf{F}:\{0,1\}^\kappa \times \{0,1\}^\delta \rightarrow \{0,1\}^\lambda \) as described in \(\mathsf {{G}^{{}}_{{1}}}\) of Fig. 6, thus making \(\mathsf{G}\) more memory-efficient. Among these settings are the non-programmable random oracle model and certain random oracles, where only values obtained or computed during the \(\mathsf {Init}\) procedure are used to program them.

Fig. 6.
figure 6

The Random Oracle technique to simulate \(\mathsf {RO}\) in a memory-efficient way. Here \(x_i\) denotes the i th query to \(\mathsf {RO}\). Note that the queries \(x_1, \ldots , x_q\) are not necessarily distinct.

In the following paragraph we analyze how success probability, running time and memory consumption change if we apply this technique.

Success Probability. There exists an adversary \(\mathsf{B}\) with \(\mathbf {LocalTime}(\mathsf{A}) = \mathbf {LocalTime}(\mathsf{B})\) and \(\mathbf {LocalMem}(\mathsf{A}) = \mathbf {LocalMem}(\mathsf{B})\) such that

$$ \left| \mathbf {Succ}(\mathsf {{G}^{{A}}_{{0}}}) - \mathbf {Succ}(\mathsf {{G}^{{A}}_{{1}}})\right| \le \mathbf {Adv}(\mathsf {PRF}^\mathsf{B}). $$

\(\mathsf{B}\) perfectly simulates the \(\mathsf {RO}\) by relaying all of \(\mathsf{A}\)’s queries to \(\mathsf {O_F}\) of the PRF games and forwarding the responses back to \(\mathsf{A}\). All other procedures are simulated as specified in \(\mathsf{G}_1 \). When \(\mathsf{B}\) is run with respect to game \(\mathsf {Random}\) of Definition 1 it provides \(\mathsf{A}\) with a perfect simulation of \(\mathsf{G}_0\), if it is run with respect to game \(\mathsf {Real}\) with a perfect simulation of game \(\mathsf {G_1}\).

Running Time. Let \(q_H\) be the number of random oracle queries posed by the adversary. Then game \(\mathsf {G_1}\) needs to evaluate the PRF \(q_H\) times, hence we have \(\mathbf {TotalTime}(\mathsf {G_1^A}) \le \mathbf {TotalTime}(\mathsf {G_0^A})+q_H \cdot \mathbf {Time}(\mathsf{F})\).

Memory. Game \(\mathsf {{G}^{{}}_{{0}}}\) needs to store an array H of size at least \(q_H\cdot \lambda \) bits (\(=q_H\) memory units), while the memory-efficient game only needs memory to execute the PRF via algorithm \(\mathsf{F}\). So overall, we have

$$\begin{aligned} \mathbf {TotalMem}(\mathsf {G_0^A})&\ge \mathbf {LocalMem}(\mathsf{A}) + q_H, \\ \mathbf {TotalMem}(\mathsf {G_1^A})&= \mathbf {LocalMem}(\mathsf{A}) + \mathbf {Mem}(\mathsf{F}). \end{aligned}$$

3.4 Random Oracle Index Guessing Technique

This technique is used when random oracle queries are answered in two different ways, e.g. in a reduction where challenge values, like a discrete logarithm challenge \(X=g^x\), are embedded in the programmable random oracle. Usually this is done by guessing some index \(i^*\) between 1 and \(q_H\) in the beginning, where \(q_H\) is the number of random oracle queries posed by the adversary. During the simulation, the challenge value is then embedded in the reduction’s response to the \(i^*\) th random oracle query.

To do this, the game needs to keep a list of all queries and responses. Independently of the way the game answers all the other queries except for the \(i^*\) th one, simply keeping a counter is not sufficient, since an adversary posing the same query all the time would then receive two different responses and the random oracle thus wouldn’t be well defined anymore. An example of such a game using the index guessing technique is game \(\mathsf {{G}^{{}}_{{0}}}\) of Fig. 7, where two deterministic procedures \(\mathsf{P}_0\) and \(\mathsf{P}_1\) are used to program H depending on \(i^*\).

To make games of this kind memory-efficient, one can use a \(1/q_H\)-PRF (see Definition 1) \(\mathsf{F}:\{0,1\}^\kappa \times \{0,1\}^\delta \rightarrow \{0,1\}\), associating to each value of the domain of the random oracle a bit 0 with probability \(1-1/q_H\) or 1 with probability \(1/q_H\) and then programming the random oracle accordingly as described in game \(\mathsf {{G}^{{}}_{{1}}}\) of Fig. 7. This method of using a biased bit goes back to Coron [14].

Fig. 7.
figure 7

The random oracle index guessing technique. By \(x_i\) we denote the i th query to \(\mathsf {RO}\). \(\mathsf{F}\) is a \(1/q_H\)-PRF. Note that the queries to \(\mathsf {RO}\) are not necessarily distinct.

We now compare the two games in terms of success probability, running time and memory efficiency.

Success Probability. Let \(\mathsf{A}\) be an adversary that is executed in \(\mathsf {G}_0\). We define an intermediate game \(\mathsf {G_0'}\), as depicted in Fig. 8, in which the index guessing is replaced by tossing a biased coin for each query.

Fig. 8.
figure 8

Intermediate game for the transition to memory-efficient index guessing.

These games are identical if \(c[x_{i^*}]=0\) and \(c[x_i]=1\) for all \(i\ne i^*\). Hence,

figure a

Now it is easy to construct an adversary \(\mathsf{B}\) against \(\mathsf{F}\) with \(\mathbf {LocalTime}(\mathsf{B}) = \mathbf {LocalTime}(\mathsf{A})\) and \(\mathbf {LocalMem}(\mathsf{B}) = \mathbf {LocalMem}(\mathsf{A})\) that provides \(\mathsf{A}\) with a perfect simulation of \(\mathsf {{G}^{{}}_{{0'}}}\) when interacting with game \(\mathsf {Random}_\alpha \) of Fig. 4 or respectively with a perfect simulation of \(\mathsf {{G}^{{}}_{{1}}}\) when interacting with \(\mathsf {Real}\). Hence \(\left| \mathbf {Succ}(\mathsf {(G_0')^A})-\mathbf {Succ}(\mathsf {{G}^{{A}}_{{1}}})\right| \le \mathbf {Adv}(\mathsf {PRF}_{1/q_H}^\mathsf{B})\). So overall, we have

$$ \mathbf {Succ}(\mathsf {{G}^{{A}}_{{1}}})\ge e^{-1}\cdot \mathbf {Succ}(\mathsf {{G}^{{A}}_{{0}}}) - \mathbf {Adv}(\mathsf {PRF}_{1/q_H}^\mathsf{B}). $$

Running Time. Game \(\mathsf {G_1}\) needs to evaluate the -PRF \(q_H\) times, hence we have \(\mathbf {TotalTime}(\mathsf {G_1^A}) = \mathbf {TotalTime}(\mathsf {G_0^A})+q_H \cdot \mathbf {Time}(\mathsf{F})\).

Memory. The standard game needs to store an array of size at least \(q_H\cdot \lambda \) bits and the integer \(i^*\), while the memory-efficient game only needs additional memory \(\mathbf {Mem}(\mathsf{F})\). So overall, we have

$$\begin{aligned}&\mathbf {TotalMem}(\mathsf {G_0^A}) \ge \mathbf {LocalMem}(\mathsf{A}) + q_H + 1 , \\&\mathbf {TotalMem}(\mathsf {G_1^A}) = \mathbf {LocalMem}(\mathsf{A})+ \mathbf {Mem}(\mathsf{F}). \end{aligned}$$

Note that for simplicity we ignored the memory consumption and running time for procedures \(\mathsf{P}_0\) and \(\mathsf{P}_1\).

3.5 Single Rewinding Technique

This technique can be used for games containing a procedure \(\mathsf {Query}\), which can be called by an adversary \(\mathsf{A}\) up to q times on inputs \( x_1,\dots ,x_q \). When \(\mathsf{A}\) terminates, it queries \(\mathsf {Fin}\) on a value \(x^*\). Procedure \(\mathsf {Fin}\) then checks whether there exists \(i\in \{1,\dots ,q\}\) such that \(\mathsf{R}(x_i,x^*)=1\), where \(\mathsf{R}\) is an efficiently computable relation specific to the game. If so, it invokes Stop with 1. If no such i exists it invokes Stop with 0. Note that we do not specify how queries to \(\mathsf {Query}\) are answered since it is not relevant here. To be able to check whether there exists an i such that \(\mathsf{R}(x_i,x^*)=1 \), the game usually stores the values \( x_1,\dots ,x_q \) as described in \(\mathsf {{G}^{{}}_{{0}}}\) in Fig. 9.

However it is possible to make the game memory efficient as described in \(\mathsf {{G}^{{}}_{{1}}}\) of Fig. 9. In this variant the game no longer stores all the \( x_i \)’s. Instead, it only stores the adversarial input \( x^* \) to \( \mathsf {Fin}\) and then rewinds \(\mathsf{A}\) to the start, i.e., it runs it a second time providing it with the exact same input and random coins, and responding to queries to \(\mathsf {Query}\) with the same values as in the first run. This means that from the adversary’s view, the second run is an exact replication of the first one. Whenever \(\mathsf{A}\) calls \( \mathsf {Query} \) on a value \( x_i \), the game checks whether \( \mathsf{R}(x^*,x_i)=1 \) and—if so—invokes Stop with 1. Note that it is necessary to store the random coins given to \(\mathsf{A}\) as well as random coins potentially used to answer queries to \(\mathsf {Query}\) to be able to rewind. This can be done memory-efficiently with the technique of Sect. 3.2.

Fig. 9.
figure 9

The single rewinding technique.

Success Probability. Since after rewinding, \( \mathsf {{G}^{{}}_{{1}}}\) provides \(\mathsf{A}\) with the exact same input as in the first execution, all values \( x_i \) are the same in both executions of \(\mathsf{A}\), so

$$ \mathbf {Succ}(\mathsf {{G}^{\mathsf{A}}_{{0}}})=\mathbf {Succ}(\mathsf {{G}^{\mathsf{A}}_{{1}}}). $$

Running Time. \( \mathsf{G}_0\) runs \(\mathsf{A}\) once, while \( \mathsf{G}_1 \) runs \(\mathsf{A}\) twice. Both games invoke the relation algorithm \(\mathsf{R}\) a total number of q times, so overall we obtain

$$ \mathbf {TotalTime}(\mathsf {{G}^{\mathsf{A}}_{{1}}})\le 2\cdot \mathbf {TotalTime}(\mathsf {{G}^{\mathsf{A}}_{{0}}}). $$

Memory. \( \mathsf {{G}^{\mathsf{A}}_{{0}}} \) stores all values \( x_1,\dots ,x_q,x^* \) while \( \mathsf {{G}^{\mathsf{A}}_{{1}}} \) only stores \( x^* \) and one of the \(x_i, 1\le i \le q\) at a time. Assuming each of the values \( x_1,\dots ,x_q,x^* \) takes one memory unit, we obtain

$$\begin{aligned} \mathbf {TotalMem}(\mathsf {{G}^{\mathsf{A}}_{{0}}})&=\mathbf {LocalMem}(\mathsf{A})+\mathbf {Mem}(\mathsf{R})+q+1,\\ \mathbf {TotalMem}(\mathsf {{G}^{\mathsf{A}}_{{1}}})&=\mathbf {LocalMem}(\mathsf{A})+\mathbf {Mem}(\mathsf{R})+2. \end{aligned}$$

We remark that the single rewinding technique can be extended to a multiple-rewinding technique, in which the reduction runs the adversary m times (on the same random coins and with the same input). For example, in Theorem 4 we consider a reduction between t-multi-collision-resistance and t-collision-resistance that rewinds the adversary several times.

4 Streaming Algorithms and Memory-Efficiency

In this section we prove two lower bounds on the memory usage of black-box reductions between certain problems. The first shows that any reduction from \(\mathsf {mUFCMA}\) to \(\mathsf {UFCMA}\) must either use more memory, run the adversary many times, or obey some tradeoff between the two options. The second gives a similar result for \(\mathsf {mCR}_t\) to \(\mathsf {CR}_t\) reductions. We start by recalling results from the data-stream model of computation which will provide the principle tools for our lower bounds.

In this section we also deal with bit-memory (\(\mathbf {Mem}_2\)) which measures the number of bits used, rather than \(\mathbf {Mem}\) which measures the number of \(\lambda \)-bit words used.

4.1 The Data Stream Model

The data stream model is typically used to reason about algorithmic challenges where a very large input can only be accessed in discrete pieces in a given order, possibly over multiple passes. For instance, data from a high-rate network connection may often be too large to store and thus only accessed in sequence.

Streaming formalization. We adopt the following notation for a streaming problem: An input is a vector \(\mathbf {y}\in U^n\) of dimension n over some finite universe U. We say that the number of elements in the stream is n. An algorithm \(\mathsf{B}\) accesses \(\mathbf {y}\) via a stateful oracle \({\mathsf {O}_\mathbf {y}}\) that works as follows: On the first call it saves an initial state \(i \leftarrow 0\) and returns \(\mathbf {y}[0]\). On future calls, \({\mathsf {O}_\mathbf {y}}\) sets \(i \leftarrow (i + 1 \mod n)\), and returns \(\mathbf {y}[i]\). The oracle models accessing a stream of data, one entry at a time. When the counter i is set to 0 (either at the start or by wrapping modulo n), the algorithm \(\mathsf{B}\) is said to be initiating a pass on the data. The number of passes during a computation \(\mathsf{B}^{\mathsf {O}_\mathbf {y}}\) is thus defined as \(p=\left\lceil q/n \right\rceil \), where q is the number of queries issued by \(\mathsf{B}\) to its oracle.

A streaming lower bound. Below we will use a well-known result lower bounding the trade-off between the number of passes and memory required to determining the most frequent element in a stream. We will also use a lower bound on a related problem that can be proven by the same techniques.

For a vector \(\mathbf {y}\in U^n\), define \({F_\infty }(\mathbf {y})\) as

$$ {F_\infty }(\mathbf {y}) = \max _{s\in U} |\{i : \mathbf {y}[i] = s\}|. $$

That is, \({F_\infty }(\mathbf {y})\) is the number of appearances of the most frequent value in \(\mathbf {y}\). Our results will use the following modified version of \({F_\infty }\), denoted \({F_{\infty ,t}}\) that only checks if the most frequent value appears t times or not:

$$ {F_{\infty ,t}}(\mathbf {y}) = {\left\{ \begin{array}{ll} 1 &{}\text {if}\, {F_\infty }(\mathbf {y}) \ge t\\ 0 &{}\text {otherwise} \end{array}\right. } $$

We also define the function \(G(\mathbf {y})\) as follows. It divides its input into two equal-length halves \(\mathbf {y}= \mathbf {y}_1 \Vert \mathbf {y}_2\), each in \(U^{n/2}\). We let

$$ G(\mathbf {y}_1\Vert \mathbf {y}_2) = {\left\{ \begin{array}{ll} 1 &{}\text {if}\, \exists j\ \forall i: \mathbf {y}_2[j] \ne \mathbf {y}_1[i] \\ 0 &{}\text {otherwise} \end{array}\right. }. $$

In words, G outputs 1 whenever \(\mathbf {y}_2\) contains an entry that is not in \(\mathbf {y}_1\).

Theorem 1

(Corollary of [21, 24]). Let t be a constant and \(\mathsf{B}\) be a randomized algorithm such that for all \(\mathbf {y}\in U^n\),

$$ \Pr [\mathsf{B}^{\mathsf {O}_\mathbf {y}}(|U|,n) = {F_{\infty ,t}}(\mathbf {y}))] \ge c, $$

where \(1/2 < c \le 1\) is a constant. Then \(\mathbf {LocalMem}_2(\mathsf{B}) = \varOmega (\min \{n/p,|U|/p\})\), where p is the number of passes \(\mathsf{B}\) makes in the worst case. The same statement holds if \({F_{\infty ,t}}\) is replaced with G.

This theorem is actually a simple corollary of a celebrated result on the communication complexity of the disjointness problem, which has several other applications. See also the lecture notes by Roughgarden [25] that give an accessible theorem statement and discussion after Theorem 4.11 of that document.

The standard version of this theorem only states that computing \({F_\infty }\) requires the stated space, so we sketch how to obtain our easy corollary. The full proof is omitted from this version due to the page limit. The proof for \({F_\infty }\) works by showing that any p-pass streaming algorithm with local memory m can be used to construct a p-round two-party protocol to compute whether sets \(S_1,S_2\) held by the parties are disjoint. One then proves a communication lower bound on any protocol to test for disjointness.

A simple modification of this argument shows that computing G also gives such a protocol: It easily allows two parties to compute if \(S_1\setminus S_2\) is empty, which is equivalent to computing if \(\overline{S_1}\) and \(S_2\) are disjoint. Thus one can reduce disjointness to this problem by having the first party take the compliment of its set.

The modification for \({F_{\infty ,t}}\) is slightly more subtle. The essential idea is that one party can copy its set \(t-1\) times when feeding it to the streaming algorithm. Then if the parties’ sets are not disjoint, we will have \({F_{\infty ,t}}\) equal to 1 and 0 otherwise. Since t is a constant this affects the lower bound by only a constant factor.

4.2 \(\mathsf {mUFCMA}\)-to-\(\mathsf {UFCMA}\) Lower Bound

Black-box reductions for \(\mathsf {mUFCMA}\) to \(\mathsf {UFCMA}\). Let \(\mathsf{R}\) be an algorithm playing the \(\mathsf {UFCMA}\) game. Recall that \(\mathsf{R}\) receives input \( pk \) and has access to an oracle \(\mathsf {ProcSign}\), and stops the game by querying \(\mathsf {Fin}(m^*,\sigma ^*)\). Below for an adversary \(\mathsf{A}\) playing \(\mathsf {mUFCMA}\), we write \(\mathsf{R}^\mathsf{A}\) to mean that \(\mathsf{R}\) has additionally “oracle access to \(\mathsf{A}\)”, which means an oracle \(\mathsf {NxtQ}_\mathsf{A}\) that returns the “next query” of \(\mathsf{A}\) after accepting a response to the previous query from \(\mathsf{R}\). When \(\mathsf{A}\) halts (i.e. \(\mathsf {NxtQ}_\mathsf{A}\) returns a query to \(\mathsf {Fin}\)), the oracle resets itself to start again with the same random tape and input \( pk \).

Definition 2

A restricted black-box reduction from \(\mathsf {mUFCMA}\) to \(\mathsf {UFCMA}\) for signature scheme \((\mathsf {Gen}, \mathsf {Sign}, \mathsf {Ver})\) is an oracle algorithm \(\mathsf{R}\), playing \(\mathsf {UFCMA}\), that respects the following restrictions for any \(\mathsf{A}\):

  1. 1.

    \(\mathsf{R}^\mathsf{A}\) starts by forwarding its initial input (consisting of the security parameter and public key) to \(\mathsf {NxtQ}_\mathsf{A}\).

  2. 2.

    When the oracle \(\mathsf {NxtQ}_\mathsf{A}\) emits a query for \(\mathsf {ProcSign}(m)\), \(\mathsf{R}\) forwards m to its own signing oracle \(\mathsf {ProcSign}\) and returns the result to \(\mathsf {NxtQ}_\mathsf{A}\), possibly after some computation.

  3. 3.

    When \(\mathsf {NxtQ}_\mathsf{A}\) emits a query for \(\mathsf {ProcVer}(m^*,\sigma ^*)\), \(\mathsf{R}\) performs some computation then returns an empty response to \(\mathsf {NxtQ}_\mathsf{A}\).

  4. 4.

    When \(\mathsf{R}\) queries \(\mathsf {Fin}(m^*,\sigma ^*)\), the value \((m^*,\sigma ^*)\) will be amongst the values that \(\mathsf {NxtQ}_\mathsf{A}\) returned as a query to \(\mathsf {ProcVer}\).

Finally we say that \(\mathsf{R}\) is advantage-preserving if there exists an absolute constant \(1/2 < c \le 1\) such that for all adversaries \(\mathsf{A}\) and all random tapes r for \(\mathsf{A}\),

$$\begin{aligned} \mathbf {Succ}(\mathsf {UFCMA}^{\mathsf{R}^\mathsf{A}} \mid r) \ge c\cdot \mathbf {Succ}(\mathsf {mUFCMA}^\mathsf{A} \mid r), \end{aligned}$$
(2)

where \(\mathbf {Succ}(\cdot \mid r)\) is exactly \(\mathbf {Succ}(\cdot )\) conditioned on the tape of \(\mathsf{A}\) being fixed to r.

These restrictions force \(\mathsf{R}\) to behave in a combinatorial manner that is amenable to a connection to streaming lower bounds. The final condition, requiring \(\mathsf{R}\) to preserve the advantage of \(\mathsf{A}\) for all random tapes, is especially restrictive. At the end of the section we discuss directions for considering more general \(\mathsf{R}\).

Theorem 2

Let \((\mathsf {Gen},\mathsf {Sign},\mathsf {Ver})\) be any signature scheme with message length \(\delta = \lambda \). Let \(\mathsf{R}\) be a restricted black-box reduction from \(\mathsf {mUFCMA}\) to \(\mathsf {UFCMA}\) that is advantage-preserving, and let p be the number of times \(\mathsf{R}\) runs \(\mathsf{A}\). Then for anyFootnote 1 \(q = q(\lambda )\) there exists an adversary \(\mathsf{A}^*\) making q signing queries, and using memory \(\mathbf {LocalMem}_2(\mathsf{A}^*) = O(\mathbf {LocalMem}_2(\mathsf {Ver}))\), such that \(\mathbf {LocalMem}_2(\mathsf{R}^{\mathsf{A}^*})=\)

$$ \varOmega (\min \{\frac{q}{p+1},\frac{2^\lambda }{p+1}\}) - O(\log q) - \max \{\mathbf {LocalMem}_2(\mathsf {Gen}),\mathbf {LocalMem}_2(\mathsf {Ver})\}.$$

Proof

Let \(\mathsf{R}\) be a restricted black-box reduction for \((\mathsf {Gen}, \mathsf {Sign}, \mathsf {Ver})\) that is advantage-preserving for some \(c\ge 1/2\). We proceed fixing an adversary \(\mathsf{A}^*\) and using \(\mathsf{R}^{\mathsf{A}^*}\) to construct a streaming algorithm \(\mathsf{B}\), making \(p+1\) passes on its stream, such that

$$\begin{aligned} \Pr [\mathsf{B}^{\mathsf {O}_\mathbf {y}}(2^\delta ,n) = G(\mathbf {y})] \ge c \end{aligned}$$
(3)

for all n and all \(\mathbf {y}\in (\{0,1\}^\lambda )^n\). We will apply the streaming lower bound on computing G (Theorem 1) to \(\mathsf{B}\), and then relate the memory used by \(\mathsf{B}\) to that of \(\mathsf{R}^{\mathsf{A}^*}\) to obtain the theorem.

We start by fixing the adversary \(\mathsf{A}^*\). It takes as input the security parameter \(\lambda \) and public key \( pk \). Then \(\mathsf{A}^*\) selects q random messages \(m_1,\ldots ,m_q\), and queries them to \(\mathsf {ProcSign}\), and ignores the outputs. Next \(\mathsf{A}^*\) selects q more random messages \(m'_1,\ldots ,m'_q\), and for each \(m'_j\) it forges a signature \(\sigma '_j\) by brute force and queries \((m'_j,\sigma '_j)\) to \(\mathsf {ProcVer}\). After the verification queries, it halts.

We record two facts about \(\mathsf{A}^*\). Let \(\mathbf {y}\in (\{0,1\}^\lambda )^{2q}\) the vector consisting of all of its queried messages, in order (the first q to \(\mathsf {ProcSign}\), and the second q to \(\mathsf {ProcVer}\) along with signatures). First, if \(G(\mathbf {y}) = 0\), then \(\mathbf {Succ}(\mathsf {mUFCMA}^{\mathsf{A}^*} \mid \mathbf {y}) = 0\) because \(\mathsf{A}^*\) will not issue any queries with a fresh forgery. If however \(G(\mathbf {y}) = 1\), then \(\mathbf {Succ}(\mathsf {mUFCMA}^{\mathsf{A}^*}\mid \mathbf {y}) = 1\) because \(\mathsf{A}^*\) will issue at least one fresh forgery to the verification oracle.

Algorithm \(\mathsf{B}^{{\mathsf {O}_\mathbf {y}}}\) will run \(\mathsf{R}^{\mathsf{A}^*}\), which expects input \( pk \), oracles for \(\mathsf {ProcSign}\), \(\mathsf {Fin}\) (for the \(\mathsf {UFCMA}\) game) and oracle \(\mathsf {NxtQ}_{\mathsf{A}^*}\) for an adversary. \(\mathsf{B}^{{\mathsf {O}_\mathbf {y}}}\) works as follows, on input \((2^\lambda ,n:=2q)\):

  • \(\mathsf{B}\) starts by initializing a \(\log n\)-bit counter \(i\leftarrow 0\), running , and running \(\mathsf{R}\) on input \( pk \).

  • \(\mathsf{B}\) responds the oracle query \(\mathsf {ProcSign}(m)\) from \(\mathsf{R}\) by returning \(\mathsf {Sign}( sk ,m)\).

  • When \(\mathsf{R}\) queries \(\mathsf {NxtQ}_{\mathsf{A}^*}\), \(\mathsf{B}\) ignores the input and responds as follows:

    • If \(i< n/2\), then \(\mathsf{B}\) queries \({\mathsf {O}_\mathbf {y}}\), which returns \(\mathbf {y}_1[i]\), and has \(\mathsf {NxtQ}_{\mathsf{A}^*}\) return \(\mathsf {ProcSign}(\mathbf {y}_1[i])\) as the next query.

    • If \(i \ge n/2\), it queries \({\mathsf {O}_\mathbf {y}}\) to get \(\mathbf {y}_2[j]\) (where \(j=i-n/2\)). Then \(\mathsf{B}\) computes a valid signature \(\sigma _j\) by brute force, and increments i modulo n. It then has \(\mathsf {NxtQ}_{\mathsf{A}^*}\) return \(\mathsf {ProcVer}(\mathbf {y}_2[j],\sigma )\) as the next query.

  • When \(\mathsf{R}\) queries \(\mathsf {Fin}(m^*,\sigma ^*)\), \(\mathsf{B}\) performs another pass on its stream and checks if \(m^*\) appears anywhere in \(\mathbf {y}_1\). If it does, then it outputs 0 and otherwise it outputs 1.

We now verify (3). If \(G(\mathbf {y}) = 0\) then \(\mathsf{B}^{{\mathsf {O}_\mathbf {y}}}\) will output 0 with probability 1. This is because our restrictions on \(\mathsf{R}\), which restricts it to outputting a value \(m^*\) that was queried by \(\mathsf{A}^*\) to \(\mathsf {ProcVer}\). On the other hand, if \(G(\mathbf {y})= 1\) then \(\mathsf{B}^{{\mathsf {O}_\mathbf {y}}}\) will output 1 with probability at least c. This is because \(\mathsf{A}^*\) will have success probability 1 when such a \(\mathbf {y}\) is fixed, so by (2) \(\mathsf{R}^{\mathsf{A}^*}\) has success probability at least c, and \(\mathsf{B}\) outputs 1 whenever \(\mathsf{R}\) succeeds in the simulated \(\mathsf {mUFCMA}\) game.

It is clear that \(\mathsf{B}\) makes \(p+1\) passes on its stream, where p is the number of times \(\mathsf{R}^{\mathsf{A}^*}\) runs \(\mathsf{A}^*\). Applying Theorem 1 to \(\mathsf{B}\) we have

$$ \mathbf {LocalMem}_2(\mathsf{B}) = \varOmega (\min \{n/(p+1), 2^{\lambda }/(p+1)\}). $$

On the other hand, by the construction of \(\mathsf{B}\) we have that \(\mathbf {LocalMem}_2(\mathsf{B}) \)

$$ =O(\mathbf {LocalMem}_2(\mathsf{R}^{\mathsf{A}^*})) + \max \{\mathbf {LocalMem}_2(\mathsf {Gen}),\mathbf {LocalMem}_2(\mathsf {Ver})\}) $$

Combining the two bounds on \(\mathbf {LocalMem}_2(\mathsf{B})\), and noting that \(q = \varTheta (n)\), gives the theorem.   \(\square \)

4.3 \(\mathsf {mCR}_t\)-to-\(\mathsf {CR}_t\) Lower Bound

Black-box reductions for \(\mathsf {mCR}_t\) to \(\mathsf {CR}_t\). Similar to the case with signatures, we formalize a class of reductions from \(\mathsf {mCR}_t\) to \(\mathsf {CR}_t\) for a hash function \(\mathsf{H}\). Let \(\mathsf{R}\) be an oracle algorithm \(\mathsf{R}^{\mathsf{A}}\) that play the \(\mathsf {CR}_t\) game (with the only oracle being \(\mathsf {Fin}\)), and additionally has access to an oracle \(\mathsf {NxtQ}_{\mathsf{A}}\) that returns the next query or some adversary playing the game \(\mathsf {mCR}_t\). The only oracles in \(\mathsf {mCR}_t\) are \(\mathsf {ProcInput}\) and \(\mathsf {Fin}\), so \(\mathsf {NxtQ}_{\mathsf{A}}\) either returns a domain point m or halts \(\mathsf{A}\). As before, the oracle resets itself after the last query by \(\mathsf{A}\), with the same input and random tape.

Definition 3

A restricted black-box reduction from \(\mathsf {mCR}_t\) to \(\mathsf {CR}_t\) for a hash function \(\mathsf{H}\) is an oracle algorithm \(\mathsf{R}\), playing \(\mathsf {CR}_t\), that respects the following restrictions for any \(\mathsf{A}\):

  1. 1.

    \(\mathsf{R}^\mathsf{A}\) starts by forwarding its initial input (consisting of the security parameter and hashing key) to \(\mathsf {NxtQ}_\mathsf{A}\).

  2. 2.

    When \(\mathsf{R}\) queries \(\mathsf {Fin}(m_1,\ldots ,m_t)\), the values \(m_1,\ldots ,m_t\) will be amongst the values that \(\mathsf {NxtQ}_\mathsf{A}\) returned as a query to \(\mathsf {ProcInput}\).

Finally we say that \(\mathsf{R}\) is advantage-preserving if there exists an absolute constant \(1/2 < c \le 1\) such that for all adversaries \(\mathsf{A}\) and all random tapes r for \(\mathsf{A}\),

$$\begin{aligned} \mathbf {Succ}(\mathsf {mCR}_t^{\mathsf{R}^\mathsf{A}} \mid r) \ge c\cdot \mathbf {Succ}(\mathsf {CR}_t^\mathsf{A} \mid r), \end{aligned}$$
(4)

where \(\mathbf {Succ}(\cdot \mid r)\) is exactly \(\mathbf {Succ}(\cdot )\) conditioned on the tape of \(\mathsf{A}\) being fixed to r.

Theorem 3

Let \(\mathsf{H}\) be the function (with empty hash key) that truncates the last \(\lambda \) bits of its input. Let \(\mathsf{R}\) be a restricted black-box reduction from \(\mathsf {mCR}_t\) to \(\mathsf {CR}_t\) that is advantage-preserving and let p be the number of times \(\mathsf{R}\) runs \(\mathsf{A}\). Then for anyFootnote 2 \(q = q(\lambda ) \le 2^\lambda \) there exists an adversary \(\mathsf{A}^*\) making q signing queries, and using memory \(\mathbf {LocalMem}_2(\mathsf{A}^*) = O(\lambda )\), such that

$$\begin{aligned}&\mathbf {LocalMem}_2(\mathsf{R}^{\mathsf{A}^*}) = \varOmega (\min \{q/p,2^\lambda /p\}). \end{aligned}$$

Proof

We proceed similarly to the proof of Theorem 2, but we now construct a streaming algorithm \(\mathsf{B}^{{\mathsf {O}_\mathbf {y}}}\) for \({F_{\infty ,t}}\) instead of G. Let \(\mathsf{R}\) be a restricted black-box reduction for \(\mathsf{H}\) that is advantage-preserving for some \(c\ge 1/2\). We will fix an adversary \(\mathsf{A}^*\) and use \(\mathsf{R}^{\mathsf{A}^*}\) to construct a streaming algorithm \(\mathsf{B}\), making p passes on its stream, such that

$$\begin{aligned} \Pr [\mathsf{B}^{\mathsf {O}_\mathbf {y}}(2^\delta ,n) = {F_{\infty ,t}}(\mathbf {y})] \ge c \end{aligned}$$
(5)

for all n and all \(\mathbf {y}\in (\{0,1\}^\lambda )^n\).

The adversary \(\mathsf{A}^*\) works as follows: On input \(\lambda \) (and empty hash key), it chooses q random messages \(m_1,\ldots ,m_q\) and queries \(m_i\Vert i\) to its \(\mathsf {ProcInput}\) oracle, where i is encoded in \(\lambda \) bits. It then queries \(\mathsf {Fin}\) and halts.

Let \(\mathbf {y}\in (\{0,1\}^\lambda )^{q}\) be the vector consisting of all of messages queried to \(\mathsf {ProcInput}\). If \({F_{\infty ,t}}(\mathbf {y}) = 0\), then \(\mathbf {Succ}(\mathsf {mCR}_t^{\mathsf{A}^*}|\mathbf {y}) = 0\) because there will be no t-collision in the queries of \(\mathsf{A}^*\). If however \({F_{\infty ,t}}(\mathbf {y}) = 1\), then \(\mathbf {Succ}(\mathsf {mUFCMA}^{\mathsf{A}^*}|\mathbf {y}) = 1\) because \(\mathsf{A}^*\) there will be a t-collision, as the hash function \(\mathsf{H}\) is defined to truncate the final \(\lambda \) bits of its inputs, which consist of the counter value.

The streaming algorithm \(\mathsf{B}^{{\mathsf {O}_\mathbf {y}}}(2^\lambda ,q)\) works as follows. It initializes a counter i to 0 and runs \(\mathsf{R}\). When \(\mathsf{R}\) requests an input from \(\mathsf {NxtQ}_{\mathsf{A}^*}\), \(\mathsf{B}^{{\mathsf {O}_\mathbf {y}}}\) queries its oracle for \(\mathbf {y}[i]\) and returns \(\mathbf {y}[i]\Vert i\) to \(\mathsf{R}\). When \(\mathsf{R}\) halts by calling \(\mathsf {Fin}(m_1,\ldots ,m_t)\), \(\mathsf{B}^{{\mathsf {O}_\mathbf {y}}}\) simply checks if the messages are all of the form \(y\Vert i\) for a fixed y and different values of i. If so, it outputs 1 and otherwise it outputs 0.

It is easy to verify that \(\mathsf{B}\) satisfies (5) and that it makes p passes on its input stream. Therefore by Theorem 1 we have

$$ \mathbf {LocalMem}_2(\mathsf{B}) = \varOmega (\min \{q/p, 2^{\lambda }/p\}). $$

By construction we also have

$$ \mathbf {LocalMem}_2(\mathsf{B}) = O(\mathbf {LocalMem}_2(\mathsf{R}^{\mathsf{A}^*})). $$

Combining these inequalities gives the theorem.    \(\square \)

Sharpness of the bounds. We observe that when one is not concerned with memory-tightness then it is trivial to reduce t-multi-collision-resistance to t-collision-resistance, by simply storing all inputs to \(\mathsf {ProcInput}\) and checking for collisions. This will however be non-tight if the \(\mathsf {mCR}_t\) adversary uses small memory but produces a large number of domain points (i.e. q is large). Memory tightness can be achieved via rewinding O(q) times, but this increases the runtime of the reduction.

Theorem 4

Let \(\mathsf{H}:\{0,1\}^\kappa \times \{0,1\}^\lambda \rightarrow \{0,1\}^\lambda \) be a hash function and let t be a fixed natural number. Then for all adversaries \(\mathsf{A}\) in the \(\mathsf {mCR}_t\) game with parameter \( \lambda \) making q queries to \(\mathsf {ProcInput}\) and for all natural numbers \(1\le c,p,m\le q<2^\lambda \) such that \(c\cdot p \cdot m=q\) there exists an adversary \(\mathsf{B}\) in the \(\mathsf {CR}_t\) game such that

$$\begin{aligned} \mathbf {Succ}(\mathsf {CR}_t^\mathsf{B})&\ge \frac{1}{2c}\cdot \mathbf {Succ}(\mathsf {mCR}_t^\mathsf{A}), \\ \mathbf {LocalTime}(\mathsf{B})&\le (2p+1) \cdot \mathbf {LocalTime}(\mathsf{A}) + (mp(q+1)+q)\cdot \mathbf {Time}(\mathsf{H})\\ \mathbf {LocalMem}(\mathsf{B})&= \mathbf {LocalMem}(\mathsf{A})+ \mathbf {Mem}(\mathsf{H}) +3m+t +3. \end{aligned}$$

If we choose \(c=1\) and \(m=q/p\), this theorem proves that the lower bound from Theorem 3 is sharp.

Proof

By assumption \(m = q/cp\). Let \(\mathsf{A}\) be an adversary in the \(\mathsf {mCR}_t\) game. For simplicity we assume that \(\mathsf{A}\) is deterministic, otherwise we can apply the PRF coin fixing technique from Sect. 3.2.

Fig. 10.
figure 10

Adversary \(\mathsf{B}\) in the \(\mathsf {CR}_t\) game. By \(\mathsf{A}(j)\) we denote the j-th out of q inputs of \(\mathsf{A}\) to \(\mathsf {ProcInput}\).

Consider adversary \(\mathsf{B}\) as defined in Fig. 10. First, \(\mathsf{B}\) stores the hash values of m out of the q inputs of \(\mathsf{A}\) to \(\mathsf {ProcInput}\). Note that \(\mathsf{A}\) only needs to be run once to perform these operations in line 05, as the indices \(i_1\) to \(i_m\) can be sorted. Then it rewinds \(\mathsf{A}\) to the start and checks for collisions of the stored hash values with all of the hash values of \(\mathsf{A}\)’s inputs to \(\mathsf {ProcInput}\). Assume that at least t of \(\mathsf{A}\)’s inputs have the same hash value. Then in each execution of the loop starting in line 01 \(\mathsf{B}\) succeeds in finding the colliding messages if it stored the corresponding hash value. The probability of this event is bounded from below by \( m/q=1/cp \). The loop is repeated p times with freshly sampled \( i_1,\dots ,i_m \). Thus

figure b

This implies . When \(\mathsf{B}\) finds a collision, it rewinds \(\mathsf{A}\) one last time to obtain the preimages of the t colliding values.

So overall, \(\mathsf{B}\) runs \(\mathsf{A}\) at most \(2p+1\) times and the hash algorithm \(\mathsf{H}\) at most \(p(m+mn)+q\) times. It needs to store \(2m+3\) counters of size \(\log q\le \lambda \) (i.e. \( 2m+3 \) memory units), m values from \(\mathsf{H}\)’s range \(\{0,1\}^\rho \) (i.e. m memory units) and the t elements from \(\{0,1\}^\delta \) that collide under \(\mathsf{H}\) (i.e. t memory units) and provide memory for \(\mathsf{A}\) and \(\mathsf{H}\).    \(\square \)

Limitations, extensions, and open problems. Our notion of black-box reductions assumes that the reduction will only run the adversary \(\mathsf{A}\) from beginning to end, each time with the same random tape. It would be interesting to generalize the reduction to allow for partial rewinding of \(\mathsf{A}\), and also for saving “snapshots” of the state of \(\mathsf{A}\) that allow for rewinding.

Our restrictions on black-box reductions confine them to essentially work like combinatorial streaming algorithms. It seems likely that these restrictions can be greatly relaxed by using a different notion of black-box reduction and using pathological (unbounded) signature schemes and hash functions to enforce the combinatorial behavior of the reduction with high probability. We pursued our version of the results for simplicity.

5 Memory-Tight Reduction for RSA Full Domain Hash Signatures

This section gives an example of a memory-tight reduction obtained via the techniques of Sect. 3. We first recall the syntax of signature schemes and recall the RSA assumption. Then we show how the RSA Full Domain Hash signature scheme can be shown secure in the random oracle model using coin replacement, random oracle replacement, single rewinding, and the random oracle index guessing technique. For subtle reasons we implement all techniques using a single PRF to obtain a memory tight reduction.

Signature schemes. A signature scheme consists of algorithms \(\mathsf {Gen}\),\(\mathsf {Sign}\),\(\mathsf {Ver}\) such that: algorithm \(\mathsf {Gen}\) generates a verification key \( pk \) and a signing key \( sk \); on input of a signing key \( sk \) and a message m algorithm \(\mathsf {Sign}\) generates a signature \(\sigma \) or the failure indicator \(\bot \); on input of a verification key \( pk \), a message m, and a candidate signature \(\sigma \), deterministic algorithm \(\mathsf {Ver}\) outputs 0 or 1 to indicate rejection and acceptance, respectively. A signature scheme is correct if for all \( sk , pk ,m\), if \(\mathsf {Sign}( sk ,m)\) outputs a signature then \(\mathsf {Ver}\) accepts it. Recall that the standard security notion of existential unforgeability against chosen message attacks is defined in Sect. 2.3 via the game of Fig. 2.

RSA assumption. Let \(\mathsf {GenRSA}_\lambda \) be an algorithm that returns \((N=pq, e, d)\), where p and q are distinct primes of bit size \( \lambda /2 \) and ed are such that \(e=d^{-1} \bmod \varPhi (N)\).

Definition 4

(RSA Assumption). Game \(\mathsf {RSA}_\lambda \) defining the hardness of RSA relative to \(\mathsf {GenRSA}_\lambda \) is depicted in Fig. 11.

Fig. 11.
figure 11

The \(\mathsf {RSA}_\lambda \) game relative to algorithm \(\mathsf {GenRSA}_\lambda \).

RSA-FDH. The RSA Full Domain Hash (RSA-FDH) signature scheme [7] is defined in Fig. 12. Its security can be reduced to the RSA assumption in the random oracle model (see [8, 14]). In the usual proof the reduction interacting with an adversary against RSA-FDH’s existential unforgeability making up to \( q_H \) hash queries and up to \( q_s \) signing queries simulates the random oracle using lazy sampling and therefore has to store up to \( (q_H+q_s) \) messages making the reduction highly non-memory-tight. However, the proof can be made memory-efficient by using the coin replacement technique of Sect. 3.2, the random oracle technique of Sect. 3.3, the random oracle index guessing technique of Sect. 3.4, and the single rewinding technique of Sect. 3.5.

Fig. 12.
figure 12

The RSA-FDH signature scheme for parameter \(\lambda \).

Theorem 5

Let \(\mathsf{F}:\{0,1\}^{\lambda }\times \{0,1\}^\lambda \rightarrow \{0,1\}^{2 \lambda +1}\) be a PRF. Then for every adversary \(\mathsf{A}\) in the \(\mathsf {UFCMA}\) game for RSA-FDH with parameter \(\lambda \) that poses \(q_H\) queries to the \(\mathsf {Hash}\), \(q_s\) queries to the \(\mathsf {ProcSign}\) oracle, and samples at most \( L \le 2^{\lambda }\) memory units of randomness, in the random oracle model there exist an adversary \(\mathsf{B}_1\) against the \(\mathsf {RSA}_\lambda \) game, an adversary \(\mathsf{B}_2\) against the \(\mathsf {PRF}\) game such that

$$ \mathbf {Succ}(\mathsf {UFCMA}^\mathsf{A})\le e\,q_s\,\mathbf {Succ}(\mathsf {RSA}_\lambda ^{\mathsf{B}_2}) + e\,q_s\,\mathbf {Adv}(\mathsf {PRF}^{\mathsf{B}_1}). $$

Further it holds that

$$\begin{aligned} \mathbf {LocalMem}(\mathsf{B}_1)&= \mathbf {LocalMem}(\mathsf{A}) + \mathbf {Mem}(\mathsf {GenRSA}_\lambda ) + 6,\\ \mathbf {LocalMem}(\mathsf{B}_2)&= \mathbf {LocalMem}(\mathsf{A}) + \mathbf {Mem}(\mathsf{F}) + 6,\\ \mathbf {LocalTime}(\mathsf{B}_1)&\approx 2\mathbf {LocalTime}(\mathsf{A}) + \mathbf {Time}(\mathsf {RSA}_\lambda ), \\ \mathbf {LocalTime}(\mathsf{B}_2)&\approx \mathbf {LocalTime}(\mathsf{A})+(q_H+q_s+L)\cdot \mathbf {Time}(\mathsf{F}). \end{aligned}$$

Note that in the proof of Theorem 5 it is necessary to apply the random coins technique and the random oracle technique in the same step. Otherwise one obtains an intermediate reduction that is not memory-tight: the reduction either has to simulate the random oracle by lazy sampling (in case the random coins technique is applied first) or, since rewinding is impossible, it has to store the messages asked to the signing oracle (if the random oracle technique is applied first).

Fig. 13.
figure 13

Games \(\mathsf {{G}^{{}}_{{0}}}\) to \(\mathsf {{G}^{{}}_{{3}}}\) for the proof of Theorem 5.

Proof

Consider the sequence of games of Fig. 13. For computations in \(\mathbb {Z}_N\) we omit writing \(\bmod \, N\) if it is clear from the context. We assume without loss of generality that any message procedures \(\mathsf {ProcSign}\) or \( \mathsf {Fin}\) are queried on was before already queried to \(\mathsf {Hash}\).

Game \(\mathsf {{G}^{{}}_{{0}}}\) is the standard \(\mathsf {UFCMA}\) game as in Fig. 2 instantiated with the RSA-FDH algorithms and with the randomness for adversary \(\mathsf{A}\) provided via procedure \( \mathsf {Coins}\), so

$$\begin{aligned} \mathbf {Succ}(\mathsf {UFCMA^A}) = \mathbf {Succ}(\mathsf {{G}^{\mathsf{A}}_{{0}}}). \end{aligned}$$
(6)

In \(\mathsf {{G}^{{}}_{{1}}}\), instead of returning \(\mathsf{H}(m)\), the \(\mathsf {Hash}\) procedure returns \(\mathsf{H}(m)^e\) and the \(\mathsf {ProcSign}\) procedure computes signatures as \((\mathsf{H}(m)^e)^d=\mathsf{H}(m)\) accordingly. This doesn’t change the distribution of the hash values and the signatures, so

$$\begin{aligned} \mathbf {Succ}(\mathsf {{G}^{\mathsf{A}}_{{0}}}) = \mathbf {Succ}(\mathsf {{G}^{\mathsf{A}}_{{1}}}). \end{aligned}$$
(7)

Game \( \mathsf {{G}^{{}}_{{2}}} \) introduces a couple of aborting conditions. With probability \( 1/q_s \) abort condition \( B[m^*]=0 \) of line 17 does not occur. Furthermore, for each message \( m_i \) the probability that abort condition \( B[m_i]=1 \) of line 12 does not occur is given by \( 1-1/q_s \). Adversary \(\mathsf{A}\) makes at most \( q_s \) queries to \( \mathsf {ProcSign}\). Hence,

(8)
Fig. 14.
figure 14

Adversary \(\mathsf{B}_1\) against the \(\mathsf {PRF}\) game for the proof of Theorem 5 in Sect. 5. \(\mathsf{B}_1\) rewinds \(\mathsf{A}\) once on the same inputs. Lines marked with (i) are only executed during the i-th invocation.

In Game \( \mathsf {{G}^{{}}_{{3}}} \) randomness is replaced by PRF \( \mathsf{F}\), whose range we split into \(\mathsf{F} = \mathsf{F}_0 \vert \vert \mathsf{F}_1 \vert \vert \mathsf{F}_2 \in \{0,1\}^\lambda \times \{0,1\}^\lambda \times \{0,1\}\). Sampling of random coins is replaced in Game \( \mathsf {{G}^{{}}_{{3}}}\) by evaluating \(\mathsf{F}_0\) on counter j, sampling the values \( H[m_i] \) and \( B[m_i] \) is replaced by evaluating \( \mathsf{F}_1 \) and \(\mathsf{F}_2\) on \( m_i \), respectively. For simplicity we assume that \(\mathsf{F}_2\) is a pseudorandom function that outputs elements in \(\mathbb {Z}_N \approx \{0,1\}^\lambda \) and that \(\mathsf{F}_2\) is a \(\alpha \)-biased pseudorandom function with \(\alpha :=1/q_s\). (This is formally not correct but we do not want to distract from the main points of our proof, which is about memory-tightness.) We proceed by constructing an adversary \(\mathsf{B}_1 \) against the \( \mathsf {PRF}\) game such that

$$\begin{aligned} \mathbf {Adv}(\mathsf {PRF}^{\mathsf{B}_1})&\ge |\mathbf {Succ}(\mathsf {{G}^{\mathsf{A}}_{{2}}})-\mathbf {Succ}(\mathsf {{G}^{\mathsf{A}}_{{3}}})|, \end{aligned}$$
(9)
$$\begin{aligned} \mathbf {LocalTime}(\mathsf{B}_1)&\approx 2\mathbf {LocalTime}(\mathsf{A}) + \mathbf {Time}(\mathsf {RSA}_\lambda ),\end{aligned}$$
(10)
$$\begin{aligned} \mathbf {LocalMem}(\mathsf{B}_1)&= \mathbf {LocalMem}(\mathsf{A}) + \mathbf {Mem}(\mathsf {GenRSA}_\lambda ) + 6 . \end{aligned}$$
(11)

The definition of \(\mathsf{B}_1\) is in Fig. 14. Adversary \(\mathsf{B}_1\) sets up the values  (Ned) using \( \mathsf {GenRSA}\), samples , sets \( y\leftarrow x^e \) and runs \(\mathsf{A}\) on input  (Ne) . It simulates the procedures \( \mathsf {Hash}\), \( \mathsf {ProcSign}\) and \( \mathsf {Coins}\) by invoking its PRF oracle \( \mathsf {O_F}\). When \(\mathsf{A}\) calls \( \mathsf {Fin}\) on message-signature pair \( (m^*,\sigma ^*) \) adversary \(\mathsf{B}_1\) rewinds \(\mathsf{A}\) to line 03, answering all of its queries in the same way. Note that this is possible, since all replies to queries on \( \mathsf {Hash}\), \( \mathsf {ProcSign}\) and \( \mathsf {Coins}\) are derived using \( \mathsf {O_F}\). During the rewinding \(\mathsf{B}_1\) raises a flag \( \mathsf {coll} \) if \(\mathsf{A}\) queries procedure \( \mathsf {ProcSign}\) on \( m^* \). Hence the event \( \{\mathsf {coll}=1\} \) is equivalent to condition \( m^*\in M \) of line 19 of games \( \mathsf {{G}^{{}}_{{2}}} \) and \( \mathsf {{G}^{{}}_{{3}}} \). When \(\mathsf{A}\) calls \( \mathsf {Fin}\) a second time on \( (m^*,\sigma ^*) \), adversary \(\mathsf{B}_1\) stops with 0 or 1 according to the message-signature pair. If \(\mathsf{B}_1\) interacts with PRF-game \( \mathsf {Random}\) it provides \(\mathsf{A}\) with a perfect simulation of game \( \mathsf {{G}^{{}}_{{2}}} \), if it interacts with \( \mathsf {Real}\) with a perfect simulation of game \( \mathsf {{G}^{{}}_{{3}}} \). Hence Eq. (9) follows. We now analyze \(\mathsf{B}_1\)’s running time and memory consumption. \(\mathsf{B}_1\) runs \( \mathsf {GenRSA}_\lambda \) once and \(\mathsf{A}\) twice and performs some minor bookkeeping. It furthermore has to store the code of \(\mathsf{A}\) and \( \mathsf {GenRSA}_\lambda \) as well as at any point in time \(6 \lambda \) bits which equals 6 memory units (i.e., the three integers (Ney) of size \(3 \lambda \), up to two messages of length \( \lambda \) each and a counter of size \( \log _2(L) \le \lambda \).

Fig. 15.
figure 15

Adversary \(\mathsf{B}_2 \) against the \(\mathsf {RSA}_\lambda \) game for the proof of Theorem 5 in Sect. 5.

We conclude the proof by giving an adversary \(\mathsf{B}_2\) against the \( \mathsf {RSA}_\lambda \) game such that

$$\begin{aligned} \mathbf {Succ}(\mathsf {RSA}_\lambda ^{\mathsf{B}_2})\ge & {} \mathbf {Succ}(\mathsf {{G}^{\mathsf{A}}_{{3}}}) , \end{aligned}$$
(12)
$$\begin{aligned} \mathbf {LocalTime}(\mathsf{B}_2)\approx & {} \mathbf {LocalTime}(\mathsf{A})+(q_H+q_s+L)\mathbf {Time}(\mathsf{F})\end{aligned}$$
(13)
$$\begin{aligned} \mathbf {LocalMem}(\mathsf{B}_2)= & {} \mathbf {LocalMem}(\mathsf{A}) + \mathbf {Mem}(\mathsf {GenRSA}_\lambda ) + 6 . \end{aligned}$$
(14)

Then the claim of the theorem follows from Eq. (7) to (9) and Sect. 5. The definition of \(\mathsf{B}_2\) is in Fig. 15. It queries \(\mathsf {Init}_\mathsf {RSA}\) to receive an RSA challenge (Ney) and samples a PRF key k. Then it invokes \(\mathsf{A}\) on input (Ne) providing it with a perfect simulation of the procedures \(\mathsf {Hash}\), \(\mathsf {ProcSign}\) and \(\mathsf {Coins}\). When \(\mathsf{A}\) invokes procedure \( \mathsf {Fin}\) on message-signature pair \((m^*,\sigma ^*)\), adversary \(\mathsf{B}_2\) checks whether \( \mathsf{F}_2(k,m^*)=0 \) and—if so—aborts. Note that by definition of procedure \( \mathsf {Hash}\) adversary \(\mathsf{B}_2\) not aborting implies that \( \mathsf {Hash}(m^*)= (\mathsf{F}_1(k, m^*))^ey \). Hence if \(\mathsf{B}_2\) does not abort and if the signature is valid, i.e. \( (\sigma ^*)^e=\mathsf {Hash}(m^*) \) holds, then \(\mathsf{B}_2\)’s answer \(x^*=\sigma /\mathsf{F}_1(k,m^*)\) to the RSA challenge is valid. Since \( \mathsf{A} \) succeeding in game \(\mathsf {{G}^{{}}_{{3}}}\) implies both aforementioned conditions Sect. 5 follows. We conclude the proof by analyzing \(\mathsf{B}_2\)’s running time and memory consumption. \(\mathsf{B}_2\) runs \(\mathsf{A}\) once and \(\mathsf{F}\) up to \( (q_H+q_s+L) \) times and performs some minor bookkeeping. Furthermore it has to store the code of \(\mathsf{A}\) and \(\mathsf{F}\) as well as at any point in time \(6 \lambda \) bits which equals 6 additional memory units (i.e., a counter of bit-size \( \log _2(L) \le \lambda \), a PRF key of bit-size \( \kappa \le \lambda \), a message of bit-size \( \lambda \) and three integers of size \( \lambda \)).    \(\square \)

6 Memory-Sensitive Problems

In this section we discuss the memory sensitivity of two cryptographic problems, multi-collision-resistance and learning parities with noise. In the full version of this paper [3], we will also analyze the memory sensitivity of the discrete logarithm problem in prime fields and of the factoring problem.

To quantify the memory sensitivity of a problem \(\mathsf{P}\) we plot time/memory trade-offs as in the Fig. 1. The horizontal axis is memory consumption and the vertical axis is running time, both on a log scale. A point (xy) is either labeled with “solvable” or “unsolvable”, where solvable means that there exists an algorithm with running time at most \(2^x\) and memory consumption at most \(2^y\) that solves the problem. We refer to the boundary between the solvable and unsolvable regions as the transition line.

A time/memory trade-off plot of a non-memory-sensitive problem typically has an (approximately) horizontal transition line, and as discused in Sect. 1, a non-memory-tight reduction has less impact. The steeper the slope of the transition line, the more memory-sensitive the problem is. We refer for the introduction for an example with concrete numbers for.

Fig. 16.
figure 16

Time memory graphs of \(\mathsf {CR}_k\) for \(k=2\) (left) and \(k=3\) (middle) and of \(\mathrm {LPN}_{\lambda ,\tau }\) for \( \lambda =1024 \) and \( \tau =1/4 \) (right). Both \( \mathbf {Time}\) and \( \mathbf {Mem}\) are in \( \log \) scale.

k -Way Collision Resistance. The k-way collision problem \(\mathsf {CR}_k\) is to find a k-collision in a hash function with \(\lambda \) output bits. The following table provides an overview over known algorithms to solve \(\mathsf {CR}_k\) with constant success probability for \(k \in \{2,3\}\).

Algorithm \(\mathsf{A}\)

\(\mathbf {Mem}(\mathsf {CR}_t^{\mathsf{A}})\)

\(\mathbf {Time}(\mathsf {CR}_t^{\mathsf{A}})\)

Birthday (\(k=2\))

O(1)

\(2^{\lambda /2}\)

Joux-Lucks [20] (\(k=3\))

\(2^\alpha \)

\(2^{\lambda (1-\alpha )}\) (\(\alpha \le 1/3\))

From the table we derive the time/memory graph of \(\mathsf {CR}_k\) in Fig. 16. \(\mathsf {CR}_3\) is memory sensitive, whereas \(\mathsf {CR}_2\) is not (as it has a horizontal transition line).

Learning Parity with Noise. Another example of a memory sensitive problem is the well-known Learning Parity with Noise (LPN) problem. Let \(\lambda \in {\mathbb {N}}\) be the dimension and be a constant that defines the error probability. The problem \(\mathsf {LPN}_{\lambda ,\tau }\) is to compute a random secret , given “noisy” random inner products with s, i.e. samples \((a_i,\nu _i)\) where , and \(\nu _i = \langle {a_i,s}\rangle +e_i\) for .

Memory usage and running time of the best known algorithms for \(\mathrm {LPN}_{\lambda ,\tau }\) with constant success probability are given in the following table.

Algorithm \(\mathsf{A}\)

\(\mathbf {LocalMem}(\mathsf {LPN}_{\lambda ,\tau }^{\mathsf{A}})\)

\(\mathbf {LocalTime}(\mathsf {LPN}_{\lambda ,\tau }^{\mathsf{A}})\)

\(\mathsf {BKW}\) [10]

\(\mathsf {Gauss}\) [15]

O(1)

Figure 16 provides the corresponding time/memory graph. Note that the recent work [15] also considers a hybrid algorithm between \(\mathsf {Well}\)-\(\mathsf {Pooled}\) \(\mathsf {Gauss}\) and \(\mathsf {BKW}\), but the interval where the hybrid algorithm actually has better performance is so small that we decided to ignore it.

We note that the situation with the Learning with Errors (LWE), the Shortest Integer Solution (SIS), and the approximate SVP problem is similar to that of the LPN problem [2, 13, 19].