Keywords

1 Introduction

A pseudorandom function (PRF) is a function \(F : \mathcal {K} \times \mathcal {D} \rightarrow \mathcal {G} \) with the following security property. For random \(k {\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}}\mathcal {K} \), the function \(F(k, \cdot )\) is computationally indistinguishable from a random function \(R(\cdot )\), given oracle access to either \(F(k, \cdot )\) or \(R(\cdot )\). PRFs are a foundational cryptographic primitive with countless applications, see [Gol01, Bel06, BG90, GGM84, Kra10] for example. While PRFs can be constructed generically from one-way functions (via pseudorandom generators) [GGM86], this generic construction is rather inefficient. Therefore we seek to construct efficient PRFs from as-weak-as-possible assumptions and with tight security proof.

Tight security. In a cryptographic security proof, we often consider an adversary \(\mathcal {A}\) against a primitive like a PRF, and describe a reduction \(\mathcal {B}\) that runs \(\mathcal {A}\) as a subroutine to break some computational problem which is assumed to be hard. Let \((t_\mathcal {A} ,\epsilon _\mathcal {A} )\) and \((t_\mathcal {B} ,\epsilon _\mathcal {B} )\) denote the running time and success probability of \(\mathcal {A}\) and \(\mathcal {B}\), respectively. Then we say that the reduction \(\mathcal {B}\) loses a factor \(\ell \), if

$$ \frac{t_\mathcal {B} }{\epsilon _\mathcal {B} } \ge \ell \cdot \frac{t_\mathcal {A} }{\epsilon _\mathcal {A} } $$

A reduction is usually considered “efficient”, if \(\ell \) is bounded by a polynomial in the security parameter. We say that a reduction is “tight”, if \(\ell \) is small. Our goal is to construct reductions \(\mathcal {B}\) such that \(\ell \) is as small as possible. Ideally we would like to have \(\ell = O(1)\) constant, but there are many examples of cryptographic constructions and primitives where this is impossible to achieve [Cor02, KK12, HJK12, LW14, BJLS16].

State of the art. Many constructions of efficient number-theoretic PRFs, including the very general Matrix-DDH-based construction of [EHK+17] (with the well-known algebraic constructions of Naor-Reingold [NR97] and Lewko-Waters [LW09] as special cases), as well as the LWE-based PRF of Banerjee, Peikert, and Rosen [BPR12], can in retrospect be seen as concrete instantiations of the augmented cascade framework of Boneh et al. [BMR10]. For these constructions, the size of the secret key and the loss in the security proof grow linearlyFootnote 1 with the length n of the function input. Thus, efficiency and security both depend on the size of the input space. In order to extend the input space to \(\{0,1\}^*\), one can generically apply a collision-resistant hash function \(H : \{0,1\}^* \rightarrow \{0,1\}^n\), where \(n = 2\lambda \) and \(\lambda \) denotes the security parameter, to the input before processing it in the PRF. This yields secret keys consisting of \(n = O(\lambda )\) elements (where the concrete type of elements depends on the particular instantiation of the augmented cascade) and a security loss of \(\ell = n = O(\lambda )\).

Contributions. We introduce all-prefix universal hash functions (APUHFs) as a special type of hash functions that are universal, even if the output of the hash function is truncated. We also describe a very simple and efficient construction, which is based on the hash function of Dietzfelbinger et al. [DHKP97], as well as a generic construction from pairwise independent hash functions with range \(\{0,1\}^n\) for some \(n \in \mathbb {N} \).

Then we show that by combining the augmented cascade with an APUHF, we are able to significantly improve both the asymptotic size of secret keys and the security loss of these constructions. Specifically, we achieve keys consisting of only a slightly super-logarithmic number of elements \(m = \omega (\log \lambda )\) and an only logarithmic security loss \(O(\log \lambda )\). Both the number of elements in the secret key and tightness are independent of the input size n, except for the key of the APUHF, which consists of n bits when instantiated with the APUHF of Dietzfelbinger et al. [DHKP97]. Based on this generic result, we then obtain simple variants of algebraic PRFs based on a large class of Matrix-DDH assumptions [EHK+17], which include the PRFs of Naor and Reingold [NR97] and its generalization by Lewko and Waters [LW09] as special cases.

Furthermore, we obtain a simple variant of the PRF of Banerjee, Peikert and Rosen [BPR12] (BPR). This PRF is based on the learning-with-errors (LWE) assumption [Reg05], and has the property that the required size of the LWE modulus depends on the length of the PRF input. More precisely, the lower bound on the LWE modulus \(p\) is exponential in the input length \(n = \varTheta (\lambda )\). We observe this in almost all the well-known LWE-based PRFs such as [BLMR13, BP14]. In order to improve efficiency and to base security on a weaker LWE assumption, it is thus desirable to make \(p\) as small as possible. We show that simply encoding the PRF input with an APUHF before processing it in the original BPR construction makes it possible to reduce the lower bound on the LWE modulus \(p\) from exponential to only slightly super-polynomial in the security parameter, which yields a weaker assumption and a significant efficiency improvement (see Sect. 5.2 for details). Furthermore, even for an arbitrary polynomially-bounded input size n, our construction requires to store only \(m = \omega (\log \lambda )\) matrices, independent of the size n of the input space \(\{0,1\}^n\), plus a single bitstring of length n when instantiated with the APUHF of Dietzfelbinger et al. [DHKP97]. In contrast, the original construction from [BPR12] requires \(\varTheta (n)\) matrices.

A similar improvement of the LWE modulus \(p\) was achieved by a different BPR variant due to Döttling and Schröder in [DS15], via a technique called on-the-fly adaptation. However, their construction requires to run \(\lambda \cdot \omega (\log \lambda )\) copies of the BPR PRF in parallel, while ours requires only a single copy plus an APUHF. Thus, our approach is significantly more efficient, and also more direct, as it essentially corresponds to the original BPR function, except that an APUHF is applied to the input. This simplicity gives not only a useful conceptual perspective on the construction of tightly secure PRFs, but it also makes schemes easier to implement securely.

Another advantage of our approach is that the resulting PRF construction is extremely simple. It is essentially identical to the augmented cascade from [BMR10], except that an APUHF h is applied to the input before it is processed by the PRF. More precisely, let \(\hat{F}^m\) be a PRF that is constructed from an m-fold application of an underlying function F via the augmented cascade construction from [BMR10]. Then our construction \(\hat{F}(K,x)\) has the form

$$ \hat{F}(K,x) := \hat{F}^m(s,h(x)) $$

where the key of our new function is a tuple \(K = (s,h)\) consisting of a random key s for the augmented cascade construction and a random function \(h {\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}}\mathcal {H} \) from a family \(\mathcal {H} = \{h : \{0,1\}^n \rightarrow \{0,1\}^m\}\) of APUHFs.

We remark that we require an additional property called perfect one-time security (“1-uniformity”) of the underlying function F of the augmented cascade, and thus technically our variant of [BMR10] is slightly less general. However, this is a minor restriction, as we show that this property is satisfied by all known instantiations of the augmented cascade. Furthermore, our security proof assumes that the reduction “knows” sufficiently close approximations of the number of queries Q and the advantage \(\epsilon _\mathcal {A} \) of the adversary. Thus, the proof shows how such non-black-box knowledge can be used to achieve more efficient PRFs with short keys and very tight security from weaker assumptions.

Technical approach. Technically, our argument is inspired by the construction of adaptively-secure PRFs from non-adaptively secure ones by Berman and Haitner [BH12]. Essentially, an augmented cascade PRF with m-bit input is a function \(\hat{F}^m : S^m \times K \times \{0,1\}^m \rightarrow K\) with key space \(S^m \times K\). In the sequel, let \((s_1, \ldots , s_m, k) \in S^m \times K\) be a key for \(\hat{F}^m\) and \(h : \{0,1\}^n \rightarrow \{0,1\}^m\). For a string \(a \in \{0,1\}^m\) we write \(a_{v:w}\) to denote the substring \((a_v, \ldots , a_w) \in \{0,1\}^{w-v+1}\) of a. Let j be an integer with \(j \le m\) (we will explain later how to choose j in a suitable way).

We start from the observation that, for each \(j \in \{1, \ldots , m\}\), we can implement an augmented cascade PRF \(\hat{F}^m\) equivalently as a two-step algorithm, which proceeds as follows.

  1. 1.

    In the first step, the function \(\hat{F}^m\) processes only the first j bits \(h(x)_{1:j}\in \{0,1\}^j\) of h(x), to compute an intermediate value \(k_x\) that depends only on the first j bits of h(x):

    $$ k_x = \hat{F}^{j}((s_1,...,s_j),k,h(x)_{1:j}) $$
  2. 2.

    Then the remaining \(m-j\) bits are processed, starting from \(k_x\), by computing

    $$ y = \hat{F}^{m-j}((s_{j+1},...,s_m),k_x,h(x)_{j+1:m}) $$

The resulting function is identical to the function \(\hat{F}^{m}\), so this is merely a specific way to implement \(\hat{F}^{m}\), which will be particularly useful to describe our approach.

To explain how we prove security, let \(x^{(1)}, \ldots , x^{(Q)}\) denote the sequence of pairwise distinct oracle queries issued by the adversary in the PRF security experiment, and suppose for now that it holds \(h(x^{(u)})_{1:j} \ne h(x^{(v)})_{1:j}\) for \(u \ne v\). Our goal is to show that then the security of \(\hat{F}^{m}\) is implied by the security of \(\hat{F}^{j}\), which is a PRF with shorter input. Intuitively, this holds due to the following two-step argument.

  1. 1.

    We replace \(\hat{F}^{j}\) with a random function R, which is computationally indistinguishable thanks to the security of \(\hat{F}^{j}\). Note that now the intermediate value \(k_x = R(h(x)_{1:j})\) is an independent random value for each oracle query made by the adversary, because we assume \(h(x^{(u)})_{1:j} \ne h(x^{(v)})_{1:j}\) for \(u \ne v\).

  2. 2.

    Next we argue that now also \(\hat{F}^{m}\) is distributed exactly like a random function. We achieve this by identifying an additional property required from \(\hat{F}^{m-j}\) that we call perfect one-time security. This property guarantees that

    $$ \Pr _{k_x{\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}}K}\left[ \hat{F}^{m-j}((s_{j+1},...,s_m),k_x,h(x)_{j+1:m}) = y \right] = \frac{1}{|K|} $$

    for all \((s_{j+1},...,s_m), h(x)_{j+1:m}, y) \in S^{m-j} \times \{0,1\}^{m-j} \times K\). This is sufficient to show that indeed now the function

    $$ \hat{F}^{m-j}((s_{j+1},...,s_m),R(h(x)_{1:j}),h(x)_{j+1:m}) $$

    is a random function, because we have \(h(x^{(u)})_{1:j} \ne h(x^{(v)})_{1:j}\) for \(u \ne v\).

It remains to ensure that \(h(x^{(u)})_{1:j} \ne h(x^{(v)})_{1:j}\) holds for all \(u \ne v\) with “sufficiently large” probability and for some “sufficiently small” value of j. Here we use the all-prefix universal hash function, in combination with an argument which on a high level follows similar proofs from [BH12] and [DS15]. The main difference is that we use the all-prefix universality to argue that setting \(j := \left\lceil \log (2Q^2/\epsilon _\mathcal {A} ) \right\rceil = O(\log \lambda )\), where Q is the number of oracle queries made by the adversary in the PRF security experiment and \(\epsilon _\mathcal {A} \) is its advantage, is sufficient to guarantee that \( h(x^{(u)})_{1:j} \ne h(x^{(v)})_{1:j} \) holds with sufficiently large probability for all \(u \ne v\).

Note that we have \(j = O(\log \lambda )\), so that we only have to require security of a “short-input” augmented cascade \(\hat{F}^{j}\) with \(j = O(\log \lambda )\). For our algebraic instantiations based on Matrix-DDH problems, this yields tightness with a security loss of only \(O(\log \lambda )\). For our application to the LWE-based PRF of Banerjee, Peikert and Rosen [BPR12], this yields that we have to require only a weaker LWE assumption. Furthermore, since we need only that \(m \ge j\) holds for all possible values of j, and we have \(j = \left\lceil \log (2Q^2/\epsilon _\mathcal {A} ) \right\rceil = O(\log \lambda )\), it is sufficient to set \(m = \omega (\log \lambda )\) slightly super-logarithmic, which yields short secret keys and efficient evaluation for all instantiations.

Our proof technique, in particular the perfect one-time security property, can also be seen as an alternative and more direct way of proving the augmented cascade construction secure, while Boneh et al. used the somewhat more complex q-parallel security of the underlying PRF.

Why all-prefix universal hash functions? We stress that we need an all-prefix universal hash function, which works for any possible prefix length j. This is necessary to make the construction and the security proof independent of particular values Q and \(\epsilon _\mathcal {A} \) of a particular adversary, because j depends on these values via the definition \(j = \left\lceil \log (2Q^2/\epsilon _\mathcal {A} ) \right\rceil \). All-prefix universality guarantees basically that a suitable value of j exists for any efficient adversary. This is also required to achieve tightness. See Sect. 4.7 for further discussion.

More related work. There were several other works about the domain extension of PRFs. The first one is due to Levin [Lev87]. It shows that larger inputs can be hashed with a universal hash function if the underlying PRF has a sufficiently large domain. Otherwise it is vulnerable to the so called “birthday attack”. The framework of Jain, Pietrzak, and Tentes [JPT12] works for small domains, but has a rather lossy security proof and is not very efficient, as it needs \(\mathcal {O}(\log q)\) invocations of the underlying pseudo-random generator (PRG), where q is the upper bound of queries to the PRF. Additionally, as the authors already mention, it seems not to work for number-theoretic PRFs like the Naor-Reingold PRF. It was revisited by Chandran and Garg [CG14]. Bernam et al. show how to circumvent the “birthday attack” using Cuckoo Hashing [BHKN13] via two invocations of the original PRF.

2 Preliminaries

Let \(\lambda \in \mathbb {N} \) denote a security parameter. All our results are in the asymptotic setting, that is, we view all expressions involving \(\lambda \) as functions in \(\lambda \). This includes the running time \(t_\mathcal {A} = t_\mathcal {A} (\lambda )\) and success probability \(\epsilon _\mathcal {A} = \epsilon _\mathcal {A} (\lambda )\) of adversaries, even though we occasionally omit \(\lambda \) in this case to simplify our notation. Similarly, all algorithms implicitly receive the security parameter \(1^\lambda \) as their first input. We say that an algorithm is efficient, if it runs in (probabilistic) polynomial time in \(\lambda \).

Notation. If A is a finite set, then we write \(a {\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}}A\) to denote the action of sampling a uniformly random element a from A. If A is a probabilistic algorithm, then \(a {\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}}A(x)\) denotes the action of running A(x) on input x with uniform coins and output a. For \(v,w \in \mathbb {N} \) and \(v<w\), we write \(\llbracket v, w \rrbracket := \{v, \ldots , w\} \subset \mathbb {N} \) to denote the interval of positive integers from v to w, and set \(\llbracket w \rrbracket := \{1, \ldots , w\} \subset \mathbb {N} \). For a bit string \(a = (a_1, \ldots , a_n) \in \{0,1\}^n\) and \(v,w \in \llbracket n \rrbracket \) with \(v \le w\), we write \(a_{v:w}\) to denote the substring \((a_v, \ldots , a_w)\) of a, and \(a_i\) to denote the i-th bit \(a_i\).

2.1 Pseudorandom Functions

Let \(\mathcal {K}, \mathcal {D} \) be sets such that there is an efficient algorithm that samples uniformly random elements \(k {\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}}\mathcal {K} \). Let \(F : \mathcal {K} \times \mathcal {D} \rightarrow \mathcal {G} \) be an efficiently computable function. For an adversary \(\mathcal {A}\) define the following security experiment \(\mathsf {Exp}_{\mathcal {A},F}^{\mathsf {prf}} (\lambda )\).

  1. 1.

    The experiment generates a random key \(k {\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}}\mathcal {K} \) and tosses a coin \(b {\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}}\{0,1\}\).

  2. 2.

    The experiment provides adversary \(\mathcal {A} ^\mathcal {O} (1^\lambda )\) with an oracle \(\mathcal {O}\) which takes as input \(x \in \mathcal {D} \) and responds as follows.

    $$ \mathcal {O} (x) = {\left\{ \begin{array}{ll} F(k,x) &{}\text {if } b=1\\ R(x) &{}\text {if } b=0 \end{array}\right. } $$

    where \(R : \mathcal {D} \rightarrow \mathcal {G} \) is a random function. When the adversary terminates and outputs a bit \(b'\), then the experiment outputs 1 if \(b=b'\), and 0 otherwise.

Let \(x_1, \ldots , x_Q \in \mathcal {D} \) be the sequence of queries issued by \(\mathcal {A}\) throughout the security experiment. We assume that we always have \(Q\ge 1\), as otherwise the output of \(\mathcal {A}\) is independent of b. Furthermore, we assume that \(\mathcal {A}\) never issues the same query twice. More precisely, we assume \(x_u \ne x_v\) for \(u \ne v\). This is without loss of generality, since both \(F(k,\cdot )\) and \(R(\cdot )\) are deterministic functions.

Definition 1

We say that adversary \(\mathcal {A}\) \((t_\mathcal {A} ,\epsilon _\mathcal {A} , Q)\)-breaks the pseudorandomness of F, if \(\mathcal {A}\) runs in time \(t_\mathcal {A} \), issues \(Q\) queries in the PRF security experiment, and

$$ \Pr \left[ \mathsf {Exp}_{\mathcal {A},F}^{\mathsf {prf}} (\lambda )=1 \right] \ge 1/2 + \epsilon _\mathcal {A} $$

2.2 (Almost-)Universal Hash Functions

Let us first recall the standard definition of universal hash functions.

Definition 2

([CW79]). A family \(\mathcal {H}\) of hash functions mapping finite set \(\{0,1\}^n\) to finite set \(\{0,1\}^m\) is universal, if for all \(x,x' \in \{0,1\}^n\) with \(x \ne x'\) holds that

$$ \Pr _{h {\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}}\mathcal {H}}[h(x) = h(x')] \le 2^{-m}. $$

We will also consider almost-universal hash functions, as defined below.

Definition 3

A family \(\mathcal {H}\) of hash functions mapping finite set \(\{0,1\}^n\) to finite set \(\{0,1\}^m\) is almost-universal, if for all \(x,x' \in \{0,1\}^n\) with \(x \ne x'\) holds that

$$ \Pr _{h {\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}}\mathcal {H}}[h(x) = h(x')] \le 2^{-m+1}. $$

Universal and almost-universal hash functions can be constructed efficiently and without additional complexity assumptions, see e.g. [CW79, DHKP97, IKOS08].

3 All-Prefix Universal Hash Functions

In this section, we define all-prefix almost universal hash functions and describe two constructions. The first one is based on the almost-universal hash function of Dietzfelbinger et al. [DHKP97], and yields an all-prefix almost-universal hash function. The second one is based on pairwise independent hash functions with suitable range, and yields an all-prefix universal hash function.

3.1 Definitions

Recall that for a bit string \(a = (a_1, \ldots , a_n) \in \{0,1\}^n\) and \(v,w \in \llbracket n \rrbracket \) with \(v\le w\), we write \(a_{v:w} := (a_v, \ldots , a_w)\).

Definition 4

Let \(\mathcal {H}\) be a family of hash functions mapping \(\{0,1\}^n\) to \(\{0,1\}^m\). We say that \(\mathcal {H}\) is a family of all-prefix universal hash functions, if for all \(x,x' \in \{0,1\}^n\) with \(x \ne x'\) and all \(w \in \llbracket m \rrbracket \) holds that

$$ \Pr _{h {\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}}\mathcal {H}}[h(x)_{1:w} = h(x')_{1:w}] \le 2^{-w}. $$

Note that all-prefix universality essentially means that for all prefixes of length w the truncation of h to its first w bits \(h(x)_{1:w}\) is a universal hash function. We also define the slightly weaker notion of all-prefix almost-universality.

Definition 5

Let \(\mathcal {H}\) be a family of hash functions mapping \(\{0,1\}^n\) to \(\{0,1\}^m\). We say that \(\mathcal {H}\) is a family of all-prefix almost-universal hash functions (APUHFs), if for all \(x,x' \in \{0,1\}^n\) with \(x \ne x'\) and all \(w \in \llbracket m \rrbracket \) holds that

$$ \Pr _{h {\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}}\mathcal {H}}[h(x)_{1:w} = h(x')_{1:w}] \le 2^{-w+1}. $$

3.2 First Construction (Almost-Universal)

We construct a simple and efficient APUHF family based on the almost-universal hash function of Dietzfelbinger et al. [DHKP97], which is defined as follows. Let \(m,n \in \mathbb {N} \) with \(m \le n\). Let

$$\begin{aligned} \mathcal {H}_{n,m}:=\{h_a : a \in \llbracket 2^n-1 \rrbracket \text { and { a} is odd} \} \end{aligned}$$
(1)

be the family of hash functions, which for \(x \in \mathbb {Z} _{2^n}\) is defined as

(2)

Before we prove that this function is all-prefix almost-universal, we first state the following lemma of Dietzfelbinger et al. [DHKP97].

Lemma 1

([DHKP97]). Let n and m be positive integers with \(m \in \llbracket n \rrbracket \). If \(x,y \in \mathbb {Z} _{2^n}\) are distinct and \(h_a \in \mathcal {H}_{n,m}\) is chosen at random, then

$$\begin{aligned} \Pr [ h_a(x) = h_a(y)] \le 2^{-m+1} \end{aligned}$$

Thus, \(\mathcal {H}_{n,m}\) is a family of almost-universal hash functions in the sense of Definition 3.

All-prefix almost-universality of \(\mathcal {H}_{n,m}\) . Now we prove that the hash function family \(\mathcal {H}_{n,m}\) of Dietzfelbinger et al. [DHKP97] is not only almost-universal, but also satisfies the stronger property of all-prefix almost-universality.

Theorem 1

\(\mathcal {H}_{n,m}\) is a family of all-prefix almost-universal hash functions in the sense of Definition 5.

Proof

Let \(\omega , m,n\) be any positive integers with \(\omega \le m \le n\). Note that if \(h_a(\cdot )\) is a function in \(\mathcal {H}_{n,m}\) then \(h_a(\cdot )_{1:\omega }\) is a function in \(\mathcal {H}_{n,\omega }\). Further note that Lemma 1 holds for all \(\omega \in \llbracket n \rrbracket \), which proves the claim.   \(\square \)

In the sequel, we will sometimes write h instead of \(h_a\), when it is clear from the context that h is be chosen uniformly random from \(\mathcal {H}_{n,m}\).

3.3 Second Construction (Universal)

While the almost-universal construction from Sect. 3.2 is already sufficient for all our applications, it is natural to ask whether also all-prefix universal hash functions (not almost-universal) can be constructed. We will show that each pairwise-independent family of hash functions with range \(\{0,1\}^n\) is also a family of all-prefix universal hash functions. To this end, let us first recall the notion of pairwise independent hash functions.

Definition 6

Let \(\mathcal {H}\) be a family of hash functions with domain \(\{0,1\}^n\) and range \(\{0,1\}^m\). We say that \(\mathcal {H}\) is pairwise independent, if for all \(x,x' \in \{0,1\}^n\) with \(x \ne x'\) and all \(y,z \in \{0,1\}^m\) holds that

$$ \Pr _{h {\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}}\mathcal {H}}[h(x) =y \wedge h(x')=z] = 2^{-2m}. $$

We first show that pairwise independence implies all-prefix pairwise independence, which is defined below. Then we show that this implies all-prefix universality.

Let us write \(x_i\) to denote the i-th bit of the bit string x.

Definition 7

Let \(\mathcal {H}\) be a family of hash functions mapping \(\{0,1\}^n\) to \(\{0,1\}^m\). We say that \(\mathcal {H}\) is all-prefix pairwise independent, if for all \(x,x' \in \{0,1\}^n\) with \(x \ne x'\) and all \(y,z' \in \{0,1\}^m\) holds that

$$ \Pr _{h {\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}}\mathcal {H}}[h(x)_{1:w} =y_{1:w} \wedge h(x')_{1:w}=z_{1:w}] = 2^{-2w} $$

for all \(w \in \llbracket m \rrbracket \).

Lemma 2

If \(\mathcal {H}\) is pairwise independent, then it is also all-prefix pairwise independent.

Proof

We have

$$\begin{aligned}&\Pr _{h {\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}}\mathcal {H}}[h(x)_{1:j} =y_{1:j} \wedge h(x')_{1:j}=z_{1:j}] \\ =&\Pr _{h {\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}}\mathcal {H}} \left[ \left( \bigcup _{y' \in \{0,1\}^{m-j}} h(x) =(y_{1:j} \parallel y' ) \right) \wedge \left( \bigcup _{z' \in \{0,1\}^{m-j}} h(x')=(z_{1:j} \parallel z') \right) \right] \\ =&\sum _{y' \in \{0,1\}^{m-j}} \sum _{z' \in \{0,1\}^{m-j}} \Pr _{h {\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}}\mathcal {H}} \left[ h(x) =(y_{1:j} \parallel y') \wedge h(x')=(z_{1:j} \parallel z') \right] \\ =&\sum _{y' \in \{0,1\}^{m-j}} \sum _{z' \in \{0,1\}^{m-j}}\ \frac{1}{2^{2m}} = \frac{2^{m-j} \cdot 2^{m-j} }{2^{2m}} = \frac{1}{2^{2j}}. \end{aligned}$$

   \(\square \)

Now it remains to show that all-prefix pairwise independence implies all-prefix universality.

Lemma 3

If \(\mathcal {H}\) is all-prefix pairwise independent, then it is also all-prefix universal.

Proof

It holds that

$$\begin{aligned} \Pr _{h {\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}}\mathcal {H}}[h(x)_{1:j} =h(x')_{1:j}] =&\sum _{y_{1:j} \in \{0,1\}^{j}} \Pr _{h {\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}}\mathcal {H}}[h(x)_{1:j} =y_{1:j} \wedge h(x')_{1:j}=y_{1:j}] \\ =&\sum _{y_{1:j} \in \{0,1\}^{j}} \frac{1}{2^{2j}} = \frac{1}{2^{j}}, \nonumber \end{aligned}$$
(3)

where (3) holds because of Lemma 2.   \(\square \)

Example instantiation. Let \(n \in \mathbb {N}\) and let

$$\begin{aligned} \mathcal {H}_{n}:= \{h_{a,b} : a,b \in \{0,1\}^n\} \end{aligned}$$

be the family of hash functions

$$\begin{aligned} h_{a,b}: GF(2^n) \rightarrow GF(2^n); x \mapsto ax+b, \end{aligned}$$

where the arithmetic operations are in \(GF(2^n)\). Since it is well-known that \(\mathcal {H}_{n}\) is pairwise independent we leave the following theorem without proof.

Theorem 2

\(\mathcal {H}_{n}\) is a family of all-prefix universal hash functions.

Note that in the explicit construction of \(GF(2^n)\) the choice of the irreducible polynomial has big impact on the efficiency of the arithmetic operations.

4 Augmented Cascade PRFs with Tighter Security

In this section, we show that APUHFs enable the instantiation of augmented cascade PRFs [BMR10] with shorter keys of slightly super-logarithmic size \(\omega (\log \lambda )\). The security proof loses only a factor \(O(\log \lambda )\), independent of the input size of the PRF, assuming that (reasonably close bounds) on the number of queries \(Q\) and the success probability \(1/2+\epsilon _\mathcal {A} \) of the PRF adversary \(\mathcal {A}\) are known a priori. In contrast, the loss of the previous security proof of [BMR10] is linear in the input size of the PRF (which is usually linear in \(\lambda \)), but does not assume any a priori knowledge about \(\mathcal {A}\).

Fig. 1.
figure 1

Definition of function \(\hat{F}^m\) of Boneh et al. [BMR10].

4.1 Augmented Cascade PRFs

Boneh et al. [BMR10] showed how to construct a PRF

$$ \hat{F}^{m}: (S^m \times K) \times X^m \rightarrow K $$

with key space \((S^m \times K)\) and input space X from an augmented cascade of functions

$$\begin{aligned} F: (S \times K ) \times X \rightarrow K \end{aligned}$$

The augmented cascade construction is described in Fig. 1. Boneh et al. [BMR10] prove that \(\hat{F}^m\) is a secure PRF, if F is parallel secure in the following sense.

Definition 8

([BMR10]). For a function \(F: (S \times K ) \times X \rightarrow K\) define \(F^{(Q)}\) as the function

$$\begin{aligned} F^{(Q)}:(S \times K^Q) \times (X \times \llbracket Q \rrbracket ) \rightarrow K \qquad \left( (s,k_1,...,k_q) ,(x,i) \right) \mapsto F\left( (s,k_i),x \right) . \end{aligned}$$

We say that \(\mathcal {A}\) \((t_\mathcal {A} ,\epsilon _\mathcal {A} ,Q)\)-breaks the \(Q\) -parallel security of \(F: (S \times K ) \times X \rightarrow K\), if it \((t_\mathcal {A} ,\epsilon _\mathcal {A} ,Q)\)-breaks the pseudorandomness of \(F^{(Q)}\) in the sense of Definition 1.

Theorem 3

([BMR10]). From each adversary \(\mathcal {A}\) that \((t_\mathcal {A} ,\epsilon _\mathcal {A} ,Q)\)-breaks the pseudorandomness of \(\hat{F}^{m}\), one can construct an adversary \(\mathcal {B}\) that \((t_\mathcal {B} ,\epsilon _\mathcal {B} ,Q)\)-breaks the \(Q\)-parallel security of \(F^{(Q)}\) with

$$\begin{aligned} t_\mathcal {B} = \varTheta (t_\mathcal {A} ) \qquad \text {and}\qquad \epsilon _\mathcal {B} \ge \frac{\epsilon _\mathcal {A} }{m} \end{aligned}$$

Note that the security loss of this construction is linear in the length m of the input of function \(\hat{F}^{m}\).

4.2 The Augmented Cascade with Encoded Input

We consider augmented cascade PRFs which are almost identical to the construction of Boneh et al. [BMR10], except that we apply an all-prefix almost-universal hash function to the input before processing it in the augmented cascade, and show that this enables a tighter security proof. We consider the special case with input space \(X = \{0,1\}\), which encompasses the MDDH-based construction of Escala et al. [EHK+17] and thus includes in particular both the instantiations of Naor-Reingold [NR97] and Lewko-Waters [LW09].

Let \(\mathcal {H}_{n,m}\) be a family of all-prefix almost-universal hash functions according to Definition 5, and let \(F: (S \times K ) \times \{0,1\} \rightarrow K\) be a function. We define the corresponding augmented cascade PRF with \(\mathcal {H}_{n,m}\)-encoded input as the function

$$\begin{aligned} \hat{F}^{\mathcal {H}_{n,m}}: S^m \times K \times \mathcal {H}_{n,m} \times \{0,1\}^n \rightarrow K \nonumber \\ ((s_1,...,s_m),k,h,x) \mapsto \hat{F}^{m}((s_1,...,s_m),k,h(x)) \end{aligned}$$
(4)

where \(\hat{F}^{m}\) is the augmented cascade construction of Boneh et al. [BMR10], applied to F as described in Fig. 1.

Remark 1

Note that evaluating the PRF requires only m recursions in the augmented cascade, and that, accordingly, the secret key consists of only m elements and the description of h, while the input size can be any polynomial number of n bits, with possibly \(n \gg m\). We will later show that it suffices to set \(m = \omega (\log \lambda )\) slightly super-logarithmic, thanks to the input encoding with an all-prefix almost-universal hash function. Also the security loss of this construction is only \(O(\log \lambda )\) and independent of the size of the input n.

4.3 Preparation for the Security Proof

In this section we describe a few technical observations which will simplify the security proof. Furthermore, we define perfect one-time security as an additional property of a function F(sxk), which will also be required for the proof. We will argue later that the Matrix-DDH-based instantiations of the augmented cascade of [EHK+17], including the functions of Naor-Reingold [NR97] and Lewko-Waters [LW09], all satisfy this additional notion. Moreover, we will show that the LWE-based PRF of [BPR12] can be viewed as an augmented cascade and it is perfectly one-time secure.

An observation about the augmented cascade. The following observation will be useful to follow the security proof more easily. Suppose we want to compute

$$ z = \hat{F}^{m}((s_1,...,s_m),k,h(x)) $$

then, due to the recursive definition of \(\hat{F}^{m}\), we can equivalently proceed in the following two steps.

  1. 1.

    Let \(i\in \llbracket m \rrbracket \). We first process the first i bits \(h(x)_{1:i}\) of h(x) with \((s_1, \ldots , s_i,k)\), and compute and “intermediate key” \(k_x\) as

    $$ k_x := \hat{F}^{i}((s_1, \ldots , s_i),k,h(x)_{1:i}) $$
  2. 2.

    Then we process the remaining \(m-i\) bits \(h(x)_{i+1:m}\) of h(x) with the remaining key elements \((s_{i+1}, \ldots , s_m,k_x)\) by computing

    $$ z = \hat{F}^{m-i}((s_{i+1},...,s_m),k_x,h(x)_{i+1:m}) $$

We formulate this observation as a lemma.

Lemma 4

For all \(i \in \llbracket m \rrbracket \), we have

$$\begin{aligned} \hat{F}^{m}((s_1,...,s_m),k,h(x))&= \hat{F}^{m-i}((s_{i+1},...,s_m),k_x,h(x)_{i+1:m}) \end{aligned}$$

where \(k_x := \hat{F}^{i}((s_1, \ldots , s_i),k,h(x)_{1:i})\).

Perfect One-Time Security. We will furthermore require an additional security property of F, which we call perfect one-time security, and show that this property is satisfied by all instantiations of function F considered in this section. We demand that F(sxk) is identically distributed to a random function R(x), if it is only evaluated once. This must hold over the uniformly random choice \(k {\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}}K\), and for any \(s \in S\) and \(x \in \{0,1\}\).

Definition 9

We say that a function \(F: S \times K \times \{0,1\}^m \rightarrow K\) is perfectly one-time secure, if

$$ \Pr _{k {\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}}K}\left[ F(s,k,x) = k' \right] = \frac{1}{|K|} $$

for all \((s,x,k') \in S \times \{0,1\}^m \times K\).

Perfect one-time security basically guarantees uniformity of the hash function, if it is evaluated only once (“1-uniformity”).

The following lemma follows directly from Definition 9. It will be useful to prove security of our variant of the augmented cascade.

Lemma 5

Let \(m \in \mathbb {N}\) and \(F: S \times K \times \{0,1\}\rightarrow K\) be perfectly one-time secure. Then the augmented cascade \(\hat{F}^m\) constructed from F is also perfectly one-time secure. That is

$$ \Pr _{k {\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}}K}\left[ \hat{F}^m((s_1,...,s_m),k,x) = k' \right] = \frac{1}{|K|} $$

for all \(((s_1,...,s_m),k',x) \in S^m \times K \times \{0,1\}^m\).

Proof

For a uniformly random chosen k it holds that \(\Pr \left[ F(s_1,k,x_1) = k_1 \right] = \frac{1}{|K|}\) for all \((s_1,k,x_1) \in S \times K \times \{0,1\}\) because of the perfect one-time security of F. Thus the input for the second iteration stays uniformly random. Due to the recursive construction executing all the following iterations will keep this distribution, which gives us the perfect one-time security of \(\hat{F}^m\).   \(\square \)

4.4 Security Proof

Now we are ready to prove the following theorem.

Theorem 4

Let \(m = \omega (\log \lambda )\) be (slightly) super-logarithmic, \(\mathcal {H}_{n,m}\) be a family of all-prefix almost universal hash functions and F be perfectly one-time secure.

From each adversary \(\mathcal {A}\) that \((t_\mathcal {A} ,\epsilon _\mathcal {A} ,Q)\)-breaks the pseudorandomness of \(\hat{F}^{\mathcal {H}_{n,m}}\) with \(Q/\epsilon _\mathcal {A} = \mathsf {poly}(\lambda )\) for some polynomial \(\mathsf {poly}\), we can construct an adversary \(\mathcal {B}\) that \((t_\mathcal {B} ,\epsilon _\mathcal {B} ,Q)\)-breaks the pseudorandomness of \(\hat{F}^{j}\), where

$$ j = O(\log \lambda ) \qquad \text {and}\qquad t_\mathcal {B} = \varTheta (t_\mathcal {A} ) \qquad \text {and}\qquad \epsilon _{\mathcal {B}} \ge \epsilon _\mathcal {A} /2 $$

Proof

In the sequel let \(j = j(\lambda )\) be defined such that

$$\begin{aligned} j := \left\lceil \log ( 2Q^2 / \epsilon _\mathcal {A} ) \right\rceil \end{aligned}$$
(5)

Observe that we have \(j(\lambda ) \le m(\lambda )\) for sufficiently large \(\lambda \), because the fact that we have \(Q/\epsilon _\mathcal {A} = \mathsf {poly}(\lambda )\) for some polynomial \(\mathsf {poly}\) and \(j < \log ( 2Q^2 / \epsilon _\mathcal {A} )+1\) together yield that \(j = O(\log \lambda )\), while we have \(m=\omega (\log \lambda )\).

Remark 2

Note that although we have \(j = O(\log ( 2Q^2 / \epsilon _\mathcal {A} ) ) = O(\log \lambda )\), the constant hidden in the big-O notation depends on the adversary.

We describe a sequence of games, where Game 0 is the original PRF security experiment, and in the last game the probability that the experiment outputs 1 is 1/2, such that no adversary can have any advantage. Let \(X_i\) denote the event that the experiment outputs 1 in Game i, and let \(\mathcal {O} _i\) denote the oracle provided by the experiment in Game i.

Game 0. This is the original security experiment. In particular, we have

$$\begin{aligned} \mathcal {O} _{0}(x) = {\left\{ \begin{array}{ll} \hat{F}^{\mathcal {H}_{n,m}}((s_1,...,s_{m}),k,h,x) &{}\text {if } b=1\\ R(x) &{}\text {if } b=0 \end{array}\right. } \end{aligned}$$

where R is a random function. Therefore, by definition, it holds that

$$ \Pr \left[ X_{0} \right] = 1/2 + \epsilon _\mathcal {A} $$

Game 1. We change the way how the oracle implements function \(\hat{F}^{\mathcal {H}_{n,m}}\). That is, we modify the behaviour of \(\mathcal {O} _{1}\) in case \(b=1\), while in case \(b=0\) oracle \(\mathcal {O} _{1}\) proceeds identical to \(\mathcal {O} _{0}\). Recall that

$$ \hat{F}^{\mathcal {H}_{n,m}}((s_1,...,s_{m}),k,h,x) = \hat{F}^{m} \left( (s_1,...,s_{m}),k,h(x)\right) $$

\(\mathcal {O} _{1}\) implements this function in a specific way. Using the observation from Lemma 4, it computes \(\hat{F}^{m} \left( (s_1,...,s_{m}),k,h(x)\right) \) in two steps:

  1. 1.

    \(k_x := \hat{F}^{j}((s_1, \ldots , s_j),k,h(x)_{1:j})\),

  2. 2.

    \(z := \hat{F}^{m-j}((s_{j+1},...,s_m,k_x,h(x)_{j+1:m})\),

where j is as defined above, and we use that \(j \le m\). By Lemma 4, this is just a specific way to implement function \(\hat{F}^{m}\), so the change is purely conceptual and we have

$$ \Pr \left[ X_{1} \right] = \Pr \left[ X_{0} \right] $$

Game 2. This game is identical to Game 1, except that we replace the function \(\hat{F}^{m}\) implemented by oracle \(\mathcal {O} _{1}\) partially with a random function. More precisely, oracle \(\mathcal {O} _{2}\) chooses a second random function \(R_{j} : \{0,1\}^{j} \rightarrow K\). If \(b=1\), then it computes \(z = \mathcal {O} _{2}(x)\) as

  1. 1.

    \(k_x := R_{j}(h(x)_{1:j})\)

  2. 2.

    \(z := \hat{F}^{m-j}((s_{j+1},...,s_m),k_x,h(x)_{j+1:m})\)

If \(b=0\), then it proceeds exactly like \(\mathcal {O} _{1}\). The proof of the following lemma is postponed to Sect. 4.5.

Lemma 6

From each \(\mathcal {A}\) that runs in time \(t_\mathcal {A} \) and issues \(Q\) oracle queries one can construct an adversary \(\mathcal {B}\) that \((t_\mathcal {B} ,\epsilon _\mathcal {B} ,Q)\)-breaks the pseudorandomness of \(\hat{F}^{j}\) where

$$\begin{aligned} t_\mathcal {B} = \varTheta (t_\mathcal {A} ) \quad \text {and} \quad \epsilon _{\mathcal {B}} =\left| \Pr \left[ X_1 \right] - \Pr \left[ X_{2} \right] \right| \end{aligned}$$
(6)

Game 3. This game is identical to Game 2, but \(\mathcal {O} _{3}\) performs an additional check. Whenever \(\mathcal {A}\) makes an oracle query x, \(\mathcal {O} _{3}\) checks whether there has been a previous oracle query \(x'\) such that

$$ h(x)_{1:j} = h(x')_{1:j} $$

If this holds, then \(\mathcal {O} _{3}\) raises event \(\mathsf {coll}\), and the experiment outputs a random bit and terminates. Note that the check is always performed, for both values \(b \in \{0,1\}\). Since both games are identical until \(\mathsf {coll}\), we have

$$ \left| \Pr \left[ X_{2} \right] - \Pr \left[ X_{3} \right] \right| \le \Pr \left[ \mathsf {coll} \right] $$

Again, the proof of the following lemma is postponed, to Sect. 4.6.

Lemma 7

If F is perfectly one-time secure, then \( \Pr \left[ \mathsf {coll} \right] \le \epsilon _\mathcal {A} /2\) and \( \Pr \left[ X_{3} \mid \overline{\mathsf {coll}} \right] = 1/2 \).

We finish the proof of Theorem 4 before we prove Lemmas 6 and 7. We have

$$\begin{aligned} \Pr \left[ X_{3} \right]&= \Pr \left[ X_{3} \mid \mathsf {coll} \right] \cdot \Pr \left[ \mathsf {coll} \right] + \Pr \left[ X_{3} \mid \overline{\mathsf {coll}} \right] \cdot \left( 1- \Pr \left[ \mathsf {coll} \right] \right) \end{aligned}$$
(7)

Recall that \(X_{3}\) denotes the probability that the experiment outputs 1, which happens if and only if \(\mathcal {A}\) outputs \(b'\) with \(b = b'\). By construction of the experiment, we abort and output a random bit in Game 3, if \(\mathsf {coll}\) occurs. In combination with Lemma 7 we thus get

$$ \Pr \left[ X_{3} \mid \mathsf {coll} \right] = \Pr \left[ X_{3} \mid \overline{\mathsf {coll}} \right] = 1/2 $$

Plugging this into (7) yields

$$\begin{aligned} \Pr \left[ X_{3} \right]&= 1/2 \cdot \Pr \left[ \mathsf {coll} \right] + 1/2 \cdot \left( 1- \Pr \left[ \mathsf {coll} \right] \right) = 1/2 \end{aligned}$$
(8)

Lower bound on \(\epsilon _{\mathcal {B}}\) . Finally, using (8), the bounds from Lemmas 6 and 7, and the fact that \(\Pr \left[ X_0 \right] = \Pr \left[ X_1 \right] \), we obtain a lower bound on \(\epsilon _{\mathcal {B}}\):

$$\begin{aligned} 1/2 + \epsilon _\mathcal {A}&= \Pr \left[ X_0 \right] = \Pr \left[ X_1 \right] \le \Pr \left[ X_2 \right] + \epsilon _{\mathcal {B}} \le 1/2 + \epsilon _\mathcal {A} /2 + \epsilon _{\mathcal {B}} \\&\iff \epsilon _{\mathcal {B}} \ge \epsilon _\mathcal {A} /2 \end{aligned}$$

Furthermore, by Lemma 6, algorithm \(\mathcal {B} \) runs in time \(t_\mathcal {B} = \varTheta (t_\mathcal {A} )\) and issues \(Q\) oracle queries.   \(\square \)

4.5 Proof of Lemma 6

Adversary \(\mathcal {B}\) plays the pseudorandomness security experiment with function \(\hat{F}^{j}\). Let \(\mathcal {O} \) denote the PRF oracle provided to \(\mathcal {B}\) in this game. \(\mathcal {B}\) runs \(\mathcal {A}\) as a subroutine by simulating the security experiment as follows.

Initialization. \(\mathcal {B} \) samples a bit \(b {\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}}\{0,1\}\), a hash function \(h \leftarrow \mathcal {H}_{n,m}\), and picks \((s_{j+1},...,s_m)\), where \(s_i \leftarrow S\) for all \(i \in \llbracket j+1, m \rrbracket \).

Handling of oracle queries. Whenever \(\mathcal {A}\) queries \(x \in \{0,1\}^n\), \(\mathcal {B} \) proceeds as follows.

  • If \(b=0\), then \(\mathcal {B} \) proceeds exactly like the original experiment. That is, it responds with R(x), where \(R : \{0,1\}^n \rightarrow K\) is a random function.

  • If \(b=1\), then \(\mathcal {B} \) computes h(x) and queries \(\mathcal {O} \) to obtain \(k_x := \mathcal {O} (h(x)_{1:j})\). Then it computes

    $$ z := \hat{F}^{m-j}((s_{j+1},...,s_m),k_x,h(x)_{j+1:m}) $$

    and returns z to \(\mathcal {A}\).

Finalization. Finally, when \(\mathcal {A}\) terminates, then \(\mathcal {B}\) outputs whatever \(\mathcal {A}\) outputs, and terminates.

Analysis of \(\mathcal {B} \) . Note that the running time of \(\mathcal {B} \) is essentially identical to the running time of \(\mathcal {A}\) plus a minor number of additional operations, thus we have \( t_\mathcal {B} = \varTheta (t_\mathcal {A} )\). If \(\mathcal {O} (x) = \hat{F}^{j}((s_1,...,s_{j},k), h(x)_{1:j})\), then by Lemma 4 it holds that \(z = \hat{F}^{m}((s_1,...,s_{m},k), h(x))\). Thus, the view of \(\mathcal {A}\) is identical to Game 1. If \(\mathcal {O} (x)\) implements a random function, then its view is identical to Game 2. This yields the claim.

4.6 Proof of Lemma 7

In order to show that \(\Pr \left[ \mathsf {coll} \right] \le \epsilon _\mathcal {A} /2\), we prove that all queries of \(\mathcal {A}\) are independent of h, regardless of \(b=0\) or \(b=1\), until \(\mathsf {coll}\) occurs. This allows us to derive an upper bound on \(\mathsf {coll}\). Consider the sequence of queries \(x_1, \ldots , x_Q\) made by \(\mathcal {A}\). Recall that we assume \(x_u \ne x_v\) for \(u \ne v\) without loss of generality.

The case \(b=0\) . In this case, \(\mathcal {O} _{3}(x_i)\) is a random function \(R(x_i)\), and therefore all information observed by \(\mathcal {A}\) is independent of h, until \(\mathsf {coll}\) occurs. Thus, the view of \(\mathcal {A}\) is equivalent to a world in which the experiment does not choose h at the beginning, but only after \(\mathcal {A}\) has made all queries, and only then computes \(h(x_i)_{1:j}\) for all \(i \in \llbracket Q \rrbracket \) and outputs a random bit if a collision occurred. By the almost-universality, we thus obtain that

$$\begin{aligned} \Pr \left[ \mathsf {coll} \mid b=0 \right] \le \sum _{i=2}^Q\frac{i-1}{2^{j -1}} \le \frac{Q^2}{2^{j} } \le \frac{Q^2\epsilon _\mathcal {A} }{2Q^2} = \frac{\epsilon _\mathcal {A} }{2}. \end{aligned}$$

Note that we use here that \(j \ge \log (2Q^2/\epsilon _\mathcal {A} )\), which holds due to the definition of j in (5).

The case \(b=1\) . We may assume without loss of generality that \(Q> 0\), as otherwise \(\mathcal {A}\) receives no information about b and thus we would have \(\epsilon _\mathcal {A} =0\). Consider the first query \(\mathcal {O} _{3}(x_1)\) of \(\mathcal {A}\). The oracle proceeds as follows. At first it computes \(k_{x_1} := R_{j}(h(x_1)_{1:j})\). Since \(R_{j}\) is a random function, this value is independent of h. In the next step it computes \(z_1 := \hat{F}^{m-j}((s_{j+1},...,s_m),k_{x_1} ,h(x_1)_{j+1:m})\), which is still uniformly random. To see this, note that the perfect one-time security of F guarantees perfect one-time security of \(\hat{F}^{m-j}\) as shown in Lemma 5. Thus \(\mathcal {A}\) gains no information about h at this point and the next query cannot be adaptive with regard to h.

Now if \(\mathcal {A}\) queries \(\mathcal {O} _{3}(x_2)\), then the experiment will evaluate the random functions \(R_{j}\) on a different position than in the first query, unless

$$\begin{aligned} h(x_1)_{1:j} = h(x_2)_{1:j} \end{aligned}$$
(9)

Due to the fact that the response to \(x_1\) was independent of h and the almost-universality of h, (9) happens with probability at most \(1/2^{j-1}\). Therefore, again by the perfect one-time security of F, \(\mathcal {A}\) receives another uniformly random value \(z_2\), which is independent of h, except with probability at most \(1/2^{j-1}\). Continuing this argument inductively over all \(Q\) queries of \(\mathcal {A}\), we see that on its i-th query \(\mathcal {A}\) will receive a random response which is independent of h, except with probability \((i-1)/2^{j-1}\), provided that all previous responses were independent of h. A union bound now yields

$$\begin{aligned} \Pr \left[ \mathsf {coll} \mid b=1 \right] \le \sum _{i=2}^Q\frac{i-1}{2^{j -1}} \le \frac{Q^2}{2^{j } } \le \frac{Q^2\epsilon _\mathcal {A} }{2Q^2} = \frac{\epsilon _\mathcal {A} }{2}. \end{aligned}$$

It remains to show that \(\Pr \left[ X_3 \mid \overline{\mathsf {coll}} \right] = 1/2\). Let us consider the case \(b=1\). If \(\overline{\mathsf {coll}}\) occurs, then there are no collisions, such that the oracle calls random function \(R_{j}\) on always different inputs, each time receiving an independent, uniformly random value. Applying the perfect one-time security of \(\hat{F}^{m-j}\) again, the response of the oracle to each query is therefore uniformly distributed and independent of all other queries. Thus, provided that no collision occurs, the view in case \(b=1\) is perfectly indistinguishable from the case \(b=0\), which yields the claim.

4.7 On the Necessity of the “all-prefix” Property

One may ask at this point whether the “all-prefix” property is really necessary, or whether it is possible to use a standard universal hash function with fixed output space \(\{0,1\}^j\) instead.

Let us explain why the “all-prefix” property is not only sufficient, but also necessary. Recall that j depends on the particular values of Q and \(\epsilon _\mathcal {A} \) of a particular given adversary, via the definition \(j = \left\lceil \log (2Q^2/\epsilon _\mathcal {A} ) \right\rceil \) in (5). One may wonder why we set j so precisely, depending on the given adversary, rather than simply choosing j sufficiently large such that it would work for any efficient adversary.

The purpose of this precise choice is because we have to find the right balance between two properties that we need to obtain tight security:

  1. 1.

    On the one hand, we need j to be sufficiently large, such that the probability of a collision of (the j-bit prefix of) the universal hash function is sufficiently unlikely.

  2. 2.

    On the other hand, we have to keep j short enough, in order to get a tight reduction.

This is why we make the value j dependent on the given adversary, specifically on the particular values of Q and \(\epsilon _\mathcal {A} \).

We stress that we do this only in the security proof, but not in the PRF construction itself. That is, we do not simply fix j to be the largest value of j such that the collision probability is sufficiently small for any adversary, because then for certain adversaries j could be “too large” such that the reduction would not be tight. Similarly, if we used a standard universal hash function with output length j, then this would also fix j to some specific value in the construction of the PRF, and thus would again make the PRF construction only tightly secure for certain adversaries that match this particular choice of j, but not necessarily for all efficient adversaries.

For example using a standard UHF with \(m = \omega (\log \lambda )\) is sufficient to bound the collision probability, but this yields only super-logarithmic tightness, and thus would be worse than in the construction of Döttling and Schröder [DS15], while with an APUHF we achieve logarithmic tightness.

Hence, the important new feature that all-prefix universality provides is that it guarantees that a suitable choice of j exists for any efficient adversary. This makes the construction independent of a particular class of adversaries that match a certain fixed value of j, while at the same time it ensures that the security proof depends tightly on the particularly given adversary. Hence, using an APUHF instead of a standard universal hash function is not just sufficient, but also necessary in order to capture all efficient adversaries and to keep the security proof tight.

We note that Döttling and Schröder [DS15] also use multiple instances of the underlying pseudorandom function, with increasing security, in order to achieve tightness. Essentially, we replace these multiple instances with a single instance, in combination with an all-prefix universal hash function. From an abstract high-level perspective, in our approach each prefix implicitly corresponds to one PRF instance of [DS15]. This makes our construction significantly more efficient.

5 Applications

5.1 Efficient and Tightly-Secure PRF from Matrix Diffie-Hellman Assumptions

We recall the definition of the matrix Diffie-Hellman (\(\text {MDDH}\)) assumption and the pseudorandom function (PRF) from [EHK+17]. We consider a variant where an all-prefix almost-universal hash function is applied to the input before it is processed by the PRF. We note that the \(\text {MDDH}\) assumption generalizes the Decisional Diffie-Hellman (\(\text {DDH}\)) and Decisional d-Linear (\(d\text{- } {\text {LIN}}\)) assumptions, and, moreover, it gives us a framework to analyze the algebraic structure behind the Diffie-Hellman-based cryptographic primitives. Thus, our results can be carried on to the Naor-Reingold PRF (based on the \(\text {DDH}\) assumption) [NR97] and the Lewko-Waters PRF (based on the \(d\text{- } {\text {LIN}}\) assumption) [LW09].

Notations and the \(\text {MDDH}\) Assumption. Let be a description of an additive group with random generator \(P\) and prime order q. Following the “implicit notation” of [EHK+13], we write \(\left[ a\right] \) shorthand for \(a P\). More generally, for a matrix \(\mathbf {A} = (a_{ij}) \in \mathbb {Z}_q^{n\times m}\), we define \([\mathbf {A} ]_s\) as the implicit representation of \(\mathbf {A} \) in :

Let us first recall the definition of the matrix Diffie-Hellman (\(\text {MDDH}\)) problem [EHK+13, EHK+17].

Definition 10

(Matrix distribution). Let \(\ell ,d\in \mathbb {N} \) and \(\ell > d\). We call \(\mathcal {D}_{\ell ,d}\) a matrix distribution if it outputs matrices in \(\mathbb {Z}_q^{\ell \times d}\) of full rank d in polynomial time, namely, it is efficiently samplable. We define \(\mathcal {D}_{d} := \mathcal {D}_{d+1, d}\).

Without loss of generality, we assume the first d rows of \(\mathbf {A} {\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}}\mathcal {D}_{\ell ,d}\) form a full-rank and invertible matrix, and we denote it by \(\mathbf {\overline{A}} \) and the rest \(\ell -d\) rows by \(\mathbf {\underline{A}} \).

Definition 11

(Transformation matrix). Let \(\mathcal {D}_{\ell , d}\) be a matrix distribution and \(\mathbf {A} \) be a matrix from it. The transformation matrix of \(\mathbf {A} \) is defined as \(\mathbf {T}:= \mathbf {\underline{A}} \cdot \mathbf {\overline{A}} ^{-1} \in \mathbb {Z} _q^{(\ell -d) \times d}\).

The \(\mathcal {D}_{\ell , d}\)-\(\text {MDDH}\) problem is to distinguish the two distributions \(([\mathbf {A} ], [\mathbf {A} \mathbf {w} ])\) and \(([\mathbf {A} ],[\mathbf {u} ])\) where \(\mathbf {A} {\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}}\mathcal {D}_{\ell , d}\), \(\mathbf {w} {\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}}\mathbb {Z}_q^d\) and \(\mathbf {u} {\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}}\mathbb {Z}_q^{\ell }\).

Definition 12

( \(\mathcal {D}_{\ell ,d}\) -Matrix Diffie-Hellman assumption, \(\mathcal {D}_{\ell ,d}\)-\(\mathbf{MDDH}\) ). Let \(\mathcal {D}_{\ell ,d}\) be a matrix distribution. We say that adversary \(\mathcal {A}\) \((t_\mathcal {A} ,\epsilon _\mathcal {A} )\)-breaks the \(\mathcal {D}_{\ell ,d}\)-Matrix Diffie-Hellman (\(\mathcal {D}_{\ell ,d}\)-\(\text {MDDH}\)) assumption in group , if \(\mathcal {A}\) runs in time \(t_\mathcal {A} \) and

$$| \Pr [\mathcal {A} (\mathcal {G},[\mathbf {A} ], [\mathbf {A} \mathbf {w} ])=1]-\Pr [\mathcal {A} (\mathcal {G},[\mathbf {A} ], [\mathbf {u} ]) =1] | \ge \epsilon _\mathcal {A} ,$$

where the probability is taken over \(\mathbf {A} {\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}}\mathcal {D}_{\ell ,d}, \mathbf {w} {\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}}\mathbb {Z}_q^d, \mathbf {u} {\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}}\mathbb {Z}_q^{\ell }\).

Examples of \(\mathcal {D}_{\ell , d}\text{- }MDDH\) . [EHK+13, EHK+17] define distributions \(\mathcal {L}_d\), \(\mathcal {C}_d\), \(\mathcal {SC}_d\), \(\mathcal {IL}_d\), and \(\mathcal {U}_d\) which corresponds to the d-Linear, d-Cascade, d-Symmetric-Cascade, d-Incremental-Linear, and d-Uniform assumption, respectively. All these assumptions are proven secure in the generic group model [EHK+13, EHK+17] and form a hierarchy of increasingly weaker assumptions.

A simple example is the \(\mathcal {L}_1 \text{- }\text {MDDH}\) assumption for \(d=1\), which is the \(\text {DDH}\) assumption: Choose \(a, w,z {\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}}\mathbb {Z} _q\), and the \(\text {DDH}\) assumption states that the following two distributions are computationally indistinguishable:

$$\begin{aligned} ([1, a,w, aw]) \approx _c ([1, a,w, z]). \end{aligned}$$

This can be represented via the \(\mathcal {L}_1\text{- }\text {MDDH}\) assumption which states the following two distributions are computationally indistinguishable:

$$ (\left[ {\begin{matrix} a \\ 1 \end{matrix}} \right] , \left[ {\begin{matrix} a w\\ w \end{matrix}} \right] ) =: (\left[ \mathbf {A} \right] , \left[ \mathbf {A} \mathbf {w} \right] ) \approx _c (\left[ \mathbf {A} \right] , \left[ \mathbf {u} \right] ) := (\left[ {\begin{matrix} a \\ 1 \end{matrix}} \right] , \left[ {\begin{matrix} z\\ w \end{matrix}} \right] ). $$

For \(d=1\) the transformation matrix \(\mathbf {T} \) contains only one element, and for \(\mathcal {L}_1\)-\(\text {MDDH}\) the corresponding transformation matrix is .

We give more examples of matrix distributions from [EHK+13, EHK+17] for \(d=2\) in Appendix A.

The PRF construction of [EHK+17] and its security. Let be a description of an additive group with random generator \(P\) and prime order q. Let \(\mathcal {D}_{\ell ,d}\) be a matrix distribution and we assume that \((\ell -d)\) divides d and define \(t:=d/(\ell -d)\).

Following the approach of Sect. 5.3 of [EHK+17], we choose a random vector , and, for \(i=1,...,m\) and \(j=1,..., t\), we choose \(\mathbf {A} _{i,j} {\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}}\mathcal {D}_{\ell ,d}\) and compute transformation matrices \(\hat{\mathbf {T}}_{i,j} := \mathbf {\underline{A}} _{i,j} \mathbf {\overline{A}} _{i,j}^{-1} \in \mathbb {Z}_q^{(\ell -d) \times d}\) and define the aggregated transformation matrices

$$\begin{aligned} \mathbf {T} _i := \begin{pmatrix} \hat{\mathbf {T}}_{i,1} \\ \vdots \\ \hat{\mathbf {T}}_{i,t} \end{pmatrix} \in \mathbb {Z}_q^{d \times d} , \end{aligned}$$

and \(\mathbf {S}:=(\mathbf {T} _1,...,\mathbf {T} _m)\). Here, for \(i \in \{1,...,m\}\), we require that \(\mathbf {T} _i\) has full rank. We note that this requirement can be satisfied by all the matrix distributions described in [EHK+17] with overwhelming probability. This implies the distribution of our \(\mathbf {T} _i\)’s is statistically close to that in [EHK+17], up to a negligibly small statistical distance of \(1/(q-1)\). Thus, their security results can be applied here.

Now let \(S := \mathbb {Z} _q^{d \times d}\), , and \(X := \{0,1\}\). The basis of the PRF construction from [EHK+17] is the function \(F_{\text {MDDH}}: S \times K \times X \rightarrow K\) defined as

(10)

By applying the augmented cascade of Boneh et al. [BMR10] (Fig. 1) to \(F_{\text {MDDH}}\), Escala et al. [EHK+17] obtain their PRF \(F_{\text {MDDH}}^m\) with key space and domain \( \{0,1\}^m\):

(11)

where \(\mathbf {S}:= (\mathbf {T} _1, ..., \mathbf {T} _m)\). The following theorem was proven in [EHK+13, EHK+17].

Theorem 5

([EHK+17, Theorem 12]). From each adversary \(\mathcal {A}\) that \((t_\mathcal {A} ,\epsilon _\mathcal {A} , Q)\)-breaks the security of \(F_{\text {MDDH}}^m\) with input space \(\{0,1\}^m\) we can construct an adversary \(\mathcal {B}\) that \((t_\mathcal {B} ,\epsilon _\mathcal {B} )\)-breaks the \(\mathcal {D}_{\ell , d}\)-\(\text {MDDH}\) assumption in with

$$ t_\mathcal {B} = \varTheta (t_\mathcal {A} ) \qquad \text {and}\qquad \epsilon _\mathcal {B} \ge \frac{\epsilon _\mathcal {A} }{dm} $$

Note that d is a constant, so that the security loss is linear in the size m of the input space.

Our construction. By additionally encoding the input with an APUHF as described in (4), we finally obtain the function \(F_{\text {MDDH}}^{\mathcal {H}_{n,m}}: S^m \times K \times \mathcal {H}_{n,m} \times \{0,1\}^n \rightarrow K\) as

(12)

In order to apply Theorem 4 to show that this particular instance of the augmented cascade with encoded input is a secure PRF with key space \(S^m \times K \times \mathcal {H}_{n,m}\) and domain \(\{0,1\}^n\), we merely have to prove that function \(F_{\text {MDDH}}\) is perfectly one-time secure.

Lemma 8

Function \(F_{\text {MDDH}}\) from (10) is perfectly one-time secure.

Proof

We have to show that

for all .

If \(x=0\) then , which is a random vector in by definition. If \(x=1\) then , which is again a random vector, due to the fact that \(\mathbf {T} \) is a full-rank matrix.   \(\square \)

By combining Theorem 4 with Theorem 5 we now obtain the following result, which shows that setting \(m = \omega (\log \lambda )\) is sufficient to achieve tight security.

Theorem 6

Let \(m = \omega (\log \lambda )\) be (slightly) super-logarithmic and \(\mathcal {H}_{n,m}\) be a family of all-prefix almost universal hash functions. From each adversary \(\mathcal {A}\) that \((t_\mathcal {A} ,\epsilon _\mathcal {A} , Q)\)-breaks the security of \(F_{\text {MDDH}}^{\mathcal {H}_{n,m}}\) with \(Q/\epsilon _\mathcal {A} = \mathsf {poly}(\lambda )\) for some polynomial \(\mathsf {poly}\) we can construct an adversary \(\mathcal {B}\) ’ that \((t_\mathcal {B} ',\epsilon _\mathcal {B} ')\)-breaks the \(\mathcal {D}_{\ell , d}\)-\(\text {MDDH}\) assumption in with

$$ t_\mathcal {B} ' = \varTheta (t_\mathcal {A} ) \qquad \text {and}\qquad \epsilon _\mathcal {B} ' \ge \frac{\epsilon _\mathcal {A} }{2dj} $$

where \(j = O(\log \lambda )\).

Proof

Theorem 4 shows that from each adversary \(\mathcal {A}\) that \((t_\mathcal {A} ,\epsilon _\mathcal {A} ,q)\)-breaks the pseudorandomness of \(F_{\text {MDDH}}^{\mathcal {H}_{n,m}}\) with \(Q/\epsilon _\mathcal {A} = \mathsf {poly}(\lambda )\) for some polynomial \(\mathsf {poly}\), we can construct an adversary \(\mathcal {B}\) that \((t_\mathcal {B} ,\epsilon _\mathcal {B} ,Q)\)-breaks the pseudorandomness of the function \(F_{\text {MDDH}}^{j}\) with input space \(\{0,1\}^j\), where

$$ j = O(\log \lambda ) \qquad \text {and}\qquad t_\mathcal {B} = \varTheta (t_\mathcal {A} ) \qquad \text {and}\qquad \epsilon _{\mathcal {B}} \ge \epsilon _\mathcal {A} /2 $$

Theorem 5 in turn shows that from each adversary \(\mathcal {B}\) that \((t_\mathcal {B} ,\epsilon _\mathcal {B} , Q)\)-breaks the security of \(F_{\text {MDDH}}^j\) we can construct an adversary \(\mathcal {B}\) ’ that \((t_\mathcal {B} ',\epsilon _\mathcal {B} ')\)-breaks the \(\mathcal {D}_{\ell , d}\)-\(\text {MDDH}\) assumption in with

$$ t_\mathcal {B} ' = \varTheta (t_\mathcal {B} ) \qquad \text {and}\qquad \epsilon _\mathcal {B} ' \ge \frac{\epsilon _\mathcal {B} }{dj} \ge \frac{\epsilon _\mathcal {A} }{2dj} $$

which yields the claim.   \(\square \)

Comparison to the DDH-based PRF of [NR97]. One particularly interesting instantiation of \(F_{\text {MDDH}}^{m}\) is based on the \(\mathcal {L}_1\text{- }\text {MDDH}\) assumption, which is an improvement over the famous Naor-Reingold construction based on the \(\text {DDH}\) (namely, \(\mathcal {L}_1\)-\(\text {MDDH}\)) assumption from [NR97]. In \(F_{\text {MDDH}}^m\), we sample \(\mathbf {A} _i\) from \(\mathcal {D}_{\ell , d}\) and then compute the aggregated transformation matrices \(\mathbf {T} _i\). For the \(\mathcal {L}_1\) distribution, we can equivalently pick random elements \(T_i\) from \(\mathbb {Z} _q\).

Let be a group of prime order q, \(S := \mathbb {Z} _q\), , \(X := \{0,1\}^n\) and \(m = \omega (\log \lambda )\) as above. Then we choose \(T_1,\ldots ,T_m, a {\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}}\mathbb {Z} _q\) and obtain a PRF with domain \(\{0,1\}^n\) as

$$\begin{aligned} F_{\text {DDH}}^{\mathcal {H}_{n,m}}(\mathbf {S}, [a ], h,x) = \left[ \left( \prod _{i : h(x)_i = 1}^m T_i\right) \cdot a \right] . \end{aligned}$$

Note that the resulting PRF is identical to the original Naor-Reingold function [NR97], except that an APUHF h is applied to the input x before it is processed in the Naor-Reingold construction. For the original construction from [NR97] both the size of the secret key and the tightness loss of the security proof (based on the DDH assumption in ) are linear in the bit-length of the function input. We show that merely by encoding the input with an APUHF one can obtain shorter secret keys of size \(m = \omega (\log \lambda )\) and with security loss \(O(\log \lambda )\) (based on the same assumption as [NR97]), even for input size \(n \gg m\).

Comparison to the Matrix-DDH PRF of [DS15]. Döttling and Schröder [DS15] also described a variant of the Matrix-DDH-based PRF of [EHK+13]. Their PRF is the function

(13)

where \(\mathbf {S} \), , and m are as in our construction, and \(x \in \mathbb {Z} _q\). Thus, in comparison, our construction from (12) uses the same value of m, but is somewhat simpler that (13) and also slightly more efficient to evaluate. In particular, the computation of the terms of the form \((x^{2^j} \cdot \mathbf {I})\) is replaced with a single evaluation of the APUHF h. Another difference is that the domain of their function is restricted to \(x \in \mathbb {Z} _q\), while in our case \(x \in \{0,1\}^n\) can be any bit string of polynomially-bounded length \(n = n(\lambda )\).

5.2 More Efficient LWE-Based PRFs

We recall the learning with error (\(\text {LWE}\)) assumption. Then we apply our results to the LWE-based PRF from Banerjee, Peikert and Rosen [BPR12].

Definition 13

(Learning With Errors assumption, LWE). Let \(p\) be a modulus, \(N\) be a positive integer, and \(\chi _\alpha :=D_{\mathbb {Z} _{p},\alpha }\) be a Gaussian distribution with noise parameter \(\alpha \). Let be a random vector. We say that adversary \(\mathcal {A}\) \((t_\mathcal {A} ,\epsilon _\mathcal {A} )\)-breaks the \(\text {LWE}_{p, N,\alpha }\) assumption if it runs in time \(t_\mathcal {A} \) and

where , , \(e {\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}}\chi _\alpha \) and \(u{\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}}\mathbb {Z} _{p}\).

Let \(\left\lfloor \cdot \right\rceil \) be the rounding function, which rounds a real number to the largest integer which does not exceed it. Let \(p\ge q\). For an element \(h \in \mathbb {Z} _{p}\), we define the rounding function \(\left\lfloor \cdot \right\rceil _q: \mathbb {Z} _p \rightarrow \mathbb {Z} _q\) as \(\left\lfloor h \right\rceil _q := \lfloor (q/p) h \rceil \), and for a vector , the rounding function is defined component-wise.

The PRF construction of [BPR12] and its security. Let \(S:= \chi _\alpha ^{N\times N}\) and \( K:= \mathbb {Z} _{p}^{N} \), and \(X:=\{0,1\}\). We assume that \(\mathbf {S} \in S\) has full rank. The basis of the PRF of [BPR12] is the function \(F_{\text {LWE}}: S \times K \times X \rightarrow K\),

(14)

We apply a slightly different augmented cascade transformation in Fig. 1 to obtain the PRF of [BPR12] with key space \({(\chi _\alpha ^{(N\times N)})^m\times \mathbb {Z} _{p}^N}\) and domain \(\{0,1\}^m\):

(15)

where \(\mathbf {S}:= (\mathbf {S} _1, ..., \mathbf {S} _m)\) and . Different to Fig. 1, we apply the rounding function on the output of Fig. 1.

Theorem 7

([BPR12, Theorem 5.2]). Let \(\chi _\alpha = D_{\mathbb {Z},\alpha }\) be a Gaussian distribution with parameter \(\alpha >0\), let \(m\) be a positive integer that denotes the length of message inputs. Define \(B:= m(C \alpha \sqrt{N})^{m}\) for a suitable universal constant C. Let \(p, q\) be two moduli such that \(p> q\cdot B \cdot N^{\omega (1)}\).

From each adversary \(\mathcal {A}\) that \((t_\mathcal {A} ,\epsilon _\mathcal {A} , Q)\)-breaks the security of \(F_{\text {LWE}}^{m}\) with input space \(\{0,1\}^m\) (for an arbitrary positive integer m) we can construct an adversary \(\mathcal {B}\) that \((t_\mathcal {B} ,\epsilon _\mathcal {B} )\)-breaks the \(\text {LWE}_{p, N,\alpha }\) assumption with

$$ t_\mathcal {B} = \varTheta (t_\mathcal {A} ) \qquad \text {and}\qquad \epsilon _\mathcal {B} \ge \frac{\epsilon _\mathcal {A} }{ m\cdot N} $$

Note that B is an important parameter, since it determines the size of the LWE modulus \(p\) and contains the expensive term \(N^{m/2}\), which is exponential in \(m\). Thus, a smaller \(m\) can give us a smaller \(p\), which in turn yields a weaker LWE assumption and a much more efficient PRF. In the following, we apply our results to \(F_{\text {LWE}}^m\) to reduce \(m\) from polynomial to logarithmic in security parameter \(\lambda \).

Our construction. By additionally encoding the input with an APUHF as described in (4), we finally obtain \( F_{\text {LWE}}^{\mathcal {H}_{n,m}} : (\chi _\alpha ^{(N\times N)})^m\times \mathbb {Z} _{p}^N\times \mathcal {H}_{n,m}\times \{0,1\}^m \rightarrow \mathbb {Z} _{q}^N\) as

(16)

In order to apply Theorem 4 to show that this particular instance of the augmented cascade with encoded input is a secure PRF with key space \(S^m \times K \times \mathcal {H}_{n,m}\) and domain \(\{0,1\}^n\), we have to prove that function \(F_{\text {LWE}} \) is perfectly one-time secure.

Lemma 9

Function \(F_{\text {LWE}} \) from (14) is perfectly one-time secure.

Proof

We have to show that

for all .

If \(x=0\) then , which is a random vector in \(\mathbb {Z} _{p}^{N}\) by definition. If \(x=1\) then , which is again a random vector, due to the fact that \(\mathbf {S} \) is a full-rank matrix.   \(\square \)

We recall the following useful notations and corollary for the proof of Theorem 8 given below. We define an error sampling function \(E: \{0,1\}^j \rightarrow \mathbb {Z} ^{N}\) and for \(x \in \{0,1\}^j\) and \(j \in \llbracket m \rrbracket \) we define the randomized version of \(F_{\text {LWE}}^j\) as . The proof of Theorem 5.2 and Lemma 5.5 in [BPR12] show that \(\tilde{F}_{\text {LWE}}^{j}\) is pseudorandom based on the decisional \(\text {LWE}\) assumption and it holds that \(F_{\text {LWE}}^m(x) = \left\lfloor \left( \prod _{i>j \wedge x_i=1 }^{m} \mathbf {S} _i \right) \cdot \tilde{F}_{\text {LWE}}^{j}(x) \right\rceil _q\), except with negligible probability. We summarize this in the following corollary.

Corollary 1

Let all the parameters be defined as in Theorem 7. There exists an efficiently randomized error sampling function \(E: \{0,1\}^j \rightarrow \mathbb {Z} ^{N}\), such that, from each adversary \(\mathcal {A}\) that \((t_\mathcal {A} ,\epsilon _\mathcal {A} , Q)\)-breaks the security of with input \(x\in \{0,1\}^j\) (for \(j \in \llbracket m \rrbracket \)) we can construct an adversary \(\mathcal {B}\) that \((t_\mathcal {B} ,\epsilon _\mathcal {B} )\)-breaks the \(\text {LWE}_{p, N,\alpha }\) assumption with

$$ t_\mathcal {B} = \varTheta (t_\mathcal {A} ) \qquad \text {and}\qquad \epsilon _\mathcal {B} \ge \frac{\epsilon _\mathcal {A} }{ m\cdot N}. $$

Moreover, except with probability \(2^{-\Omega (N)}\), we have

$$\begin{aligned} F_{\text {LWE}}^m(x) = \left\lfloor \left( \prod _{i>j \wedge x_i=1 }^{m} \mathbf {S} _i \right) \cdot \tilde{F}_{\text {LWE}}^{j}(x) \right\rceil _q. \end{aligned}$$

Theorem 8

Let \(m = \omega (\log \lambda )\) be (slightly) super-logarithmic and \(\mathcal {H}_{n,m}\) be a family of all-prefix almost universal hash functions. Let \(\chi _\alpha = D_{\mathbb {Z},\alpha }\) be a Gaussian distribution with parameter \(\alpha >0\), let \(m\) be a positive integer denotes the length of message inputs. Define \(B:= m(C \alpha \sqrt{N})^{m}\) for a suitable universal constant C. Let \(p, q\) be two moduli such that \(p> q\cdot B \cdot N^{\omega (1)}\).

From each adversary \(\mathcal {A}\) that \((t_\mathcal {A} ,\epsilon _\mathcal {A} , Q)\)-breaks the security of \(F_{\text {LWE}}^{\mathcal {H}_{n,m}}\) with \(Q/\epsilon _\mathcal {A} = \mathsf {poly}(\lambda )\) for some polynomial \(\mathsf {poly}\) we can construct an adversary \(\mathcal {B}\) ’ that \((t_\mathcal {B} ',\epsilon _\mathcal {B} ')\)-breaks the \(\text {LWE}_{p, N,\alpha }\) assumption with

$$ t_\mathcal {B} ' = \varTheta (t_\mathcal {A} ) \qquad \text {and}\qquad \epsilon _\mathcal {B} ' \ge \frac{\epsilon _\mathcal {A} }{2 j \cdot N} - 2^{-\Omega (N)} $$

where \(j = O(\log \lambda )\).

Proof

The proof is the same as the one for Theorem 4. The only difference is between Games 1 and 2. Here we do one intermediate game transition Game 1’: We simulate \(\mathcal {O}_1(x)\) by returning \(F_{\text {LWE}}^m(x) = \left\lfloor \left( \prod _{i>j \wedge x_i=1 }^{m} \mathbf {S} _i \right) \cdot \tilde{F}_{\text {LWE}}^{j}(x) \right\rceil _q\) and \(\mathcal {O}_0\) by returning a random vector in \(\mathbb {Z} _{q}^N\).

By the second statement of Corollary 1, the difference between Games 1 and 1’ is bounded by the statistical difference \(2^{-\Omega (N)}\). Moreover, the difference between Games 1’ and 2 is bounded by the security of \(\tilde{F}_{\text {LWE}}^{j}\). By the first statement of Corollary 1 we can conclude the proof.   \(\square \)

Comparison to the LWE PRF of [DS15]. Döttling and Schröder [DS15] describe a different variant of the BPR PRF. Their approach is to instantiate their Construction 1 with the BPR PRF and then obtain the following function

where \(L = \omega (\log \lambda )\), for each \(j \in \llbracket \lambda \rrbracket \) the function \(H_{2^i,j} : \{0,1\}^n \rightarrow \{0,1\}^{i+1} \) is chosen from a suitable universal hash function family with range \(\{0,1\}^{i+1}\), and \(\mathbf {S} \) is chosen the same as ours.

Compared with \(F_{\text {LWE}}^{\mathsf {DS15}}\), our variant has shorter secret keys: instead of having \( L \cdot \lambda \) many hash functions, we only have a single one. In terms of computation efficiency, instead of running \(H_i\) and \(F_{\text {LWE}}^i\) for \(L \cdot \lambda \) times, we only run the hash function and \(F_{\text {LWE}}^m\) once.

6 Conclusion

We have introduced all-prefix (almost-)universal hash functions (APUHFs) as a tool to generically improve the augmented cascade construction of pseudorandom functions by Boneh, Montgomery, and Raghunathan [BMR10]. By generically applying an APUHF to the function input before processing it in the augmented cascade, we are able to reduce both the key size and the tightness of the security proof by one order of magnitude. We gave simple and very efficient constructions of such a function families, based on the almost-universal hash function family of Dietzfelbinger et al. [DHKP97], which can be evaluated by essentially a single modular multiplication, and generically on pairwise-independent hash functions.

For the instantiation based on Matrix-DDH assumptions of [EHK+13], which includes the classical constructions of Naor-Reingold [NR97] and the Lewko-Waters [LW09] as special cases, this yields asymptotically short keys consisting of only \(\omega (\log \lambda )\) elements, and tight security with loss only \(O(\log \lambda )\). These parameters are similar to the respective constructions of Döttling and Schröder [DS15], but our instantiation is conceptually much simpler and slightly more efficient.

For the LWE-based instantiation based of Banerjee, Peikert and Rosen [BPR12] (BPR), we are able to reduce the required size of the LWE modulus \(p\) from exponential to super-polynomial in the security parameter, which significantly improves efficiency and allows to prove security under a weaker LWE assumption. Again, the latter is similar to a result from [DS15], but we replace their relatively expensive generic construction, which requires to run \(\lambda \cdot \omega (\log \lambda )\) instances of the BPR function in parallel, with a single instance plus an all-prefix almost-universal hash function.

We believe that APUHFs may have many further applications in cryptography beyond pseudorandom functions. This may include, for example, constructions of more efficient cryptosystems with tight provable security, such as digital signatures or public-key encryption schemes. In particular constructions using arguments similar to pseudorandom functions based on the augmented cascade, such as [CW13, GHKW16], seem to be promising targets.