1 Introduction

Wegman-Carter MACs. A Message Authentication Code (MAC) is a fundamental symmetric-key primitive that allows a sender to authenticate messages by computing tags that can be verified by the receiver (the sender and the receiver sharing a common secret key). Many MACs are based on some underlying cryptographic primitive such as a block cipher (e.g., CBC-MAC [BKR00]) or a hash function (e.g., HMAC [BCK96]). A different approach, pioneered by Wegman and Carter [WC81] (building on earlier work by Gilbert et al. [GMS74]), first treats the message M with an almost xor-universal (AXU) hash functionFootnote 1 H (i.e., a fast, combinatorial primitive rather than a slow, cryptographic one) and masks the result with a one-time pad, resulting in information-theoretically secure authentication. Since sharing a one-time pad for each message to authenticate is not very practical, one can instead use a pseudorandom function F, as first proposed by Brassard [Bra82], allowing the sender and the receiver to share a short secret K rather than a long list of one-time pads. The mask for each new message is then generated pseudorandomly by applying \(F_K\) to a nonce N, a value used at most once. This reintroduces a cryptographic primitive (and hence a computational assumption), but only for treating a small nonce rather than a potentially long message. The resulting nonce-based MAC, that we simply call the Wegman-Carter (WC) construction, is

$$ \mathsf {WC}[F,H]_{K,K_h}(N,M)=F_K(N)\oplus H_{K_h}(M), $$

where K is the key for the pseudorandom function F, \(K_h\) is the key for the AXU hash function H, N is the nonce, and M is the message.Footnote 2

The WC construction enjoys a very strong provable security bound when nonces are never reused. Assuming that F is perfect (i.e., \(F_K\) is a uniformly random function), any adversary seeing at most \(q_m\) honestly generated tags and making at most \(q_v\) verification queries (i.e., forgery attempts) succeeds with probability at most \(\varepsilon q_v\), where \(\varepsilon \) is the maximal differential probability of H, namely

$$ \varepsilon = \max _{X\ne X',Y} \Pr \left[ H_{K_h}(X)\oplus H_{K_h}(X')=Y \right] , $$

the probabilities being taken over the random draw of the hashing key \(K_h\). When F is not perfect, there is an additional term accounting for its insecurity as a PRF (more precisely, this corresponds to the best advantage an adversary can achieve in distinguishing \(F_K\) from a uniformly random function within \(q_m+q_v\) queries).

Many AXU hash functions have been proposed for instantiating this construction, most of them based on polynomial hashing [Kra94, Rog95, Sho96, HK97, BHK+99, Ber00, KR00, KVW04, MV04, Ber05c]. See [Ber07] for more references and a comprehensive survey of polynomial hashing. Universal hash functions can also be constructed from a block cipher (e.g. by using the CBC mode with prefix-free encoding [BR05, BPR05]), but in that case the provable maximal differential probability depends on the PRP-security of the block cipher (hence, this yields “computational” rather than “statistical” universal hash functions).

Nonce-Misuse Resistance. Despite the advantages just mentioned (efficiency and excellent security bound), the WC construction has one major shortcoming: it is very vulnerable to nonce-misuse. If a nonce is repeated even a single time, consequences can be catastrophic [Jou06, HP08]. For example, in the case of polynomial universal hashing, this can lead to a complete recovery of the hashing key, which allows universal forgeries. To remedy this nonce-misuse problem, the simplest option, which has been known for long, is to apply the PRF to the output of the hash function. For instance, if the PRF takes 2n-bit inputs, one can define the tag as \(F_K(N\Vert H_{K_h}(M))\); this construction was analyzed by Black et al. [BHK+99, BC09]. If F takes only n-bit inputs, one can instead apply the PRF with an independent key to the output of the WC construction, thereby defining the tag as

$$\begin{aligned} F_{K'}\big (F_K(N)\oplus H_{K_h}(M)\big ). \end{aligned}$$
(1)

If one gets rid of the nonce, simply defining the tag as \(F_K(H_{K_h}(M))\), one obtains a stateless MAC but the security bound includes an extra “birthday-type” term \(\varepsilon q_m^2\).

Beyond-Birthday-Bound Security. There is another obstacle which can prevent concrete implementations from enjoying the strong security bound promised by the WC construction: pseudorandom functions are not always readily available, and it is common to use a pseudorandom permutation instead, or in other words to replace F with a block cipher E. However, as first pointed out by Shoup [Sho96], this causes the proven security bound to drop to the so-called birthday bound. Indeed, a random permutation can be distinguished from a random function within q queries with advantage roughly \(q^2/2^n\). For resource-constrained environments, where lightweight cryptographic primitives based on block ciphers with 64-bit blocks are likely to be implemented, this means that security insurance is lost after \(2^{32}\) queries, which is often unacceptable, especially when refreshing keys regularly is excluded.

A first solution to overcome the birthday bound while using only a block cipher is to use a randomized construction. However, existing schemes either require very strong properties from the block cipher such as the ideal cipher model [JJV02] or resistance to related-key attacks [JL04], or require a relatively large amount of randomness (at least 3n bits for the MACRX construction of [BGK99]). The beyond-birthday-bound secure construction named MAC-R2 of Minematsu [Min10] uses a random n-bit IV per message and bears resemblance to the construction proposed in this paper, but it requires four calls to the underlying block cipher. (Jumping ahead, our new construction requires only two calls.) Moreover, reliable randomness might not always be available in some environments, and it might sometimes be easier to maintain a state.

Another option is to implement \(F_K\) in construction (1) from a block cipher E using a so-called PRP-to-PRF conversion method [BKR98, HWKS98] with beyond-birthday-bound security. (On the other hand, it is easy to see that the outer PRF \(F_{K'}\) can be directly implemented by a block cipher without security loss.) Perhaps the simplest such method is the “xor” construction \(E_{K_1}(N)\oplus E_{K_2}(N)\), or its close single-key variant \(E_K(N\Vert 0)\oplus E_K(N\Vert 1)\), which have been analyzed in a number of papers [BI99, Luc00, Pat08a, Pat13, CLP14]. However, all known methods require at least two block cipher calls; taking into account the outer encryption layer, this amounts to three block cipher calls for the whole construction. Is it possible to do better?

Our Contribution. We propose a new nonce-based MAC based on a AXU hash function and a block cipher with the following properties:

  • (i) it is simple and efficient, requiring only two calls to the underlying block cipher, one of which can be carried out in parallel to the hash function computation;

  • (ii) it provably provides security beyond the birthday bound when nonces are never reused;

  • (iii) it provably retains security up to the birthday bound in case of nonce misuse.

Property (ii) ensures that the scheme is highly secure in the nominal use case where nonces are never repeated, while property (iii) acts as a “safety net” if anything goes wrong with nonces.

Our starting point is what we call the Encrypted Wegman-Carter construction, which is simply construction (1) where the outer PRF layer is replaced by a block cipher, viz.

$$\begin{aligned} E_{K'}\big (F_K(N)\oplus H_{K_h}(M)\big ). \end{aligned}$$
(2)

As already briefly explained, this construction enjoys the same security bound as the (unencrypted) WC construction when nonces are never repeated, and is moreover nonce-misuse resistant up to the birthday bound. Replacing \(F_K\) by a simple block cipher call causes the security bound to drop to the birthday bound even when nonces are not repeated, while using a PRP-to-PRF conversion method with security beyond the birthday bound results in at least three block cipher calls in total for the resulting construction.

Our main observation is that one can overcome the birthday bound in the nonce-respecting scenario by instantiating \(F_K\) using “only” the Davies-Meyer (DM) construction. The DM construction is the easiest way to turn a block cipher into a keyed function.Footnote 3 Given a block cipher \(E:\mathcal {K}\times \{0,1\}^n\rightarrow \{0,1\}^n\), the DM construction based on E is simply

$$ \mathsf {DM}[E]_K(N)=E_K(N)\oplus N. $$

Note that this PRF construction is not secure beyond the birthday bound: given black-box access to a function \(f:\{0,1\}^n\rightarrow \{0,1\}^n\), a distinguisher can simply query \(f(N_i)\) for roughly \(2^{n/2}\) distinct values \(N_i\) and look for collisions in values \(f(N_i)\oplus ~N_i\). When f is a uniformly random function this will happen with good probability, whereas when \(f=\mathsf {DM}[E]_K\) this cannot happen. However, this attack is not possible anymore if one encrypts the output of the DM construction.

Using the DM construction to instantiate \(F_K\) in construction (2) results in a MAC construction based only on E and H, which we call Encrypted Wegman-Carter with Davies-Meyer (EWCDM) construction, depicted on Fig. 1 and defined as

$$\begin{aligned} E_{K'}\big ( E_K(N)\oplus N \oplus H_{K_h}(M)\big ). \end{aligned}$$
(3)

Our main result is that the EWCDM construction is secure up to roughly \(2^{2n/3}\) MAC queries and \(2^n\) verification queries against nonce-respecting adversaries (while against nonce-misusing adversaries it still enjoys birthday-bound security) (Table 1). We stress that this does not hold for the (unencrypted) Wegman-Carter construction with Davies-Meyer: if tags are computed as

$$ T= E_K(N)\oplus N \oplus H_{K_h}(M), $$

then the resulting MAC scheme is only provably secure up to the birthday bound against nonce-respecting adversaries.Footnote 4 Hence, the outer encryption layer \(E_{K'}\) turns out to be twice useful: for providing nonce-misuse resistance on one hand, and for cheaply enhancing security against nonce-respecting adversaries beyond the birthday bound on the other hand.

We believe that our new construction would be an elementary and easy-to-implement way to enhance the security of widely deployed authentication or authenticated encryption schemes such as Poly1305-AES [Ber05c] or GCM [MV04] (in particular, note that this can be done in a black-box way on top of an existing implementation of those schemes). The main cost would be some additional latency due to the extra block cipher call, but depending on the context this might be tolerable.

Fig. 1.
figure 1

The “Encrypted Wegman-Carter with Davies-Meyer” construction.

Proof Technique. At the heart of construction (3) is a novel PRP-to-PRF conversion method: namely, if we make abstraction for a moment of the hash of the message M, and if we simply denote P and \(P'\) in place of \(E_K\) and \(E_{K'}\), we obtain a function of the nonce defined as

$$ F(N)=P'(P(N)\oplus N). $$

For obvious reasons, we call this the Encrypted Davies-Meyer (EDM) construction. The main part of the proof consists in proving that this is a secure PRF up to \(2^{2n/3}\) adversarial queries. (We prove this as a standalone result in the full version of the paper [CS16]; this constitutes a good warm-up for the reader before the more complicated security proof of the EWCDM construction in Sect. 4.) However, since the hash of the message is “intermingled” within the EDM construction, it does not seem possible to first prove that the outputs of the MAC oracle are indistinguishable from random, and then handle verification queries (as is usually done for proving the security of the standard Wegman-Carter construction; see Theorem 1 in Sect. 3.1). Note that one cannot hope either to prove security beyond the birthday bound by a sequence of games that would start by replacing the DM construction \(E_K(N)\oplus N\) by a uniformly random function.

Hence, it seems that any proof aiming at security beyond the birthday bound must handle MAC queries and verification queries both at the same time. For this, we employ the H-coefficients technique, which has been introduced by Patarin [Pat90, Pat91, Pat08b] and which recently regained attention since Chen and Steinberger used it to analyze the iterated Even-Mansour cipher [CS14]. This technique gives a kind of “systematic” way to upper bound the statistical distance between the answers of two interactive systems and is typically used to prove (information-theoretic) pseudorandomness of constructions such as Feistel networks. To the best of our knowledge, this is the first time the H-coefficients technique is used for proving the security of a MAC (i.e., unpredictability rather than pseudorandomness).

More Related Work. This paper focuses on nonce-based (hence stateful) MACs, but there is also an important line of work aiming at constructing stateless and deterministic MACs secure beyond the birthday bound. However, existing constructions [Yas10, Yas11, DS11, ZWSW12] are far more complex than the one presented in this paper. We mainly mentioned works related to provable security; there is also a large number of papers (motivated by the analysis of the widely deployed GCM mode [MV04]) investigating attacks against polynomial hash-based MACs [Fer05, HP08, Saa12, PC15, ABBT15].

Open Problems. We prove the security of the EWCDM construction in the nonce-respecting scenario up to \(2^{2n/3}\) MAC queries, but we conjecture that security actually holds up to close to \(2^n\) queries (a similar conjecture holds for the Encrypted Davies-Meyer construction).

The EWCDM construction uses two distinct keys for the two calls to the block cipher; a natural question is whether security beyond the birthday bound also holds when the same key is used. We believe this to be true, but likely cumbersome to prove. The corresponding question regarding the Encrypted Davies-Meyer construction is even more intriguing: How many queries are required to distinguish \(P(x\oplus P(x))\) from a random function? It might well be that this construction is secure up to close to \(2^n\) queries, which would yield the first optimally secure PRP-to-PRF conversion method which uses a single permutation (unlike \(P_1(x)\oplus P_2(x)\)) and does not shrink the domain (unlike \(P(x\Vert 0)\oplus P(x\Vert 1)\)).

Finally, it would be interesting to investigate how the security of EWCDM is affected by tag truncation. We believe that the only change to be made to the bound of Theorem 3 is to replace the term \(6q_v/2^n\) by a term \(O(q_v/2^\ell )\), where \(\ell \) is the length of the truncated tag, but this remains to be proven.

Organization. We first establish the notation and recall standard security definitions in Sect. 2. In Sect. 3, we recall the previous security results on the Wegman-Carter and the Encrypted Wegman-Carter constructions, and describe our new EWCDM construction. We then prove the security of EWCDM in the nonce-respecting scenario in Sect. 4 and in the nonce-misusing scenario in Sect. 5. We also analyze the Encrypted Davies-Meyer PRP-to-PRF conversion method in the full version of the paper [CS16].

Table 1. Proven security bounds (omitting constants and the term accounting for the PRP-security of the underlying block cipher) for the Wegman-Carter construction \(\mathsf {WC}[E,H]\), the Encrypted Wegman-Carter construction \(\mathsf {EWC}[E,H]\), and the new Encrypted Wegman-Carter with Davies-Meyer construction \(\mathsf {EWCDM}[E,H]\).

2 Preliminaries

Basic Notation. Given a non-empty set \(\mathcal {X}\), we denote \(X\leftarrow _{\$}\mathcal {X}\) the draw of an element X from \(\mathcal {X}\) uniformly at random. The set of all functions from \(\mathcal {X}\) to \(\mathcal {Y}\) is denoted \(\mathsf {Func}(\mathcal {X},\mathcal {Y})\), and the set of all permutations of \(\mathcal {X}\) is denoted \(\mathsf {Perm}(\mathcal {X})\). The set of binary strings of length n is denoted \(\{0,1\}^n\). The set of all functions from \(\{0,1\}^n\) to \(\{0,1\}^n\) is simply denoted \(\mathsf {Func}(n)\), and the set of all permutations of \(\{0,1\}^n\) is simply denoted \(\mathsf {Perm}(n)\). For integers \(1\le b\le a\), we will write \((a)_b=a(a-1)\cdots (a-b+1)\) and \((a)_0=1\) by convention. Note that the probability that a random permutation \(P\leftarrow _{\$}\mathsf {Perm}(n)\) satisfies q equations \(P(X_i)=Y_i\) for distinct \(X_i\)’s and distinct \(Y_i\)’s is exactly \(1/(2^n)_q\).

PRFs and Block Ciphers. A keyed function with key space \(\mathcal {K}\), domain \(\mathcal {X}\), and range \(\mathcal {Y}\) is a function \(F:\mathcal {K}\times \mathcal {X}\rightarrow \mathcal {Y}\). We denote \(F_K(X)\) for F(KX). A (qt)-adversary against F is an algorithm \(\mathsf {A}\) with oracle access to a function from \(\mathcal {X}\) to \(\mathcal {Y}\), making at most q oracle queries, running in time at most t, and outputting a single bit. The advantage of \(\mathsf {A}\) in breaking the PRF-security of F is defined as

$$ \mathbf{Adv }^{\mathrm {PRF}}_F(\mathsf {A})=\left| \Pr \left[ K\leftarrow _{\$}\mathcal {K}: \mathsf {A}^{F_K}=1 \right] -\Pr \left[ R\leftarrow _{\$}\mathsf {Func}(\mathcal {X},\mathcal {Y}):\mathsf {A}^{R}=1 \right] \right| . $$

A block cipher with key space \(\mathcal {K}\) and domain \(\mathcal {X}\) is a mapping \(E:\mathcal {K}\times \mathcal {X}\rightarrow \mathcal {X}\) such that for any key \(K\in \mathcal {K}\), \(X\mapsto E(K,X)\) is a permutation of \(\mathcal {X}\). We denote \(E_K(X)\) for E(KX). A (qt)-adversary against E is an algorithm \(\mathsf {A}\) with oracle access to a permutation of \(\mathcal {X}\), making at most q oracle queries, running in time at most t, and outputting a single bit. The advantage of \(\mathsf {A}\) in breaking the PRP-security of E is defined as

$$ \mathbf{Adv }^{\mathrm {PRP}}_E(\mathsf {A})=\left| \Pr \left[ K\leftarrow _{\$}\mathcal {K}: \mathsf {A}^{E_K}=1 \right] -\Pr \left[ P\leftarrow _{\$}\mathsf {Perm}(\mathcal {X}):\mathsf {A}^{P}=1 \right] \right| . $$

Note that we do not need the strongest “two-sided” version of PRP-security (where the adversary also has access to a decryption oracle) since all constructions considered in this paper only use the forward (encryption) direction of the underlying block cipher.

MACs. Given four non-empty sets \(\mathcal {K}\), \(\mathcal {N}\), \(\mathcal {M}\), and \(\mathcal {T}\), a nonce-based keyed function with key space \(\mathcal {K}\), nonce space \(\mathcal {N}\), message space \(\mathcal {M}\) and range \(\mathcal {T}\) is simply a function \(F:\mathcal {K}\times \mathcal {N}\times \mathcal {M}\rightarrow \mathcal {T}\). Stated otherwise, it is a keyed function whose domain is a cartesian product \(\mathcal {N}\times \mathcal {M}\). We denote \(F_K(N,M)\) for F(KNM).

Definition 1

(Nonce-Based MAC). Let \(\mathcal {K}\), \(\mathcal {N}\), \(\mathcal {M}\), and \(\mathcal {T}\) be non-empty sets. Let \(F:\mathcal {K}\times \mathcal {N}\times \mathcal {M}\rightarrow \mathcal {T}\) be a nonce-based keyed function. For \(K\in \mathcal {K}\), let \(\mathsf {Ver}_K\) be the verification oracle which takes as input a triple \((N,M,T)\in \mathcal {N}\times \mathcal {M}\times \mathcal {T}\) and returns 1 (“accept”) if \(F_K(N,M)=T\), and 0 (“reject”) otherwise. A \((q_m,q_v,t)\)-adversary against the MAC-security of F is an adversary \(\mathsf {A}\) with oracle access to the two oracles \(F_K\) and \(\mathsf {Ver}_K\) for \(K\in \mathcal {K}\), making at most \(q_m\) “MAC” queries to its first oracle and at most \(q_v\) “verification” queries to its second oracle, and running in time at most t. We say that \(\mathsf {A}\) forges if any of its queries to \(\mathsf {Ver}_K\) returns 1. The advantage of \(\mathsf {A}\) against the MAC-security of F is defined as

$$ \mathbf{Adv }^{\mathrm {MAC}}_F(\mathsf {A})= \Pr \left[ K\leftarrow _{\$}\mathcal {K}:\mathsf {A}^{F_K,\mathsf {Ver}_K} \text { forges} \right] , $$

where the probability is also taken over the random coins of \(\mathsf {A}\), if any. The adversary is not allowed to ask a verification query (NMT) if a previous query (NM) to \(F_K\) returned T. The adversary is said nonce-respecting if it never repeats a nonce \(N\in \mathcal {N}\) in its queries to the first oracle \(F_K\).

We say that an adversary is nonce-misusing if it does not abide to the rule of non-repeating nonces. The MAC-security of F in face of nonce-misusing adversaries is defined exactly as above, and can be rephrased as the standard (i.e., not nonce-based) MAC-security of a keyed function with domain \(\mathcal {N}\times \mathcal {M}\).

AXU Hash Functions. We will need the following definition of an almost xor-universal (AXU) hash function.

Definition 2

( \(\varepsilon \) -AXU Hash Function). Let \(\mathcal {K}_h\), \(\mathcal {X}\) and \(\mathcal {Y}\) be three non-empty sets and \(\varepsilon >0\). A keyed function \(H:\mathcal {K}_h\times \mathcal {X}\rightarrow \mathcal {Y}\) is said to be \(\varepsilon \)-AXU if for any distinct \(X,X'\in \mathcal {X}\) and any \(Y\in \mathcal {Y}\),

$$ \Pr \left[ K_h\leftarrow _{\$}\mathcal {K}_h:H_{K_h}(X)\oplus H_{K_h}(X')=Y \right] \le \varepsilon . $$

3 Wegman-Carter MAC Constructions

3.1 The Standard Wegman-Carter Construction

We recall the standard Wegman-Carter construction [WC81] of a nonce-based MAC from an \(\varepsilon \)-AXU hash function and a PRF. Let \(\mathcal {K}\), \(\mathcal {K}_h\), and \(\mathcal {M}\) be non-empty sets. Let \(F:\mathcal {K}\times \{0,1\}^n\rightarrow \{0,1\}^n\) be a keyed function and \(H:\mathcal {K}_h\times \mathcal {M}\rightarrow \{0,1\}^n\) be an \(\varepsilon \)-AXU hash function. The Wegman-Carter construction based on F and H is the nonce-based keyed function with key space \(\mathcal {K}\times \mathcal {K}_h\), nonce space \(\{0,1\}^n\), message space \(\mathcal {M}\), and range \(\{0,1\}^n\) defined by

$$ \mathsf {WC}[F,H]_{K,K_h}(N,M)=F_K(N)\oplus H_{K_h}(M). $$

We recall the classical security result for this construction [WC81] and sketch the proof for completeness. Here and in all the following, \(t_H\) is an upper bound on the time needed to compute \(H_{K_h}(M)\) for any key \(K_h\in \mathcal {K}_h\) and any message \(M\in \mathcal {M}\).

Theorem 1

Let F and H be as above. Then for any \((q_m,q_v,t)\)-nonce-respecting adversary \(\mathsf {A}\) against the MAC-security of \(\mathsf {WC}[F,H]\), there exists a \((q_m+q_v,t')\)-adversary \(\mathsf {A}'\) against the PRF-security of F, where \(t'=O(t+(q_m+q_v)t_H)\), such that

$$ \mathbf{Adv }^{\mathrm {MAC}}_{\mathsf {WC}[F,H]}(\mathsf {A}) \le \mathbf{Adv }^{\mathrm {PRF}}_F(\mathsf {A}')+\varepsilon q_v. $$

Proof

Fix a \((q_m,q_v,t)\)-nonce-respecting adversary \(\mathsf {A}\). Consider the WC construction where \(F_K\) is replaced by a uniformly random function R, and let \(\delta \) be the advantage of \(\mathsf {A}\) against this new construction. By a straightforward hybrid argument, there is an adversary \(\mathsf {A}'\), making at most \(q_m+q_v\) oracle queries, and running in time \(O(t+(q_m+q_v)t_H)\), such that

$$ \mathbf{Adv }^{\mathrm {MAC}}_{\mathsf {WC}[F,H]}(\mathsf {A}) \le \mathbf{Adv }^{\mathrm {PRF}}_F(\mathsf {A}')+\delta . $$

The answers \(R(N)\oplus H_{K_h}(M)\) of the MAC oracle are now uniformly random and independent from \(K_h\). Consider the i-th verification query \((N',M',T')\) of the adversary. If \(N'\) never appeared in the MAC queries of the adversary, then \(T'\) is valid with probability \(2^{-n}\). If \(N'=N\) for some previous MAC query (NM) that returned T, then the verification query is valid iff

$$ R(N')\oplus H_{K_h}(M')=T' \Leftrightarrow H_{K_h}(M)\oplus H_{K_h}(M')=T\oplus T', $$

which happens with probability at most \(\varepsilon \) by definition of an \(\varepsilon \)-AXU hash function. (If \(M=M'\), then one must have \(T\ne T'\) by definition of the security experiment, and the forgery cannot be valid.) Since for an \(\varepsilon \)-AXU hash function with range \(\{0,1\}^n\) one has \(\varepsilon \ge 2^{-n}\), in all cases the forgery is valid with probability at most \(\varepsilon \). By a union bound over the \(q_v\) verification queries, one has \(\delta \le \varepsilon q_v\), which concludes the proof.    \(\square \)

Assume now that F is a family of permutations of \(\{0,1\}^n\), or in other words, a block cipher, that we denote E. Then E can be distinguished from a random function with q queries and advantage roughly \(q^2/2^n\) by simply looking for collisions in its outputs. In other words, by the PRP-PRF switching Lemma [BR06], the best upper bound one can hope to prove for the PRF-advantage of adversary \(\mathsf {A}'\) appearing in Theorem 1, assuming that E is a secure PRP, is

$$ \mathbf{Adv }^{\mathrm {PRF}}_E(\mathsf {A}')\le \mathbf{Adv }^{\mathrm {PRP}}_E(\mathsf {A}')+\frac{(q_m+q_v)^2}{2^{n+1}}, $$

so that the security bound for the resulting construction \(\mathsf {WC}[E,H]\) now has a birthday-type term. Bernstein [Ber05a, Ber05b] proved a better (but still of birthday-type) bound: as long as \(q_m\le 2^{n/2}\), the adversary can forge with probability at most \(C\varepsilon q_v\), for some small constant C (in all practical cases, \(C\le 2\)). Note that the distinguishing attack against E does not seem to translate into a forgery attack against the MAC scheme, and it might be possible to improve the security bound under additional assumptions on H and E.

3.2 Nonce-Misuse Resistance and the Encrypted Wegman-Carter Construction

In general, the standard Wegman-Carter construction of the previous section does not offer any security against nonce-misusing adversaries. Consider for example the case where H is a polynomial-based hash function. Then any adversary who gets two tags T and \(T'\) for two different messages M and \(M'\) generated with the same nonce knows that \(H_{K_h}(M)\oplus H_{K_h}(M')\oplus T\oplus T'=0\). The left hand side is a polynomial in \(K_h\) whose coefficients depend on M, \(M'\), T and \(T'\), and \(K_h\) is a root of this polynomial. Even though its degree can be quite high, this is often enough to mount devastating attacks. This weakness was one of the main criticism against the GCM authenticated encryption mode [MV04], whose authentication relies on the standard Wegman-Carter construction [Jou06].

The classical way to remedy this situation and achieve nonce-misuse resistance for Wegman-Carter MACs is to apply an extra PRF layer to the output of the construction. When this additional layer is a block cipher, one obtains what we call the Encrypted Wegman-Carter (EWC) construction. Let \(F:\mathcal {K}\times \{0,1\}^n\rightarrow \{0,1\}^n\) be a keyed function, \(E:\mathcal {K}'\times \{0,1\}^n\rightarrow \{0,1\}^n\) be a block cipher, and \(H:\mathcal {K}_h\times \mathcal {M}\rightarrow \{0,1\}^n\) be an \(\varepsilon \)-AXU hash function. Then the EWC construction based on F, E, and H has key space \(\mathcal {K}\times \mathcal {K}'\times \mathcal {K}_h\), nonce space \(\{0,1\}^n\), message space \(\mathcal {M}\), and range \(\{0,1\}^n\), and is defined by

$$\begin{aligned} \mathsf {EWC}[F,E,H]_{K,K',K_h}(N,M)&=E_{K'}\big (\mathsf {WC}[F,H]_{K,K_h}(N,M)\big )\\&=E_{K'}\big (F_K(N)\oplus H_{K_h}(M)\big ). \end{aligned}$$

One can straightforwardly verify that the security of this construction against nonce-respecting adversaries does not depend on E and that the upper bound of Theorem 1 still holds. For nonce-misusing adversaries, one has the following (the proof is omitted since it is exactly the same, mutatis mutandis, as the proof of Theorem 4 of Sect. 5).

Theorem 2

Let F, E and H be as above. Then for any \((q_m,q_v,t)\)-nonce-misusing adversary \(\mathsf {A}\) against the MAC-security of \(\mathsf {EWC}[F,E,H]\), there exists a \((q_m+q_v,t')\)-adversary \(\mathsf {A}'\) against the PRF-security of F and a \((q_m+q_v,t'')\)-adversary \(\mathsf {A}''\) against the PRP-security of E, where \(t',t''=O(t+(q_m+q_v)t_H)\), such that

$$ \mathbf{Adv }^{\mathrm {MAC}}_{\mathsf {EWC}[F,E,H]}(\mathsf {A}) \le \mathbf{Adv }^{\mathrm {PRF}}_F(\mathsf {A}')+ \mathbf{Adv }^{\mathrm {PRP}}_E(\mathsf {A}'')+\frac{2(q_m+q_v)^2}{2^n}+\frac{(q_m+q_v)^2\varepsilon }{2}. $$

It is tempting to implement F from E. The simplest way to do so is simply to let \(F=E\), thereby obtaining the construction (overloading notation \(\mathsf {EWC}[\cdot ]\))

$$ \mathsf {EWC}[E,H]_{K,K',K_h}(N,M)=E_{K'}\big (E_K(N)\oplus H_{K_h}(M)\big ). $$

However, the resulting MAC suffers from the same birthday-bound type problem against nonce-respecting adversaries as the unencrypted Wegman-Carter MAC \(\mathsf {WC}[E,H]\) of Sect. 3.1. As already mentioned in introduction, it is possible to use a PRP-to-PRF conversion method to obtain security beyond the birthday bound, but using the best known constructions yields a MAC that makes at least three calls to the underlying block cipher. Our goal is to reduce the number of block cipher calls to two, which seems to be the minimum to achieve both security beyond the birthday bound and nonce-misuse resistance.

3.3 The New Construction EWCDM

The main contribution of this paper is to propose a much simpler solution that allows to get beyond the birthday bound, namely using the Davies-Meyer (DM) construction which turns a block cipher \(E:\mathcal {K}\times \{0,1\}^n\rightarrow \{0,1\}^n\) into a keyed function as

$$ \mathsf {DM}[E]_K(N)=E_K(N)\oplus N. $$

Using the DM construction based on E to instantiate F in \(\mathsf {EWC}[F,E,H]\) results in a MAC construction based only on E and H, which we call Encrypted Wegman-Carter with Davies-Meyer (EWCDM) construction and denote \(\mathsf {EWCDM}[E,H]\), illustrated on Fig. 1 and defined as follows:

$$\begin{aligned} \mathsf {EWCDM}[E,H]_{K,K',K_h}(N,M)&\mathrel {\mathop =^\mathrm{def}}\mathsf {EWC}[\mathsf {DM}[E],E,H]_{K,K',K_h}(N,M)\\&=E_{K'}\big ( E_K(N)\oplus N \oplus H_{K_h}(M)\big ). \end{aligned}$$

As already explained in introduction, the DM construction is not PRF-secure beyond the birthday bound. Still, our main result, that we state and prove in the next section, is that the EWCDM construction is secure up to roughly \(2^{2n/3}\) MAC queries and \(2^n\) verification queries against nonce-respecting adversaries (while against nonce-misusing adversaries it still enjoys birthday-bound security).

The security proof entails an analysis of what we call the Encrypted Davies-Meyer (EDM) PRP-to-PRF conversion method, which turns two independent permutations P and \(P'\) of \(\{0,1\}^n\) into a function of \(\{0,1\}^n\) to \(\{0,1\}^n\) defined as

$$ \mathsf {EDM}[P,P'](N)=P'(P(N)\oplus N). $$

By “stripping off” from the security proof of EWCDM all details related to the hash function and verification queries, one can extract a proof that the EDM construction is a secure PRF up to \(2^{2n/3}\) adversarial queries. We do so in the full version of the paper [CS16], and the reader might want to read this simpler proof before proceeding to Sect. 4. However, as already explained in introduction, it does not seem possible to prove the MAC-security of the EWCDM construction in a modular way from the PRF-security of the EDM construction.

Finally, note that adding the hash of the message to the output of the EDM construction (rather than “in the middle”) would result in a construction secure up to \(2^{2n/3}\) queries against nonce-respecting adversaries, but insecure against nonce-misusing ones since it is just an instantiation of the standard WC construction of Sect. 3.1 (with the EDM construction as PRF).

4 Nonce-Respecting Security of EWCDM

4.1 Statement of the Result and Overview of the Proof

In all the following, we simply denote \(\varPi [E,H]\) the EWCDM construction based on block cipher E and AXU hash function H. Our main security result is as follows.

Theorem 3

Let \(\mathcal {M}\), \(\mathcal {K}\) and \(\mathcal {K}_h\) be non-empty sets. Let \(E:\mathcal {K}\times \{0,1\}^n\rightarrow \{0,1\}^n\) be a block cipher and \(H:\mathcal {K}_h\times \mathcal {M}\rightarrow \{0,1\}^n\) be an \(\varepsilon \)-AXU hash function. Then for any \((q_m,q_v,t)\)-nonce-respecting adversary \(\mathsf {A}\) against the MAC-security of \(\varPi [E,H]\) with \(q_m^{3/2}\le 2^n/4\) and \(q_v\le 2^n/4\), there exists a \((q_m+q_v,t')\)-adversary \(\mathsf {A}'\) against the PRP-security of E, where \(t'=O(t+(q_m+q_v)t_H)\), such that

$$ \mathbf{Adv }^{\mathrm {MAC}}_{\varPi [E,H]}(\mathsf {A})\le 2 \mathbf{Adv }^{\mathrm {PRP}}_E(\mathsf {A}')+\frac{5q_m^{3/2}}{2^n}+\frac{\varepsilon q_m}{2}+\frac{6q_v}{2^n}+\varepsilon q_v. $$

Hence, assuming \(\varepsilon \simeq 2^{-n}\), the EWCDM construction is secure up to \(q_m\simeq 2^{2n/3}\) MAC queries and \(q_v\simeq 2^n\) verification queries.

In the remaining of the section, we prove Theorem 3. We fix a \((q_m,q_v,t)\)-nonce-respecting adversary \(\mathsf {A}\) against the MAC-security of \(\varPi [E,H]\) and we let

$$ \delta = \mathbf{Adv }^{\mathrm {MAC}}_{\varPi [E,H]}(\mathsf {A}). $$

As specified in Definition 1, adversary \(\mathsf {A}\) has access to a MAC oracle \(\varPi [E,H]_{K,K',K_h}\) and a verification oracle \(\mathsf {Ver}_{K,K',K_h}\) for a randomly drawn key tuple \((K,K',K_h)\).

The first step of the proof is standard and consists in replacing \(E_K\) and \(E_{K'}\) by two random and independent permutations P and \(P'\), both in the MAC and in the verification oracle (in other words, we replace the block cipher E by the perfect cipher \(E^*\) whose key space is the set of all permutations of \(\{0,1\}^n\)). Let \(\varPi [E^*,H]\) denote the resulting construction. It is easy to show that there exists an adversary against the PRP-security of E, making at most \(q_m+q_v\) oracle queries and runnig in time at most \(O(t+(q_m+q_v)t_H)\), such that

$$\begin{aligned} \delta \le 2 \mathbf{Adv }^{\mathrm {PRP}}_E(\mathsf {A}')+ \mathbf{Adv }^{\mathrm {MAC}}_{\varPi [E^*,H]}(\mathsf {A}). \end{aligned}$$
(4)

(We replace successively \(E_K\) and \(E_{K'}\) by a random permutation, each time constructing an hybrid PRP-adversary, and we consider the best of the two adversaries). Our goal is now to upper bound

$$\begin{aligned} \delta ^*&\mathrel {\mathop =^\mathrm{def}} \mathbf{Adv }^{\mathrm {MAC}}_{\varPi [E^*,H]}(\mathsf {A})\\&=\Pr \left[ (P,P')\leftarrow _{\$}\mathsf {Perm}(n)^2,K_h\leftarrow _{\$}\mathcal {K}_h:\mathsf {A}^{\varPi [P,P',H_{K_h}],\mathsf {Ver}[P,P',H_{K_h}]} \text { forges} \right] , \end{aligned}$$

where, overloading the notation, \(\varPi [P,P',H_{K_h}]\) denotes the construction \(\varPi [E^*,H]\) instantiated with permutations P, \(P'\), and hashing key \(K_h\) and \(\mathsf {Ver}[P,P',H_{K_h}]\) denotes the corresponding verification oracle.

It will be more convenient to express \(\delta ^*\) as a distinguishing advantage. Namely, let \(\mathsf {Rand}\) denote a perfectly random oracle with domain \(\{0,1\}^n\times \mathcal {M}\) and range \(\{0,1\}^n\), and \(\mathsf {Rej}\) be an oracle with inputs in \(\{0,1\}^n\times \mathcal {M}\times \{0,1\}^n\) which always returns 0 (“reject”). Since the adversary cannot forge (i.e., have the right oracle return 1) when interacting with \((\mathsf {Rand},\mathsf {Rej})\), we have

$$ \delta ^*=\Pr \left[ \mathsf {A}^{\varPi [P,P',H_{K_h}],\mathsf {Ver}[P,P',H_{K_h}]} \text { forges} \right] -\Pr \left[ \mathsf {A}^{\mathsf {Rand},\mathsf {Rej}} \text { forges} \right] . $$

Consider now an adversary \(\mathsf {D}\) which queries a pair of oracles \((\mathcal {O}_1,\mathcal {O}_2)\) and outputs a bit \(\beta \), which we denote \(\mathsf {D}^{\mathcal {O}_1,\mathcal {O}_2}=\beta \). (We will refer to such an adversary as a distinguisher.) Say that such an adversary is non-trivial if it never makes a query (NMT) to its right (verification) oracle if a previous query (NM) to its left (MAC) oracle returned T. Then

$$\begin{aligned} \delta ^* \le \max _{\mathsf {D}} \Pr \left[ \mathsf {D}^{\varPi [P,P',H_{K_h}],\mathsf {Ver}[P,P',H_{K_h}]}=1 \right] -\Pr \left[ \mathsf {D}^{\mathsf {Rand},\mathsf {Rej}}=1 \right] , \end{aligned}$$
(5)

where the maximum is taken over non-trivial adversaries. (This follows easily by considering the particular \(\mathsf {D}\) which runs \(\mathsf {A}\) and outputs 1 iff \(\mathsf {A}\) successfully forges.) Hence, we see that \(\delta ^*\) cannot be larger than the advantage of the best non-trivial distinguisher between the two pairs of oracles \((\varPi [P,P',H_{K_h}],\mathsf {Ver}[P,P',H_{K_h}])\) and \((\mathsf {Rand},\mathsf {Rej})\).Footnote 5 This formulation of the problem now allows us to use the H-coefficients technique [Pat08b, CS14], as we explain in more details below.

The H-Coefficients Technique. From now on, we fix a non-trivial distinguisher \(\mathsf {D}\) interacting either with the real world \((\varPi [P,P',H_{K_h}],\mathsf {Ver}[P,P',H_{K_h}])\) for uniformly random permutations \((P,P')\) and a random hashing key \(K_h\), or with the ideal world \((\mathsf {Rand},\mathsf {Rej})\), making at most \(q_m\) queries to its left (MAC) oracle and at most \(q_v\) queries to its right (verification) oracle, and outputting a single bit. We let

$$ \mathbf{Adv }(\mathsf {D})=\Pr \left[ \mathsf {D}^{\varPi [P,P',H_{K_h}],\mathsf {Ver}[P,P',H_{K_h}]}=1 \right] -\Pr \left[ \mathsf {D}^{\mathsf {Rand},\mathsf {Rej}}=1 \right] . $$

We assume that \(\mathsf {D}\) is computationally unbounded (and hence wlog deterministic) and that it never repeats a query. Let

$$ \tau _m=\big ((N_1,M_1,T_1),\ldots ,(N_{q_m},M_{q_m},T_{q_m})\big ) $$

be the list of MAC queries of \(\mathsf {D}\) and corresponding answers. Let also

$$ \tau _v=\big ((N'_1,M'_1,T'_1,b_1),\ldots , (N'_{q_v},M'_{q_v},T'_{q_v},b_{q_v})\big ) $$

be the list of verification queries of \(\mathsf {D}\) and corresponding answers (with \(b_i\in \{0,1\}\)). The pair \((\tau _m,\tau _v)\) constitutes the queries transcript of the attack. For convenience, we slightly modify the security experiment by revealing to the distinguisher (after it made all its queries but before it outputs its decision bit) the hashing key \(K_h\) if we are in the real world, or a uniformly random “dummy” key \(K_h\) if we are in the ideal world (this is obviously wlog since the distinguisher can ignore this additional piece of information). All in all, the transcript of the attack is the triplet \(\tau =(\tau _m,\tau _v,K_h)\). We will often simply name a tuple \((N,M,T)\in \tau _m\) a MAC query, and a tuple \((N',M',T',b) \in \tau _v\) a verification query.

A transcript \(\tau \) is said attainable (with respect to distinguisher \(\mathsf {D}\)) if the probability to obtain this transcript in the ideal world is non-zero. In particular, note that for an attainable transcript \(\tau =(\tau _m,\tau _v,K_h)\), any verification query \((N'_i,M'_i,T'_i,b_i)\in \tau _v\) is such that \(b_i=0\).Footnote 6 We denote \(\varTheta \) the set of attainable transcripts. We also denote \(X_\mathrm{re}\), resp. \(X_\mathrm{id}\), the probability distribution of the transcript \(\tau \) induced by the real world, resp. the ideal world. The main lemma of the H-coefficients technique is the following one (see e.g. [CS14] or [CLL+14] for the proof).

Lemma 1

Fix a distinguisher \(\mathsf {D}\). Let \(\varTheta =\varTheta _\mathrm{good}\sqcup \varTheta _\mathrm{bad}\) be a partition of the set of attainable transcripts. Assume that there exists \(\varepsilon _1\) such that for any \(\tau \in \varTheta _\mathrm{good}\), one hasFootnote 7

$$ \frac{\Pr [X_\mathrm{re}=\tau ]}{\Pr [X_\mathrm{id}=\tau ]}\ge 1-\varepsilon _1, $$

and that there exists \(\varepsilon _2\) such that \(\Pr [X_\mathrm{id}\in \varTheta _\mathrm{bad}]\le \varepsilon _2\). Then \( \mathbf{Adv }(\mathsf {D})\le \varepsilon _1+\varepsilon _2\).

The remaining of the proof of Theorem 3 is structured as follows: in Sect. 4.2, we define bad transcripts and upper bound their probability in the ideal world; in Sect. 4.3, we analyze good transcripts and prove that they are almost as likely in the real and the ideal world. Theorem 3 follows easily by combining Eqs. (4) and (5) above, Lemmas 12 and 3 proven below.

4.2 Definition and Probability of Bad Transcripts

We start by defining bad transcripts. We say that a MAC query \((N_i,M_i,T_i)\in \tau _m\) is collisioning if there exists another MAC query \((N_j,M_j,T_j)\in \tau _m\) with \(j\ne i\) such \(T_i=T_j\), otherwise we say it is non-collisioning.

Definition 3

We say that an attainable transcript \(\tau =(\tau _m,\tau _v,K_h)\) is bad if one of the following conditions is met:

  • (i) the number of collisioning MAC queries in \(\tau _m\) is more than \(\sqrt{q_m}\);

  • (ii) there exists two distinct MAC queries \((N_i,M_i,T_i)\) and \((N_j,M_j,T_j)\) in \(\tau _m\) such that

    $$ \left\{ \begin{array}{l} T_i=T_j\\ N_i\oplus H_{K_h}(M_i)=N_j\oplus H_{K_h}(M_j); \end{array} \right. $$
  • (iii) there exists a MAC query \((N_i,M_i,T_i)\in \tau _m\) and a verification query \((N'_j,M'_j,T'_j,b_j)\in \tau _v\) such that

    $$ \left\{ \begin{array}{l} N_i=N'_j\\ T_i=T'_j\\ H_{K_h}(M_i)=H_{K_h}(M'_j). \end{array} \right. $$

We denote \(\varTheta _\mathrm{bad}\), resp. \(\varTheta _\mathrm{good}\) the set of bad, respectively good transcripts.

We quickly comment on these three conditions. Condition (i) captures the case where there are too many tag collisions and will be needed when lower bounding the probability of getting a good transcript in the real world. Condition (ii) can only happen in the ideal world and hence allows to trivially distinguish; in the real world, if \(N_i\oplus H_{K_h}(M_i)=N_j\oplus H_{K_h}(M_j)\), then, since \(N_i\ne N_j\) because the adversary is assumed nonce-respecting, one necessarily has

$$ P(N_i)\oplus N_i\oplus H_{K_h}(M_i)\ne P(N_j)\oplus N_j\oplus H_{K_h}(M_j) $$

which implies \(T_i\ne T_j\) by applying \(P'\) to both sides of the inequality. Similarly, condition (iii) can only happen in the ideal world since in the real world, if \(N_i=N'_j\), \(T_i=T'_j\), and \(H_{K_h}(M_i)=H_{K_h}(M'_j)\), one should have \(b_j=1\) (while \(b_j=0\) in the ideal world).

We now upper bound the probability to get a bad transcript in the ideal world.

Lemma 2

For any integers \(q_m\) and \(q_v\), one has

$$ \Pr \left[ X_\mathrm{id}\in \varTheta _\mathrm{bad} \right] \le \frac{q_m^{3/2}}{2^n}+\frac{\varepsilon q_m}{2}+\varepsilon q_v. $$

Proof

We upper bound the probabilities of the three conditions in turn. We denote \(\varTheta _i\) the set of attainable transcript that satisfy the i-th condition. Recall that, in the ideal world, \(K_h\) is drawn independently from the queries transcript.

Conditions (iand (ii). We will deal with conditions (i) and (ii) together, using the fact that

Since the adversary does not make useless queries, its \(\mathrm {MAC}\) queries are distinct. In the ideal world, the values \(T_i\) for \(i\in \{1,\ldots ,q_m\}\) are then simply chosen uniformly and independently at random from \(\{0,1\}^n\). We introduce the random variable

$$ C=\left| \left\{ (i,j)\in \{1,\ldots ,q_m\}^2, i\ne j, T_i=T_j\right\} \right| . $$

The number of collisioning MAC queries is always lower than C. Note that

$$ \mathbb {E}[C]=\sum \limits _{1\le i \le q_m} \sum _{\begin{array}{c} 1\le j\le q_m \\ i\ne j \end{array}} \Pr \left[ T_i=T_j \right] \le \frac{q_m^2}{2^n}. $$

By Markov’s inequality,

$$ \Pr \left[ X_\mathrm{id}\in \varTheta _1 \right] \le \Pr \left[ C \ge \sqrt{q_m} \right] \le \frac{q_m^{3/2}}{2^n}. $$

Assume now that \(X_\mathrm{id}\notin \varTheta _1\), i.e., \(\tau _m\) is such that the number of collisioning MAC queries is lower than \(\sqrt{q_m}\). Recall that \(K_h\) is chosen independently from \(\tau _m\) in the ideal world. Fix any (ij) such that \(i\ne j\) and \(T_i=T_j\). Since the number of collisioning MAC queries is lower than \(\sqrt{q_m}\), there are at most \(q_m/2\) such pairs of queries. Then, since H is \(\varepsilon \)-AXU, one has

$$ \Pr \left[ K_h\leftarrow _{\$}\mathcal {K}_h : N_i\oplus H_{K_h}(M_i)=N_j\oplus H_{K_h}(M_j) \right] \le \varepsilon $$

and, by summing over the at most \(q_m/2\) such pairs of queries, one has

Hence,

$$ \Pr \left[ X_\mathrm{id}\in \varTheta _1\cup \varTheta _2 \right] \le \frac{q_m^{3/2}}{2^n}+\frac{\varepsilon q_m}{2}. $$

Condition ( iii ). We consider any verification query \((N'_j,M'_j,T'_j,b_j)\in \tau _v\) and upper bound the probability that condition (iii) is satisfied for this particular query. Since the adversary is nonce-respecting, there is at most one MAC query \((N_i,M_i,T_i)\) such that \(N_i=N'_j\). We distinguish two cases:

  • If the verification query comes after the MAC query, then since the distinguisher is non-trivial, either \(T_i\ne T'_j\), or \(M_i\ne M'_j\). In the former case, condition (iii) cannot be satisfied, while in the latter case, the probability over the random draw of \(K_h\) that \(H_{K_h}(M_i)\oplus H_{K_h}(M'_j)=0\) is at most \(\varepsilon \).

  • If the MAC query comes after the verification query, then \(T_i\) is random and independent from \(T'_j\) and the probability that \(T_i=T'_j\) is \(2^{-n}\).

Since for an \(\varepsilon \)-AXU hash function with range \(\{0,1\}^n\) one has \(\varepsilon \ge 2^{-n}\), we see that in all cases condition (iii) is met with probability at most \(\varepsilon \). Thus, by summing over every verification query, one has

$$ \Pr \left[ X_\mathrm{id}\in \varTheta _3 \right] \le \varepsilon q_v. $$

The Lemma follows by an union bound over all conditions.    \(\square \)

4.3 Analysis of Good Transcripts

We now analyze good transcripts and prove the following lemma.

Lemma 3

Assume that \(q_m^{3/2}\le 2^n/4\) and \(q_v\le 2^n/4\). Then, for any good transcript \(\tau \), one has

$$ \frac{\Pr \left[ X_\mathrm{re}=\tau \right] }{\Pr \left[ X_\mathrm{id}=\tau \right] } \ge 1-\frac{4q_m^{3/2}}{2^n} - \frac{6q_v}{2^n}. $$

Let \(\tau =(\tau _m,\tau _v,K_h)\) be a good transcript. Since in the ideal world the MAC oracle is perfectly random and the verification always rejects, one simply has

$$\begin{aligned} \Pr [X_\mathrm{id}=\tau ]=\frac{1}{|\mathcal {K}_h|\cdot (2^n)^{q_m}}. \end{aligned}$$
(6)

We must now lower bound the probability of getting \(\tau \) in the real world. We say that a pair of permutations \((P,P')\) is compatible with \(\tau _m\) if

$$ \forall i\in \{1,\ldots ,q_m\},\ \varPi [P,P',H_{K_h}](N_i,M_i)=T_i, $$

and we say that it is compatible with \(\tau _v\) if

$$ \forall i\in \{1,\ldots ,q_v\},\ \varPi [P,P',H_{K_h}](N'_i,M'_i)\ne T'_i. $$

We simply say that \((P,P')\) is compatible with \(\tau \) if it is compatible with \(\tau _m\) and \(\tau _v\). We denote \(\mathsf {Comp}(\tau _m)\), \(\mathsf {Comp}(\tau _v)\), and \(\mathsf {Comp}(\tau )\) the set of pairs of permutations that are compatible with respectively \(\tau _m\), \(\tau _v\), and \(\tau \). Then one can easily check (see for example [CS14] for a detailed explanation) that

$$\begin{aligned} \Pr [X_\mathrm{re}=\tau ]=\frac{1}{|\mathcal {K}_h|}\cdot \Pr \left[ (P,P')\leftarrow _{\$}\mathsf {Perm}(n)^2:(P,P') \in \mathsf {Comp}(\tau ) \right] . \end{aligned}$$
(7)

MAC Queries Transcript. We will first consider the probability that a random pair \((P,P')\) is compatible with the MAC queries transcript \(\tau _m\). To ease the notation, we reorder the transcript as follows. Let r be the number of distinct tags T appearing in MAC queries. Then we rewrite the transcript so that all queries with the same tag are consecutive, so that the MAC queries transcript (that we still denote \(\tau _m\)) is now

$$\begin{aligned} \tau _m=\big (&(N_{1,1},M_{1,1},T_1),\ldots ,(N_{1,q_1},M_{1,q_1},T_1),\\&(N_{2,1},M_{2,1},T_2),\ldots ,(N_{2,q_2},M_{2,q_2},T_2),\\&\ldots ,\\&(N_{r,1},M_{r,1},T_r),\ldots ,(N_{r,q_r},M_{r,q_r},T_r)\big ), \end{aligned}$$

where \(T_1,\ldots ,T_r\) are distinct and \(\sum _{i=1}^r q_i=q_m\).

Our goal is now to lower bound the probability that a random pair of permutations \((P,P')\) satisfies

$$ \forall i\in \{1,\ldots ,r\},\forall j\in \{1,\ldots ,q_i\},\ P'\big (P(N_{i,j})\oplus N_{i,j}\oplus H_{K_h}(M_{i,j})\big )=T_i. $$

For this, we will consider the possible “internal” values \(Z_i=(P')^{-1}(T_i).\) We say that a tuple \(\mathbf {Z}=(Z_1,\ldots ,Z_r)\) of distinct values in \(\{0,1\}^n\) is good if

  1. (a)

    all \(q_m\) values \(Z_i\oplus N_{i,j}\oplus H_{K_h}(M_{i,j})\) for \(i\in \{1,\ldots ,r\}\), \(j\in \{1,\ldots ,q_i\}\) are distinct;

  2. (b)

    for every verification query \((N',M',T',b)\in \tau _v\) such that \(N'=N_{i,j}\) and \(T'=T_k\) for some \(i\in \{1,\ldots ,r\}\), \(j\in \{1,\ldots ,q_i\}\), and \(k\in \{1,\ldots ,r\}\) with \(k\ne i\), one has

    $$ Z_i\oplus H_{K_h}(M_{i,j})\oplus H_{K_h}(M')\ne Z_k. $$

Property (a) is needed since the values \(Z_i\oplus N_{i,j}\oplus H_{K_h}(M_{i,j})\) are the images by P of the (distinct) nonces \(N_{i,j}\). Property (b) will be needed later when lower bounding the probability that \((P,P')\) is compatible with the verification transcript \(\tau _v\).

Given a good tuple \(\mathbf {Z}\), the probability, for a randomly drawn pair \((P,P')\), that

$$\begin{aligned} \left\{ \begin{array}{l} \forall i\in \{1,\ldots ,r\},\forall j\in \{1,\ldots ,q_i\},\ P(N_{i,j})=Z_i\oplus N_{i,j}\oplus H_{K_h}(M_{i,j}),\\ \forall i\in \{1,\ldots ,r\},\ P'(Z_i)=T_i \end{array} \right. \end{aligned}$$
(8)

is exactly

$$\begin{aligned} \frac{1}{(2^n)_{q_m} (2^n)_r}. \end{aligned}$$
(9)

(This is simply the probability that P satisfies \(q_1+\ldots +q_r=q_m\) equations and \(P'\) satisfies r equations.)

It remains to lower bound the number \(N_{\mathbf {Z}}\) of good tuples \(\mathbf {Z}\), which can be done as follows. First, note that by definition of a good transcript, for any \(i\in \{1,\ldots ,r\}\), the values \(Z_i\oplus N_{i,j}\oplus H_{K_h}(M_{i,j})\) for \(1\le j\le q_i\) are distinct since otherwise condition (ii) defining a bad transcript would be fulfilled (without that, good tuples \(\mathbf {Z}\) would not exist). In the following, for \(i,k\in \{1,\ldots ,r\}\) with \(k<i\), we denote \(q'_{i,k}\) the number of verification queries \((N',M',T',b)\in \tau _v\) such that either \(N'=N_{i,j}\) for some \(j\in \{1,\ldots ,q_i\}\) and \(T'=T_k\), or \(N'=N_{k,j}\) for some \(j\in \{1,\ldots ,q_k\}\) and \(T'=T_i\). Note that since a verification query can count for at most one pair (ik), one has

$$\begin{aligned} \sum _{i=2}^r\sum _{k=1}^{i-1} q'_{i,k}\le q_v. \end{aligned}$$
(10)

Then,

  • there are at least \(2^n\) possibilities for \(Z_1\);

  • once \(Z_1\) is fixed, there are at least \(2^n-1-q_2q_1-q'_{2,1}\) possibilities for \(Z_2\) since \(Z_2\) must be different from the following values:

    • \(\bullet \) \(Z_1\),

    • \(\bullet \) \(Z_1 \oplus N_{1,j}\oplus H_{K_h}(M_{1,j})\oplus N_{2,j'}\oplus H_{K_h}(M_{2,j'})\) for all \(j\in \{1,\ldots ,q_1\}\) and all \(j'\in \{1,\ldots , q_2\}\) (in order for property (a) to be fulfilled),

    • \(\bullet \) \(Z_1\oplus H_{K_h}(M_{1,j})\oplus H_{K_h}(M')\) for every verification query \((N',M',T',b)\in \tau _v\) such that \(N'=N_{1,j}\) for some \(j\in \{1,\ldots ,q_1\}\) and \(T'=T_2\), and \(Z_1\oplus H_{K_h}(M_{2,j})\oplus H_{K_h}(M')\) for every verification query \((N',M',T',b)\in \tau _v\) such that \(N'=N_{2,j}\) for some \(j\in \{1,\ldots ,q_2\}\) and \(T'=T_1\), which amounts to at most \(q'_{2,1}\) values (in order for property (b) to be fulfilled);

  • once \(Z_1,\ldots ,Z_i\) are fixed, there are at least \(2^n-i-q_{i+1}\sum _{k=1}^i q_k -\sum _{k=1}^i q'_{i+1,k}\) possibilities for \(Z_{i+1}\) since \(Z_{i+1}\) must be different from the following values:

    • \(\bullet \) \(Z_1,\ldots ,Z_i\),

    • \(\bullet \) \(Z_k \oplus N_{k,j}\oplus H_{K_h}(M_{k,j})\oplus N_{i+1,j'}\oplus H_{K_h}(M_{i+1,j'})\) for all \(k\in \{1,\ldots ,i\}\), all \(j\in \{1,\ldots ,q_k\}\), and all \(j'\in \{1,\ldots , q_{i+1}\}\),

    • \(\bullet \) \(Z_k\oplus H_{K_h}(M_{k,j})\oplus H_{K_h}(M')\) for every verification query \((N',M',T',b)\in \tau _v\) such that \(N'=N_{k,j}\) for some \(k\in \{1,\ldots ,i\}\), \(j\in \{1,\ldots ,q_k\}\) and \(T'=T_{i+1}\), and \(Z_k\oplus H_{K_h}(M_{i+1,j})\oplus H_{K_h}(M')\) for every verification query \((N',M',T',b)\in \tau _v\) such that \(N'=N_{i+1,j}\) for some \(j\in \{1,\ldots ,q_{i+1}\}\) and \(T'=T_k\) for some \(k\in \{1,\ldots ,i\}\), which amounts to at most \(\sum _{k=1}^i q'_{i+1,k}\) values.

Hence, the number of good tuples \(\mathbf {Z}=(Z_1,\ldots ,Z_r)\) is at least

$$\begin{aligned} N_{\mathbf {Z}} \ge \prod _{i=0}^{r-1}\left( 2^n-i-q_{i+1}\sum _{k=1}^i q_k-\sum _{k=1}^i q'_{i+1,k}\right) . \end{aligned}$$
(11)

Verification Queries Transcript. From now on, we fix a good tuple \(\mathbf {Z}\). We will now lower bound the probability that a random pair \((P,P')\) is compatible with the verification transcript \(\tau _v\), conditioned on \((P,P')\) satisfying the set of Eq. (8). (Recall that P is then fixed on \(q_m\) values and \(P'\) is fixed on r values.) For this, it will be easier to upper bound the probability that \((P,P')\) is not compatible with \(\tau _v\), i.e., that there exists \((N',M',T',b)\in \tau _v\) such that

$$\begin{aligned} P'\big (P(N')\oplus N' \oplus H_{K_h}(M')\big )=T'. \end{aligned}$$
(12)

Fix any verification query \((N',M',T',b)\in \tau _v\). We say that it is nonce-fresh, resp. tag-fresh, if \(N'\), resp. \(T'\) does not appear in the MAC queries transcript \(\tau _m\).Footnote 8 We consider four possible cases.

  • Case 1: the verification query is both nonce-fresh and tag-fresh. Then \(P(N')\) is random and two sub-cases can occur: if \(P(N')\oplus N' \oplus H_{K_h}(M')\in \mathbf {Z}\), Eq. (12) cannot be satisfied since the query is tag-fresh; on the other hand, if \(P(N')\oplus N' \oplus H_{K_h}(M')\notin \mathbf {Z}\), Eq. (12) is satisfied with probability \(1/(2^n-r)\) over the choice of \(P'\). Hence, over the choice of \((P,P')\), Eq. (12) is satisfied with probability at most

    $$ \frac{1}{2^n-r}\le \frac{1}{2^n-q_m}. $$
  • Case 2: the verification query is nonce-fresh, but not tag-fresh. Then there exists \((N,M,T)\in \tau _m\) such that \(T=T'\). Let \(Z=(P')^{-1}(T)\) (this value is well defined since we assume Eq. (8) are satisfied). Then Eq. (12) is satisfied iff

    $$ P(N')=Z\oplus N'\oplus H_{K_h}(M'), $$

    hence with probability exactly \(1/(2^n-q_m)\) since the query is nonce-fresh and \(N'\) does not appear in Eq. (8).

  • Case 3: the verification query is tag-fresh, but not nonce-fresh. Then there exists a unique \((N,M,T)\in \tau _m\) such that \(N'=N\), so that \(P(N')\) is fixed by Eq. (8). If \(P(N')\oplus N' \oplus H_{K_h}(M')\in \mathbf {Z}\), then Eq. (12) cannot be satisfied since the query is tag-fresh. If \(P(N')\oplus N' \oplus H_{K_h}(M')\notin \mathbf {Z}\), then Eq. (12) is satisfied with probability

    $$ \frac{1}{2^n-r}\le \frac{1}{2^n-q_m}. $$
  • Case 4: the verification query is neither nonce-fresh nor tag-fresh. Then there exists a unique \((N_{i,j},M_{i,j},T_i)\in \tau _m\) such that \(N'=N_{i,j}\) and \((N_k,M_k,T_k)\in \tau _m\) (with possibly \(k=i\)) such that \(T'=T_k\). If \(k=i\), then Eq. (12) cannot be satisfied since otherwise one would have

    $$ P(N')\oplus N'\oplus H_{K_h}(M')=(P')^{-1}(T_i)=P(N_{i,j})\oplus N_{i,j}\oplus H_{K_h}(M_{i,j}), $$

    which implies \(H_{K_h}(M')=H_{K_h}(M_{i,j})\) and condition (iii) defining a bad transcript would be fulfilled. On the other hand, if \(k\ne i\), then Eq. (12) being satisfied would imply

    $$\begin{aligned}&P(N')\oplus N'\oplus H_{K_h}(M')=(P')^{-1}(T_k)=Z_k\\ \Rightarrow&P(N_{i,j}) \oplus N_{i,j} \oplus H_{K_h}(M')=Z_k\\ \Rightarrow&Z_i\oplus H_{K_h}(M_{i,j}) \oplus H_{K_h}(M') = Z_k, \end{aligned}$$

    and this would contradict property (b) of a good tuple \(\mathbf {Z}\). Hence, by definition of a good transcript and a good tuple \(\mathbf {Z}\), we see that Eq. (12) cannot be satisfied in that case.

Summarizing, we see that for any verification query, Eq. (12) is satisfied with probability at most \(1/(2^n-q_m)\). By a union bound over the \(q_v\) verification queries, we obtain that

(13)

Summing Up. We can now lower bound the probability that a random pair \((P,P')\) is compatible with \(\tau \), that we denote

$$ \mathsf {p}(\tau )\mathrel {\mathop =^\mathrm{def}}\Pr \left[ (P,P')\leftarrow _{\$}\mathsf {Perm}(n)^2:(P,P') \in \mathsf {Comp}(\tau ) \right] . $$

Namely, summing over all good tuples \(\mathbf {Z}\), and using (9), (11), and (13), we have

This, in turn, allows us to lower bound the ratio of the probabilities to obtain \(\tau \) in the real and the ideal world, namely combining (6) and (7) with the equation above, we have

$$\begin{aligned} {\frac{\Pr \left[ X_\mathrm{re}=\tau \right] }{\Pr \left[ X_\mathrm{id}=\tau \right] }}&\ge {\underbrace{\frac{(2^n)^{q_m}\prod _{i=0}^{r-1}\left( 2^n-i-q_{i+1}\sum _{k=1}^i q_k-\sum _{k=1}^i q'_{i+1,k}\right) }{(2^n)_{q_m}(2^n)_r}}}_{A} \nonumber \\&\,\,\,\, \times \left( 1-{\frac{q_v}{2^n-q_m}}\right) . \end{aligned}$$
(14)

We focus on term A, that we can rewrite

$$\begin{aligned} A =\prod _{i=0}^{q_m-1}\left( 1+\frac{i}{2^n-i}\right) \prod _{i=0}^{r-1}\left( 1-\underbrace{\frac{q_{i+1}\sum _{k=1}^i q_k}{2^n-i}}_{a_i}-\underbrace{\frac{\sum _{k=1}^i q'_{i+1,k}}{2^n-i}}_{b_i}\right) . \end{aligned}$$
(15)

The following “Bonferroni-type” inequality will be useful to further lower bound A.

Lemma 4

Let \(r\ge 1\) be an integer and \((a_i)_{0\le i\le r-1}\) and \((b_i)_{0\le i\le r-1}\) be positive reals such that \(a_i\le 1/2\) and \(b_i\le 1/2\) for all \(i\in \{0,\ldots ,r-1\}\). Then

$$ \prod _{i=0}^{r-1} (1-a_i-b_i) \ge \prod _{i=0}^{r-1}(1-a_i) \prod _{i=0}^{r-1}(1-2b_i). $$

Proof

The proof is by induction. We first prove it for \(r=1\). One has

$$ (1-a_0)(1-2b_0)=1-a_0-2b_0+2a_0b_0=1-a_0-b_0-\underbrace{b_0(1-2a_0)}_{\ge 0} \le 1-a_0-b_0. $$

Assume that the result holds for \(r\ge 1\). Then

$$\begin{aligned} \prod _{i=0}^r(1-a_i) \prod _{i=0}^r(1-2b_i)&=\prod _{i=0}^{r-1}(1-a_i) \prod _{i=0}^{r-1}(1-2b_i)\times \underbrace{(1-a_r)(1-2b_r)}_{\ge 0}\\&\le \prod _{i=0}^{r-1} (1-a_i-b_i) \times (1-a_r-b_r-b_r(1-2a_r))\\&= \prod _{i=0}^r (1-a_i-b_i) -\underbrace{b_r(1-2a_r)\prod _{i=0}^{r-1} (1-a_i-b_i)}_{\ge 0}\\&\le \prod _{i=0}^r (1-a_i-b_i). \end{aligned}$$

The result holds for \(r+1\) and the lemma follows.    \(\square \)

We can apply this lemma to the r.h.s. of (15). Indeed, for any \(i\in \{0,\ldots ,r-1\}\), one has \(q_{i+1}\le \sqrt{q_m}\) (as otherwise condition (i) of a bad transcript would be met), and \(q_m^{3/2}\le 2^n/4\) by assumption, so that

$$ a_i\mathrel {\mathop =^\mathrm{def}}\frac{q_{i+1}\sum _{k=1}^i q_k}{2^n-i}\le \frac{q_{i+1}\sum _{k=1}^i q_k}{2^n-q_m}\le \frac{2 q_m^{3/2}}{2^n}\le \frac{1}{2}, $$

Moreover, by (10) and the assumption that \(q_v\le 2^n/4\), one has

$$ b_i \mathrel {\mathop =^\mathrm{def}}\frac{\sum _{k=1}^i q'_{i+1,k}}{2^n-i} \le \frac{\sum _{k=1}^i q'_{i+1,k}}{2^n-q_m} \le \frac{2q_v}{2^n} \le \frac{1}{2}. $$

Hence,

$$\begin{aligned} A&\ge \prod _{i=0}^{q_m-1}\left( 1+\frac{i}{2^n-i}\right) \prod _{i=0}^{r-1}\left( 1-\frac{q_{i+1}\sum _{k=1}^i q_k}{2^n-i}\right) \prod _{i=0}^{r-1}\left( 1-\frac{2\sum _{k=1}^i q'_{i+1,k}}{2^n-i}\right) \nonumber \\&\ge \prod _{i=0}^{q_m-1}\left( 1+\frac{i}{2^n-i}\right) \prod _{i=0}^{r-1}\left( 1-\frac{q_{i+1}\sum _{k=1}^i q_k}{2^n-i}\right) \left( 1-\frac{2\sum _{i=0}^{r-1}\sum _{k=1}^i q'_{i+1,k}}{2^n-q_m}\right) \nonumber \\&\ge \underbrace{\prod _{i=0}^{q_m-1}\left( 1+\frac{i}{2^n-i}\right) \prod _{i=0}^{r-1}\left( 1-\frac{q_{i+1}\sum _{k=1}^i q_k}{2^n-i}\right) }_{A'}\left( 1-\frac{2q_v}{2^n-q_m}\right) , \end{aligned}$$
(16)

where for the last inequality we used (10).

In order to further lower bound \(A'\), we need to distinguish collisioning MAC queries from non-collisioning ones. Up to reordering the MAC queries transcript, we assume that non-collisioning queries come first, and we let \(s\in \{0,\ldots ,r\}\) be the integer such that \(q_i=1\) for \(i\in \{1,\ldots ,s\}\), and \(q_i>1\) for \(i\in \{s+1,\ldots , r\}\). Note that since the transcript is good, one has

$$\begin{aligned} \sum _{i=s+1}^r q_i\le \sqrt{q_m} \end{aligned}$$
(17)

as otherwise condition (i) of a bad transcript would be fulfilled. Then

$$\begin{aligned} A'&\ge \prod _{i=0}^{q_m-1}\left( 1+\frac{i}{2^n-i}\right) \prod _{i=0}^{s-1}\left( 1-\frac{q_{i+1}\sum _{k=1}^i q_k}{2^n-i}\right) \prod _{i=s}^{r-1}\left( 1-\frac{q_{i+1}\sum _{k=1}^i q_k}{2^n-i}\right) \nonumber \\&= \prod _{i=0}^{q_m-1}\left( 1+\frac{i}{2^n-i}\right) \prod _{i=0}^{s-1}\left( 1-\frac{i}{2^n-i}\right) \prod _{i=s}^{r-1}\left( 1-\frac{q_{i+1}\sum _{k=1}^i q_k}{2^n-i}\right) \nonumber \\&\ge \prod _{i=0}^{q_m-1}\left( 1-\frac{i^2}{(2^n-i)^2}\right) \prod _{i=s}^{r-1}\left( 1-\frac{q_{i+1}q_m}{2^n-i}\right) \nonumber \\&\ge \prod _{i=0}^{q_m-1}\left( 1-\frac{i^2}{(2^n-q_m)^2}\right) \prod _{i=s}^{r-1}\left( 1-\frac{q_{i+1}q_m}{2^n-q_m}\right) \nonumber \\&\ge \left( 1-\frac{q_m^3}{3(2^n-q_m)^2}\right) \left( 1-\frac{q_m\sum _{i=s+1}^r q_i}{2^n-q_m}\right) \nonumber \\&\ge \left( 1-\frac{4q_m^3}{3\cdot 2^{2n}}\right) \left( 1-\frac{2q_m^{3/2}}{2^n}\right) , \end{aligned}$$
(18)

where for the last inequality we used (17) and \(q_m\le 2^n/2\).

Combining (14), (16), and (18), we finally obtain (using \(q_m\le 2^n/2\) once again)

$$ \frac{\Pr \left[ X_\mathrm{re}=\tau \right] }{\Pr \left[ X_\mathrm{id}=\tau \right] } \ge 1-\frac{4q_m^3}{3\cdot 2^{2n}} -\frac{2q_m^{3/2}}{2^n} - \frac{6q_v}{2^n}. $$

Lemma 3 follows using \(q_m^3/2^{2n}\le q_m^{3/2}/2^n\) by our assumption that \(q_m^{3/2}\le 2^n/4\).

5 Nonce-Misuse Security of EWCDM

In this section, we consider the security of the EWCDM construction when the adversary is allowed to repeat nonces. In this setting, PRF-security implies MAC-security, hence we can simply consider the EWCDM construction as a function with domain \(\mathcal {N}\times \mathcal {M}\) and study its pseudorandomness. Our result on the PRF-security of the EWCDM construction is as follows.

Lemma 5

Let \(\mathcal {M}\), \(\mathcal {K}\) and \(\mathcal {K}_h\) be non-empty sets. Let \(E:\mathcal {K}\times \{0,1\}^n\rightarrow \{0,1\}^n\) be a block cipher and \(H:\mathcal {K}_h\times \mathcal {M}\rightarrow \{0,1\}^n\) be an \(\varepsilon \)-AXU hash function. Then for any (qt)-(nonce-misusing) adversary \(\mathsf {A}\) against the PRF-security of \(\varPi [E,H]\), there exists a \((q,t')\)-adversary \(\mathsf {A}'\) against the PRP-security of E, where \(t'=O(t+qt_H)\), such that

$$ \mathbf{Adv }^{\mathrm {PRF}}_{\varPi [E,H]}(\mathsf {A})\le 2 \mathbf{Adv }^{\mathrm {PRP}}_{E}(\mathsf {A}')+\frac{q^2}{2^n}+\frac{q^2\varepsilon }{2}. $$

The corresponding MAC-security can easily be deduced from Lemma 5 using the following generic result of Bellare et al. [BGM04, Proposition 7.3].

Lemma 6

Let F be a keyed function with output length n. Then for any \((q_m,q_v,t)\)-adversary \(\mathsf {A}\) against the MAC-security of F, there exists a \((q_m+q_v,t')\)-adversary \(\mathsf {A}'\) against the PRF-security of F, where \(t'=O(t)\), such that

$$ \mathbf{Adv }^{\mathrm {MAC}}_F(\mathsf {A})\le \mathbf{Adv }^{\mathrm {PRF}}_F(\mathsf {A}') +\frac{q_v}{2^n}. $$

Combining Lemmas 5 and 6, we obtain the following theorem (absorbing the \(q_v/2^n\) term into \((q_m+q_v)^2/2^n\)).

Theorem 4

Let \(\mathcal {M}\), \(\mathcal {K}\) and \(\mathcal {K}_h\) be non-empty sets. Let \(E:\mathcal {K}\times \{0,1\}^n\rightarrow \{0,1\}^n\) be a block cipher and \(H:\mathcal {K}_h\times \mathcal {M}\rightarrow \{0,1\}^n\) be an \(\varepsilon \)-AXU hash function. Then for any \((q_m,q_v,t)\)-nonce-misusing adversary \(\mathsf {A}\) against the MAC-security of \(\varPi [E,H]\), there exists a \((q_m+q_v,t')\)-adversary \(\mathsf {A}'\) against the PRP-security of E, where \(t'=O(t+(q_m+q_v)t_H)\), such that

$$ \mathbf{Adv }^{\mathrm {MAC}}_{\varPi [E,H]}(\mathsf {A})\le 2 \mathbf{Adv }^{\mathrm {PRP}}_{E}(\mathsf {A}')+\frac{2(q_m+q_v)^2}{2^n}+\frac{(q_m+q_v)^2\varepsilon }{2}. $$

The proof of Lemma 5 is standard (indeed, the construction, seen as a keyed function with domain \(\mathcal {N}\times \mathcal {M}\), follows the classical “hash-then-PRF” paradigm). We include it below for completeness.

Proof of Lemma  5. Fix a (qt)-adversary \(\mathsf {A}\) against the \(\mathrm {PRF}\)-security of \(\varPi [E,H]\). The first step of the proof consists in replacing \(E_K\) and \(E_{K'}\) by two uniformly random and independent permutations P and \(P'\). It is easy to show that there is an adversary \(\mathsf {A}'\) making at most q queries and running in time at most \(t'=O(t+qt_H)\) such that

$$\begin{aligned} \mathbf{Adv }^{\mathrm {PRF}}_{\varPi [E,H]}(\mathsf {A})\le 2 \mathbf{Adv }^{\mathrm {PRP}}_{E}(\mathsf {A}') + \mathbf{Adv }^\mathrm {PRF}_{\varPi [E^*,H]}(\mathsf {A}), \end{aligned}$$
(19)

where \(E^*\) denotes the perfect cipher on \(\{0,1\}^n\). Then, we use the \(\mathrm {PRP}/\mathrm {PRF}\) switching lemma [BR06] to replace the random permutations P and \(P'\) by two independent and uniformly random functions R and \(R'\), obtaining

$$\begin{aligned} \mathbf{Adv }^{\mathrm {PRF}}_{\varPi [E^*,H]}(\mathsf {A}')\le \frac{q^2}{2^n}+ \mathbf{Adv }^{\mathrm {PRF}}_{\varPi [F^*,H]}(\mathsf {A}), \end{aligned}$$
(20)

where \(F^*\) denotes the perfect keyed function from \(\{0,1\}^n\) to \(\{0,1\}^n\) (i.e., the keyed function with key space \(\mathsf {Func}(n)\)).

It remains to upper bound the PRF-advantage of \(\mathsf {A}\) against \(\varPi [F^*,H]\). For this, we use the H-coefficients technique. The adversary must distinguish between two worlds:

  • the real world in which it interacts with \(\varPi [R,R',H]\) where R and \(R'\) are two uniformly and independently drawn functions from \(\{0,1\}^n\) to \(\{0,1\}^n\);

  • the ideal world in which it receives independent and uniformly random answers.

Let \(\tau _m=((N_1,M_1,T_1),\ldots ,(N_q,M_q,T_q))\) be the list of all queries of \(\mathsf {A}\) and the corresponding answers. In order to have a simple description of bad transcripts, we reveal to the adversary at the end of the experiment the key \(K_h\) and the function R if we are in the real world, while in the ideal world we simply draw a dummy key \(K_h \leftarrow _{\$}\mathcal {K}_h\) and a function R independently from the answers of the oracle. All in all, the transcript of the interaction of \(\mathsf {A}\) with its oracle is a tuple \(\tau =(\tau _m,K_h,R)\) and, in this case, a transcript is said attainable (with respect to an adversary \(\mathsf {A}\)) if the probability to obtain it in the ideal world is non-zero. We denote \(\varTheta \) the set of attainable transcripts. We also denote \(X_\mathrm{re}\), resp. \(X_\mathrm{id}\), the probability distribution of the transcript \(\tau \) induced by the real world, resp. the ideal world.

We start by defining the set of bad transcripts.

Definition 4

We say that an attainable transcript \(\tau =(\tau _m,K_h,R)\) is bad if there exists distinct queries \((N,M,T),(N',M',T')\in \tau _m\) such that

$$ R(N)\oplus N \oplus H_{K_h}(M)=R(N')\oplus N' \oplus H_{K_h}(M'). $$

Otherwise we say that \(\tau \) is good. We denote \(\varTheta _\mathrm{bad}\), resp. \(\varTheta _\mathrm{good}\), the set of bad, resp. good transcripts.

We first upper bound the probability to get a bad transcript in the ideal world.

Lemma 7

$$ \Pr \left[ X_\mathrm{id}\in \varTheta _\mathrm{bad} \right] \le \frac{q^2\varepsilon }{2}. $$

Proof

Let \(\tau _m\) be any attainable query transcript. Recall that, in the ideal world, the key \(K_h\) and the function R are drawn uniformly at random and independently from the query transcript \(\tau _m\). Fix any pair of distinct queries \((N,M,T),(N',M',T')\). Two cases can occur:

  • \(M \ne M'\): then the probability, over the random draw of \(K_h\) and R, that \(R(N)\oplus N \oplus H_{K_h}(M)=R(N')\oplus N' \oplus H_{K_h}(M')\) is lower than \(\varepsilon \) by the \(\varepsilon \)-AXU property of H;

  • \(M=M'\): then, since we assume that the adversary never makes redundant queries, \(N \ne N'\) and the probability that \(R(N)\oplus N=R(N')\oplus N'\) is \(1/2^n\le \varepsilon \).

By summing over every possible pair of queries, one gets the result.    \(\square \)

We then analyze good transcripts.

Lemma 8

For any good transcript \(\tau \), one has

$$ \frac{\Pr \left[ X_\mathrm{re}=\tau \right] }{\Pr \left[ X_\mathrm{id}=\tau \right] }= 1. $$

Proof

Let \(\tau =(\tau _m,K_h,R)\) be a good transcript. One has

$$ \Pr \left[ X_\mathrm{id}=\tau \right] =\frac{1}{|\mathcal {K}_h|}\cdot \frac{1}{|\mathsf {Func}(n)|}\cdot \frac{1}{(2^n)^q} $$

since, in the ideal world, the oracle is perfectly random and the key \(K_h\) and the function R are chosen uniformly at random and independently from the query transcript.

We say that a function \(R'\in \mathsf {Func}(n)\) is compatible with the transcript \(\tau \) if \(R'(R(N_i)\oplus N_i \oplus H_{K_h}(M_i))=T_i\) for all \(i\in \{1,\ldots ,q\}\). Let \(\mathsf {Comp}(\tau )\) be the set of all compatible functions \(R'\). Then it is easy to see that

$$ \Pr \left[ X_\mathrm{re}=\tau \right] =\frac{1}{|\mathcal {K}_h|}\cdot \frac{1}{|\mathsf {Func}(n)|} \cdot \Pr \left[ R'\leftarrow _{\$}\mathsf {Func}(n): R'\in \mathsf {Comp}(\tau ) \right] . $$

Since \(\tau \) is a good transcript, the values \(R(N_i)\oplus N_i \oplus H_{K_h}(M_i)\) are distinct. Hence

$$ \Pr \left[ R'\leftarrow _{\$}\mathsf {Func}(n): R'\in \mathsf {Comp}(\tau ) \right] =\frac{1}{(2^n)^q} $$

and therefore \(\Pr \left[ X_\mathrm{re}=\tau \right] =\Pr \left[ X_\mathrm{id}=\tau \right] \).    \(\square \)

Combining Lemmas 17, and 8, one obtains

$$\begin{aligned} \mathbf{Adv }^{\mathrm {PRF}}_{\varPi [F^*,H]}(\mathsf {A})\le \frac{q^2\varepsilon }{2}. \end{aligned}$$
(21)

Lemma 5 finally follows from Eqs. (19), (20), and (21).