Keywords

1 Introduction

A classic question in cryptography has been dealing with adversaries that adaptively compromise particular parties, thereby learning their secrets. Consider a setting where parties use keys \(k_1,\ldots ,k_n\) to encrypt messages \(m_1,\ldots ,m_n\) to derive ciphertexts \(\textsf {Enc}(k_1,m_1),\ldots ,\textsf {Enc}(k_n,m_n)\). An adversary obtains the ciphertexts and compromises a chosen subset of the parties to learn their keys. What can we say about the security of the messages encrypted by the keys that remain secret? Surprisingly, traditional approaches to formal security analysis, such as using encryption schemes that provide semantic security  [19], fail to suffice for proving these messages’ confidentiality. This problem was first discussed in the context of secure multiparty computation  [10], and it arises in a variety of important cryptographic applications, as we explain below.

In this work, we introduce a new framework for formal analyses when security in the face of adaptive compromise is desired. Our approach provides a modular route towards analysis using idealized primitives (such as random oracles or ideal ciphers) for practical and in-use schemes. This modularity helps us sidestep the pitfalls of prior ideal-model analyses that either invented new (less satisfying) ideal primitives, omitted proofs, or gave detailed but incorrect proofs. We exercise our framework across applications including searchable symmetric encryption (SSE), revocable cloud storage, and asymmetric password-authenticated key exchange (aPAKE). In particular, we provide full, correct proofs of security against adaptive adversaries for the Cash et al.  [12] searchable symmetric encryption scheme that is used often in practice and the BurnBox system  [33] for dealing with compelled-access searches. We show that our new definitions imply the notion of equivocable encryption introduced to prove security of the OPAQUE  [24] asymmetric password-authenticated key exchange protocol. More broadly, our framework can be applied to a wide variety of constructions  [1, 2, 9, 13, 17, 20, 21, 25,26,27,28,29, 34].

Current approaches to the “commitment problem”. Our motivating applications have at their core an adaptive simulation-based model of security. Roughly speaking, they ask that no computationally bound adversary can distinguish between two worlds. In the first world, the adversary interacts with the scheme whose security is being measured. In the second world, the “ideal” world, the adversary’s queries are instead handled by a simulator that must make do with only limited information which represents allowable “leakage” about the queries the adversary has made so far. The common unifying factor between varying applications we consider is that the adversary can make queries resulting in being given a ciphertexts encrypting messages of its choosing, then with future queries adaptively choose to expose the secret keys underlying some of the ciphertexts. The leakage given to the simulator will not include the messages encrypted unless a query has been made to expose the corresponding key.

Proving security in this model, however, does not work based on standard assumptions of the underlying encryption scheme. The problem is that the simulator must commit to ciphertexts, revealing them to the adversary, before knowing the messages associated to them. Hence the commitment problem. Several prior approaches for proving positive security results exist.

One natural approach attempts to build special non-committing encryption schemes  [10] that can be proven (in the standard model) to allow opening some a priori fixed ciphertext to a message. But these schemes are not practical, as they require key material at least as long as the underlying message. Another unsatisfying approach considers only non-adaptive security in which an attacker specifies all of its queries at the beginning of the game. This is one of the two approaches that were simultaneously taken by Cash et al.  [12]. Here the simulator is given the leakage for all of these queries at once and generates a transcript of all of its response. This is unsatisfying because more is lost when switching from adaptive to non-adaptive security than just avoiding the commitment problem. It is an easy exercise to construct encryption schemes which are secure when all queries to it must be chosen ahead of time but are not secure even against key-recovery attacks when an adversary may adaptively choose its queries.

The primary approach used to avoid this is to use idealized models, which we can again split into two versions. The first is to use an idealized model of encryption. Examples of this include indifferentiable authenticated encryption  [3] (IAE) or the ideal encryption model (IEM) of Tyagi et al.  [33]. Security analyses in these models might not say much when one uses real encryption schemes, even when one is willing to use more established idealized models such as the ideal cipher model (ICM) or the random oracle model (ROM). One hope would be to use approaches such as indifferentiability  [30] to modularly show that symmetric schemes sufficiently “behave like” ideal encryption, but this approach is unlikely to work for most encryption schemes used in practice  [3].

Fig. 1.
figure 1

Old state of affairs. Red dashed lines correspond to implications proved through programming in an ideal model proof. A different programming proof is needed to prove an application secure for each pair of PRF and symmetric encryption mode. (Color figure online)

Fig. 2.
figure 2

New state of affairs. Red dashed lines correspond to implications proved through programming in an ideal model proof. New definitions are in bold boxes. Programming proofs are only needed to show each low level PRF construction meets SIM-AC-PRF. (Color figure online)

The final approach, which is by far the most common in searchable symmetric encryption  [1, 2, 9, 12, 13, 17, 20, 21, 25,26,27,28,29, 34], is to fix a particular encryption scheme and prove security with respect to it in the ICM or ROM. Typically encryption schemes are built as modes of operations of an underlying pseudorandom function (PRF) and this function (or its constituent parts) is what is modeled as an ideal function. The downside of this is represented in Fig. 1. On the top, we have the applications one would like to prove secure, and on the bottom, we have the different modes of operation and PRFs that one might use. Using this approach means that for each application, we have to provide a separate ideal model proof for each different choice of a mode of operation and a PRF (represented by dotted red arrows in Fig. 1). If there are A applications, P PRFs, and M modes of operation one might consider using, then this requires \(A\cdot P\cdot M\) ideal model proofs in total, an unsatisfying state of affairs.

Moreover, the required ideal analysis can be tediousFootnote 1 and error-prone. This is presumably why only a few of the papers we found actually attempt to provide the full details of the ROM proof. We have identified bugs in all of the proofs that did provide the details. The lack of a full, valid proof among the fifteen papers we considered indicates the need for a more modular framework to capture this use of the random oracle. Our work provides such a framework, allowing the random oracle details to be abstracted away as a proof that only needs to be provided once. This framework provides definitions for use by other cryptographers that are simple to use, apply to practical encryption schemes, and allow showing adaptive security in well-studied models.

Examples of the “commitment problem”. We proceed by discussing the example applications where we will apply our framework.

Revocable cloud storage and the compelled access setting. We start with the recently introduced compelled access setting (CAS)  [33]. Here one wants encryption tools that provide privacy in the face of an authority searching digital devices, e.g., government searches of mobile phones or laptops at border crossings. To protect against compelled access searches, the BurnBox tool  [33] uses what they call revocable encryption. At its core, this has the system encrypt a user’s files \(m_1,\ldots ,m_n\) with independent keys \(k_1,\ldots ,k_n\). Ciphertexts are stored on (adversarially visible) cloud storage. Before a search occurs, the user instructs the application to delete the keys corresponding to files that the user wishes to hide from the authority, thereby revoking their own access to them. The other keys and file contents are disclosed to the authority.

The formal security definition introduced by Tyagi et al. captures confidentiality for revoked files even in the face of adversarial choice of which files to revoke, meaning they want security in the face of adaptive compromises. This very naturally results in the commitment problem because the simulator can be forced to provide ciphertexts for files, but only later learn the contents of these files at the time of key revelation. At which point, it is supposed to give keys which properly decrypt these ciphertexts.

To address the commitment problem they introduced the IEM. This models symmetric encryption as perfect: every encryption query is associated to a freshly chosen random string as ciphertext, and decryption is only allowed on previously returned ciphertexts. Analyses in the IEM can commit to ciphertexts (when the adversary doesn’t know the key) and later open them to arbitrary messages. In their implementation, they used AES-GCM for encryption which cannot be thought of as indifferentiable from the IEM. Hence their proof can ultimately only provide heuristic evidence for the security of their implemenation.

Symmetric searchable encryption. Our second motivating setting is symmetric searchable encryption (SSE), which has similar issues as that discussed above for BurnBox, but with added complexity. SSE handles the following problem: a client wants to offload storage of a database of documents to an untrusted server while maintaining the ability to perform keyword searches on the database. The keyword searches should not reveal the contents of the documents to the server. To enable efficient solutions, we allow queries to leak some partial information about documents. Security is formalized via a simulation-based definition  [15], in which a simulator given only the allowed leakage must trick an adversary into believing it is interacting with the actual SSE construction. An adaptive adversary can submit keyword searches as a function of prior returned results. Proving security here establishes that the scheme only leaks what is allowed and nothing more. While the leakage itself has been shown to be damaging in various contexts  [11, 22], our focus here is on the formal analyses showing that leakage-abuse attacks are the best possible ones.

A common approach for SSE can be summarized at a high level as follows. The client generates a sequence of key pairs \((k_1,k'_1),\ldots ,(k_n,k'_n)\) for keywords \(w \in \{1,\ldots ,n\}\) represented as integers for simplicity. The first key \(k_w\) in each pair is used to encrypt the identifiers of documents containing w. The latter key \(k'_w\) is used as a pseudorandom function (PRF) key to derive pseudorandom locations to store the encryption of the document identifiers. When the client later wants to search for documents containing w it sends the associated \((k_w,k'_w)\) keys to the server. The server then uses \(k'_w\) to re-derive the pseudorandom locations of the ciphertexts and uses \(k_w\) to decrypt them.

To prove adaptive security, the simulator for such a protocol runs into the commitment problem because it must commit to ciphertexts of the document identifiers before knowing what the identifiers are. Perhaps less obviously, a simulator also runs into a commitment issues with the PRF. To ensure security the simulated locations of ciphertexts must be random, but then when responding to a search query the simulator is on the hook to find a key for the PRF that “explains” the simulated locations. Papers on SSE typically address these issue by modeling the PRF as a random oracle and fixing a specific construction of an encryption scheme based on a random oracle. As noted earlier, this has resulted in a need for many tedious and error-prone proofs.

Asymmetric password-authenticated key exchange and equivocable encryption. In independent and concurrent work, Jarecki et al. updated the ePrint version of  [24] to introduce the notion of equivocable encryption and use it to prove security of their asymmetric password-authenticated key exchange protocol OPAQUE. The definition of equivocable encryption is essentially a weakened version of our confidentiality definition, considering only single-user security and allowing only a single encryption query; whereas we consider multi-user security and arbitrarily many adaptively chosen queries. Since their definition is implied by ours, our results will make rigorous their claim that “common encryption modes are equivocable under some idealized assumption”.

A new approach. We introduce a new framework for analyzing security in adaptive compromise scenarios. Our framework has a simple, but powerful recipe: augment traditional simulation-based, property-based security definitions to empower adversaries with the ability to perform adaptive compromise of secret keys. For symmetric encryption, for example, we convert the standard simulation-based, multi-user indistinguishability under chosen plaintext attack (mu-IND-CPA)  [4] to a game that includes the same adversarial powers, but adds an additional oracle for adaptively compromising individual user secret keys. Critical to our approach is (1) the use of simulators, which allows handling corruptions gracefully, and (2) incorporating handling of idealized models (e.g., the ROM or ICM). The latter is requisite for analyzing practical constructions.

We offer new definitions for multi-user CPA and CCA security of symmetric encryption, called SIM-AC-CPA (simulation-based security under adaptive corruption, chosen plaintext attack) and SIM-AC-CCA (chosen ciphertext attack). By restricting the classes of allowed simulators we can obtain stronger definitions (e.g., SIM-AC-$ which requires that ciphertexts look like random strings).

Symmetric encryption under adaptive compromise. We then begin exercising our framework by first answering the question: Are practical, in-use symmetric encryption schemes secure in the face of adaptive compromises? We give positive results here, in idealized models. Taking an encrypt-then-MAC scheme such as AES in counter mode combined with HMAC  [5] as an example, we could directly show SIM-AC-CCA security while modeling AES as an ideal cipher and HMAC as a random oracle (c.f.,  [16]). But this would lead to a rather complex proof, and we’d have to do similarly complex proofs for other encryption schemes.

Instead, we provide simple, modular proofs by lifting the underlying assumptions made about primitives (AES and HMAC) to hold in adaptive compromise scenarios. Specifically, we introduce a new security notion for pseudorandom functions under adaptive compromise attacks (SIM-AC-PRF). This adapts the standard multi-user PRF notion to also give adversaries the ability to adaptively compromise particular keys. Then we prove that AES and HMAC each achieve this notion in the ICM and ROM, respectively. The benefit is that these proofs encapsulate the complexity of ideal model programming proofs in the simpler context of SIM-AC-PRF (as opposed to SIM-AC-CCA).

The workflow when using our framework is represented by Fig. 2. Here PRFs are individually shown to achieve SIM-AC-PRF security in an ideal model. Then modes of operation are proven secure under the assumption that they use a SIM-AC-PRF secure PRF. Then each application is proven secure under the appropriate assumption of the encryption scheme used. This decreases the total number of proofs required to \(A+P+M\), significantly fewer than the \(A\cdot P\cdot M\) required previously. Moreover, the complicated ideal model programming analysis (represented by red dashed arrows) is restricted to only appearing in the simplest of these proofs (analyzing of PRFs); it can then simply be “passed along” to the higher level proofs.

We can then show that for most CPA modes of operation (e.g., CBC mode or CTR mode), one can prove SIM-AC-CPA security assuming the underlying block cipher is SIM-AC-PRF. The core requirement is that the mode of operation enjoys a property that we call extraction security. This is a technical condition capturing the key security properties needed to prove that a mode of operation is SIM-AC-CPA assuming the underlying block cipher is SIM-AC-PRF. Moreover, we show that most existing (standard) proofs of IND-CPA security show, implicitly, the extraction security of the mode. Thus, we can easily establish adaptive compromise proofs given existing (standard) ones.

The above addresses only confidentiality. Luckily, integrity is inherited essentially for free from existing analysis. We generically show that SIM-AC-CPA security combined with the standard notion of ciphertext integrity implies SIM-AC-CCA security. Thus, one can prove encrypt-then-MAC is SIM-AC-CCA secure assuming the SIM-AC-CPA security of the encryption and the standard unforgeability security of the MAC. This is an easy adaptation of the standard proof  [8] of encrypt-then-MAC.

Applying the framework to high-level protocols. Equipped with our new SIM-AC-CCA and SIM-AC-PRF security notions, we can return to our motivating task: providing positive security analyses of BurnBox and the Cash et al. SSE scheme.

We give a proof of BurnBox’s CAS security assuming the SIM-AC-CPA security of the underlying symmetric encryption scheme. Our proof is significantly simpler than the original analysis, avoiding specifically the nuanced programming logic that led to the bug in the original analysis. For the Cash et al. scheme we apply our SIM-AC-PRF definition and a key-private version of our SIM-AC-CPA definition. Their adaptive security claim was accompanied only by a brief proof sketch which fails to identify an important detail that need to be considered in the ROM analysis (see the full version of this paper [23]). Our proof handles this detail cleanly while being ultimately of comparable complexity to their non-adaptive security proof.

Unfortunately, these settings and constructions are inherently complicated. So even with the simplification provided by our analysis techniques there is not space to fit their analysis in the body of our paper; it has instead been relegated to the full version of this work. We choose this organization because our main contribution is the definition abstraction which we believe will be of use for future work, rather than the particular applications we chose to exhibit its use.

Treatment of symmetric encryption. In this work, we focus on randomized encryption, over more modern nonce-based variants because this was the form of encryption used by the applications we identified. In the full version of this paper [23], we extend our definitions to nonce-based encryption. The techniques we introduce for analyzing randomized symmetric encryption schemes should extend to nonce-based encryption schemes.

Related works. A related line of work is that of selective-opening attacks  [7] which studies the security of asymmetric encryption schemes against compromises in a multi-sender setting (where coins underlying encryption may be compromised) or multi-receiver setting (where secret decryption keys may be compromised). Selective-opening definitions are typically formulated to aim for standard model (or non-programmable random oracle model) achievability and hence do not suffice for the applications we consider in this work.

The full version of this paper is available on ePrint  [23].

2 Preliminaries

A list T of length \(n\in {{\mathbb N}}\) specifies an ordered sequence of elements T[1], T[2], \(\dots \), T[n]. The operation \(T{.}\mathsf {add}(x)\) appends x to this list by setting \(T[n+1]\leftarrow x\). This making T a list of length \(n+1\). We let |T| denote the length of T. The operation \(x\leftarrow T{.}\mathsf {dq}()\) sets x equal to the last element of T and removes this element from T. In pseudocode lists are assumed to be initialized empty (i.e. have length 0). An empty list or table is denoted by \([\cdot ]\). We sometimes use set notation with a list, e.g. \(x\in T\) is \(\mathsf {true}\) if \(x=T[i]\) for any \(1\le i \le |T|\).

Let S and \(S'\) be two sets with \(|S|\le |S'|\). Then \(\mathsf {Inj}(S,S')\) is the set of all injections from S to \(S'\). We will sometimes abuse terminology and refer to functions with co-domain \(\{\,S \, :\,S\subseteq \{0,1\}^*\,\}\) as sets. For \(n\in {{\mathbb N}}\) we define \([n]=\{1,\dots ,n\}\).

The notation denotes the (randomized) execution of A with state \(\sigma \). Changes that A makes to its input variable \(\sigma \) are maintained after A’s execution. For given \(x_1,x_2,\dots \) and \(\sigma \) we let \([A(x_1,x_2,\dots : \sigma )]\) denote the set of possible outputs of A given these inputs.

We define security notions using pseudocode-based games. The pseudocode “Require \(\mathsf {bool}\)” is shorthand for “If not \(\mathsf {bool}\) then return \(\bot \)”. We will sometimes use infinite loops defining variable \(x_{\textit{u}}\) for all \(\textit{u}\in \{0,1\}^*\). Such code is executed lazily; the code is initially skipped, then later if a variable \(x_{\textit{u}}\) would be used, the necessary code to define it is first run. The pseudocode “\(\exists x\in {X}\) s.t. p(x)” for some predicate p evaluates to the boolean value \(\bigvee _{x\in {X}} p(x)\). If this is \(\mathsf {true}\), the variable x is set equal to the lexicographically first \(x\in {X}\) for which p(x) is \(\mathsf {true}\).

We use an asymptotic formalism. The security parameter is denoted \(\lambda \). Our work is generally written in a way to allow concrete security bounds to be extracted easily. In security proofs we typically explicitly state how we will bound the advantage of an adversary by the advantages of reduction adversaries we build (and possibly other terms). Reduction adversaries and simulators are explicitly given in code (from which concrete statements about their efficiency can be obtained by observation).

Let \(f:{{\mathbb N}}\rightarrow {{\mathbb N}}\). We say f is negligible if for all polynomials p there exists a \(\lambda _p\in {{\mathbb N}}\) such that \(f(\lambda )\le 1/p(\lambda )\) for all \(\lambda \ge \lambda _p\). We say f is super-polynomial if 1/f is negligible. We say f is super-logarithmic if \(2^f\) is super-polynomial.

Ideal primitives. We will make liberal use of ideal primitives such as random oracles or ideal ciphers. An ideal primitive \(\mathsf {P}\) specifies algorithms \(\mathsf {\mathsf {P}{.}Init}\) and \(\mathsf {\mathsf {P}{.}Prim}\). The initialization algorithm has syntax . The stateful evaluation algorithm has syntax . We sometimes us \(A^\mathsf {P}\) as shorthand for giving algorithm A oracle access to \(\mathsf {\mathsf {P}{.}Prim}(1^\lambda ,\cdot :\sigma _\mathsf {P})\). Adversaries are often given access to \(\mathsf {P}\) via an oracle \(\textsc {Prim}\).

Ideal primitives should be stateless. By this we mean that after \(\sigma _\mathsf {P}\) is output by \(\mathsf {\mathsf {P}{.}Init}\), it is never modified by \(\mathsf {\mathsf {P}{.}Prim}\) (so we could have used the syntax ). However, when written this way, ideal primitives are typically inefficient, e.g., for the random oracle model \(\sigma _\mathsf {P}\) would store a huge random table. Our security results will necessitate that \(\mathsf {P}\) be efficiently instantiated so we have adopted the stateful syntax to allow ideal primitives to be written in their efficient “lazily sampled” form. Despite this notational convenience, we will assume that any ideal primitive we reference is essentially stateless. By this, we mean that it could have been equivalently written to be stateless (if inefficient).Footnote 2

The standard model is captured by the primitive \(\mathsf {P}_{\mathsf {sm}}\) for which \(\mathsf {\mathsf {P}_{\mathsf {sm}}{.}Init}(1^\lambda )\) and \(\mathsf {\mathsf {P}_{\mathsf {sm}}{.}Prim}(1^\lambda ,x:\sigma _\mathsf {P})\) always returns the empty string \(\varepsilon \).

We define a random oracle that takes arbitrary input and produce variable length outputs. It is captured by the primitive \(\mathsf {P}_{\mathsf {rom}}\) defined as follows.

figure a

The ideal-cipher model is parameterized by a block-length \(n:{{\mathbb N}}\rightarrow {{\mathbb N}}\) and captured by \(\mathsf {P}_{\mathsf {icm}}^n\) defined as follows.Footnote 3

figure b

It stores tables E and D which it uses to lazily sample a random permutation for each \(K\), with \(E[K,\cdot ]\) representing the forward evaluation and \(D[K,\cdot ]\) its inverse. It parses its input as a tuple \((\mathrm {op},K,y)\) where \(\mathrm {op}\in \{\mathrm {+},\mathrm {-}\}\) specifies the direction of evaluation and \(K\in \{0,1\}^{*}\) and \(y\in \{0,1\}^{n(\lambda )}\) specify the input.

Sometimes we construct a cryptographic primitive from multiple underlying cryptographic primitives which expect different ideal primitives. To capture this it will be useful to have a notion of combining ideal primitives. Let \(\mathsf {P}'\) and \(\mathsf {P}''\) be ideal primitives. We define their cross product \(\mathsf {P}=\mathsf {P}'\times \mathsf {P}''\) as follows.

figure c

By our earlier convention \(A^{\mathsf {P}'\times \mathsf {P}''}\) is shorthand for giving algorithm A oracle access to \(\mathsf {\mathsf {P}{.}Prim}(1^\lambda ,\cdot :\sigma _\mathsf {P})\). In A’s code, \(B^{\mathsf {P}'}\) denotes giving B oracle access to \(\mathsf {\mathsf {P}{.}Prim}(1^\lambda ,(1,\cdot ):\sigma _\mathsf {P})\) and \(B^{\mathsf {P}''}\) to denote giving B oracle access to \(\mathsf {\mathsf {P}{.}Prim}(1^\lambda ,(2,\cdot ):\sigma _\mathsf {P})\).

2.1 Standard Cryptographic Definitions

We recall standard cryptographic syntax and security notions.

Symmetric encryption syntax. A symmetric encryption scheme \(\mathsf {SE}\) specifies algorithms \(\mathsf {\mathsf {SE}{.}Kg}\), \(\mathsf {\mathsf {SE}{.}Enc}\), and \(\mathsf {\mathsf {SE}{.}Dec}\) as well as sets \(\mathsf {\mathsf {SE}{.}M}\), \(\mathsf {\mathsf {SE}{.}Out}\), and \(\mathsf {\mathsf {SE}{.}K}\) representing the message, ciphertext, and key space respectively. The key generation algorithm has syntax . The encryption algorithm has syntax , where \(c\in \mathsf {\mathsf {SE}{.}Out}(\lambda ,\left| m\right| )\) is required. The deterministic decryption algorithm and has syntax \(m\leftarrow \mathsf {\mathsf {SE}{.}Dec}^\mathsf {P}(1^\lambda ,K,c)\). Rejection of \(c\) is represented by returning \(m=\bot \). Informally, correctness requires that encryptions of messages in \(\mathsf {\mathsf {SE}{.}M}(\lambda )\) decrypt properly. We assume the boolean \((m\in \mathsf {\mathsf {SE}{.}M}(\lambda ))\) can be efficiently computed.

Integrity of ciphertexts. Integrity of ciphertext security is defined by the game \(\mathrm {G}^{\mathsf {int}\hbox {-}\mathsf {ctxt}}_{\mathsf {SE},\mathcal {A}_{\mathsf {ctxt}}}\) shown in Fig. 3. In the game, the attacker interacts with one of two “worlds” (determined by the bit b) via its oracles \(\textsc {Enc}\), \(\textsc {Prim}\), \(\textsc {Exp}\), and \(\textsc {Dec}\). The attacker’s goal is to determine which world it is interacting with.

Fig. 3.
figure 3

Game defining multi-user CTXT security of \(\mathsf {SE}\) in the face of exposures.

The \(\textsc {Prim}\) oracle gives the attacker access to the ideal primitive \(\mathsf {P}\). The encryption oracle \(\textsc {Enc}\) takes as input a user \(\textit{u}\) and message \(m\), then returns the encryption of that message using the key of that user, \(K_\textit{u}\). Recall that by our convention each \(K_{\textit{u}}\) is not sampled until needed. The exposure oracle \(\textsc {Exp}\) takes in \(\textit{u}\) and then returns \(K_{\textit{u}}\) to the attacker. The decryption oracle \(\textsc {Dec}\) is the only oracle whose behavior depends on the bit b. It takes as input a user \(\textit{u}\) and ciphertext \(c\). When \(b=1\), it will return the decryption of \(c\) using \(K_\textit{u}\) while when \(b=0\) it will always return \(\bot \). Thus, the goal of the attacker it to forge a ciphertext which will decrypt to a non-\(\bot \) value.

To prevent trivial attacker, we disallow querying a ciphertext to \(\textsc {Dec}(\textit{u},\cdot )\) if it came from \(\textsc {Enc}(\textit{u},\cdot )\) or if \(\textit{u}\) was already exposed. This is captured by the “Require” statements in \(\textsc {Dec}\) using lists \(C_\textit{u}\) and \({X}\) (which store the ciphertexts returned by \(\textsc {Enc}(\textit{u},\cdot )\) and the users that have been exposed, respectively).

We define the advantage function \(\mathsf {Adv}^{\mathsf {\mathsf {int}\hbox {-}\mathsf {ctxt}}}_{\mathsf {SE},\mathsf {P},\mathcal {A}_{\mathsf {ctxt}}}(\lambda )=2\Pr [\mathrm {G}^{\mathsf {int}\hbox {-}\mathsf {ctxt}}_{\mathsf {SE},\mathsf {P},\mathcal {A}_{\mathsf {ctxt}}}(\lambda )]-1\). We say \(\mathsf {SE}\) is INT-CTXT secure with \(\mathsf {P}\) if for all PPT \(\mathcal {A}_{\mathsf {ctxt}}\), the advantage \(\mathsf {Adv}^{\mathsf {\mathsf {int}\hbox {-}\mathsf {ctxt}}}_{\mathsf {SE},\mathsf {P},\mathcal {A}_{\mathsf {ctxt}}}(\cdot )\) is negligible. INT-CTXT security is typically defined to only consider a single user and no exposures. Using a hybrid argument one can show that our definition of INT-CTXT security is implied by the more standard definition.

Function family. A family of functions \(\mathsf {F}\) specifies algorithms \(\mathsf {\mathsf {F}{.}Kg}\) and \(\mathsf {\mathsf {F}{.}Ev}\) together with sets \(\mathsf {\mathsf {F}{.}Inp}\) and \(\mathsf {\mathsf {F}{.}Out}\). The key generation algorithm has syntax . The evaluation algorithm is deterministic and has the syntax \(y\leftarrow \mathsf {\mathsf {F}{.}Ev}(1^\lambda ,K,x)\). It is required that for all \(\lambda \in {{\mathbb N}}\) and \(K\in [\mathsf {\mathsf {F}{.}Kg}(1^\lambda )]\) that \(\mathsf {\mathsf {F}{.}Ev}(1^\lambda ,K,x)\in \mathsf {\mathsf {F}{.}Out}(\lambda )\) whenever \(x\in \mathsf {\mathsf {F}{.}Inp}(\lambda )\). It is assumed that random elements of \(\mathsf {\mathsf {F}{.}Out}(\lambda )\) can be efficiently sampled.

Fig. 4.
figure 4

Game defining one-wayness of \(\mathsf {F}\).

One-wayness. The one-wayness of a family of functions \(\mathsf {F}\) is given by the game \(\mathrm {G}^{\mathsf {ow}}\) shown in Fig. 4. The adversary is given a key \(K\) to \(\mathsf {F}\) and the image y of a random point x in the domain. Its goal is to find a point with the same image. We define the advantage function \(\mathsf {Adv}^{\mathsf {\mathsf {ow}}}_{\mathsf {F},\mathsf {P},\mathcal {A}}(\lambda )=\Pr [\mathrm {G}^{\mathsf {ow}}_{\mathsf {F},\mathsf {P},\mathcal {A}}(\lambda )]\) and say \(\mathsf {F}\) is OW secure with \(\mathsf {P}\) if \(\mathsf {Adv}^{\mathsf {\mathsf {ow}}}_{\mathsf {F},\mathsf {P},\mathcal {A}}(\cdot )\) is negligible for all PPT \(\mathcal {A}\).

Security definitions. In the body of this paper we sometimes informally reference other security notions for symmetric encryption schemes (IND-CPA, IND-CCA, IND-KP, IND-$) and function families (PRF, UF-CMA). These definitions are recalled in the full version of this paper [23].

3 New Security Definitions for Symmetric Primitives

In this section we provide our definitions for the security of symmetric cryptographic primitives (namely randomized encryption and pseudorandom functions) against attackers able to adaptively compromise users’ keys.

3.1 Randomized Symmetric Encryption

We describe our security definitions for randomized symmetric encryption. We refer to them as SIM-AC-CPA and SIM-AC-CCA security. The definition of SIM-AC-CPA (resp. SIM-AC-CCA) security is a generalization of IND-CPA (IND-CCA) security to a multi-user setting in which some users’ keys may be compromised by an attacker.

Consider game \(\mathrm {G}^{\mathsf {sim}\hbox {-}\mathsf {ac}\hbox {-}\mathsf {cpa}}\) shown in Fig. 5. It is parameterized by a symmetric encryption scheme \(\mathsf {SE}\), simulator \(\mathsf {S}\), ideal primitive \(\mathsf {P}\), and attacker \(\mathcal {A}_{\mathsf {cpa}}\). The attacker interacts with one of two “worlds” via its oracles \(\textsc {Enc}\), \(\textsc {Exp}\), and \(\textsc {Prim}\). The attacker’s goal is to determine which world it is interacting with.

In the real world (\(b=1\)) the encryption oracle \(\textsc {Enc}\) takes \((\textit{u},m)\) as input and returns an encryption of \(m\) using \(\textit{u}\)’s key \(K_\textit{u}\). Oracle \(\textsc {Prim}\) returns the output of the ideal primitive on input x. Oracle \(\textsc {Exp}\) returns \(\textit{u}\)’s key \(K_\textit{u}\) to the attacker.

In the ideal world (\(b=0\)), the return values of each of these oracles are instead chosen by a simulator \(\mathsf {S}\). In \(\textsc {Prim}\) it is given the input provided to the oracle. In \(\textsc {Enc}\) it is given the name of the current user \(\textit{u}\) and some leakage \(\ell \) about the message \(m\). If \(\textit{u}\) has not yet been exposed \((\textit{u}\not \in {X})\) this leakage is just the length of the message. Otherwise the leakage is the message itself. The inputs and outputs of this oracle for a user \(\textit{u}\) are stored in the lists \(M_\textit{u}\) and \(C_\textit{u}\) so they can be leaked to the simulator when \(\textsc {Exp}(\textit{u})\) is called.

We define \(\mathsf {Adv}^{\mathsf {\mathsf {sim}\hbox {-}\mathsf {ac}\hbox {-}\mathsf {cpa}}}_{\mathsf {SE},\mathsf {S},\mathsf {P},\mathcal {A}_{\mathsf {cpa}}}(\lambda ) =2\Pr [\mathrm {G}^{\mathsf {sim}\hbox {-}\mathsf {ac}\hbox {-}\mathsf {cpa}}_{\mathsf {SE},\mathsf {S},\mathsf {P},\mathcal {A}_{\mathsf {cpa}}}(\lambda )]-1\). We say \(\mathsf {SE}\) is SIM-AC-CPA secure with \(\mathsf {P}\) if for all PPT \(\mathcal {A}_{\mathsf {cpa}}\) there exists a PPT \(\mathsf {S}\) such that \(\mathsf {Adv}^{\mathsf {\mathsf {sim}\hbox {-}\mathsf {ac}\hbox {-}\mathsf {cpa}}}_{\mathsf {SE},\mathsf {S},\mathsf {P},\mathcal {A}_{\mathsf {cpa}}}(\cdot )\) is negligible. Intuitively, this definition captures that ciphertexts reveal nothing about the messages encrypted other than their length unless the encryption key is known to the attacker. In the full version of this paper [23], we show that SIM-AC-CPA security is impossible in the standard model. The proof is a simple application of the ideas of Nielsen  [31].

Fig. 5.
figure 5

Games defining SIM-AC-CPA and SIM-AC-CCA security of \(\mathsf {SE}\).

SIM-AC-CCA security extends SIM-AC-CPA security by giving \(\mathcal {A}_{\mathsf {cca}}\) access to a decryption oracle which takes as input \((\textit{u},c)\). In the real world, it returns the decryption of \(c\) using \(K_\textit{u}\). In the ideal world, the simulator simulates this. To prevent trivial attacks, the attacker is disallowed from querying \((\textit{u},c)\) if \(c\) was returned from an earlier query \(\textsc {Enc}(\textit{u},m)\). We define \(\mathsf {Adv}^{\mathsf {\mathsf {sim}\hbox {-}\mathsf {ac}\hbox {-}\mathsf {cca}}}_{\mathsf {SE},\mathsf {S},\mathsf {P},\mathcal {A}_{\mathsf {cca}}}(\lambda ) = 2\Pr [\mathrm {G}^{\mathsf {sim}\hbox {-}\mathsf {ac}\hbox {-}\mathsf {cca}}_{\mathsf {SE},\mathsf {S},\mathsf {P},\mathcal {A}_{\mathsf {cca}}}(\lambda )]-1\). We say \(\mathsf {SE}\) is SIM-AC-CCA secure with \(\mathsf {P}\) if for all PPT \(\mathcal {A}_{\mathsf {cca}}\) there exists a PPT \(\mathsf {S}\) such that \(\mathsf {Adv}^{\mathsf {\mathsf {sim}\hbox {-}\mathsf {ac}\hbox {-}\mathsf {cca}}}_{\mathsf {SE},\mathsf {S},\mathsf {P},\mathcal {A}_{\mathsf {cca}}}(\cdot )\) is negligible.

Simplifications. It will be useful to keep in mind simplifications we can make to restrict the behavior of the adversary or simulator without loss of generally. They are applicable to all SIM-AC-style definitions we provide in this paper.

  • If an oracle is deterministic in the real world, then we can assume that the adversary never repeats a query to this oracle or that the simulator always provides the same output to repeated queries.

  • We can assume the adversary never makes a query to a user it has already exposed or that for such queries the simulator just runs the code of the real world (replacing calls to \(\mathsf {P}\) with calls to \(\mathsf {\mathsf {S}{.}Prim}\)).

  • We can assume the adversary always queries with \(\textit{u}\in [\textit{u}_{\lambda }]\) for some polynomial \(\textit{u}_{(\cdot )}\) or that the simulator is agnostic to the particular strings used to reference users.

  • We can assume that adversaries never make queries that fail “Require” statements. (All requirements of oracles we provide will be efficiently computable given the transcripts of queries the adversary has made.)

Proving these are slightly more subtle to establish than analogous simplifications would be in non-simulation-based games because of the order that algorithms are quantified in our security definitions. They all follow the same pattern though, so we sketch the second of these.

Suppose \(\mathsf {SE}\) is SIM-AC-CPA secure for all adversaries that never make a call \(\textsc {Enc}(\textit{u},m)\) after having made a call \(\textsc {Exp}(\textit{u})\), then we claim \(\mathsf {SE}\) is SIM-AC-CPA secure. Let \(\mathcal {A}\) be an arbitrary adversary. Then we build a wrapper adversary \(\mathcal {A}'\) that simply forwards all of \(\mathcal {A}\)’s queries except for encryption queries made for a user that has already been exposed. In these cases \(\mathcal {B}\) responds with the output of \(\mathsf {\mathsf {SE}{.}Enc}^{\textsc {Prim}(\cdot )}(1^\lambda ,K_{\textit{u}},m)\) (or \(\bot \) if \(m\not \in \mathsf {\mathsf {SE}{.}M}(\lambda )\)), where \(K_\textit{u}\) is the key last returned from \(\textsc {Exp}(\textit{u})\). Let \(\mathsf {S}'\) be a simulator for \(\mathcal {A}'\). Then we construct \(\mathsf {S}\) for \(\mathcal {A}\) which responds exactly as \(\mathsf {S}'\) would except in response to encryption queries made for a user that has already been exposed. In these cases \(\mathsf {S}'\) responds with the output of \(\mathsf {\mathsf {SE}{.}Enc}^{\mathsf {\mathsf {S}'{.}Prim}(1^\lambda ,\cdot :\sigma )}(1^\lambda ,K_{\textit{u}},m)\), where \(K_\textit{u}\) is the key it last returned for \(\textsc {Exp}(\textit{u})\). It is clear that \(\mathsf {Adv}^{\mathsf {\mathsf {sim}\hbox {-}\mathsf {ac}\hbox {-}\mathsf {cpa}}}_{\mathsf {SE},\mathsf {S},\mathsf {P},\mathcal {A}}(\lambda )=\mathsf {Adv}^{\mathsf {\mathsf {sim}\hbox {-}\mathsf {ac}\hbox {-}\mathsf {cpa}}}_{\mathsf {SE},\mathsf {S}',\mathsf {P},\mathcal {A}'}(\lambda )\) because the view of \(\mathcal {A}\) is identical in the corresponding games.

Stronger security notions. It is common in the study of symmetric encryption primitives to study stronger security definitions than IND-CPA security. Most schemes instead aim directly for their output to be indistinguishable from random bits (IND-$). This implies IND-CPA security and additional nice properties such as forms of key-privacy.

We can capture such notions by placing restrictions on the behavior of the simulator. Let \(\mathsf {S}\) be a simulator (for which we think of \(\mathsf {\mathsf {S}{.}Enc}\) as being undefined) which additionally defines algorithms \(\mathsf {\mathsf {S}{.}Enc}_1\) and \(\mathsf {\mathsf {S}{.}Enc}_2\) as well as set \(\mathsf {\mathsf {S}{.}Out}\). Then we define simulators \(\mathsf {S}_{\mathsf {k}}[\mathsf {S}]\) and \(\mathsf {S}_{\mathsf {\$}}[\mathsf {S}]\) to be identical to \(\mathsf {S}\) except for the following encryption simulation algorithms.

figure d

Checking \(\ell \in {{\mathbb N}}\) acts as a convenient way of verifying if the user being queried has been exposed yet. Because \(\mathsf {\mathsf {S}{.}Enc}_1(1^\lambda ,\ell :\sigma )\) is not given \(\textit{u}\) in \(\mathsf {S}_{\mathsf {k}}\), the output of \(\mathsf {S}_{\mathsf {k}}\) is distributed identically for any unexposed users. The class of key-anonymous simulators \(\mathcal {S}_{k}\) is the set of all \(\mathsf {S}_{\mathsf {k}}[\mathsf {S}]\) for some \(\mathsf {S}\). Similarly, \(\mathsf {S}_{\mathsf {\$}}\) always outputs a random bitstring as the ciphertext for any unexposed user. The class of random-ciphertext simulators \(\mathcal {S}_{\$}\) is the set of all \(\mathsf {S}_{\mathsf {\$}}[\mathsf {S}]\) for some \(\mathsf {S}\). Note that \(\mathcal {S}_{\$}\subset \mathcal {S}_{k}\).

We say \(\mathsf {SE}\) is SIM-AC-KP secure with \(\mathsf {P}\) if for all PPT \(\mathcal {A}_{\mathsf {cpa}}\) there exists a PPT \(\mathsf {S}\in \mathcal {S}_{k}\) such that \(\mathsf {Adv}^{\mathsf {\mathsf {sim}\hbox {-}\mathsf {ac}\hbox {-}\mathsf {cpa}}}_{\mathsf {SE},\mathsf {S},\mathsf {P},\mathcal {A}_{\mathsf {cpa}}}(\cdot )\) is negligible. We say that \(\mathsf {SE}\) is SIM-AC-$ secure with \(\mathsf {P}\) if for all PPT \(\mathcal {A}_{\mathsf {cpa}}\) there exists a PPT \(\mathsf {S}\in \mathcal {S}_{\$}\) such that \(\mathsf {Adv}^{\mathsf {\mathsf {sim}\hbox {-}\mathsf {ac}\hbox {-}\mathsf {cpa}}}_{\mathsf {SE},\mathsf {S},\mathsf {P},\mathcal {A}_{\mathsf {cpa}}}(\cdot )\) is negligible. It is straightforward to see that SIM-AC-$ security implies SIM-AC-KP security which in turn implies SIM-AC-CPA security. Standard counter-examples will show that these implications do not hold in the other direction.

It is sometimes useful to define security in an all-in-one style, introduced by Rogaway and Shrimpton  [32], which simultaneously requires IND-$ security and INT-CTXT security. In our framework we can define \(\mathcal {S}_{\bot }\) as the class of IND-CCA simulators which always return \(\bot \) for decryption queries to unexposed users. Then we say \(\mathsf {SE}\) is SIM-AC-AE secure with \(\mathsf {P}\) if for all PPT \(\mathcal {A}_{\mathsf {cca}}\) there exists a PPT \(\mathsf {S}\in \mathcal {S}_{\$}\cap \mathcal {S}_{\bot }\) such that \(\mathsf {Adv}^{\mathsf {\mathsf {sim}\hbox {-}\mathsf {ac}\hbox {-}\mathsf {cca}}}_{\mathsf {SE},\mathsf {S},\mathsf {P},\mathcal {A}_{\mathsf {cca}}}(\cdot )\) is negligible.

3.2 Pseudorandom Functions

Typically a symmetric encryption scheme will use a PRF as one of their basic building blocks. For modularity, it will be useful to provide a simulation-based security definition for PRFs in the face of active compromises. In Sect. 6, we show our PRF definition can be applied to construct a SIM-AC secure symmetric encryption scheme. Additionally, in the full version of this paper [23], we show that our definition is of independent use by using it to prove the adaptive security of a searchable symmetric encryption scheme introduced by Cash et al.  [12].

Fig. 6.
figure 6

Game defining multi-user PRF security of \(\mathsf {F}\) in the face of exposures.

The game \(\mathrm {G}^{\mathsf {sim}\hbox {-}\mathsf {ac}\hbox {-}\mathsf {prf}}_{\mathsf {F},\mathsf {S},\mathsf {P},\mathcal {A}_{\mathsf {prf}}}\) is shown in Fig. 6. In the real world, \(\textsc {Ev}\) gives adversary \(\mathcal {A}_{\mathsf {prf}}\) the real output of \(\mathsf {F}\). In the ideal world, \(\textsc {Ev}\)’s output is chosen at random (and stored in the table \(T_{\textit{u}}\)), unless \(\textit{u}\) has already been exposed in which case simulator \(\mathsf {S}\) chooses the output. The table \(T_{\textit{u}}\) is given to \(\mathsf {S}\) when an exposure of \(\textit{u}\) happens so it can output a key consistent with prior \(\textsc {Ev}\) queries; we assume it is easy to iterate over all \((x,T_{\textit{u}}[x])\) pairs for which \(T_{\textit{u}}[x]\) is not \(\bot \). We define \(\mathsf {Adv}^{\mathsf {\mathsf {sim}\hbox {-}\mathsf {ac}\hbox {-}\mathsf {prf}}}_{\mathsf {F},\mathsf {S},\mathsf {P},\mathcal {A}_{\mathsf {prf}}}(\lambda )=2\Pr [\mathrm {G}^{\mathsf {sim}\hbox {-}\mathsf {ac}\hbox {-}\mathsf {prf}}_{\mathsf {F},\mathsf {S},\mathsf {P},\mathcal {A}_{\mathsf {prf}}}(\lambda )]-1\). We say \(\mathsf {F}\) is SIM-AC-PRF secure with \(\mathsf {P}\) if for all PPT \(\mathcal {A}_{\mathsf {prf}}\) there exists a PPT \(\mathsf {S}\) such that \(\mathsf {Adv}^{\mathsf {\mathsf {sim}\hbox {-}\mathsf {ac}\hbox {-}\mathsf {prf}}}_{\mathsf {F},\mathsf {S},\mathsf {P},\mathcal {A}_{\mathsf {prf}}}(\cdot )\) is negligible.

4 Applications

The value of our definitions stems from their usability in proving the security of protocols constructed from symmetric encryption and pseudorandom functions. In this section, we discuss the application our definitions to simplify and modularize existing security results of Cash et al.  [12] and Tyagi et al.  [33], and how they imply the notion of equivocable encryption introduced by Jarecki et al.  [24].

4.1 Asymmetric Password-Authenticated Key Exchange: OPAQUE

Password-authenticated key exchange (PAKE) protocols allow a client and a server with a shared password to establish a shared key resistant to offline guessing attacks. Asymmetric PAKE (aPAKE) further considers security in the case of server compromise, meaning that the server must store some secure representation of the password, rather than the password itself.

OPAQUE  [24] is an aPAKE protocol currently being considered for standardization by the IETF. At a high level, OPAQUE is constructed from an oblivious pseudorandom function (OPRF) and an authenticated key exchange protocol (AKE). User key material for the AKE protocol is stored encrypted under an password-derived key from an OPRF. Key exchange proceeds in two steps: (1) the user rederives the encryption key by running the OPRF protocol with the server on their password, then (2) retrieves and decrypts the AKE keys from the server-held ciphertext and proceeds with the AKE protocol. The “commitment problem” arises when an adversary comprises the server state and then later compromises a user password.

Fig. 7.
figure 7

Game defining EQV security of \(\mathsf {SE}\).

Comparison to equivocable encryption. To prove security of their scheme, Jarecki et al. independently propose a weaker version of SIM-AC-CPA that they call equivocable encryption (EQV). Consider game \(\mathrm {G}^{\mathsf {eqv}}\) defined in Fig. 7. An encryption scheme \(\mathsf {SE}\) is equivocable if for any PPT adversary \(\mathcal {A}=(\mathcal {A}_1,\mathcal {A}_2)\), there exists a simulator \(\mathsf {S}\), such that the advantage function \(\mathsf {Adv}^{\mathsf {\mathsf {eqv}}}_{\mathsf {SE},\mathsf {S},\mathsf {P},\mathcal {A}}(\lambda )=2\Pr [\mathrm {G}^{\mathsf {eqv}}_{\mathsf {SE},\mathsf {S},\mathsf {P},\mathcal {A}}(\lambda )]-1\) is negligible. The  [24] definition does not specify how to incorporate the ideal model, so we make a reasonable assumption.

Note that EQV is a weaker version of SIM-AC-CPA in that it allows for only one user and only one encryption query. Showing SIM-AC-CPA implies EQV can be done with a simple wrapper reduction in which the output of \(\mathcal {A}_1\) from EQV is forwarded to the encryption oracle of SIM-AC-CPA. Since EQV allows for only one encryption query, we can further show that EQV does not imply SIM-AC-CPA. Consider a scheme that uses a key \(K=(K_1,K_2)\) and constructs ciphertexts as \((K_1,\mathsf {Enc}_{K_2}(m))\) unless \(m=K_1\), in which case it is formed as \((K_2,\mathsf {Enc}_{K_2}(m))\). Such a scheme could be secure with respect to EQV but will not be secure in a game that allows multiple encryption queries. Interestingly, showing that our multi-user SIM-AC-CPA notion is implied by its single-user version through a hybrid argument is not straightforward due to managing inconsistencies in simulator state between hybrid steps. We have not been able to prove this result and leave it open for future work. Thus, even if EQV was extended to allow multiple encryption queries, it still may not be widely applicable to situations that require multiple users.

Ultimately, our work fills in the claim of Jarecki et al. that “common encryption modes are equivocable under some idealized assumption”.

4.2 Searchable Symmetric Encryption

In the full version of this paper [23], we show that our symmetric encryption and PRF security definitions are useful for proving the security of searchable searchable symmetric encryption (SSE) schemes. An SSE scheme allows a client with a database of documents to store them in encrypted form on a server while still being able to perform keyword searches on these documents.

As a concrete example, we consider Cash et al.  [12] which proved non-adaptive security of an SSE scheme when using a PRF and an IND-$ secure encryption scheme and claimed adaptive security when the PRF is replaced with a random oracle and the encryption scheme is replaced with a specific random-oracle-based scheme. We will prove their adaptive result, this time assuming the family of functions is SIM-AC-PRF secure and the encryption scheme is SIM-AC-KP secure. This makes the result more modular because one is no longer restricted to use their specific choices of a PRF and encryption scheme constructed from a random oracle. As a concrete benefit of this, their choice of encryption scheme does not provide INT-CTXT security. To replace the scheme with one that does would require a separate proof while our proof allows the user to choice their favorite INT-CTXT secure scheme without requiring any additional proofs (assuming that scheme is SIM-AC-CPA secure).

Our proof is roughly as complex as their non-adaptive proof; it consists of three similar reductions to the security of the underlying primitives. Without our definitions, a full adaptive proof would have been a technically detailed (though “standard” and not conceptually difficult) proof because it would have to deal with programming the random oracle. Perhaps because of this, the authors of  [12] only provided a sketch of the result, arguing that it follows from the same ideas as their non-adaptive proof plus programming the random oracle to maintain consistency. They claim, “[t]he only defects in the simulation occur when an adversary manages to query the random oracle with a key before it is revealed”. This is technically insufficient; a defect also occurs if the same key is sampled multiple times by the simulator (analogously to parts of our proofs for Theorem 3 and Theorem 4). In our SSE proof, we need not address these details because programming the ideal primitive is handled by the assumed simulation security of the underlying primitives.

A large number of other works on SSE have used analogous techniques of constructing a PRF and/or encryption scheme from a random oracle to achieve adaptive security  [1, 2, 9, 12, 13, 17, 20, 21, 25,26,27,28,29, 34]. As we discuss in the full version of this paper [23], these papers all similarly elided the details of the random oracle programming proof and/or made mistakes in writing these details. The mistakes are individually small and not difficult to fix, but their prevalence indicates the value our definitions can provide to modularize and simplify the proofs in these works. We chose to analyze the Cash et al. scheme to highlight the application of our definitions because it was the simplest construction requiring both SIM-AC-PRF and SIM-AC-KP secure and because their thorough non-adaptive proof served as a useful starting point from which to build our proof.

4.3 Self-revocable Encrypted Cloud Storage: BurnBox

In the full version of this paper [23], we consider the BurnBox construction of a self-revocable cloud storage scheme proposed by Tyagi et al.  [33]. Its goal is to help provide privacy in the face of an authority searching digital devices, e.g., searches of mobile phones or laptops at border crossings. In their proposed scheme a user stores encrypted version of their files on cloud storage. At any point in time they are able to temporarily revoke their own access to these files. Thereby an authority searching their device is unable to learn the content of these files despite their possession of all the secrets stored on the user’s device.

Proving security of their scheme in their security model necessitates solving the “commitment problem.” A simulator is forced to simulate the attacker’s view by providing ciphertexts for files that it does not know the contents of, then later produce a plausible looking key which decrypts the files properly when told the contents. To resolve this issue in their security they modeled the symmetric encryption scheme in the ideal encryption model (which they introduced for this purpose). We are able to recover their result assuming the SIM-AC-CPA security of the encryption scheme. This provides rigorous justification for the use of practically-used encryption schemes which cannot necessarily be thought of as well modeled by the ideal encryption model (e.g. AES-GCM which they used in their prototype implementation). Moreover, the proof we obtain is simpler than the original proof of Tyagi et al. because we do not have to reason about the programming of the ideal encryption model. The original proof has a bug in this programming which we discuss in the full version of this paper [23].

5 Symmetric Encryption Security Results

In this section, we show that important existing results about the security of symmetric encryption schemes “carry over” to our new definitions. These results (together with our results in the next section) form the foundation of our claim that encryption schemes used in practice can be considered to achieve SIM-AC-AE security when their underlying components are properly idealized. First, we show that SIM-AC-CPA and INT-CTXT security imply SIM-AC-CCA security. Then we show that the classic Encrypt-then-MAC scheme achieves SIM-AC-CCA security. Each of these results are, conceptually, a straightforward extension of their standard proof. Finally, we show that random oracles and ideal ciphers are SIM-AC-PRF secure and ideal encryption  [33] is SIM-AC-AE secure.

CPA and CTXT imply CCA. The following theorem captures that SIM-AC-CPA and INT-CTXT security imply SIM-AC-CCA security. Bellare and Namprempre  [8] showed the analogous result for IND-CPA and IND-CCA security.

Theorem 1

If \(\mathsf {SE}\) is SIM-AC-CPA and INT-CTXT secure with \(\mathsf {P}\), then \(\mathsf {SE}\) is SIM-AC-CCA secure with \(\mathsf {P}\).

Proof (Sketch)

Here we sketch the main ideas of the proof. The full details are provided in the full version of this paper [23].

The SIM-AC-CCA simulator we provide is parameterized by a SIM-AC-CPA simulator \(\mathsf {S}_{\mathsf {cpa}}\). As state it stores \(\sigma \) of \(\mathsf {S}_{\mathsf {cpa}}\) and keeps each \(K_{\textit{u}}\) that is has returned to exposure queries. For \(\textsc {Prim}\), \(\textsc {Enc}\), and \(\textsc {Exp}\) queries it simply runs \(\mathsf {S}_{\mathsf {cpa}}\). For \(\textsc {Dec}\) queries it does one of two things. If \(\textit{u}\) has already been exposed it uses the key it previously returned to run the actual decryption algorithm (with oracle access to \(\mathsf {S}_{\mathsf {cpa}}\)’s emulation of \(\mathsf {P}\)) and returns the result. Otherwise it assumes the adversary has failed at producing a forgery and simply returns \(\bot \). (Note this means we have SIM-AC-AE security if \(\mathsf {SE}\) is SIM-AC-$ secure.)

The SIM-AC-CPA security of \(\mathsf {SE}\) ensures that the adversary cannot differentiate between the real and ideal world queries to \(\textsc {Prim}\), \(\textsc {Enc}\), and \(\textsc {Exp}\). The INT-CTXT security of \(\mathsf {SE}\) does the same for the \(\textsc {Dec}\) queries. In the full proof we show that \(\mathsf {Adv}^{\mathsf {\mathsf {sim}\hbox {-}\mathsf {ac}\hbox {-}\mathsf {cca}}}_{\mathsf {SE},\mathsf {S}_{\mathsf {cca}},\mathsf {P},\mathcal {A}_{\mathsf {cca}}}(\lambda ) \le \mathsf {Adv}^{\mathsf {\mathsf {int}\hbox {-}\mathsf {ctxt}}}_{\mathsf {SE},\mathsf {P},\mathcal {A}_{\mathsf {ctxt}}}(\lambda ) + \mathsf {Adv}^{\mathsf {\mathsf {sim}\hbox {-}\mathsf {ac}\hbox {-}\mathsf {cpa}}}_{\mathsf {SE},\mathsf {S}_{\mathsf {cpa}},\mathsf {P},\mathcal {A}_{\mathsf {cpa}}}(\lambda ).\)    \(\square \)

Encrypt-then-MAC. Let \(\mathsf {SE}\) be an encryption scheme. Let \(\mathsf {F}\) be a family of functions for which \(\mathsf {\mathsf {F}{.}Inp}(\lambda )=\{0,1\}^*\). Then the Encrypt-then-MAC encryption scheme using \(\mathsf {SE}\) and \(\mathsf {F}\) is denoted \(\mathsf {EtM}[\mathsf {SE},\mathsf {F}]\). Its message space is defined as \(\mathsf {\mathsf {EtM}[\mathsf {SE},\mathsf {F}]{.}M}(\lambda )=\mathsf {\mathsf {SE}{.}M}(\lambda )\). If \(\mathsf {SE}\) expects access to ideal primitive \(\mathsf {P}_1\) and \(\mathsf {F}\) expects access to ideal primitive \(\mathsf {P}_2\) then \(\mathsf {EtM}[\mathsf {SE},\mathsf {F}]\) expects access to \(\mathsf {P}_1\times \mathsf {P}_2\). The key-generation algorithm \(\mathsf {\mathsf {EtM}[\mathsf {SE},\mathsf {F}]{.}Kg}\) returns \(K=(K_{\mathsf {SE}},K_{\mathsf {F}})\) where \(K_\mathsf {SE}\) was sampled with \(\mathsf {\mathsf {SE}{.}Kg}(1^\lambda )\) and \(K_{\mathsf {F}}\) was sampled with \(\mathsf {\mathsf {F}{.}Kg}(1^\lambda )\). Algorithms \(\mathsf {\mathsf {EtM}[\mathsf {SE},\mathsf {F}]{.}Enc}\), and \(\mathsf {\mathsf {EtM}[\mathsf {SE},\mathsf {F}]{.}Dec}\) are defined as follows.

figure e

The following theorem establishes that the generic composition result of Bellare and Namprempre  [8] holds with our simulation-based definitions of security. We sketch its straightforward proof in the full version of this paper [23].

Theorem 2

Let \(\mathsf {SE}\) be an encryption scheme. Let \(\mathsf {F}\) be a family of functions for which \(\mathsf {\mathsf {F}{.}Inp}(\lambda )=\{0,1\}^*\). If \(\mathsf {SE}\) is SIM-AC-CPA secure with \(\mathsf {P}_1\) and \(\mathsf {F}\) is UF-CMA secure with \(\mathsf {P}_2\), then \(\mathsf {EtM}[\mathsf {SE},\mathsf {F}]\) is SIM-AC-CCA secure with \(\mathsf {P}_1\times \mathsf {P}_2\).

Random oracles are good PRFs. We show that a SIM-AC-PRF secure family of functions can be constructed simply in the random oracle model. Consider \(\mathsf {R}\) defined as follows. It is parameterized by a key-length function \(\mathsf {\mathsf {R}{.}kl}:{{\mathbb N}}\rightarrow {{\mathbb N}}\) and output length function \(\mathsf {\mathsf {R}{.}ol}:{{\mathbb N}}\rightarrow {{\mathbb N}}\). It has input set \(\mathsf {\mathsf {R}{.}Inp}(\lambda )=\{0,1\}^*\) and output set \(\mathsf {\mathsf {R}{.}Out}(\lambda )=\{0,1\}^{\mathsf {\mathsf {R}{.}ol}(\lambda )}\).

figure f

Theorem 3

\(\mathsf {R}\) is SIM-AC-PRF secure with \(\mathsf {P}_{\mathsf {rom}}\) if \(\mathsf {\mathsf {R}{.}kl}\) is super-logarithmic.

Concretely, in our proof we provide a simulator \(\mathsf {S}_{\mathsf {prf}}\) for which we show that,

$$\begin{aligned} \mathsf {Adv}^{\mathsf {\mathsf {sim}\hbox {-}\mathsf {ac}\hbox {-}\mathsf {prf}}}_{\mathsf {R},\mathsf {S}_{\mathsf {prf}},\mathsf {P}_{\mathsf {rom}},\mathcal {A}_{\mathsf {prf}}}(\lambda ) \le \frac{\textit{u}^2_\lambda + p_\lambda \textit{u}_\lambda }{2^{\mathsf {\mathsf {R}{.}kl}(\lambda )}} \end{aligned}$$

where \(\textit{u}_\lambda \) is an upper bound on the number of users that \(\mathcal {A}_{\mathsf {prf}}\) queries to and \(p_\lambda \) is an upper bound on the number of \(\textsc {Prim}\) queries that \(\mathcal {A}_{\mathsf {prf}}\) makes.

This theorem captures the random oracle programming implicit in the adaptive security claims of the numerous SSE papers we have identified that used a random oracle like a PRF to achieve adaptive security  [1, 2, 9, 12, 13, 17, 20, 21, 25,26,27,28,29, 34]. Of these works, most chose to elide the details of establishing that the adversary cannot detect the random oracle programming, likely considering them simple and/or standard. Despite this, we have identified bugs in all of the proofs that did provide more details. We discuss these bugs in more detail in the full version of this paper [23].

To be clear, we do not claim that any of the SSE schemes studied in these works are insecure. The prevalence of this issue speaks to the difficulty of properly accounting for the details in an ideal model programming proof. Our SIM-AC-PRF notion provides a convenient intermediate definition via which these higher-level protocols could have been proved secure without having to deal with the tedious details of a random oracle programming proof.

Proof (Sketch)

Here we sketch the main ideas of the proof. The full details are provided in the full version of this paper [23]. The SIM-AC-PRF simulator works are follows. For \(\textsc {Prim}\) queries it just emulates \(\mathsf {P}_{\mathsf {rom}}\) using a table T. For \(\textsc {Ev}\) queries, it just runs \(\mathsf {\mathsf {R}{.}Ev}\) honestly with the key it previously returned for the given user. For \(\textsc {Exp}\) queries (on an unexposed user) it picks a random key for this user and sets T to be consistent with values in the table \(T_{\textit{u}}\) it is given. This simulation is only detectable by an attacker that makes a query to the random oracle with some key that is later chosen by the simulator in response to an exposure or if the simulator happened to chose the same key for two different users.Footnote 4 These events happen with negligible probability.   \(\square \)

Ideal ciphers are good PRFs. One of the most commonly used PRFs is AES so it would be useful to think of it as being SIM-AC-PRF secure; however, due to its invertible nature we cannot realistically model it as a random oracle and refer to the above theorem. Instead, AES is often modeled as an ideal cipher. Let \(\mathsf {\mathsf {B}{.}kl}:{{\mathbb N}}\rightarrow {{\mathbb N}}\) be given and consider \(\mathsf {B}\) defined as follows. It has input set \(\mathsf {\mathsf {B}{.}Inp}(\lambda )=\{0,1\}^{n(\lambda )}\) and output set \(\mathsf {\mathsf {B}{.}Out}(\lambda )=\{0,1\}^{n(\lambda )}\).

figure g

The following establishes that an ideal cipher is SIM-AC-PRF secure.

Theorem 4

\(\mathsf {B}\) is SIM-AC-PRF secure with \(\mathsf {P}_{\mathsf {icm}}^n\) if \(\mathsf {\mathsf {B}{.}kl}\), n are super-logarithmic.

Concretely, in our proof we provide a simulator \(\mathsf {S}_{\mathsf {prf}}\) for which we show that,

$$\begin{aligned} \mathsf {Adv}^{\mathsf {\mathsf {sim}\hbox {-}\mathsf {ac}\hbox {-}\mathsf {prf}}}_{\mathsf {B},\mathsf {S}_{\mathsf {prf}},\mathsf {P}_{\mathsf {icm}}^n,\mathcal {A}_{\mathsf {prf}}}(\lambda ) \le \frac{\textit{u}^2_\lambda + p_\lambda \textit{u}_\lambda }{2^{\mathsf {\mathsf {B}{.}kl}(\lambda )}} + \frac{q^2_{\lambda }}{2^{n(\lambda )+1}} \end{aligned}$$

where \(\textit{u}_\lambda \) is an upper bound on the number of users that \(\mathcal {A}_{\mathsf {prf}}\) queries to, \(p_\lambda \) is an upper bound on the number of \(\textsc {Prim}\) queries that \(\mathcal {A}_{\mathsf {prf}}\) makes, and \(q_{\lambda }\) is an upper bound on the number of \(\textsc {Ev}\) queries that \(\mathcal {A}_{\mathsf {prf}}\) makes.

The proof of this theorem follows the same general pattern as the proof that a random oracle is SIM-AC-PRF secure (Theorem 3). It only needs to extend the ideas of this prior result slightly to apply a birthday bound so that we can treat the values of \(\mathsf {P}_{\mathsf {icm}}^n\) as being sampled with replacement. It works best to process this step last so we do not have to consider the order in which queries are made. The proof is given in the full version of this paper [23].

Ideal encryption model. In the full version of this paper [23], we recall the ideal encryption model used in the analysis of Tyagi et al.  [33] and show that it gives a SIM-AC-AE secure encryption scheme. While doing so, we identify and show how to fix a bug in their proof which used this model.

6 Security of Modes of Operation

In the previous section, we showed that existing analysis of the integrity of a symmetric encryption scheme carries over to our simulation setting to lift SIM-AC-CPA security to SIM-AC-CCA security. It would be convenient to be able to similarly prove that existing IND-CPA security of an encryption scheme suffices to imply SIM-AC-CPA security. Unfortunately, we cannot possibly hope for this to be the case. We know that IND-CPA security can be achieved in the standard model (assuming one-way functions exist), but SIM-AC-CPA security necessarily requires the use of ideal models.

For any typical encryption scheme we could figure out the appropriate way to idealize its underlying components and then write a programming proof to establish security. This would likely be detail intensive and prone to mistakes. We can improve on this by noting that typical symmetric encryption schemes are built as modes of operation using an underlying PRF. We can aim to prove security more modularly by assuming the SIM-AC-PRF security of the underlying family of functions. This alleviates the detail-intensiveness of the proof because the ideal model programming has already been handled in the assumption of SIM-AC-PRF security; it can simply be “passed” along to the new analysis.

In this section, we will show that we can do even better than that. We will restrict attention to modes of operation which are IND-$ secure when built from a PRF and satisfy a special extractability property we define in Sect. 6.1 (which standard examples of models of operation do). Then, in Sect. 6.2, we establish a generic proof framework to elevate an existing IND-$ security proof to a SIM-AC-$ security proof, by showing that existing proofs of IND-$ security tend to (implicitly) prove that the scheme satisfies our extractability property. Finally, in Sect. 6.3 we discuss how the techniques of this section can be extended to other constructions not captured by our formalism, but also note the existence of a (contrived) mode of operation which is IND-$ secure with any secure PRF, but is never SIM-AC-$ secure.

6.1 Modes of Operation and Extractability

We first need to have a formalism capturing what a mode of operation is. Our formalism does not capture all possible modes of operation, but does seem to capture most constructions that are of practical interest and would not be hard to modify to capture other constructions.

A mode of operation \(\mathsf {SE}\) specifies efficient algorithms \(\mathsf {\mathsf {SE}{.}Kg}\), \(\mathsf {\mathsf {SE}{.}Enc}\), and \(\mathsf {\mathsf {SE}{.}Dec}\) as well as sets \(\mathsf {\mathsf {SE}{.}M}\), \(\mathsf {\mathsf {SE}{.}Out}\), \(\mathsf {\mathsf {SE}{.}FInp}\), and \(\mathsf {\mathsf {SE}{.}FOut}\). For any family of functions \(\mathsf {F}\) with \(\mathsf {\mathsf {F}{.}Inp}=\mathsf {\mathsf {SE}{.}FInp}\) and \(\mathsf {\mathsf {F}{.}Out}=\mathsf {\mathsf {SE}{.}FOut}\), it defines a symmetric encryption scheme \(\mathsf {SE}[\mathsf {F}]\) as follows.

figure h

The superscript \(\mathsf {F}_{{K_{\mathsf {F}}}}^{\mathsf {P}}\) is shorthand for oracle access to \(\mathsf {\mathsf {F}{.}Ev}^{\mathsf {P}}(1^\lambda ,{K_{\mathsf {F}}},\cdot )\). It is required that \(\mathsf {\mathsf {SE}[\mathsf {F}]{.}M}=\mathsf {\mathsf {SE}{.}M}\). Moreover, for a given \(\lambda \in {{\mathbb N}}\) the encryption of a message \(m\in \mathsf {\mathsf {SE}{.}M}(\lambda )\) must always be in \(\mathsf {\mathsf {SE}{.}Out}(\lambda ,\left| m\right| )\).

Suppose we want to prove that \(\mathsf {SE}\) is SIM-AC-$ whenever \(\mathsf {F}\) is SIM-AC-PRF. The natural way to do so is to build our simulator \(\mathsf {S}\) from the encryption scheme from the given simulator \(\mathsf {S}_{\mathsf {F}}\) for \(\mathsf {F}\). In \(\textsc {Prim}\) we can simply have \(\mathsf {\mathsf {S}{.}Prim}\) run \(\mathsf {\mathsf {S}_{\mathsf {F}}{.}Prim}\). In \(\textsc {Enc}\) the ciphertext is chosen at random if the user has not been exposed, otherwise we can simply run \(\mathsf {\mathsf {SE}{.}Enc}\) but use \(\mathsf {\mathsf {S}_{\mathsf {F}}{.}Ev}\) in place of \(\mathsf {F}_{{K_{\mathsf {F}}}}\). This just leaves \(\textsc {Exp}\), here we are given a list of ciphertexts for the user and need to output a key to “explain” them. A natural approach is to randomly pick our own \({K_{\mathsf {SE}}}\) and use \(\mathsf {\mathsf {S}_\mathsf {F}{.}Exp }\) to chose \({K_{\mathsf {F}}}\). Doing so requires giving \(\mathsf {S}_\mathsf {F}\) a list of input and outputs to the family of function. Intuitively, it seems we want to be able to “extract” a list of input-outputs pairs for \(\mathsf {F}\) that explain our ciphertexts.

Extractability. A mode of operation is extractable if it additionally specifies an efficient extraction algorithm \(\mathsf {\mathsf {SE}{.}Ext}\) satisfying a correctness and uniformity property we now define. The extraction algorithm \(\mathsf {\mathsf {SE}{.}Ext}\) has syntax . The goal of this algorithm is to “extract” a sequence of responses \(\textit{\textbf{y}}\) by \(\mathsf {F}\) and a string of randomness r that explains how message \(m\) could be encrypted to ciphertext \(c\) when using key \({K_{\mathsf {SE}}}\). We formally define correctness by the following game. It is assumed that \(\mathsf {\mathsf {SE}{.}Ext}\) provides outputs of the appropriate lengths to make this code well-defined. Extraction correctness of \(\mathsf {SE}\) requires that \(\Pr [\mathrm {G}^{\mathsf {corr}}_{\mathsf {SE},m}(1^\lambda )]=1\) for all \(\lambda \in {{\mathbb N}}\) and \(m\in \mathsf {\mathsf {SE}{.}M}(\lambda )\).

figure i

We will also require a uniformity property of \(\mathsf {\mathsf {SE}{.}Ext}\). Specifically we require that its output be uniformly random whenever \(c\) is. Formally, there must exist \(q,l:{{\mathbb N}}\times {{\mathbb N}}\rightarrow {{\mathbb N}}\) such that the two distributions on the right above are equivalent for all \(\lambda \in {{\mathbb N}}\), \(m\in \mathsf {\mathsf {SE}{.}M}(\lambda )\), and \({K_{\mathsf {SE}}}\in [\mathsf {\mathsf {SE}{.}Kg}(1^\lambda )]\).Footnote 5

Extraction security. A core step in our proof will require an additional property of \(\mathsf {SE}\) which we will now define. Roughly, the desired property is that if \(\mathsf {\mathsf {SE}{.}Ext}\) is repeatedly used to explain randomly chosen ciphertexts an adversary cannot notice if it causes inconsistent values to be returned to \(\mathsf {\mathsf {SE}{.}Enc}\).

Fig. 8.
figure 8

Game defining IND-AC-EXT security of \(\mathsf {SE}\). Note that the adversary is not given oracle access to the “private” oracle \(\textsc {Rf}\).

Formally, consider the game \(\mathrm {G}^{\mathsf {ind}\hbox {-}\mathsf {ac}\hbox {-}\mathsf {ext}}\) shown in Fig. 8. In it, a key is chosen for each user and then the adversary is given access to an encryption oracle. In this oracle a random ciphertext is sampled. Then \(\mathsf {\mathsf {SE}{.}Ext}\) is run to provide vector \(\textit{\textbf{y}}\) and coins r which explain this ciphertext with respect to the queried message. Finally, \(\mathsf {\mathsf {SE}{.}Enc}\) is run with coins r and access to an oracle \(\textsc {Rf}\) whose behavior depends on the chosen \(\textit{\textbf{y}}\). The ciphertext it outputs is returned to the adversary.

When \(b=0\), this oracle simply returns the entries of \(\textit{\textbf{y}}\), one at a time. The value returned for an input x is stored as \(T_{\textit{u}}[x]\). The behavior when \(b=1\) is similar except that if an input x to \(\textsc {Rf}\) is ever repeated for a user \(\textit{u}\), then the value stored in \(T_{\textit{u}}[x]\) is used instead of the corresponding entry of \(\textit{\textbf{y}}\). The attacker’s goal is to distinguish between these two cases.

The adversary may choose to expose any user \(\textit{u}\), learning \({K_{\mathsf {SE},\textit{u}}}\) and \(T_{\textit{u}}\). After doing so it is no longer able to make \(\textsc {Enc}\) queries to that user (as captured by the second “Require” statement in \(\textsc {Enc}\)). Note that by the uniformity of \(\mathsf {\mathsf {SE}{.}Ext}\) we could instead think of \(\textit{\textbf{y}}\) and r as simply being picked at random without \(\mathsf {\mathsf {SE}{.}Ext}\) being run, but we believe the current framing is conceptually more clear.

We define \(\mathsf {Adv}^{\mathsf {\mathsf {ind}\hbox {-}\mathsf {ac}\hbox {-}\mathsf {ext}}}_{\mathsf {SE},\mathcal {A}}(\lambda )=2\Pr [\mathrm {G}^{\mathsf {ind}\hbox {-}\mathsf {ac}\hbox {-}\mathsf {ext}}_{\mathsf {SE},\mathcal {A}}]-1\) and say that \(\mathsf {SE}\) is IND-AC-EXT secure if \(\mathsf {Adv}^{\mathsf {\mathsf {ind}\hbox {-}\mathsf {ac}\hbox {-}\mathsf {ext}}}_{\mathsf {SE},\mathcal {A}}(\cdot )\) is negligible for all PPT \(\mathcal {A}\). This notion will be used for an important step of the coming security proof. Of the properties required from an extraction algorithm it is typically the most difficult to verify.

Example Modes. As a simple example, we can consider counter-mode encryption. Let \(\mathsf {\mathsf {CTR}{.}ol},\mathsf {\mathsf {CTR}{.}il}:{{\mathbb N}}\rightarrow {{\mathbb N}}\) be fixed and the latter be super-logarithmic. Then \(\mathsf {CTR}\) is defined as follows. Its key generation algorithm, \(\mathsf {\mathsf {CTR}{.}Kg}\), always returns \(\varepsilon \). Its sets are defined by

$$\begin{aligned} \mathsf {\mathsf {CTR}{.}M}(\lambda )=(\{0,1\}^{\mathsf {\mathsf {CTR}{.}ol}(\lambda )})^*,&\ \ \mathsf {\mathsf {CTR}{.}Out}(\lambda ,l)=\{0,1\}^{l+\mathsf {\mathsf {CTR}{.}il}(\lambda )}\\ \mathsf {\mathsf {CTR}{.}FInp}(\lambda )=\{0,1\}^{\mathsf {\mathsf {CTR}{.}il}(\lambda )},&\ \ \mathsf {\mathsf {CTR}{.}FOut}(\lambda )=\{0,1\}^{\mathsf {\mathsf {CTR}{.}ol}(\lambda )}. \end{aligned}$$

Algorithms \(\mathsf {\mathsf {CTR}{.}Enc}\), \(\mathsf {\mathsf {CTR}{.}Dec}\), and \(\mathsf {\mathsf {CTR}{.}Ext}\) are defined below where \(+\) is addition modulo \(2^{\mathsf {\mathsf {CTR}{.}il}(\lambda )}\) with elements of \(\{0,1\}^{\mathsf {\mathsf {CTR}{.}il}(\lambda )}\) interpreted as integers.

figure j

It is clear that \(\mathsf {\mathsf {CTR}{.}Ext}\) is correct and that its outputs are distributed uniformly when \(c\) is picked at random. The IND-AC-EXT security of \(\mathsf {CTR}\) follows from the probabilistic analysis done in existing proofs of security for \(\mathsf {CTR}\), such as the proof of Bellare, Desai, Jokipii, and Rogaway  [6]. The standard analysis simply bounds the probability that any of the values \(r_1+1,\dots ,r_1+l_1,r_2+1,\dots ,r_2+l_2,\dots ,r_q+1,\dots ,r_q+l_q\) collide when the \(r_i\)’s are picked uniformly and the \(l_i\)’s are adaptively chosen (before the corresponding \(r_i\) is chosen).

Other IND-AC-EXT secure modes of operation include cipher-block chaining (CBC), cipher feedback (CFB), and output feedback (OFB).

6.2 Extractability Implies SIM-AC-$ Security

Finally, we can state the main result of this section, that IND-AC-EXT security of an extractable mode of operation implies SIM-AC-$ security.

Theorem 5

Let \(\mathsf {SE}\) be an extractable mode of operation which is IND-AC-EXT secure. Then \(\mathsf {SE}[\mathsf {F}]\) is SIM-AC-$ secure with \(\mathsf {P}\) whenever \(\mathsf {F}\) is SIM-AC-PRF secure with \(\mathsf {P}\) and satisfies \(\mathsf {\mathsf {F}{.}Inp}=\mathsf {\mathsf {SE}{.}FInp}\) and \(\mathsf {\mathsf {F}{.}Out}=\mathsf {\mathsf {SE}{.}FOut}\).

The full proof is given in the full version of this paper [23]. It considers a sequence of games which transition from the real world of \(\mathrm {G}^{\mathsf {sim}\hbox {-}\mathsf {ac}\hbox {-}\mathsf {cpa}}\) to the ideal world (using a simulator we specify). In the first transition we use the security of \(\mathsf {F}\) to replace \(\mathsf {SE}\)’s oracle access to it with oracle access to a lazily-sampled random function (or simulation by a given simulator \(\mathsf {S}_{\mathsf {prf}}\) if the corresponding user has been exposed). Next we modify the game so that (for unexposed users) ciphertexts are chosen at random and then explained by \(\mathsf {\mathsf {SE}{.}Ext}\). Then \(\mathsf {\mathsf {SE}{.}Enc}\) is run with the chosen random coins and oracle access to this explanation (except for whenever a repeat query is made) to produce a modified ciphertext which is returned. The uniformity of \(\mathsf {\mathsf {SE}{.}Ext}\) ensures this game is identical to the prior game. Then we apply the IND-AC-EXT security of \(\mathsf {SE}\) so that the oracle given to \(\mathsf {\mathsf {SE}{.}Enc}\) is not kept consistent on repeated queries. The correctness of \(\mathsf {\mathsf {SE}{.}Ext}\) gives that the output of \(\mathsf {\mathsf {SE}{.}Enc}\) is equal to the \(c\) that was sampled at random. We provide simulator \(\mathcal {S}_{\$}\) that simulates this game perfectly. It runs \(\mathsf {S}_{\mathsf {prf}}\) whenever the game would. On an exposure it generate the table \(T_{\textit{u}}\) for \(\mathsf {S}_{\mathsf {prf}}\) by running \(\mathsf {\mathsf {SE}{.}Ext}\) on ciphertexts to obtain explanatory outputs of the PRF.

Concretely, in the proof we construct adversaries \(\mathcal {A}_{\mathsf {prf}}\) and \(\mathcal {A}_{\mathsf {ext}}\) along with simulator \(\mathsf {S}_{\mathsf {cpa}}\) for which we show

$$\begin{aligned} \mathsf {Adv}^{\mathsf {\mathsf {sim}\hbox {-}\mathsf {ac}\hbox {-}\mathsf {cpa}}}_{\mathsf {SE}[\mathsf {F}],\mathsf {S}_{\mathsf {\$}}[\mathsf {S}_{\mathsf {cpa}}],\mathsf {P},\mathcal {A}_{\mathsf {cpa}}}(\lambda ) \le \mathsf {Adv}^{\mathsf {\mathsf {sim}\hbox {-}\mathsf {ac}\hbox {-}\mathsf {prf}}}_{\mathsf {F},\mathsf {S}_{\mathsf {prf}},\mathsf {P},\mathcal {A}_{\mathsf {prf}}}(\lambda ) + \mathsf {Adv}^{\mathsf {\mathsf {ind}\hbox {-}\mathsf {ac}\hbox {-}\mathsf {ext}}}_{\mathsf {SE},\mathcal {A}_{\mathsf {ext}}}(\lambda ). \end{aligned}$$

In the full version of this paper [23], we show that a variant of IND-AC-EXT security without exposures (which we call IND-EXT) necessarily holds if \(\mathsf {SE}[\mathsf {F}]\) is single-user IND-$ secure for all single-user PRF secure \(\mathsf {F}\)’s. Moreover, we identify that the typical way that IND-EXT security is shown in security proofs for \(\mathsf {SE}\) is by proving a slightly stronger property which will suffice to imply IND-AC-EXT security. Thereby, one can obtain a SIM-AC-$ security proof from a IND-$ security proof by using the information theoretic core of the existing proof.

6.3 Extensions and a Counter-Example Construction

Simple extensions. For encryption schemes not covered by our formalism, it will often be easy to extend the underlying ideas to cover the scheme. Suppose \(\mathsf {SE}\) uses two distinct function families as PRFs, one could extend our mode of operation syntax to cover this by giving two separate PRF oracles to the encryption and decryption oracles. Then security would follow if there is an extraction algorithm satisfies analogous properties which explains outputs for both of the oracles. The proof would just require an additional step in which the second SIM-AC-PRF is replaced with simulation, as in our transition between games \(\mathrm {G}_0\) and \(\mathrm {G}_1\).

One can analogously prove the SIM-AC-$ security of the Encrypt-then-PRF construction, where instead of a second SIM-AC-PRF function family we have a SIM-AC-$ encryption scheme. From random ciphertexts it is straightforward to extract the required output of the function family and encryption scheme.

We can also extend the analysis to cover \(\mathsf {GCM}\) when its nonces chosen uniformly at random. It is not captured by our current syntax because the encryption algorithm always applies the PRF to the all-zero string to derive a sub-key for a hash function. It is straightforward to extend our extraction ideas to allow consistency on this PRF query while maintaining our general proof technique.

Non-extractable counterexample. We showed our general security result for extractable modes of operations and described how to extend it for some simple variants. One might optimistically hope that SIM-AC-$ security would hold for any IND-$ secure mode of operation (when a SIM-AC-PRF secure function family is used). Unfortunately, we can show that this is not the case. We can provide an example mode of operation which is IND-$ secure when using a PRF, but not SIM-AC-CPA secure for any choice of function family. It will be clear that this mode of operation is not extractable, as required by our earlier theorem.

Fix \(n:{{\mathbb N}}\rightarrow {{\mathbb N}}\). Let \(\mathsf {G}\) be a function family that is OW secure with \(\mathsf {P}_{\mathsf {sm}}\) and for which \(\mathsf {\mathsf {G}{.}Kg}(1^\lambda )\) always returns \(\varepsilon \) and \(\mathsf {\mathsf {G}{.}Ev}(1^\lambda ,\varepsilon ,\cdot )\) is always a permutation on \(\{0,1\}^{n(\lambda )}\). Such a \(\mathsf {G}\) us a one-way permutation on n-bits. From \(\mathsf {G}\) we construct our counterexample \(\mathsf {CX}\). It has sets \(\mathsf {\mathsf {CX}{.}Out}(\lambda ,l)=\{0,1\}^{l+n(\lambda )}\) and \(\mathsf {\mathsf {CX}{.}M}(\lambda )=\mathsf {\mathsf {CX}{.}FInp}(\lambda )=\mathsf {\mathsf {CX}{.}FOut}(\lambda )=\{0,1\}^{n(\lambda )}\). Key generation is given by \(\mathsf {\mathsf {CX}{.}Kg}=\mathsf {\mathsf {G}{.}Kg}\). Encryption and decryption are given as follows.

figure k

Above, the superscript \(\varepsilon \) is used as shorthand for the oracle that always returns \(\varepsilon \). Note that this is exactly the behavior of \(\mathsf {G}\)’s expected ideal primitive \(\mathsf {P}_{\mathsf {sm}}\). This counterexample uses the ideas originally introduced by Fischlin et al.  [18] to construct non-programmable random oracles by exploiting a one-way permutation. The construction is not extractable because doing so would require being able to invert the one-way permutation. The following theorem formally establishes that this is a counterexample.

Theorem 6

Fix \(n:{{\mathbb N}}\rightarrow {{\mathbb N}}\). Let \(\mathsf {G}\) be a one-way permutation on n-bits. Let \(\mathsf {F}\) be a family of functions with \(\mathsf {\mathsf {F}{.}Out}(\lambda )=\mathsf {\mathsf {F}{.}Inp}(\lambda )=\{0,1\}^{n({\lambda })}\) and \(\mathsf {P}\) be an ideal primitive. Then \(\mathsf {CX}[\mathsf {F}]\) is IND-$ secure with \(\mathsf {P}\) if \(\mathsf {F}\) is PRF secure with \(\mathsf {P}\). However, \(\mathsf {CX}[\mathsf {F}]\) is not SIM-AC-CPA secure with \(\mathsf {P}\).

Proof (Sketch)

That \(\mathsf {CX}[\mathsf {F}]\) is IND-$ secure when \(\mathsf {F}\) is PRF secure follows from, e.g., the standard security proof for \(\mathsf {CTR}\) plus the observation that a permutation applied to a PRF is still a PRF. For the negative result, let \(\mathsf {S}\) be any simulator and consider the following SIM-AC-CPA adversary \(\mathcal {A}_{\mathsf {cpa}}\) and OW adversary \(\mathcal {A}\).

figure l

Adversary \(\mathcal {A}_{\mathsf {cpa}}\) queries for the encryption of a random message. Then it exposes the corresponding users and uses the given key to calculate the input-output pair this claims for \(\mathsf {G}\). If indeed, this is a valid pair it returns 1, otherwise it returns 0. When \(b=1\), note that \(\mathcal {A}_{\mathsf {cpa}}\) will always return 1. Intuitively, when \(b=0\), adversary \(\mathcal {A}_{\mathsf {cpa}}\) should almost never return 1 because from the perspective of the simulator \(\mathsf {S}\) it looks like y was chosen at random, so finding a pre-image for it requires breaking the security of \(\mathsf {G}\).

This intuition is captured by the adversary \(\mathcal {A}\). It simulates the view \(\mathsf {S}\) would see when run for \(\mathcal {A}\), except instead of picking \(m\) at random it waits until after running \(\mathsf {\mathsf {S}{.}Enc}\) and sets \(m\leftarrow c_1\,{\oplus }\,y\) where y is the \(\mathsf {G}\) image it was given as input. Note that y is a uniformly random string because \(\mathsf {G}\) is a permutation and \(\mathsf {S}\) is only given the length of the message at this point. Thus, this re-ordering of the calculation of \(m\) does not change the view of \(\mathsf {S}\). By asking \(\mathsf {S}\) for the appropriate key and running \(\mathsf {\mathsf {F}{.}Ev}\), the adversary obtains a potential pre-image for y.

Simple calculations give \(\mathsf {Adv}^{\mathsf {\mathsf {sim}\hbox {-}\mathsf {ac}\hbox {-}\mathsf {cpa}}}_{\mathsf {SE},\mathsf {S},\mathsf {P},\mathcal {A}_{\mathsf {cpa}}}(\lambda )=1-\mathsf {Adv}^{\mathsf {\mathsf {ow}}}_{\mathsf {G},\mathsf {P},\mathcal {A}}(\lambda )\). The latter advantage is negligible from the security of \(\mathsf {G}\), so the former is non-negligible.   \(\square \)

Extensions to PRFs. It is often useful to construct a PRF \(\mathsf {H}\) with large input domains from a PRF \(\mathsf {F}\) with smaller input domains. The smaller PRF \(\mathsf {F}\) is often thought of as being reasonably modeled by a random oracle or ideal cipher. If the larger construction \(\mathsf {H}\) is an indifferentiable construction of a random oracle  [14, 30], then we can apply Theorem 3 to obtain the SIM-AC-PRF security of \(\mathsf {H}\).

In the case that \(\mathsf {H}\) is not indifferentiable, one can often use techniques similar to the above to lift a PRF security proof for \(\mathsf {H}\) to a SIM-AC-PRF security proof for \(\mathsf {H}\) whenever \(\mathsf {F}\) is SIM-AC-PRF secure. Implicit in the existing security proof there will often be a way of “explaining” a random output of \(\mathsf {H}\) with random outputs by \(\mathsf {F}\). On exposure queries, the simulator for \(\mathsf {H}\) would extract these explanations and feed them to the existing simulator for \(\mathsf {F}\) to obtain the key to output. For primitive queries, it would just run the \(\mathsf {F}\) simulator and for evaluation queries after exposure it would just run \(\mathsf {H}\) using the \(\mathsf {F}\) simulator in place of \(\mathsf {F}\).