1 Introduction

Passwords constitute the most ubiquitous form of authentication in the Internet, from the most mundane to the most sensitive applications. The almost universal password authentication method in practice relies on TLS/SSL and consists of the user sending its password to the server under the protection of a client-to-server confidential TLS channel. At the server, the password is decrypted and verified against a one-way image typically computed via hash iterations applied to the password and a random “salt” value. Both the password image and salt are stored for each user in a so-called “password file.” In this way, an attacker who succeeds in stealing the password file is forced to run an exhaustive offline dictionary attack to find users’ passwords given a set (“dictionary”) of candidate passwords. The two obvious disadvantages of this approach are: (i) the password appears in cleartext at the server during login; and (ii) security breaks if the TLS channel is established with a compromised server’s public key (a major concern given today’s too-common PKI failuresFootnote 1).

Password protocols have been extensively studied in the crypto literature – including in the above client-server setting where the user is assumed to possess an authentic copy of the server’s public key [19, 20], but the main focus has been on password-only protocols where the user does not need to rely on any outside keying material (such as public keys). The basic setting considers two parties that share the same low-entropy password with the goal of establishing shared session keys secure against offline dictionary attacks, namely, against an active attacker that possesses a small dictionary from which the password has been chosen. The only viable option for the attacker should be the inevitable online impersonation attack with guessed passwords. Such model, known as password-authenticated key exchange (PAKE), was first studied by Bellovin and Merritt [5] and later formalized by Bellare et al. [4] in the game-based indistinguishability approach. Canetti et al. [12] formalized PAKE in the Universally Composable (UC) framework [11], which better captures PAKE security issues such as the use of arbitrary password distributions, the inputting of wrong passwords by the user, and the common use in practice of related passwords for different services.

Whereas the cryptographic literature on PAKE’s focuses on the above basic setting, in practice the much more common application of password protocols is in the client-server setting. However, sharing the same password between user and server would mean that a break to the server leaks plaintext passwords for all its users. Thus, what’s needed is that upon a server compromise, and the stealing of the password file, an attacker is forced to perform an exhaustive offline dictionary attack as in the above TLS scenario. No other attack, except for an inevitable online guessing attack, should be feasible. In particular, the two main shortcomings of password-over-TLS mentioned earlier - reliance on public keys and exposure of the password to the server - need to be eliminated. This setting, known as aPAKE, for asymmetric PAKE (also called augmented or verifier-based), was introduced by Bellovin and Merrit [6], later formalized in the simulation-based approach by Boyko et al. [10], and in the UC framework by Gentry et al. [18]. Early protocols proven in the simulation-based model include [10, 28, 29]. Later, Gentry et al. [18] presented a compiler that transforms any UC-PAKE protocol into a UC-aPAKE (adding an extra round of communication and a client’s signature). This was followed by [24] who show the first simultaneous one-round adaptive UC-aPAKE protocol. In addition, several aPAKE protocols targeting practicality have been proposed, most with ad-hoc security arguments, and some have been (and are being) considered for standardization (see below).

A common unfortunate property of all these aPAKE protocols, including those being proposed for practical use and regardless of their underlying formalism, is that they are all vulnerable to pre-computation attacks. Namely, the attacker \(\mathcal {A}\) can pre-compute a table of values based on a passwords dictionary D, so as soon as \(\mathcal {A}\) succeeds in compromising a server it can instantly find a user’s password. This significantly weakens the benefits of security against server compromise that motivate the aPAKE notion in the first place. Moreover, while current definitions require that the attacker cannot exploit a server compromise without incurring a workload proportional to the dictionary size |D|, these definitions allow all this workload to be spent before the actual server compromise happens. Indeed, this weakening in the existing aPAKE security definition [18] is needed to accommodate aPAKE protocols that store a one-way deterministic mapping of the user’s password at the server, say \(H(\mathsf {pw})\). Such protocols trivially fall to a pre-computation attack as the attacker \(\mathcal {A}\) can build a table of \((H(\mathsf {pw}),\mathsf {pw})\) pairs for all \(\mathsf {pw}\in D\), and once it compromises the server, it finds the value \(H(\mathsf {pw})\) associated with a user and immediately, in \(\log (|D|)\) time, finds that user’s password. Such devastating attack can be mitigated by “personalizing” the password map, e.g., hashing the password together with the user id. This forces \(\mathcal {A}\) to pre-compute separate tables for individual users, yet all this effort can still be spent before the actual server compromise. Note that in the case of passwords transmitted over TLS, pre-computation is prevented since password are hashed with a random salt visible to the server only. In contrast, existing aPAKE protocols that do not rely on PKI, either don’t use salt or if they do, the salt is transmitted from server to user during login in the clearFootnote 2. Given that password stealing via server compromise is the main avenue for collecting billions of passwords by attackers, the above vulnerability of existing aPAKE protocols to pre-computation attacks is a serious flaw, and in this aspect password-over-TLS is more secure than all known aPAKE schemes.

Our Contributions

We initiate the study of Strong aPAKE (SaPAKE) protocols that strengthen the aPAKE security notion by disallowing pre-computation attacks. We formalize this notion in the Universally Composable (UC) model by modifying the aPAKE functionality from [18] to eliminate an adversarial action which allowed such pre-computation attacks. As we explain above, allowing pre-computation attacks was indeed necessary to model the security of existing aPAKE protocols.

The next contribution is building Strong aPAKE (SaPAKE) protocols. For this we present two generic constructions. The first builds the SaPAKE protocol from any aPAKE protocol (namely one that satisfies the original definition from [18]) so that one can “salvage” existing aPAKE protocols. To do so we resort to Oblivious PRF (OPRF) functions [17, 22], namely, a PRF with an associated two-party protocol that in our case is run between a server \(\mathsf {S}\) that stores a PRF key k and a user \(\mathsf {U}\) with a password \(\mathsf {pw}\). At the end of the interaction, \(\mathsf {U}\) learns the PRF output \(F_k(\mathsf {pw})\) and \(\mathsf {S}\) learns nothing (in particular, nothing about \(\mathsf {pw}\)). We show that by preceding any aPAKE protocol with an OPRF interaction in which \(\mathsf {U}\) computes the value \(\mathsf {rw}=F_k(\mathsf {pw})\) with the help of \(\mathsf {S}\) and uses \(\mathsf {rw}\) as the password in the aPAKE protocol, one obtains a Strong aPAKE protocol. We show that if the OPRF and the given aPAKE protocol are, respectively, UC realizations of the OPRF functionality (defined in [22]) and the original aPAKE functionality from [18], the resultant scheme realizes our UC functionality \(\mathcal {F}_{\mathsf {SaPAKE}}\).

Our second transformation consists of the composition of an OPRF as above with a regular authenticated key exchange protocol AKE. We require UC security for the AKE protocol as well as a property known as resistance to KCI attacks. The latter means that an attacker that learns the secret keys of one party \(\mathsf {P}\), but does not actively control \(\mathsf {P}\), cannot use this information to impersonate another party \(\mathsf {P}'\) to \(\mathsf {P}\). KCI resistance is a common property of most AKE protocols. In our SaPAKE construction, \(\mathsf {U}\) first runs the OPRF with \(\mathsf {S}\) to compute \(\mathsf {rw}=F_k(\mathsf {pw})\); then it runs the AKE protocol with \(\mathsf {S}\) using a private key stored, encrypted using an authenticated encryption under \(\mathsf {rw}\), at \(\mathsf {S}\) who sends it to \(\mathsf {U}\). Crucial to the security of the protocol is the use of authenticated encryption with a “random-key robustness” property, which is achieved naturally by some schemes or otherwise can be easily ensured, e.g., by adding an HMAC to a symmetric encryption scheme. Under these conditions we show that the composed scheme realizes our UC functionality \(\mathcal {F}_{\mathsf {SaPAKE}}\).

Next, we use the above second transformation to instantiate a Strong aPAKE protocol with a very efficient OPRF and any efficient AKE with the KCI property. The OPRF scheme we use, essentially a Chaum-type blinded DH computation, has been proven UC-secure by Jarecki et al. [21, 22]. We show that this OPRF scheme, which we call DH-OPRF (called 2HashDH in [21, 22]), remains secure in spite of changes to the OPRF functionality that we introduce for supporting a stronger OPRF notion needed in our setting. We call the result of this instantiation, the OPAQUE protocol.

OPAQUE combines the best properties of existing aPAKE protocols and of the standard password-over-TLS. As any aPAKE-secure protocol, it offers two fundamental advantages over the TLS-based solution: It does not rely on PKI and the plaintext password is never in the clear at the server. The only way for an attacker that observes (or actively controls) a session at a server to learn the password is via an exhaustive offline dictionary attack. Watching or participating in a session with the user does not help the attacker. At the same time, OPAQUE resolves the major flaw of existing aPAKE protocols relative to password-over-TLS, namely, their vulnerability to pre-computation attacks.

In addition to the above fundamental properties, OPAQUE enjoys important properties for use in practice. Its modularity allows for its use with different key-exchange schemes that can provide different features and performance tradeoffs. When implemented with a 2-message implicit-authentication KE protocol (e.g., HMQV [27]), OPAQUE takes only 2 messages (or 3 with mutual explicit authentication). The computational cost (using the DH-OPRF scheme from Appendix A) is one exponentiation for the server and two for the clientFootnote 3 in addition to the KE protocol cost (with HMQV, this cost is 2.17 exponentiations per party). OPAQUE offers forward secrecy (a particularly crucial property for password protocols) if the KE does. OPAQUE further supports password hardening for increasing the cost of offline dictionary attacks (upon server compromise) through user-side iterated hashing without the need to transmit salt from \(\mathsf {S}\) to \(\mathsf {U}\). In Fig. 7 in Sect. 6 we show an instantiation of OPAQUE in the RO model with HMQV as the AKE.

Compared to the practical aPAKE protocols that have been and are being considered for standardization (cf., [1, 32]), OPAQUE fares clearly better on the security side as the only scheme that offers resistance to pre-computation attacks while all others are vulnerable. Performance-wise, OPAQUE is competitive with the more efficient among these protocols (see Sect. 6). Additional advantages of OPAQUE include its ability to store and retrieve user’s secrets such as a bitcoin wallet, authentication credentials, encrypted backup keys, etc., and to support a user-transparent server-side threshold implementation [23] (where the only exposure of the user password - or any stored secrets - is in case a threshold of servers is compromised and even then a full dictionary attack is required). Finally, we comment that while OPAQUE can completely replace password authentication in TLS, it can also be used in conjunction with TLS, either for bootstrapping client authentication (via an OPAQUE-retrieved client signing key) or as an hedge against PKI failures. In other words, while we are accustomed to use TLS to protect passwords, OPAQUE can be used to protect TLS.

We stress that variants of OPAQUE have been studied in prior work in several settings but none of these works presents a formal analysis of the protocol as an aPAKE, let alone as a Strong aPAKE, a notion that we introduce here for the first time. While our treatment frames OPAQUE in the context of Oblivious PRFs [21, 22], its design can be seen as an instantiation of the Ford-Kaliski paradigm for password hardening and credential retrieval using Chaum’s blinded exponentiation. Boyen [9] specifies and studies the protocol (called HPAKE) in the setting of client-side halting KDF [8]. Jarecki et al. [21, 22] study a threshold version (also using the OPRF abstraction) in the context of password-protected secret sharing (PPSS) protocols. Because of the relation between PPSS and Threshold PAKE protocols [21], this analysis implies security of OPAQUE as a PAKE protocol in the BPR model [4] but not as an aPAKE (let alone as a strong aPAKE).

2 The Strong aPAKE Functionality

We present the ideal UC Strong aPAKE functionality, \(\mathcal {F}_{\mathsf {SaPAKE}}\), that will serve as our definition of Strong aPAKE security; namely, we call a protocol a secure Strong aPAKE if it realizes \(\mathcal {F}_{\mathsf {SaPAKE}}\). Functionality \(\mathcal {F}_{\mathsf {SaPAKE}}\) is a simple but significant variant of the UC aPAKE functionality \(\mathcal {F}_{\mathsf {aPAKE}}\) from [18] (it was denoted \(\mathcal {F}_{\mathsf {apwKE}}\) in [18]) which we recall in Fig. 1.

The aPAKE functionality of [18] is based on the UC PAKE functionality from [12], and it includes extensions needed for taking care of the asymmetric nature of the aPAKE setting. First, in an aPAKE scheme the server and the user run different programs: The user runs an aPAKE session on a password (via command \(\textsc {UsrSession}\)) while the server runs it on a “password file” \(\mathsf {file}[sid]\) that represents server’s user-specific state corresponding to the user’s password, e.g., a password hash, which the server creates on input the user’s password during aPAKE initialization, via command \(\textsc {StorePwdFile}\). Furthermore, \(\mathcal {F}_{\mathsf {aPAKE}}\) models a possible compromise of a server, via command \(\textsc {StealPwdFile}\), from which the attacker obtains \(\mathsf {file}[sid]\). Such compromise subsequently allows the attacker to (1) impersonate the server to the user, via command \(\textsc {Impersonate}\), and (2) find the password via an offline dictionary attack, via command \(\textsc {OfflineTestPwd}\). The way functionality \(\mathcal {F}_{\mathsf {aPAKE}}\) of [18] handles the offline dictionary attack is the focus of the Strong aPAKE functionality we propose, and we discuss them below.

Fig. 1.
figure 1

Functionalities \(\mathcal {F}_{\mathsf {aPAKE}}\) (full text) and \(\mathcal {F}_{\mathsf {SaPAKE}}\) (shadowed text omitted)

Strong aPAKE vs. aPAKE. Our functionality \(\mathcal {F}_{\mathsf {SaPAKE}}\) is almost identical to \(\mathcal {F}_{\mathsf {aPAKE}}\) except that the text with the gray background in Fig. 1 is omitted. That is, the only difference between \(\mathcal {F}_{\mathsf {SaPAKE}}\) and \(\mathcal {F}_{\mathsf {aPAKE}}\) are in the actions upon the stealing of the password file; specifically, \(\mathcal {F}_{\mathsf {SaPAKE}}\) omits recording the \((\textsc {offline},\mathsf {pw})\) pairs and does not allow for \(\textsc {OfflineTestPwd}\) queries made before the \(\textsc {StealPwdFile}\) query. Let us explain. Let’s consider first the definition of \(\mathcal {F}_{\mathsf {SaPAKE}}\), i.e., with the gray text omitted. In this case, the actions upon server compromise, i.e., \(\textsc {StealPwdFile}\), are simple. First, a flag is defined to mark that the password file has been compromised. Second, once this event happens, the adversary is allowed to submit password guesses and be informed if a guess was correct. Note that each guess “costs” the attacker one \(\textsc {OfflineTestPwd}\) query. This together with the restriction that these queries can only be made after the password file is compromised ensure that shortcuts in finding the password after such compromise are not possible, namely that the attacker needs to pay with one \(\textsc {OfflineTestPwd}\) query for each password it wants to test. Thus, pre-computation attacks are made infeasible.

Now, consider the \(\mathcal {F}_{\mathsf {aPAKE}}\) functionality from [18] which includes the text in gray too. This functionality allows the attacker, via \((\textsc {offline},\mathsf {pw})\) records, to make guess queries against the password even before the password file is compromised. The restriction is that the responses to whether a guess was correct or not are provided to the attacker only after a \(\textsc {StealPwdFile}\) event. But note that if one of these guesses was correct, the attacker learns it immediately upon server compromise. This provision was necessary in [18] because the \(\mathsf {file}[sid]\) in their aPAKE construction contains a deterministic publicly-computable hash of the password, thus allowing for a pre-computation attack which lets the adversary instantaneously identify the password with a single table lookup upon server compromise. Indeed, one can think of the pairs \((\textsc {offline},\mathsf {pw})\) in the original \(\mathcal {F}_{\mathsf {aPAKE}}\) functionality as a pre-computed table that the attacker builds overtime and which it can use to identify the password as soon as the server is compromised. By eliminating the ability to get guesses \((\textsc {offline},\mathsf {pw})\) answered before server compromise in our \(\mathcal {F}_{\mathsf {SaPAKE}}\) functionality, we make such pre-computation attacks infeasible in the case of a Strong aPAKE.

Modeling Server Compromise and Offline Dictionary Queries. As in [18], we specify that \(\textsc {StealPwdFile}\) and \(\textsc {OfflineTestPwd}\) messages from \(\mathcal {A^*}\) to \(\mathcal {F}_{\mathsf {SaPAKE}}\) are accounted for by the environment. This is consistent with the UC treatment of adaptive corruption queries and is crucial to our modeling. Note that if the environment does not observe adaptive corruption queries then the ideal model adversary, i.e., the simulator, could immediately corrupt all parties at the beginning of the protocol, learning their private inputs and thus making the work of simulation easier. By making the player-corruption queries, modeled by \(\textsc {StealPwdFile}\) command in our context, observable by the environment, we ensure that the environment’s view of both the ideal and the real execution includes the same player-corruption events. This way we keep the simulator “honest,” because it can only corrupt a party if the environment accounts for it.

The same concern pertains to offline dictionary queries \(\textsc {OfflineTestPwd}\), because if they were not observable by the environment, the ideal adversary could make such queries even if the real adversary does not. In particular, without environmental accounting for these queries the \(\mathcal {F}_{\mathsf {aPAKE}}\) and \(\mathcal {F}_{\mathsf {SaPAKE}}\) functionalities would be equivalent because the simulator could internally gather all the offline dictionary attack queries made by the real-world adversary before server corruption, and it would send them all via the \(\textsc {OfflineTestPwd}\) query to \(\mathcal {F}_{\mathsf {SaPAKE}}\) after server corruption via the \(\textsc {StealPwdFile}\) query. Such simulator would make the ideal-world view indistinguishable from the real-world view to the environment if the environment does not observe the sequence of \(\textsc {OfflineTestPwd}\) and \(\textsc {StealPwdFile}\) queries.

Finally, we note that the functionality \(\mathcal {F}_{\mathsf {SaPAKE}}\), like \(\mathcal {F}_{\mathsf {aPAKE}}\), has effectively two separate notions of a server corruption. Formally, it considers a static adversarial model where all entities, including users and servers, are either honest or corrupt throughout the life-time of the scheme. In addition, it allows for an adaptive server compromise of an honest server, via the \(\textsc {StealPwdFile}\), which leaks to the adversary the server’s private state corresponding to a particular password file, but it does not give the adversary full control over the server’s entity. In particular, the accounts on the same server for which the adversary does not explicitly issue the \(\textsc {StealPwdFile}\) command must remain unaffected. We adopt this convention from [18] and we call a server “corrupted” if it is (statically) corrupt and adversarially controlled, and we call an aPAKE instance “compromised” if the adversary steals its password file from the server.

Non-black-box Assumptions. Note that the aPAKE functionality requires the simulator, playing the role of the ideal-model adversary, to detect offline password guesses made by the real-world adversary. As pointed out by [18], this seems to require a non-black-box hardness assumption on some cryptographic primitive, e.g., the Random Oracle Model (ROM), which would allow the simulator to extract a password guess from adversary’s local computation, e.g., a local execution of aPAKE interaction on a password guess and a stolen password file.

Server Initialization. We note that while \(\mathcal {F}_{\mathsf {aPAKE}}\) defines password registration as an internal action of server \(\mathsf {S}\), with the user’s password as a local input, one can modify it to support an interactive procedure between user and server, e.g., to prevent \(\mathsf {S}\) from ever learning the plaintext password. To that end one needs to assume that during the Password Registration phase there is an authenticated channel from server to user, so the user can verify that it is registering the password with the correct server. (Functionality \(\mathcal {F}_{\mathsf {aPAKE}}\) effectively also assumes such authenticated channel because otherwise the user’s password cannot be safely transported to \(\mathsf {S}\).) In practice, the server also needs to verify the user’s identity, and the password file could be created by the user and transported to the server. However, this is beyond the scope of the formal aPAKE functionality.

Fig. 2.
figure 2

Functionality \(\mathcal {F}_{\mathsf {OPRF}}\) with adaptive compromise

3 Oblivious Pseudorandom Function

Oblivious Pseudorandom Functions (OPRF) are a central tool in all our constructions. An OPRF consists of a pseudorandom function family F with an associated two-party protocol run between a server that holds a key k for F and a user with an input x. At the end of the interaction, the user learns the PRF output \(F_k(x)\) and nothing else, and the server learns nothing (in particular, nothing about x). The notion of OPRF was introduced in [17]. The first UC formulation of it was given in [21], including a verifiability property that lets the user check the correct behavior of the server during the OPRF execution. Later [22] gave an alternative UC definition of OPRF which dispensed with the verifiability property, allowing for more efficient instantiations. The main idea in the OPRF formulations of [21, 22] is the use of a ticketing mechanism \(\mathsf {tx}(\cdot )\) that ensures that the number of input values on which anyone can compute the OPRF on a key held by an honest server \(\mathsf {S}\) is no more than the number of executions of the OPRF recorded by \(\mathsf {S}\). This mechanism dispenses with the need to extract users’ inputs as is typically needed in UC simulations and it leads to much more efficient OPRF instantiations.

Here we adopt the formulation from [22] as the basis for our definition of functionality \(\mathcal {F}_{\mathsf {OPRF}}\) presented in Fig. 2. We refer to [22] for detailed rationale, but we note that it requires PRF outputs to be pseudorandom even to the owner of the PRF key k. This does not seem achievable under non-black-box assumptions, but it is achievable, indeed very efficiently, in the Random Oracle Model (ROM). Note that the reliance on non-black-box assumptions like ROM is called for in the aPAKE context, see Sect. 2.

Changes from OPRF Functionality of [22]. To use UC OPRF in our application(s) we need to make some changes to the way functionality \(\mathcal {F}_{\mathsf {OPRF}}\) was defined in [22], as described below. Changes (2), (3) and (4) are essentially syntactic and require only cosmetic changes in the security argument. Change (1) is the only one which influences the security argument in a more essential way. Fortunately, the DH-OPRF protocol that we use for OPRF instantiation in our protocols, shown in [22] to realize their version of the OPRF functionality \(\mathcal {F}_{\mathsf {OPRF}}\), also realizes our modified \(\mathcal {F}_{\mathsf {OPRF}}\) functionality. We recall the DH-OPRF protocol in Fig. 9 in Appendix A, adapting its syntax to our changes in \(\mathcal {F}_{\mathsf {OPRF}}\), and we argue that the security proof of [22] which shows that it realizes \(\mathcal {F}_{\mathsf {OPRF}}\) defined by [22] extends to the modified functionality \(\mathcal {F}_{\mathsf {OPRF}}\) presented here.

  1. (1)

    We extend the OPRF functionality to allow the adaptive compromise of a server holding the PRF key via a \(\textsc {Compromise}\) message. Such action is needed in the aPAKE setting where the attacker \(\mathcal {A^*}\) can compromise a server’s password file that contains the server’s OPRF key. After the compromise, \(\mathcal {A^*}\) is allowed to compute that server’s PRF function by itself on any value of its choice using \(\textsc {OfflineEval}\) and without the restrictions of the ticketing mechanism. We note that functionality \(\mathcal {F}_{\mathsf {OPRF}}\) distinguishes between (statically) corrupted servers and (adaptively) compromised sessions (the latter representing different OPRF keys at the same server). This distinction allows for a granular separation between compromised and uncompromised OPRF keys held by the same server. We adopt this distinction for consistency with the aPAKE functionality from Fig. 1 that distinguishes between an entirely corrupted server and particular aPAKE instances that can be adaptively compromised by an adversary.

  2. (2)

    We change the \(\textsc {SndrComplete}\) message such that it is sent from \(\mathsf {S}\) instead of \(\mathcal {A}\), thus restricting the number of OPRF invocations per \( ssid \) to one. This enforces a single password guess per aPAKE sub-session which is crucial for aPAKE security.

  3. (3)

    We change the session-id syntax used in [22] to model the use of multiple OPRF keys by the same server. In the formulation of [22] each PRF key was identified with a server identity making a one-to-one correspondence between OPRF keys and servers. Here, we allow multiple OPRF keys to be associated with one server. Each such key is identified with a tag \( sid \) and a server can be associated with multiple such tags. In the context of our application to aPAKE protocols, each aPAKE session is associated with a unique OPRF key used by the server for a particular user, so the session-id \( sid \) corresponds to a user account at that server. Any \( sid \) can include sub-sessions, denoted by \( ssid \), corresponding to different runs of the OPRF protocol between a user and a server.

  4. (4)

    We add an Initialization phase to the functionality, which models a server picking an OPRF key and, in addition, computing the OPRF value on any input. This interface simplifies the usage of OPRF in our applications to aPAKE, where the server will pick an OPRF key for a new user and evaluate the OPRF on the user’s password (for generating an encryption key). This modeling differs from [22] who framed OPRF initialization as an interactive procedure through an \(\textsc {Eval}\) call while here it is performed locally by the server.

4 A Compiler from aPAKE to Strong aPAKE via OPRF

In Fig. 3 we specify a compiler that transforms any OPRF and any aPAKE into a Strong aPAKE protocol. In UC terms the Strong aPAKE protocol is defined in the \((\mathcal {F}_{\mathsf {OPRF}},\mathcal {F}_{\mathsf {aPAKE}})\)-hybrid world, for \(\mathcal {F}_{\mathsf {OPRF}}\) with the output length parameter \(\ell =2\tau \). The compiler is simple. First, the user transforms its password \(\mathsf {pw}\) into a randomized value \(\mathsf {rw}\) by interacting with the server in an OPRF protocol where the user inputs \(\mathsf {pw}\) and the server inputs the OPRF key. Nothing is learned at the server about \(\mathsf {pw}\) (i.e., \(\mathsf {rw}\) is indistinguishable from random as long as the input \(\mathsf {pw}\) is not queried as input to the OPRF). Next, the user sets \(\mathsf {rw}\) as its password in the given aPAKE protocol. Note that since the password \(\mathsf {rw}\) is taken from a pseudorandom set, then even if the size of this set is the same as the original dictionary D from which \(\mathsf {pw}\) was taken, the pseudorandom set is unknown to the attacker (the attacker can only learn this set via OPRF queries which require an online dictionary attack). Thus, any previous ability to run a pre-computation attack against the aPAKE protocol based on dictionary D is now lost.

We assume that \(\mathcal {A}\) always simultaneously sends queries \((\textsc {Compromise}, sid )\) and \((\textsc {StealPwdFile}, sid )\) for the same \( sid \), resp. to \(\mathcal {F}_{\mathsf {OPRF}}\) to \(\mathcal {F}_{\mathsf {aPAKE}}\), because in any instantiation of this scheme the server’s OPRF-related state and aPAKE-related state would be part of the same \(\mathsf {file}[sid]\). Consequently, for a single \( sid \), \(\mathsf {S}\)’s status (\(\textsc {compromised}\) or not) in \(\mathcal {F}_{\mathsf {OPRF}}\) and \(\mathcal {F}_{\mathsf {aPAKE}}\) is always the same.

Fig. 3.
figure 3

Strong aPAKE protocol in the \((\mathcal {F}_{\mathsf {OPRF}},\mathcal {F}_{\mathsf {aPAKE}})\)-hybrid world

4.1 Proof of Security

Theorem 1

The protocol in Fig. 3 UC-realizes the \(\mathcal {F}_{\mathsf {SaPAKE}}\) functionality assuming access to the OPRF functionality \(\mathcal {F}_{\mathsf {OPRF}}\) and aPAKE functionality \(\mathcal {F}_{\mathsf {aPAKE}}\).

Concretely, for any adversary \(\mathcal {A}\) against the protocol, there is a simulator \({\mathsf {SIM}}\) that produces a view in the simulated ideal world (henceforth simulated world) such that the advantage that an environment has in distinguishing between this view and the view in the \((\mathcal {F}_{\mathsf {OPRF}},\mathcal {F}_{\mathsf {aPAKE}})\)-hybrid real world (henceforth real world) is no more than \((q_F^2+2q_O+6)/2^{2\tau +1}\), where \(\tau \) is the security parameter, \(q_F\) is the number of \(\textsc {Eval}\) and \(\textsc {OfflineEval}\) messages aimed at \(\mathcal {F}_{\mathsf {OPRF}}\) from \(\mathcal {A}\), and \(q_O\) is the number of \(\textsc {OfflineTestPwd}\) messages aimed at \(\mathcal {F}_{\mathsf {aPAKE}}\) from \(\mathcal {A}\). (In the real world, \(\mathcal {A}\) sends the messages to \(\mathcal {F}_{\mathsf {OPRF}}\) and \(\mathcal {F}_{\mathsf {aPAKE}}\). In the simulated world, \(\mathcal {A}\) sends the messages to \({\mathsf {SIM}}\) acting as both \(\mathcal {F}_{\mathsf {OPRF}}\) and \(\mathcal {F}_{\mathsf {aPAKE}}\).)

Due to lack of space, we leave the proof to the full version of this paper.

5 A Compiler from AKE-KCI to Strong aPAKE via OPRF

Our second transformation for building a Strong aPAKE protocol composes an OPRF with an Authenticated Key Exchange (AKE) protocol, “glued” together using authenticated encryption. We require the AKE to be secure in the UC model, namely, to realize the UC KE functionality of [14], but we also require it to be “KCI secure,” a property which we call here “security against reverse impersonation.” The notion of AKE-KCI security has been formalized with a game-based approach in [27], but to the best of our knowledge it was not formalized in UC setting, and we present such formalization in Sect. 5.1.

5.1 UC Definition of AKE-KCI

The KCI notion for KE protocols, which stands for “key-compromise impersonation,” captures the property we call “security against reverse impersonation,” which concerns an attacker \(\mathcal {A}\) who learns party \(\mathsf {P}\)’s long-term keys but otherwise does not actively control \(\mathsf {P}\). Resistance to KCI attacks, or “KCI security” for short, postulates that even though \(\mathcal {A}\) can impersonate \(\mathsf {P}\) to other parties, sessions which \(\mathsf {P}\) itself runs with honest peers need to remain secure. A game-based definition of this notion appears in [27], and here we formalize it in the UC model through functionality \(\mathcal {F}_{\mathsf {AKE-KCI}}\) presented in Fig. 4. We specialize functionality \(\mathcal {F}_{\mathsf {AKE-KCI}}\) to our user-server setting where only servers can be compromised, but it can be extended to allow for compromise of any protocol party.

Fig. 4.
figure 4

Functionality \(\mathcal {F}_{\mathsf {AKE-KCI}}\)

Functionality \(\mathcal {F}_{\mathsf {AKE-KCI}}\) extends the standard KE functionality of [14] with two adversarial actions. The first, \(\textsc {Compromise}\), is targeted at a server and captures the compromise of the server’s keys. The second is \(\textsc {Impersonate}\) which is borrowed from the aPAKE functionality of [18] shown in Fig. 1. This action can only be targeted at users’ sessions, and only for sessions with servers compromised via the \(\textsc {Compromise}\) action, and it marks such session as \(\textsc {compromised}\), which implies that the attacker can determine the session key this session outputs, via the \(\textsc {NewKey}\) action. This models the fact that user’s sessions with a compromised \(\mathsf {S}\) as a peer cannot be assumed to be secure since they could have been run with the adversary who has stolen \(\mathsf {S}\)’s keys. However, sessions at \(\mathsf {S}\) itself must not be affected by the \(\textsc {Impersonate}\) action, and they remain secure. All other elements in \(\mathcal {F}_{\mathsf {AKE-KCI}}\) are the same as in the basic UC KE functionality, except of some syntactic specialization to the user-server setting.

AKE-KCI Security of HMQV. A concrete instantiation of protocol OPAQUE shown in Fig. 7 in Sect. 6, which instantiates the generic Strong aPAKE protocol shown in Sect. 5.2 below, using HMQV [27] as the AKE-KCI protocol. The KCI property of HMQV was proved in [27] in the game-based Canetti-Krawczyk model [13] extended to include KCI security. Here we require UC security, namely, a protocol that realizes functionality \(\mathcal {F}_{\mathsf {AKE-KCI}}\). Fortunately, [14] proves the equivalence of the game-based definition of [13] and their UC AKE formulation. Thanks to this equivalence, HMQV, as a basic KE, is secure in the UC model. More precisely, this applies to the three-message HMQV with client authentication (which satisfies the “ACK” property required for the equivalence in [14]). For the 2-message version of HMQV, the equivalence still holds using the notion of non-information oracle [14] that holds for HMQV under Computational Diffie-Hellman (CDH) assumption in the RO model. For our purposes, however, we need HMQV to realize the extended AKE-KCI functionality of Fig. 4. Luckily, the equivalence with the game-based definition extends to this case. Indeed, since the original equivalence from [14] holds even in the case of adaptive party corruptions, the \(\textsc {Compromise}\) and \(\textsc {Impersonate}\) actions introduced here – which constitute a limited form of adaptive corruptions – follow as a special case. Finally, we note that the equivalence between the above models also preserves forward secrecy, so this property (proved in the game-based Canetti-Krawczyk model in [27]) holds in the UC too. We note that by the results in [27], the 3-message HMQV enjoys full PFS while the 2-message only weak PFS (against passive attackers only). The above security of HMQV (without including security against the leakage of ephemeral exponents) is based on the CDH assumption in the RO model [27].

Fig. 5.
figure 5

Strong aPAKE based on AKE-KCI in the \(\mathcal {F}_{\mathsf {OPRF}}\)-hybrid world

5.2 Strong aPAKE Construction from OPRF and AKE-KCI

Our Strong aPAKE protocol based on OPRF and AKE-KCI is shown in Fig. 5. The protocol uses the same OPRF tool as the Strong aPAKE construction of Sect. 4, for length parameter \(\ell =2\tau \), which defines the “randomized password” value \(\mathsf {rw}=F_k(\mathsf {pw})\) for user \(\mathsf {U}\)’s password \(\mathsf {pw}\) and OPRF key k held by server \(\mathsf {S}\). We assume that in the AKE-KCI protocol \(\varPi \) each party holds a (private, public) key pair, and that the each party runs the Login subprotocol using its key pair and the public key of the counterparty as inputs. In Password Registration phase, server \(\mathsf {S}\) generates the user \(\mathsf {U}\)’s keys, and \(\mathsf {S}\)’s password file contains \(\mathsf {S}\)’s key pair \(p_s,P_s\); \(\mathsf {U}\)’s public key \(P_u\); and a ciphertext c of \(\mathsf {U}\)’s private key \(p_u\), and the public keys \(P_u\) and \(P_s\) created using an Authenticated Encryption scheme using \(\mathsf {rw}=F_k(\mathsf {pw})\) as the key. After creating the password file, value \(p_u\) is erased at \(\mathsf {S}\). In Login phase, \(\mathsf {S}\) runs OPRF with \(\mathsf {U}\), which lets \(\mathsf {U}\) compute \(\mathsf {rw}=F_k(\mathsf {pw})\), it sends c to \(\mathsf {U}\), who can decrypt it under \(\mathsf {rw}\) and retrieves its key-pair \(p_u,P_u\) together with the server’s key \(P_s\), at which point both parties have appropriate inputs to the AKE-KCI protocol \(\varPi \) to compute the session key.

Role of Authenticated Encryption. The Strong aPAKE protocol of Fig. 5 utilizes an Authenticated Encryption scheme \(\mathsf {AE}=(\mathsf {AuthEnc},\mathsf {AuthDec})\) to encrypt and authenticate \(\mathsf {U}\)’s AKE “credential” \(m=(p_u,P_u,P_s)\). We encrypt the whole payload m for simplicity, because unlike \(\mathsf {U}\)’s private key \(p_u\), values \(P_u,P_s\) could be public and need to be only authenticated, not encrypted. However, the authentication property of \(\mathsf {AE}\) must apply to the whole payload. Intuitively, \(\mathsf {U}\) must authenticate \(\mathsf {S}\)’s public key \(P_s\), but if \(\mathsf {U}\) derived even its key pair \((p_u,P_u)\) using just the secrecy of \(\mathsf {rw}=F_k(\mathsf {pw})\), e.g., using \(\mathsf {rw}\) as randomness in a key generation, and \(\mathsf {U}\) then executed AKE on such \((p_u,P_u)\) pair, the resulting protocol would already be insecure. To see an example, if an AKE leaks \(\mathsf {U}\)’s public key input \(P_u\) (note that AKE does not guarantee privacy of the public key) then an adversary \(\mathcal {A}\) who engages \(\mathsf {U}\) in a single protocol instance can find \(\mathsf {U}\)’s password \(\mathsf {pw}\) via an offline dictionary attack by running the OPRF with \(\mathsf {U}\) on some key \(k^*\), and then given \(P_u\) leaked in the subsequent AKE it finds \(\mathsf {pw}\) s.t. the key generation outputs \(P_u\) as a public key on randomness \(\mathsf {rw}=F_{k^*}(\mathsf {pw})\).

Thus the role of the authentication property in authenticated encryption is to commit \(\mathcal {A}\) to a single guess of \(\mathsf {rw}\) and consequently, given the OPRF key \(k^*\), to a single guess \(\mathsf {pw}\). (Note that our UC OPRF notion implies that F is collision-resistant.) To that end we need the authenticated encryption to satisfy the following property which we call random-key robustness:Footnote 4 For any efficient algorithm \(\mathcal {A}\) there is a negligible probability that \(\mathcal {A}\) on input \((k_1,k_2)\) for two random keys \(k_1,k_2\) outputs c s.t. \(\mathsf {AuthDec}_{k_1}(c)\,{\ne }\!\perp \) and \(\mathsf {AuthDec}_{k_2}(c) \,{\ne }\!\perp \). In other words, it must be infeasible to create an authenticated ciphertext that successfully decrypts under two different randomly generated keys. This property can be achieved in the standard model using e.g. encrypt-then-MAC with a MAC that is collision resistant with respect to the message and key, a property enjoyed by HMAC with full hash output. In the RO model used by our aPAKE application one can also enforce it for any authenticated encryption scheme by attaching to its ciphertext c a hash H(kc) for a RO hash H with \(2\tau \)-bit outputs.

Note on Not Utilizing \(\mathcal {F}_{\mathsf {AKE-KCI}}\). In Fig. 5 we abstract the OPRF protocol as functionality \(\mathcal {F}_{\mathsf {OPRF}}\), but we use the real-world AKE-KCI protocol \(\varPi \), rather than functionality \(\mathcal {F}_{\mathsf {AKE-KCI}}\). The reason for this presentation is that in the KE functionality of [14], of which \(\mathcal {F}_{\mathsf {AKE-KCI}}\) is an extension, it is not clear how to support a usage of the KE protocol on keys which are computed via some other mechanism than the intended KE key generation. The KE functionality of [14] assumes that each entity keeps its private key as a permanent state, authenticates to a counterparty given its identity, and a KE party cannot specify any bitstring as one’s own private key and a counterparty’s public key. This is not how we use AKE in our Strong aPAKE of Fig. 5 precisely because \(\mathsf {U}\) does not keep state and has to reconstruct its keys from a password (via OPRF). However, we can still use the real-world protocol \(\varPi \), which UC-realizes \(\mathcal {F}_{\mathsf {AKE-KCI}}\), giving it the OPRF-computed information as input. In the proof of security we utilize the simulator \(\mathsf {SIM}_{\mathsf {AKE}}\), which shows that \(\varPi \) UC-realizes \(\mathcal {F}_{\mathsf {AKE-KCI}}\), in our simulator construction, but we rely on its correctness only if \(\mathsf {U}\) runs \(\varPi \) on the correctly reconstructed \((p_u,P_s,P_s)\), and if the adversary causes \(\mathsf {U}\) to reconstruct a different string we interpret this as a successful attack on \(\mathsf {U}\)’s login session.

5.3 Proof of Security

In Theorem 2 below we state security of the Strong aPAKE protocol of Fig. 5.

Theorem 2

If protocol \(\varPi \) UC-realizes functionality \(\mathcal {F}_{\mathsf {AKE-KCI}}\) then protocol in Fig. 5 UC-realizes functionality \(\mathcal {F}_{\mathsf {SaPAKE}}\) in the \(\mathcal {F}_{\mathsf {OPRF}}\)-hybrid model.

Concretely, suppose that there is a simulator \(\mathsf {SIM}_{\mathsf {AKE}}\) such that the distinguishing advantage of an environment \(\mathcal {Z}\) between the real execution of \(\varPi \) and \(\mathcal {Z}\)’s interaction with \(\mathsf {SIM}_{\mathsf {AKE}}\) is at most \(\mathbf {Adv}^{\mathsf {DIST}}_{\mathsf {SIM}_{\mathsf {AKE}},\mathcal {Z}}(\tau )\), where \(\tau \) is the security parameter. Then for any adversary \(\mathcal {A}\) with running time T against the protocol, there is a simulator \({\mathsf {SIM}}\) that produces a view in the simulated world such that the advantage that \(\mathcal {Z}\) has in distinguishing between this view and the view in the real world is no more than \(\mathbf {Adv}^{\mathsf {AUTH}}_{\mathsf {AE},T}(\tau )+q_F^2\cdot \mathbf {Adv}^{\mathsf {RK-RBST}}_{\mathsf {AE},T}(\tau )+2\mathbf {Adv}^{\mathsf {DIST}}_{\mathsf {SIM}_{\mathsf {AKE}},\mathcal {Z}}(\tau )\), where \(q_F\) is the number of \(\textsc {Eval}\) and \(\textsc {OfflineEval}\) messages aimed at \(\mathcal {F}_{\mathsf {OPRF}}\) from \(\mathcal {A}\), and \(\mathbf {Adv}^{\mathsf {AUTH}}_{\mathsf {AE},T}(\tau )\) and \(\mathbf {Adv}^{\mathsf {RK-RBST}}_{\mathsf {AE},T}(\tau )\) are the probabilities that any algorithm in running time T breaks the authenticity of \(\mathsf {AE}\) and the random-key robustness of \(\mathsf {AE}\), respectively.

Proof

For any adversary \(\mathcal {A}\), we construct a simulator \({\mathsf {SIM}}\) as in Fig. 6. While interacting with \(\mathsf {SIM}_{\mathsf {AKE}}\), \({\mathsf {SIM}}\) plays the role of both \(\mathcal {F}_{\mathsf {AKE-KCI}}\) and \(\mathcal {A}\).

Following [11], without loss of generality, we may assume that \(\mathcal {A}\) is a “dummy” adversary that merely passes all its messages and computations to the environment \(\mathcal {Z}\). We omit all interactions with corrupted \(\mathsf {U}\) and \(\mathsf {S}\) where \({\mathsf {SIM}}\) acts as \(\mathcal {F}_{\mathsf {OPRF}}\), since the simulation is trivial (\({\mathsf {SIM}}\) gains all information needed and simply follows the code of \(\mathcal {F}_{\mathsf {OPRF}}\)). To keep notation brief we denote functionality \(\mathcal {F}_{\mathsf {SaPAKE}}\) as \(\mathcal {F}\).

figure a
Fig. 6.
figure 6

The simulator \({\mathsf {SIM}}\)

In order to account for the advantage of the environment \(\mathcal {Z}\) in distinguishing between its views in the real world and the simulated world, we compare between these two settings in the different simulator actions and derive the distinguishing advantages in cases where the simulation is not perfect. Below we assume that \(\mathcal {Z}\) issues the \((\textsc {StorePwdFile}, sid ,\mathsf {U},\mathsf {pw})\) command to \(\mathsf {S}\) for some \(\mathsf {pw}\); otherwise any subsequent server-side commands of \(\mathcal {Z}\) will not have any effect.

  • \(\mathsf {file}[sid]=\langle p_s,P_s,P_u,c \rangle \) (from \(\mathcal {A}\)): In both worlds, \(\mathcal {Z}\) receives this message after \(\mathcal {A}\) sends \((\textsc {Compromise}, sid )\) aimed at \(\mathcal {F}_{\mathsf {OPRF}}\) and \((\textsc {StealPwdFile}, sid )\) to \(\mathsf {S}\), provided that \(\mathcal {Z}\) input \((\textsc {StorePwdFile}, sid ,\mathsf {U},\mathsf {pw})\) to \(\mathsf {S}\) previously.

    In both worlds, \(p_s\), \(P_s\) and \(P_u\) are generated in the same way, and c is computed as \(\mathsf {AuthEnc}_\mathsf {rw}(p_u,P_u,P_s)\). The only difference is that \(\mathsf {rw}\) is \(F_{ sid ,\mathsf {S}}(\mathsf {pw})\) in the real world, while it is chosen from random in the simulated world. There is no way for \(\mathcal {Z}\) to distinguish unless and until it queries \(F_{ sid ,\mathsf {S}}(\mathsf {pw})\) by letting \(\mathcal {A}\) send \((\textsc {OfflineEval}, sid ,\mathsf {S},\mathsf {pw})\) aimed at \(\mathcal {F}_{\mathsf {OPRF}}\). However, once \(\mathcal {A}\) sends such message, \({\mathsf {SIM}}\) sets \(F_{ sid ,\mathsf {S}}(\mathsf {pw})\) to \(\mathsf {rw}\). Therefore, in both worlds, \(F_{ sid ,\mathsf {S}}(\mathsf {pw})=\mathsf {rw}\) and \(\mathcal {Z}\) cannot distinguish.

  • \((\textsc {OfflineEval}, sid ,\rho )\) (from \(\mathcal {A}\)): In both worlds, \(\mathcal {Z}\) receives this message after \(\mathcal {A}\) sends \((\textsc {OfflineEval}, sid ,\mathsf {S},x)\) to \(\mathcal {F}_{\mathsf {OPRF}}\), provided that \(\mathsf {S}\) is corrupted or marked \(\textsc {compromised}\). The selection of \(\rho \) is the same in the two worlds, except that in the simulated world, if \(x=\mathsf {pw}\), \(\rho \) is set to \(\mathsf {rw}\) which was chosen from random in advance, while in the real world, \(\rho \) is always chosen from random directly. There is no way to distinguish between these two cases.

  • \((\textsc {Eval}, sid , ssid ,\mathsf {U},\mathsf {S})\) (from \(\mathcal {A}\)): In both worlds, \(\mathcal {Z}\) receives this message after inputting \((\textsc {UsrSession}, sid , ssid ,\mathsf {S},\mathsf {pw}')\) to \(\mathsf {U}\).

  • c and \((\textsc {SndrComplete}, sid , ssid ,\mathsf {S})\) (from \(\mathcal {A}\)): In both worlds, \(\mathcal {Z}\) receives these two messages after inputting \((\textsc {SvrSession}, sid , ssid )\) to \(\mathsf {S}\). As argued above, \(\mathcal {Z}\) cannot distinguish the two c’s in the two worlds.

  • \((\textsc {abort}, sid , ssid )\) (from \(\mathsf {U}\)): In both worlds, \(\mathcal {Z}\) may receive this message after \(\mathcal {A}\) sends \((\textsc {RcvComplete}, sid , ssid ,\mathsf {S}^*)\) aimed at \(\mathcal {F}_{\mathsf {OPRF}}\) and \(c'\) aimed at \(\mathsf {U}\), provided that (i) there is a record \(\langle ssid ,\mathsf {U},\mathsf {S},\mathsf {pw}'\rangle \) in \(\mathcal {F}_{\mathsf {OPRF}}\) (or a record \(\langle ssid ,\mathsf {U},\mathsf {S},\cdot \rangle \) in \({\mathsf {SIM}}\)), (ii) if \(\mathsf {S}\) is honest and not marked \(\textsc {compromised}\), then \(\mathsf {S}^*=\mathsf {S}\), and (iii) \(\mathsf {tx}( sid ,\mathsf {S}^*)>0\).

    Note that \(\mathcal {Z}\) may see a \(\textsc {halt}\) message from \({\mathsf {SIM}}\) at this time. \(\textsc {halt}\) occurs when there exists \(x_1 \,{\ne }\, x_2\) such that \(\mathsf {AuthDec}_{y_1}(c') \,{\ne }\!\perp \) and \(\mathsf {AuthDec}_{y_2}(c')\,{\ne }\!\perp \), where \(y_1=F_{ sid ,\mathsf {S}^*}(x_1)\) and \(y_2=F_{ sid ,\mathsf {S}^*}(x_2)\). Since \(F_{ sid ,\mathsf {S}^*}(\cdot )\) is a random function onto \(\{0,1\}^{2\tau }\), \(y_1\) and \(y_2\) are independent random strings in \(\{0,1\}^{2\tau }\); thus, for fixed \(y_1\) and \(y_2\), the probability that \(\mathcal {A}\) finds \(c'\) such that \(\mathsf {AuthDec}_{y_1}(c') \,{\ne }\!\perp \) and \(\mathsf {AuthDec}_{y_2}(c') \,{\ne }\!\perp \) is at most \(\mathbf {Adv}^{\mathsf {RK-RBST}}_{\mathsf {AE},T}(\tau )\) due to the random-key robustness of \(\mathsf {AE}\). Since \(\mathcal {A}\) queries F \(q_F\) times, there are \(q_F\) independent y’s; using a polynomial reduction, we have \(\Pr [\textsc {halt}]\le q_F^2\cdot \mathbf {Adv}^{\mathsf {RK-RBST}}_{\mathsf {AE},T}(\tau )\).

    Next we assume that \(\textsc {halt}\) does not occur. In the real world, \(\mathcal {Z}\) receives \((\textsc {abort}, sid , ssid )\) from \(\mathsf {U}\) if and only if \(\mathsf {AuthDec}_{\mathsf {rw}'}(c')=\,\perp \); that is, \(\mathcal {Z}\) does not receive this message if and only if \(\mathsf {AuthDec}_{\mathsf {rw}'}(c') \,{\ne }\!\perp \). There are only three possibilities:

    1. (1)

      \((\mathsf {pw}',\mathsf {S}^*,c')=(\mathsf {pw},\mathsf {S},c)\): Then \(\mathsf {rw}'=\mathsf {rw}=F_{ sid ,\mathsf {S}}(\mathsf {pw})\), thus \(\mathsf {AuthDec}_{\mathsf {rw}'}(c')=\mathsf {AuthDec}_\mathsf {rw}(c)=(p_u,P_u,P_s)\).

    2. (2)

      \(\mathcal {A}\) queries \(\mathsf {rw}'=F_{ sid ,\mathsf {S}^*}(\mathsf {pw}')\) previously, and \(\mathsf {AuthDec}_{\mathsf {rw}'}(c') \,{\ne }\!\perp \): If \(\mathcal {A}\) learns \(\mathsf {rw}'\), then it can compute an \(\mathsf {AuthEnc}\) instance on \(\mathsf {rw}'\) and any message to find a \(c'\) such that \(\mathsf {AuthDec}_{\mathsf {rw}'}(c') \,{\ne }\!\perp \).

    3. (3)

      Other cases where \(\mathcal {A}\) finds a \(c'\) such that \(\mathsf {AuthDec}_{\mathsf {rw}'}(c') \,{\ne }\!\perp \), while \(\mathsf {rw}'\) is independently random of everything else in \(\mathcal {Z}\)’s view (since \(\mathcal {A}\) does not query \(F_{ sid ,\mathsf {S}^*}(\mathsf {pw}')\)), and \(\mathcal {Z}\) does not query \(\mathsf {AuthEnc}_{\mathsf {rw}'}(p_u',P_u',P_s')\) (\(\mathcal {Z}\) queries \(\mathsf {AuthEnc}_{\mathsf {rw}'}(p_u',P_u',P_s')\) by setting \(\mathsf {pw}'=\mathsf {pw}\) [thus making \(\mathsf {rw}'=\mathsf {rw}\)] and receiving \(c=\mathsf {AuthEnc}_\mathsf {rw}(p_u,P_u,P_s)\) from \(\mathsf {S}\)). Since \(\mathsf {AE}\) is an authenticated encryption, the probability of (3) is at most \(\mathbf {Adv}^{\mathsf {AUTH}}_{\mathsf {AE},T}(\tau )\).

    In the simulated world, \(\mathcal {Z}\) does not receive this message if and only if either of the following two conditions holds:

    1. (1)

      \(c'=c\), \(\mathsf {S}^*=\mathsf {S}\) and \(\mathcal {F}\) returns \(\textsc {Succ}\) on \((\textsc {TestAbort}, sid , ssid ,\mathsf {U})\) from \({\mathsf {SIM}}\). The last condition holds if and only if there are two records \(\langle ssid ,\mathsf {U},\mathsf {S},\mathsf {pw}'\rangle \) and \(\langle ssid ,\mathsf {S},\mathsf {U},\mathsf {pw}''\rangle \), the former marked \(\textsc {fresh}\) and \(\mathsf {pw}'=\mathsf {pw}''\). Note that no \(\textsc {TestPwd}\), \(\textsc {Impersonate}\) or \(\textsc {NewKey}\) message has been issued yet, so the record must be \(\textsc {fresh}\). According to the syntax of \(\textsc {SvrSession}\), we have \(\mathsf {pw}''=\mathsf {pw}\). Therefore, the last condition is equivalent to \(\mathsf {pw}'=\mathsf {pw}\), thus this case is equivalent to case (1) in the real world.

    2. (2)

      There exists x s.t. \(y=F_{ sid ,\mathsf {S}^*}(x)\) is defined in \({\mathsf {SIM}}\), \(\mathsf {AuthDec}_y(c') \,{\ne }\!\perp \) and \(\mathcal {F}\) returns “correct guess” on \((\textsc {TestPwd}, sid , ssid ,x)\) from \({\mathsf {SIM}}\). The last condition is equivalent to \(x=\mathsf {pw}'\); thus, the three conditions combined are equivalent to \(\mathsf {rw}'=F_{ sid ,\mathsf {S}^*}(\mathsf {pw}')\) is defined in \({\mathsf {SIM}}\) and \(\mathsf {AuthDec}_{\mathsf {rw}'}(c') \,{\ne }\!\perp \). \({\mathsf {SIM}}\) defines \(F_{ sid ,\mathsf {S}^*}(\mathsf {pw}')\) only when receiving \((\textsc {OfflineEval}, sid ,\mathsf {S}^*,\mathsf {pw}')\) from \(\mathcal {A}\). Therefore, this case is equivalent to case (2) in the real world.

    Hence, \(\mathcal {Z}\) receives this message in the two worlds under the same conditions, except for case (3) in the real world.

  • Messages sent from \(\mathsf {U}\) and \(\mathsf {S}\) while executing \(\varPi \) (in the real world), or messages sent from \({\mathsf {SIM}}\) (in the simulated world) (from \(\mathcal {A}\)): In case (1) and messages sent from \(\mathsf {S}\) in case (2), they are simulated by \({\mathsf {SIM}}\) who in turn receives them from \(\mathsf {SIM}_{\mathsf {AKE}}\). Since \(\mathsf {SIM}_{\mathsf {AKE}}\) generates \(\mathcal {A}\)’s view indistinguishable from \(\mathcal {A}\)’s view in the real world, \({\mathsf {SIM}}\), who merely passes messages between \(\mathsf {SIM}_{\mathsf {AKE}}\) and \(\mathcal {A}\), can also achieve that; the distinguishing advantage of \(\mathcal {Z}\) is at most \(\mathbf {Adv}^{\mathsf {DIST}}_{\mathsf {SIM}_{\mathsf {AKE}},\mathcal {Z}}(\tau )\). For messages sent from \(\mathsf {U}\) in case (2), they are the results of \(\varPi _u\) on \((p_u',P_u',P_s')\), and are simulated perfectly.

  • \(( sid , ssid ,SK')\) (from \(\mathsf {U}\)): In both worlds, \(\mathcal {Z}\) receives this message when \(\varPi \) is completed and sends output to \(\mathsf {U}\). In the real world, there are two cases:

    • \((p_u',P_u',P_s')=(p_u,P_u,P_s)\), i.e., the input of \(\mathsf {U}\) to \(\varPi \) is correct. This corresponds to case (1) above. There are two subcases regarding \(\varPi \):

      \(*\) :

      \(\mathsf {S}\) is not compromised. Then according to the syntax of \(\mathcal {F}_{\mathsf {AKE-KCI}}\), \(SK'\) is a random string in \(\{0,1\}^\tau \) (independent of everything else, or the same with \(\mathsf {S}\)’s output if \(\mathsf {S}\) already output previously). In the simulated world, the record \(\langle ssid ,\mathsf {U},\mathsf {S},\mathsf {pw}'\rangle \) in \(\mathcal {F}\) is marked \(\textsc {fresh}\), so \(SK'\) is also a random string in \(\{0,1\}^\tau \).

      \(*\) :

      \(\mathsf {S}\) is compromised (then \(\mathcal {A}\) may impersonate \(\mathsf {S}\) while interacting with \(\mathsf {U}\) in the execution of \(\varPi \) and set \(\mathsf {U}\)’s output). In the simulated world, \(\mathsf {SIM}_{\mathsf {AKE}}\) sends \((\textsc {Impersonate}, sid , ssid )\) to \({\mathsf {SIM}}\), who transfers this message to \(\mathcal {F}\), which makes the record \(\langle ssid ,\mathsf {U},\mathsf {S},\mathsf {pw}'\rangle \) marked \(\textsc {compromised}\) (note that we have \(\mathsf {pw}'=\mathsf {pw}\) here since this is a condition of case (1)). Therefore, \(SK'\) chosen by \(\mathsf {SIM}_{\mathsf {AKE}}\) (which is the same with the \(SK'\) output by \(\varPi \) in the real world except for probability at most \(\mathbf {Adv}^{\mathsf {DIST}}_{\mathsf {SIM}_{\mathsf {AKE}},\mathcal {Z}}(\tau )\)) is the value output to \(\mathsf {U}\).

    • \((p_u',P_u',P_s') \,{\ne }\, (p_u,P_u,P_s)\), i.e., the input of \(\mathsf {U}\) to \(\varPi \) is incorrect. This may occur only in cases (2) and (3) above. As argued above, the probability of (3) is at most \(\mathbf {Adv}^{\mathsf {AUTH}}_{\mathsf {AE},T}(\tau )\).

      (2) is equivalent to case (2) in the simulated world, where \({\mathsf {SIM}}\) sends \((\textsc {TestPwd}, sid , ssid ,\mathsf {U},x)\) to \(\mathcal {F}\) and \(\mathcal {F}\) returns “correct guess” (meaning that \(x=\mathsf {pw}'\)). After this, the record \(\langle ssid ,\mathsf {U},\mathsf {S},\mathsf {pw}'\rangle \) is marked \(\textsc {compromised}\). Therefore, \(SK'\), which is computed by \({\mathsf {SIM}}\) as \(\varPi _u\)’s output on \((p_u',P_u',P_s')\), is the value output to \(\mathsf {U}\). In the real world, \(\mathsf {U}\) also outputs \(SK'\).

  • \(( sid , ssid ,SK)\) (from \(\mathsf {S}\)): In both worlds, \(\mathcal {Z}\) receives this message when \(\varPi \) is completed and sends output to \(\mathsf {S}\).

    In the real world, SK is always a random string in \(\{0,1\}^\tau \) (independent of everything else, or the same with \(\mathsf {U}\)’s output if \(\mathsf {U}\) already output previously). Note that in the simulated world, the record \(\langle ssid ,\mathsf {S},\mathsf {U},\mathsf {pw}'\rangle \) is always marked \(\textsc {fresh}\). Therefore, SK is also random string in \(\{0,1\}^\tau \).

It remains to show that \(\mathsf {SIM}_{\mathsf {AKE}}\)’s view while interacting with \({\mathsf {SIM}}\) is the same as interacting with \(\mathcal {F}_{\mathsf {AKE-KCI}}\) and \(\mathcal {A}\). When \({\mathsf {SIM}}\) acts as \(\mathcal {A}\), the interaction is trivial since \({\mathsf {SIM}}\) merely passes messages between \(\mathsf {SIM}_{\mathsf {AKE}}\) and the real \(\mathcal {A}\). Consider when \({\mathsf {SIM}}\) acts as \(\mathcal {F}_{\mathsf {AKE-KCI}}\), and note that \({\mathsf {SIM}}\) engages with \(\mathsf {SIM}_{\mathsf {AKE}}\) only in cases (1) and (2):

  1. (1)

    \(\mathsf {U}\)’s input is correct: Same effect as honest \(\mathsf {U}\) and \(\mathsf {S}\) executing \(\varPi \);

  2. (2)

    \(\mathsf {U}\)’s input is incorrect: Same effect as corrupted \(\mathsf {U}\) and honest \(\mathsf {S}\) executing \(\varPi \). Note that \({\mathsf {SIM}}\) engages with \(\mathsf {SIM}_{\mathsf {AKE}}\) on the side of \(\mathsf {S}\) only, so \(\mathsf {SIM}_{\mathsf {AKE}}\)’s view is again the same.

We conclude that \(\mathcal {Z}\)’s view in the real world and the simulated world is the same, except for (1) \((\textsc {abort}, sid , ssid )\) or \(\textsc {halt}\) after \(\mathcal {A}\) sends \((\textsc {RcvComplete}, sid , ssid ,\mathsf {S}^*)\) and \(c'\), (2) messages sent during the execution of \(\varPi \), and (3) \(( sid , ssid ,SK')\) output from \(\mathsf {U}\). The probabilities that (1), (2) and (3) are different in the two worlds are no more than \(\mathbf {Adv}^{\mathsf {AUTH}}_{\mathsf {AE},T}(\tau )+q_F^2\cdot \mathbf {Adv}^{\mathsf {RK-RBST}}_{\mathsf {AE},T}(\tau )\), \(\mathbf {Adv}^{\mathsf {DIST}}_{\mathsf {SIM}_{\mathsf {AKE}},\mathcal {Z}}(\tau )\) and \(\mathbf {Adv}^{\mathsf {DIST}}_{\mathsf {SIM}_{\mathsf {AKE}},\mathcal {Z}}(\tau )\), respectively. Using a hybrid argument, we can see that \(\mathcal {Z}\)’s advantage is no more than \(\mathbf {Adv}^{\mathsf {AUTH}}_{\mathsf {AE},T}(\tau )+q_F^2\cdot \mathbf {Adv}^{\mathsf {RK-RBST}}_{\mathsf {AE},T}(\tau )+ 2\mathbf {Adv}^{\mathsf {DIST}}_{\mathsf {SIM}_{\mathsf {AKE}},\mathcal {Z}}(\tau )\).

6 OPAQUE: A Strong Asymmetric PAKE Instantiation

Figure 7 shows OPAQUE, a concrete instantiation of the generic OPRF+AKE protocol from Fig. 5. An illustration is presented in Fig. 8.

The OPRF is instantiated with the DH-OPRF scheme from [22] recalled in Appendix A, while the AKE protocol can be instantiated with any UC-secure 2-message implicitly-authenticated AKE-KCI; in Fig. 7 this is illustrated with HMQV [27]. Fortunately, the two messages of DH-OPRF and the two messages from HMQV (or a similar protocol) can be run “in parallel” hence obtaining a 2-message SaPAKE.

By Theorem 2 on the security of the generic OPRF+AKE construction, by Lemma 1 in Appendix A on the security of DH-OPRF, and by security of HMQV (see below), we get that protocol OPAQUE realizes functionality \(\mathcal {F}_{\mathsf {SaPAKE}}\), hence it is a provably-secure Strong aPAKE, under the One-More Diffie-Hellman assumption [3, 22] in ROM.

Fig. 7.
figure 7

Protocol OPAQUE

Fig. 8.
figure 8

Schematic representation of OPAQUE (see Fig. 7 for the details)

6.1 Protocol Details and Properties

We expand on the specification of OPAQUE and the protocol’s properties.

\(\bullet \) Password registration. Password registration is the only part of the protocol assumed to run over secure channels where parties can authenticate each other. We note that while OPAQUE is presented with \(\mathsf {S}\) doing all the registration operations, in practice one may want to avoid that. Instead, we can let \(\mathsf {S}\) choose an OPRF key \(k_s\) and \(\mathsf {U}\) choose \(\mathsf {pw}\), and then run the OPRF protocol between \(\mathsf {U}\) and \(\mathsf {S}\) so only \(\mathsf {U}\) learns its secrets (\(\mathsf {pw},\mathsf {rw},p_u\)) and only \(\mathsf {S}\) learns \(p_s\). A problem arises with this approach if \(\mathsf {S}\)’s policy is to check the user’s password for compliance with some rules. A possible workaround is to adapt techniques from [26] that present zero-knowledge proofs for proving compliance without disclosing the password.

\(\bullet \) Authenticated encryption. As specified in Sect. 5.2, the scheme \(\mathsf {AuthEnc}\) used in the protocol needs to satisfy the key-committing property defined there. In practice, using an encrypt-then-mac scheme with HMAC-256 (or larger) as the MAC provides this property (if a scheme does not have this property then adding on top of it such a HMAC computed on the scheme’s ciphertext will ensure this property).

\(\bullet \) Key exchange. The generic AKE representation via the \(\mathsf {KE}\) formula applies to any protocol whose session key is computed as a function of the long-term private-public key pair of each party and ephemeral session-specific private-public values. These values are represented as \((p_s,P_s, x_s, X_s)\) for the server and \((p_u,P_u, x_u, X_u)\) for the user. We note that while more general key-exchange protocols can be used with OPAQUE, this representation applies to many such protocols and, in particular, to HMQV [27] which we use here as our main instantiation.

\(\bullet \) Explicit mutual authentication. The protocol as illustrated takes just two messages but does not provide explicit user authentication. With a third message the protocol achieves mutual authentication by simply adding the value \(f_{K}(1)\) to the server’s message and adding a third message where \(\mathsf {U}\) sends \(f_{K}(2)\) to \(\mathsf {S}\). Each party verifies that the value received from the other is computed correctly and if not it aborts.

\(\bullet \) Use of HMQV. Recall that the security of OPAQUE depends on the KE protocol being AKE-secure in the UC model with the additional KCI property; namely, it should realize the AKE-KCI UC functionality from Fig. 4. As argued in Sect. 5.1, HMQV indeed realizes this functionality (under the CDH assumption in the RO model), hence it is appropriate for use in OPAQUE. Moreover, HMQV enjoys forward secrecy. Specifically, the 2-message protocol provides weak forward secrecy (i.e., forward secrecy is guaranteed for sessions where the user’s message delivered to the server came from the real \(\mathsf {U}\)) while the 3-message variant with explicit client authentication provides full forward secrecy, namely, against arbitrary active attacks [27].

\(\bullet \) Forward secrecy. This property (or lack of it) is inherited by OPAQUE from the key exchange component \(\mathsf {KE}\). In the case of HMQV, forward secrecy is achieved as stated above. One cannot overstate the importance of forward secrecy in password protocols: it guarantees that past session keys remain secure upon the compromise of a user’s password (or server’s information).

\(\bullet \) User iterated hashing. OPAQUE can be strengthened by increasing the cost of a dictionary attack in case of server compromise. This is done by changing the computation of \(\mathsf {rw}\) to \(\mathsf {rw}=H^n(F_k(\mathsf {pw}))\), that is, the client applies n iterations of the function H on top of the result of the OPRF value \(F_k(\mathsf {pw})\). In practice, the iterations \(H^n\) would be replaced with one of the standard password-based KDFs, such as PBKDF2 [25] or bcrypt [31]. This forces an attacker that compromises the password file at the server to compute for each candidate password \(\mathsf {pw}'\) the function \(F_k(\mathsf {pw}')\) as well as the additional n hash iterations. Note that n needs not be remembered by the user; it can be sent from \(\mathsf {S}\) to \(\mathsf {U}\) in the server’s message. Furthermore, one can follow Boyen’s design and apply the probabilistic Halting KDF function [8] as used in [9] so that the iterations count is hidden from the attacker and even from the server.

\(\bullet \) Performance. OPAQUE takes two messages (three with explicit mutual authentication); one exponentiation for \(\mathsf {S}\), two and a hashing-into-G for \(\mathsf {U}\), plus the cost of \(\mathsf {KE}\). With HMQV, the latter cost is one offline fixed-base exponentiation and one multi-exponentiation (at the cost of 1.16 regular exponentiations) per party (about three exponentiations in total for the server and four for the user). All exponentiations are in regular DH groups, hence accommodating the fastest elliptic curves (e.g., no pairings). It is common in PAKE protocols to count number of group elements transmitted between the parties. In OPAQUE, \(\mathsf {U}\) sends two while \(\mathsf {S}\) sends three (one, \(P_u\), can be omitted at the cost of one fixed-based exponentiation at the client).

\(\bullet \) Performance comparison. The introduction presents background on OPAQUE and other password protocols. Here we provide a comparison with the more efficient among these protocols, particularly those that are being, or have been, considered for standardization. Clearly, OPAQUE is superior security-wise as the only one not subject to pre-computation attacks, but it also fares well in terms of performance.

AugPAKE [33, 34], is computationally very efficient with only 2.17 exponentiations per party; however, it uses 4 messages and does not provide forward secrecy. In addition, the protocol has only been analyzed as a PAKE protocol, not aPAKE [34]. Another proposed aPAKE protocol, SPAKE2+ [2, 15], uses two messages only and 3 multi-exponentiations (or about 3.5 exponentiations) per party which is similar to OPAQUE cost. The security of the protocol has only been informally argued in [15] and to the best of our knowledge no formal analysis has appeared. We also mention SRP which has been included in TLS ciphersuites in the past but is considered outdated as it does not have an instantiation that works over elliptic curves (the protocol is defined over rings and uses both addition and multiplication). Its implementations over RSA moduli is therefore less efficient than those over elliptic curve; it also takes 4 messages.

We also mention two very recent schemes that have been formally analyzed as aPAKE protocols but, as the rest, are vulnerable to pre-computation. The protocol VTBPEKE in [30] uses 3 messages and 4 exponentiations per party and was proven secure in the non-UC aPAKE model of [7], while [24] shows a simultaneous one-round scheme that they prove secure in the UC aPAKE model of [18] augmented with adaptive security. The protocol works over bilinear groups and its computational cost includes 4 exponentiations and 3 pairing per party. We note that all of the above protocols require an initial message from server to user in order to transmit salt, which results in one or two added messages to the above message counts (except for VTBPEKE which already includes the salt transmission in its 3 messages). Also, all these protocols, like OPAQUE, work in the RO model.

\(\bullet \) Threshold implementation. We comment on a simple extension of OPAQUE that can be very valuable in large deployments, namely, the ability to implement the OPRF phase as a Threshold OPRF [23]. In this case, an attacker needs to break into a threshold of servers to be able to impersonate the servers to the user or to run an offline dictionary attack. Such an implementation requires no user-side changes, i.e., the user does not need to know if the system is implemented with one or multiple servers.

\(\bullet \) Secret retrieval and hedging TLS. Additional features of OPAQUE include the ability to store and retrieve user’s secrets (such as a bitcoin wallet, authentication credentials, encrypted backup keys, etc.) as part of the information encrypted and authenticated at the server under ciphertext c. In one particular use case such secret can be a client signature key for TLS. In this case, the key exchange part of OPAQUE can reuse that of TLS and a server’s certificate can be replaced with the server’s public key stored under the client-authenticated ciphertext c.

6.2 An OPAQUE Variant: Multiplicative Blinding

A variant of OPAQUE is obtained by replacing the user’s exponential blinding operation \(\alpha :=(H'(\mathsf {pw}))^r\) with \(\alpha :=(H'(\mathsf {pw}))\cdot g^r\). The server responds as before with \(\beta =\alpha ^{k_s}\). Assuming that \(\mathsf {U}\) knows the value \(y=g^{k_s}\) (previously stored or received from \(\mathsf {S}\)), it can compute the same “hashed Diffie-Hellman” value \((H'(\mathsf {pw}))^{k_s}\) as \(\beta / y^{r}\). The advantage of this variant is that while the number of client exponentiations remains the same, one is fixed-base (\(g^r\)) and the other \((y^{r})\) can also be fixed-base if \(\mathsf {U}\) caches y, a realistic possibility for accounts where the user logs in frequently (e.g., a personal email or social network). Computing \(y^r\) can also be done while waiting for the server’s response to reduce latency. Moreover, both exponentiations can be done offline although only short-term storage is recommended as the leakage of r exposes \(H'(\mathsf {pw})\). If \(\mathsf {U}\) does not store y, it needs to be transmitted to \(\mathsf {U}\) by \(\mathsf {S}\) together with the response \(\beta \). This still allows for fixed-base optimization for computing \(g^r\) but not for \(y^{r}\).

However, it turns out that this multiplicative mechanism results in an OPRF protocol that does not realize our OPRF functionality \(\mathcal {F}_{\mathsf {OPRF}}\). Thus, our analysis here does not imply the security of the multiplicative OPAQUE variant in general. If \(\mathsf {rw}\) is redefined as \(\mathsf {rw}:=H(\mathsf {pw},y,H'(\mathsf {pw})^{k_s})\), i.e. if y is included under the hash, then the resulting OPRF does realize our functionality, and OPAQUE remains secure as SaPAKE under both blinding variants. This change, however, introduces a (slight) overhead of having to transmit y even if it is not strictly needed, e.g. if the client implements the exponential blinding operation. An alternative approach would be to replace the OPRF functionality \(\mathcal {F}_{\mathsf {OPRF}}\) with a weaker form \(\mathcal {F}_{\mathsf {OPRF}}'\) and to show that (i) \(\mathcal {F}_{\mathsf {OPRF}}'\) is realized by the multiplicative variant (even without hashing y) and (ii) \(\mathcal {F}_{\mathsf {OPRF}}'\) is sufficient for proving Theorem 2 hence implying the security of OPAQUE as SaPAKE. We intend to investigate this weakening of \(\mathcal {F}_{\mathsf {OPRF}}\).