1 Introduction

Keybase is a suite of encryption tools. It encompasses a public-key directory, an instant messenger, and a cloud storage service. Keybase was launched in 2014. In February 2020, it reported having accumulated more than 1.1M user accounts [27]. In May 2020, Keybase was acquired by Zoom. At the time, Zoom issued a public statement [36] saying that the Keybase’s team was meant to play a critical part in building scalable end-to-end encryption for Zoom. The acquisition appears to have put an end to an active development of new Keybase features, but as of February 2024 it keeps receiving regular maintenance updates.

Instant Messaging in Keybase. Keybase implements its own end-to-end encrypted instant messaging protocol. This protocol is designed to support large groups. One-on-one chats are treated as group chats and hence use the same protocol. The protocol also allows to send large files as encrypted attachments in chat. It is impossible to opt out of end-to-end encryption in Keybase. In this work we analyze the security of this protocol.

The Keybase client is open source [24], but the server is not. Our security analysis primarily relies on the source code. Keybase also provides the “Keybase Book” website [22] with excellent documentation that explains its cryptographic design. The only prior security analysis of Keybase was done by NCC Group in 2019 [31], which broadly looked at the security of the entire Keybase ecosystem. In comparison, we provide an in-depth analysis of a single component in Keybase.

Encrypted Group Chats. In this work we consider a setting in which an arbitrary number of users can form a group. All group members share a key for a symmetric encryption scheme. Each instant message within the group is encrypted with this key. Let us use g to denote the identity of a group and \(K_g\) to denote the key shared between the members of this group. In Keybase, every member of group g uses the same long-term key \(K_g\) to encrypt their outgoing chat messages. Each message is encrypted only once, simultaneously for all recipients. The resulting ciphertext is then broadcast to all members of the group.

The Sender Keys protocol [7, 28] can be seen as building on this basic design idea. In Sender Keys, every member of the group owns a distinct symmetric encryption key; they share it with other group members. Each outgoing message is encrypted with the sender’s own key, and the resulting ciphertext is broadcast to the group. Furthermore, each key is used to encrypt only a single message, and immediately afterwards a new key is derived to be used for the next encryption. So every group member tracks every other member’s current encryption key, decrypting each incoming ciphertext with the corresponding sender’s key and subsequently replacing it with an appropriately derived new key. Variants of the Sender Keys protocol are used in the Signal [28], WhatsApp [34], and Matrix [1, 2] messengers. In addition, the Messaging Layer Security (MLS) [8] protocol contains a component called FS-GAEAD [3] or TreeDEM [33] that similarly uses a sender’s key to encrypt and broadcast a message (but its overall design significantly differs from design of the Sender Keys protocol).

An encrypted group chat protocol should provide at least confidentiality and integrity of communication, with respect to an attacker that is not a member of the group. In part, this could be achieved by building the protocol from a symmetric encryption scheme that satisfies some notion of authenticated-encryption security. But care is needed to also prevent undesired message replays, reordering, or drops. These requirements are specific to a stateful protocol and do not necessarily follow from properties provided by the underlying stateless scheme.

Sender Authentication in Group Chats. Consider a group chat protocol that is built from a single symmetric encryption scheme and where every symmetric key is known to all group members. In such a protocol, group members are able to impersonate each other. This is true regardless of whether each group uses a single shared encryption key or has each member own a distinct encryption key. To prevent group members from impersonating each other, it is natural to use a digital signature scheme. Let us use u to denote the identity of a user and \(sk_u\) to denote this user’s signing key for a digital signature scheme.

Fig. 1.
figure 1

Warmup schemes obtained by composing digital signatures with symmetric encryption. Left pane: \(\mathsf {Sign\text {-}{}then\text {-}{}Encrypt}\). Right pane: \(\mathsf {Encrypt\text {-}{}then\text {-}{}Sign}\).

What is a sound way to compose a symmetric encryption scheme with a digital signature scheme? Let us consider two sequential compositions of a signing algorithm \(\textsf{Sign}\) with an encryption algorithm \(\textsf{Encrypt}\). We call the resulting schemes \(\mathsf {Sign\text {-}{}then\text {-}{}Encrypt}\) and \(\mathsf {Encrypt\text {-}{}then\text {-}{}Sign}\), and we show them in Fig. 1. The \(\mathsf {Sign\text {-}{}then\text {-}{}Encrypt}\) scheme first signs a message m to obtain its digital signature s and then encrypts (sm) to obtain and return a symmetric ciphertext c. The \(\mathsf {Encrypt\text {-}{}then\text {-}{}Sign}\) scheme first encrypts m as c and then computes a signature s over c; it returns the pair (cs).

These compositions closely mirror those that are commonly used to build signcryption [5], which is a standard cryptographic primitive that combines digital signatures with public-key encryption [35], except we replace public-key encryption with symmetric encryption. It is well known that the corresponding compositions for signcryption are not secure in the multi-user setting, unless some effort is taken to bind together the message with the sender and recipient identities [5]. The standard advice is to always sign the recipient’s identity and always encrypt the sender’s identity. Our basic schemes in Fig. 1 would intuitively suffer from similar issues and benefit from similar countermeasures. However, the exact details would depend on what kind of security one expects from these schemes, so we defer this discussion.

The Sender Keys protocol [7] prescribes to sign a symmetric ciphertext; and this is indeed done by the Signal, WhatsApp, and Matrix messengers. The MLS protocol [8] protocol prescribes to encrypt a digital signature with a symmetric encryption scheme. So either protocol can be seen as using some variant of \(\mathsf {Encrypt\text {-}{}then\text {-}{}Sign}\) or \(\mathsf {Sign\text {-}{}then\text {-}{}Encrypt}\) as a subroutine. We will now discuss that Keybase can be seen as extending both of these basic schemes.

Fig. 2.
figure 2

A high-level representation of the \(\textsf{SealPacket}\) scheme in Keybase.

SealPacket: Sign-then-Encrypt in Keybase. Keybase uses a variant of the basic \(\mathsf {Sign\text {-}{}then\text {-}{}Encrypt}\) scheme. It signs the symmetric encryption key along with the plaintext, meaning it signs \((K_g, m)\) instead of just m. The resulting scheme is called \(\textsf{SealPacket}\) and is shown in Fig. 2. In the source code, the decision to sign \(K_g\) is explained as follows [26]:

simply using encryption and signing together isn’t good enough ...the inner layer needs to assert something about the outer layer ...a better approach is to mix the outer key into the inner crypto, so that it’s impossible to forget to check it ...That means the inner signing layer needs to assert the encryption key ...We don’t need to worry about whether the signature might leak the encryption key either, because the signature gets encrypted.

Keybase uses \(\textsf{SealPacket}\) to encrypt the following three types of plaintexts: (1) a metadata header that is automatically created and sent along with every chat message, (2) a file that is sent as an attachment in chat, and (3) an arbitrary string chosen by a chat bot (for secure server-side storage of bot data).

Fig. 3.
figure 3

A high-level representation of the \(\textsf{BoxMessage}\) scheme in Keybase.

BoxMessage: Encrypt-then-Sign in Keybase. The \(\textsf{BoxMessage}\) scheme in Keybase is a variant of the basic \(\mathsf {Encrypt\text {-}{}then\text {-}{}Sign}\) scheme. This scheme, unlike \(\textsf{SealPacket}\), is used only for one purpose: to encrypt the body of a chat message. So we denote by \(c_{\textsf{body}}\) the symmetric ciphertext that is created in the inner (encryption) layer of \(\textsf{BoxMessage}\). The \(\textsf{BoxMessage}\) scheme extends the basic \(\mathsf {Encrypt\text {-}{}then\text {-}{}Sign}\) scheme in two ways and is shown in Fig. 3. First, it takes an associated-data field \(ad\) and signs \((c_{\textsf{body}}, ad)\) instead of just \(c_{\textsf{body}}\). Second, rather than use a signature scheme, \(\textsf{BoxMessage}\) uses \(\textsf{SealPacket}\) to sign \((c_{\textsf{body}}, ad)\).

Keybase uses the auxiliary-data field \(ad\) to authenticate a metadata header for the chat message. This header contains the group’s identity and sender’s identity among multiple other values. The data in \(ad\) is sent in plain over the network (along with \(c_{\textsf{body}}\)), meaning that \(\textsf{SealPacket}\) is not meant to provide confidentiality of \(ad\). Indeed, the Keybase documentation explains that \(\textsf{SealPacket}\) is used to provide the confidentiality of the signature over \((c_{\textsf{body}}, ad)\) [23]:

fields in the header aren’t secret from the server, and it actually needs to know several of them ...The reason for sign-then-encrypting/signencrypting the header is instead to keep the signature itself private. Even though the server knows who’s talking to whom, because it’s delivering all the messages, it’s better that it can’t prove what it knows.

Interestingly, \(\textsf{BoxMessage}\) reuses the group’s symmetric encryption key \(K_g\) between its calls to \(\textsf{Encrypt} \) and \(\textsf{SealPacket}\). As mentioned above, \(\textsf{SealPacket}\) will itself first sign \(K_g\) and then run another instance of \(\textsf{Encrypt} \) with \(K_g\) as the key. In total, the same value of \(K_g\) is therefore used in 3 distinct contexts.

Symmetric Signcryption. We define symmetric signcryption as a new cryptographic primitive that combines symmetric encryption with digital signatures. We capture the setting where every user owns a signing key pair and in each group all users share a single symmetric encryption key. The encryption key is long-term, meaning it can be used an arbitrary number of times, simultaneously by all members of the group. This will allow us to formalize and analyze the \(\textsf{SealPacket}\) and \(\textsf{BoxMessage}\) schemes.

We note that the use of sender-specific encryption keys in the Sender Keys [7] and MLS [8] protocols can also be captured by symmetric signcryption. Indeed, in either protocol all symmetric encryption keys can seen as being independently sampled, and each individual key is used only once. This can be thought of as a collection of “one-time-use” symmetric signcryption schemes.

We adapt the standard syntax of (asymmetric) signcryption to suit our setting, defining algorithms \(\textsf{SigEnc}\) and \(\textsf{VerDec}\). They both take explicit sender and group identities, nonces, and associated data.

Security of Symmetric Signcryption. We define two security notions for symmetric signcryption. The out-group authenticated encryption (OAE) security requires confidentiality and integrity of communication against an adversary that does not know the symmetric key of the group it attacks. This is required to hold even against an adversary that can assign to every user an arbitrary (possibly malformed) digital signature key pair. The in-group unforgeability (IUF) security requires unforgeability of messages sent by users whose signing keys are not known to the adversary. This is required to hold even against an adversary that can assign to every group an arbitrary (possibly malformed) symmetric key.

We show how to extend the basic \(\mathsf {Sign\text {-}{}then\text {-}{}Encrypt}\) scheme, by carefully incorporating user and group identifiers, to achieve both of our security notions. We assume strong unforgeability of the underlying digital signature scheme and authenticated-encryption security of the underlying encryption scheme.

Implementation of SealPacket and BoxMessage. In the source code of the Keybase client, \(\textsf{SealPacket}\) is implemented in [26] and \(\textsf{BoxMessage}\) is implemented in [25]. These schemes are instantiated with the nonce-based authenticated encryption scheme \(\mathsf {XSalsa20\text {-}{}Poly1305}\)  [15, 16] and the digital signature scheme \(\textsf{Ed25519}\)  [17, 18]. They also use \(\mathsf {SHA\text {-}{}512}\) and \(\mathsf {SHA\text {-}{}256}\) which does not significantly affect the design of either scheme so we omit discussing it here, but in the main body of the paper, we attempt to formalize both schemes precisely.

Keybase implements four versions of the \(\textsf{BoxMessage}\) scheme: V1, V2, V3, and V4. V1 is deprecated; the Keybase client allows to receive but not send messages that use V1. V2 is the default version that we formalize and analyze in this work. V3 is the same as V2, except it supports exploding messages; the body of an exploding message is encrypted using an ephemeral key instead of \(K_g\). V4 is the same as V3, except it makes all group members use a dummy (zero) signing key and instead authenticate messages using pairwise MACs.

Provable Security Analysis. We model \(\textsf{BoxMessage}\) and \(\textsf{SealPacket}\) as symmetric signcryption schemes and provide formal reductions for their IUF and OAE security. Our analysis is done in a concrete security framework [11], and in a multi-key setting; we state precise bounds on the advantage of an attacker. The analysis of \(\textsf{BoxMessage}\) largely encompasses that of \(\textsf{SealPacket}\), because \(\textsf{BoxMessage}\) uses \(\textsf{SealPacket}\) in a modular way. So we focus on the analysis of \(\textsf{BoxMessage}\) here. The main challenges arise from using \(K_g\) in 3 distinct contexts.

First, we aim to show it is hard to switch the context of the \(\mathsf {XSalsa20\text {-}{}Poly1305}\) ciphertexts \(c_{\textsf{body}}\) and \(c_{\textsf{header}}\). Both are encrypted using the same key \(K_g\), so there is a risk that an attacker could forge a valid encryption of some body-plaintext m from a known encryption of some header-plaintext \((s, c_{\textsf{body}}, ad)\), or vice versa. To rule out such attacks, we rely on an observation that every application-layer message m that is queried to be encrypted by \(\textsf{BoxMessage}\) is encoded in a specific way, whereas every header-plaintext \((s, c_{\textsf{body}}, ad)\) is expected to start with a valid \(\textsf{Ed25519}\) signature. Based on the specification of \(\textsf{Ed25519}\) we show (in the ROM and GGM) that it is hard to cast the encoding used in m as a valid \(\textsf{Ed25519}\) signature (with respect to any verification key of adversary’s choice).

Second, we need to show that \(\mathsf {XSalsa20\text {-}{}Poly1305}\) provides authenticated encryption even for certain messages derived from its secret key. This arises because in \(\textsf{SealPacket}\) the \(\mathsf {XSalsa20\text {-}{}Poly1305}\) key \(K_g\) is first signed with \(\textsf{Ed25519}\) and then the resulting signature is encrypted using \(\mathsf {XSalsa20\text {-}{}Poly1305}\) under the same key. Here again we rely on the specification of \(\textsf{Ed25519}\). An \(\textsf{Ed25519}\) signature depends on two \(\mathsf {SHA\text {-}{}512}\) hash values of the message that is being signed, but it does not depend on the signed message beyond that. We use this (in the ROM) to eliminate the need to consider key-dependent messages and hence only require \(\mathsf {XSalsa20\text {-}{}Poly1305}\) to provide the standard notion of authenticated encryption.

We do not know any way to avoid the above analysis. The necessity to use non-standard security notions appears to be inherently implied by the design decisions made in Keybase. This could have been avoided (e.g. with out \(\mathsf {Sign\text {-}{}then\text {-}{}Encrypt}\) scheme). Overall, our reductions (in the ROM and GGM) rely on the AEAD security of \(\mathsf {XSalsa20\text {-}{}Poly1305}\), collision resistance of \(\mathsf {SHA\text {-}{}256}\) and \(\mathsf {SHA\text {-}{}512}\), and strong unforgeability of \(\textsf{Ed25519}\). We note that Keybase uses the version of \(\textsf{Ed25519}\) that was recently shown to be SUF-CMA secure [10, 20].

Limitations of Our Work. Our analysis of Keybase is intentionally narrow in scope. We perform an in-depth, algorithmic analysis of specific chat components that can be modeled as symmetric signcryption. Other analysis is outside the scope of our work, such as whether these algorithms are secure against timing attacks and whether they provide protection against message replays, reordering, or drops when used within the broader stateful chat protocol. More broadly, our analysis does not explicitly cover many other applications of cryptography in Keybase, including other versions of \(\textsf{BoxMessage} \), encryption of attachments or bot data, the initial key exchange used to agree on group keys, the public-key directory used to share user keys, and the cloud storage service. These applications are important for the overall security of Keybase, and have the potential to interplay with each other in subtle ways. For example, user signing keys are used for multiple tasks in Keybase. We believe appropriate context separation is used for these purposes (e.g. all messages signed in \(\textsf{SealPacket} \) start with \(\text { ``Keybase-Chat-2''}\)). If not, subtle cross-application attacks may be possible.

Related Work. The Hybrid Public-Key Encryption (HPKE) standard is specified in RFC 9180 [9]. Alwen, Janneck, Kiltz, and Lipp [4] analyze the “pre-shared key” modes from RFC 9180. They cast the \(\textsf{HPKE}_{\textsf{AuthPSK}}\) mode as an asymmetric signcryption scheme that is augmented with a pre-shared symmetric key, and they define the corresponding security notions. They analyze the security that is achieved by \(\textsf{HPKE}_{\textsf{AuthPSK}}\) depending on which combinations of keys are secure. Our definitions are similar in the sense that both works define a signcryption-type primitive that in addition uses a symmetric key. However, the algorithms in [4] use one more set of keys, and the definitions in [4] are stated in the two-user setting. In essence, our primitives are similar in form, but are tailored to be used as tools in different settings.

2 Preliminaries

We use standard pseudocode notation and assume familiarity with hash functions, random oracles, nonce-based encryption, and digital signatures. Collision resistance is defined by .

2.1 Standard Security Notions in a Multi-key Setting

Key Management Oracles. Throughout this work we consider multi-key security notions. Adversaries in security games will be provided with three types of key management oracles. These oracles will allow (1) sampling new honest (i.e. challenge) keys, (2) exposing existing honest keys, and (3) adding corrupt keys of the adversary’s choice. When an honest key is exposed it becomes corrupt, but it was initially sampled from a correct key distribution. In contrast, when an adversary adds its own corrupt key, such a key could be maliciously crafted in an arbitrary way. In basic security notions an adversary cannot benefit from crafting corrupt keys, because no challenge queries are permitted with respect to such keys. This changes for more complex systems built from more than one keyed primitive when some security is required to hold even if some underlying secrets are exposed. The ability to use malicious keys was modeled in prior work on (asymmetric) signcryption [14], and will be needed in this work.

Our security model for symmetric signcryption in Sect. 3 will define two sets of key management oracles. The set of user oracles \(\textsc {U}= \{{\textsc {NewHonUser},} {\textsc {ExposeUser}, \textsc {NewCorrUser}}\}\) will manage the keys for a digital signature scheme, whereas the set of group oracles \(\textsc {G}= \{{\textsc {NewHonGroup}, \textsc {ExposeGroup},} {\textsc {NewCorrGroup}}\}\) will manage the keys for a nonce-based encryption scheme. We adopt the same terminology and notation across all of the multi-key security notions; each notion for an asymmetric primitive will define a set of user oracles \(\textsc {U}\), and each notion for a symmetric primitive will define a set of group oracles \(\textsc {G}\). For consistency, we include oracles for adding corrupt keys even when an adversary cannot benefit from using them. When simulating user oracles in a security reduction, we write \(\textsc {Sim}\textsc {U}\) to denote the set \(\{{\textsc {Sim}\textsc {NewHonUser},} {\textsc {Sim}\textsc {ExposeUser}, \textsc {Sim}\textsc {NewCorrUser}}\}\) and do similarly for group oracles.

Nonce-Based Authenticated Encryption. Consider game of Fig. 4 for nonce-based encryption scheme \(\textsf{NE}\) and adversary \(\mathcal {A}_{\textsf{AEAD}}\). The advantage of \(\mathcal {A}_{\textsf{AEAD}}\) in breaking the \(\textsf{AEAD}\) security of \(\textsf{NE}\) is defined as . The game samples a challenge bit b, and \(\mathcal {A}_{\textsf{AEAD}}\) is required to guess it. Adversary \(\mathcal {A}_{\textsf{AEAD}}\) is given the group oracles \(\textsc {G}\), encryption oracle \(\textsc {Enc}\), and decryption oracle \(\textsc {Dec}\). Among the group oracles, \(\textsc {NewHonGroup}\) creates new groups with honestly generated \(\textsf{NE}\) keys, \(\textsc {ExposeGroup}\) reveals the keys of existing groups, and \(\textsc {NewCorrGroup}\) instantiates new corrupt groups with \(\textsf{NE}\) keys of \(\mathcal {A}_{\textsf{AEAD}}\)’s choice. We require \(\textsf{NE}\) to be nonce-misuse resistant [30], meaning that no challenge message m is allowed to be queried across two distinct calls to \(\textsc {Enc}\) with respect to the same set of \({g, n, ad}\). A corrupt group key can only be used to call \(\textsc {Enc}\) with \(m_0 = m_1\), and a group key cannot be exposed after it has been used in \(\textsc {Enc}\) with \(m_0 \ne m_1\). The decryption oracle \(\textsc {Dec}\) takes \(g, n, c, ad\) as input and decrypts this to the corresponding plaintext m. Following the all-in-one style of [30, 32], it returns \(\bot \) if \(b=0\), and it returns m otherwise. This oracle never decrypts a ciphertext with an exposed group’s key, and it never decrypts ciphertexts previously produced by \(\textsc {Enc}\) (with the same \(g, c, ad\)).

Fig. 4.
figure 4

Left pane: Game defining authenticated-encryption security of a nonce-based encryption scheme \(\textsf{NE}\). Right pane: Game defining key-recovery security of \(\textsf{NE}\). Bottom pane: Group oracles \(\textsc {G}= \{{\textsc {NewHonGroup}, \textsc {ExposeGroup}, \textsc {NewCorrGroup}}\}\) that are provided to an adversary in either game, except that the code only appears in the \(\textsf{AEAD}\) security game.

Key-Recovery Security of NE. Consider game of Fig. 4 for nonce-based encryption scheme \(\textsf{NE}\) and adversary \(\mathcal {A}_{\textsf{KR}}\). The advantage of \(\mathcal {A}_{\textsf{KR}}\) in breaking the \(\textsf{KR}\) security of \(\textsf{NE}\) is defined as .

Fig. 5.
figure 5

Game defining strong unforgeability of a digital signature scheme \(\textsf{DS}\), where \(\textsc {U}= \{{\textsc {NewHonUser}, \textsc {ExposeUser}, \textsc {NewCorrUser}}\}\).

Strong Unforgeability of Digital Signatures. Consider game of Fig. 5 for signature scheme \(\textsf{DS}\) and adversary \(\mathcal {A}_{\textsf{SUFCMA}}\). The advantage of \(\mathcal {A}_{\textsf{SUFCMA}}\) in breaking the \(\textsf{SUFCMA}\) security of \(\textsf{DS}\) is defined as .

3 Symmetric Signcryption

In this section we define syntax and security for multi-user symmetric signcryption. In symmetric signcryption, a user encrypts messages using their signing key and a symmetric key shared by a group of users. We want that nobody outside a group can learn what messages are being encrypted, and nobody at all can forge a message as having come from someone other than themself.

Syntax. A symmetric signcryption scheme \(\textsf{SS}\) specifies algorithms \(\textsf{SS}.\textsf{UserKg}\), \(\textsf{SS}.\textsf{SigEnc}\), \(\textsf{SS}.\textsf{VerDec}\), where \(\textsf{SS}.\textsf{VerDec}\) is deterministic. These algorithm use syntax , , and \(m \leftarrow \textsf{SS}.\textsf{VerDec}(g, K_{g}, u, vk_{u}, n, c, ad)\). Associated to \(\textsf{SS}\) is a group-key length \(\textsf{SS}.\textsf{gkl}\in \mathbb {N}\), a nonce space \(\textsf{SS}.\textsf{NS}\), a plaintext space \(\textsf{SS}.\textsf{MS}\subseteq \{0,1\}^*\), and an associated-data space \(\textsf{SS}.\textsf{AD}\). The user’s key generation algorithm \(\textsf{SS}.\textsf{UserKg}\) returns a key pair (skvk) where sk is a signing key and vk is the corresponding verification key. The signcryption algorithm \(\textsf{SS}.\textsf{SigEnc}\) takes a group’s identifier \(g \in \{0,1\}^*\) and its symmetric key \(K_{g}\in \{0,1\}^{\textsf{SS}.\textsf{gkl}}\), a sender’s identifier \(u \in \{0,1\}^*\) and its signing key \(sk_{u}\), a nonce \(n \in \textsf{SS}.\textsf{NS}\), a plaintext \(m \in \textsf{SS}.\textsf{MS}\), and associated data \(ad\in \textsf{SS}.\textsf{AD}\); it returns a signcryption ciphertext c. The deterministic unsigncryption algorithm \(\textsf{SS}.\textsf{VerDec}\) takes \(g, K_{g}, u, vk_{u}, n, c, ad\), where \(vk_{u}\) is the verification key of the sender u; it returns a plaintext \(m \in \{0,1\}^* \cup \{\bot \}\), where \(\bot \) indicates a failure to recover a plaintext. We say that \(\textsf{SS}\) is deterministic \(\textsf{SS}.\textsf{SigEnc}\) is deterministic. Correctness is defined in the natural way.

Fig. 6.
figure 6

Left pane: Game defining in-group unforgeability \(\textsf{IUF}\) of a symmetric signcryption scheme \(\textsf{SS}\) with respect to a ciphertext-triviality predicate \(\textsf{pred}_{\textsf{trivial}}^{\textsf{auth}}\). Right pane: Game defining out-group authenticated-encryption security \(\textsf{OAE}\) of \(\textsf{SS}\) with respect to a ciphertext-triviality predicate \(\textsf{pred}_{\textsf{trivial}}^{\textsf{sec}}\) and an output-guarding function \(\textsf{func}_{\textsf{out}}^{\textsf{sec}}\). Bottom pane: User oracles \(\textsc {U}= \{{\textsc {NewHonUser}, \textsc {ExposeUser}, \textsc {NewCorrUser}}\}\) and group oracles \(\textsc {G}= \{{\textsc {NewHonGroup}, \textsc {ExposeGroup}, \textsc {NewCorrGroup}}\}\) that are provided to an adversary in either game, except that the code only appears in the \(\textsf{OAE}\) security game.

3.1 In-Group Unforgeability

The strongest variant of in-group unforgeability requires that an attacker cannot modify anything about ciphertexts. We also capture weaker variants. For example, the \(\textsf{SealPacket}\) encryption algorithm in Keybase (as defined in Sect. 4) uses a signing key to bind its ciphertexts to a group’s symmetric key but not to a group’s identifier. So we parameterize our security definition in order to capture the type of authenticity that is as restrictive as possible except for allowing (what can be described as) cross-group forgeries.

IUF Game. Consider game \(\mathscr {G}^{\textsf{IUF}}\) of Fig. 6, defined for symmetric signcryption scheme \(\textsf{SS}\), ciphertext-triviality predicate \(\textsf{pred}_{\textsf{trivial}}^{\textsf{auth}}\), and adversary \(\mathcal {A}_{\textsf{IUF}}\). The advantage of \(\mathcal {A}_{\textsf{IUF}}\) in breaking the \(\textsf{IUF}\) security of \(\textsf{SS}\) is defined as . Adversary \(\mathcal {A}_\textsf{IUF}\) is given access to user oracles \(\textsc {U}\), group oracles \(\textsc {G}\), encryption oracle \(\textsc {SigEnc}\), and decryption oracle \(\textsc {VerDec}\). Its goal is to set the \(\textsf{win}\) flag by forging a ciphertext for an honest user.

Among user oracles, \(\textsc {NewHonUser}\) creates honest users with honestly generated signing keys, \(\textsc {NewCorrUser}\) creates corrupt users with malicious signing keys, and \(\textsc {ExposeUser}\) exposes signing keys of existing users. Among group oracles, \(\textsc {NewHonGroup}\) creates honest groups with honestly sampled symmetric keys, \(\textsc {NewCorrGroup}\) creates corrupt groups with malicious symmetric keys, and \(\textsc {ExposeGroup}\) exposes symmetric keys of existing groups. Oracles \(\textsc {NewHonGroup}\) and \(\textsc {NewCorrGroup}\) take as input a set \(\textsf{users}\) identifying the new group’s users; the encryption and decryption oracles then disallow queries that match a group to a non-member user. The user and group oracles use tables \(\mathsf {user\_is\_corrupt}\) and \(\mathsf {group\_is\_corrupt}\) in order to keep track of the users and groups whose keys are not secure, respectively. The \(\textsf{IUF}\) game never checks \(\mathsf {group\_is\_corrupt}\), deliberately giving the adversary full control over group keys.

The encryption oracle \(\textsc {SigEnc}\) takes \(({g, u, n, m, ad})\) and returns a ciphertext c that is produced by running \(\textsf{SS}.\textsf{SigEnc}({g, \textsf{K}[g], u , \textsf{sk}[u], n, m, ad})\). Here note that the group and user keys \(\textsf{K}[g]\) and \(\textsf{sk}[u]\) are the only two inputs to \(\textsf{SS}.\textsf{SigEnc}\) that are not directly chosen by the adversary at the moment of querying the \(\textsc {SigEnc}\) oracle. At the end of each \(\textsc {SigEnc}\) query, the set \(C\) is updated to add the tuple \(((g, u, n, m, ad),c)\) that can be interpreted as containing the input-output transcript of this query.

The decryption oracle \(\textsc {VerDec}\) takes \(({g, u, n, c, ad})\) and returns the message m that is recovered by running \(\textsf{SS}.\textsf{VerDec}({g, \textsf{K}[g], u, \textsf{vk}[u], n, c, ad})\). Keys \(\textsf{K}[g],\textsf{sk}[g]\) are the only inputs to \(\textsf{SS}.\textsf{VerDec}\) not directly chosen by the adversary. If \(m \ne \bot \), then the oracle determines if the current oracle query is a valid forgery and sets the \(\textsf{win}\) flag if so. In particular, \(\textsc {VerDec}\) builds the tuple \(z = ((g, u, n, m, ad), c)\) with all input and output values of the current decryption query. It checks z against the set \(C\) that contains the input-output behavior of all the prior encryption queries. If z is determined to be trivially obtainable from the information in \(C\), then \(\textsc {VerDec}\) exits early (with m as its output value); otherwise it sets the \(\textsf{win}\) flag. This check is performed by the ciphertext-triviality predicate \(\textsf{pred}_{\textsf{trivial}}^{\textsf{auth}}\). We will describe the syntax and the sample variants of \(\textsf{pred}_{\textsf{trivial}}^{\textsf{auth}}\) below.

Ciphertext-Triviality Predicates. The \(\textsf{IUF}\) security game is parameterized by ciphertext-triviality predicate \(\textsf{pred}_{\textsf{trivial}}^{\textsf{auth}}\) (we will also parameterize the \(\textsf{OAE}\) game with \(\textsf{pred}_{\textsf{trivial}}^{\textsf{auth}}\)). Predicate \(\textsf{pred}_{\textsf{trivial}}^{\textsf{auth}}\) takes a tuple \(z = ((g, u, n, m, ad),c)\) and a set \(C\) as input, where \(C\) contains tuples of the same format. Here z describes the input-output values of the current query to \(\textsc {VerDec}\) oracle and each element of \(C\) contains an input-output transcript of a prior \(\textsc {SigEnc}\) oracle query. Predicate \(\textsf{pred}_{\textsf{trivial}}^{\textsf{auth}}\) returns \(\textsf{true}\) if z is considered to be trivially forgeable based on the information in \(C\) and \(\textsf{false}\) otherwise.

Fig. 7.
figure 7

Sample ciphertext-triviality predicates which capture rules for deciding if a successfully decrypted \(\textsc {VerDec}\) query was trivially obtainable or forgeable.

In Fig. 7 we define several ciphertext-triviality predicates. Predicate \(\textsf{pred}_{\textsf{trivial}}^{\textsf{suf}}\) checks if \(z \in C\), capturing the strongest possible level of authenticity. This requires that only prior outputs of \(\textsc {SigEnc}\) can be successfully queried to the \(\textsc {VerDec}\) oracle; any other successful decryption query causes the adversary to win the \(\textsf{IUF}\) game. This predicate can be thought of as making the \(\textsf{IUF}\) game capture the “strong” unforgeability of ciphertexts in our group setting. One could capture existential unforgeability by considering the predicate \(\textsf{pred}_{\textsf{trivial}}^{\textsf{euf}}\) that does not allow the adversary to win by merely producing new ciphertexts that decrypt to some tuple \((g, u, n, m, ad)\) previously queried to \(\textsc {SigEnc}\). Predicates \(\textsf{pred}_{\textsf{trivial}}^{\mathsf {suf\text {-}{}except\text {-}{}group}}\) and \(\textsf{pred}_{\textsf{trivial}}^{\mathsf {suf\text {-}{}except\text {-}{}user}}\) capture the authenticity of schemes where a ciphertext encrypting \((g, u, n, m, ad)\) is not bound to the group’s identifier or to the user’s identifier, respectively. We use \(\textsf{pred}_{\textsf{trivial}}^{\textsf{suf}}\), \(\textsf{pred}_{\textsf{trivial}}^{\mathsf {suf\text {-}{}except\text {-}{}group}}\) and \(\textsf{pred}_{\textsf{trivial}}^{\mathsf {suf\text {-}{}except\text {-}{}user}}\) in our security analysis of Keybase. In this work, we do not use \(\textsf{pred}_{\textsf{trivial}}^{\mathsf {suf\text {-}{}except\text {-}{}user}}\) with the \(\textsf{IUF}\) game – we need it for \(\textsf{OAE}\).

3.2 Out-Group Authenticated Encryption

The strongest version of the out-group AE security requires that an attacker outside a chat group can neither learn any information about the exchanged messages, nor modify the exchanged ciphertexts in any way. We also capture weaker variants of this security notion. For example, the \(\textsf{SealPacket}\) encryption algorithm (as defined in Sect. 4) does not use a group’s symmetric key to explicitly bind its ciphertexts to a user’s signing key or a user’s identifier when used in isolation. So we capture a variant of out-group AE security that is as restrictive as possible except for allowing an attacker to violate the sender’s authenticity within any particular group.

OAE Game. Consider game of Fig. 6 for symmetric signcryption scheme \(\textsf{SS}\), ciphertext-triviality predicate \(\textsf{pred}_{\textsf{trivial}}^{\textsf{auth}}\), output-guarding function \(\textsf{func}_{\textsf{out}}^{\textsf{sec}}\), and adversary \(\mathcal {A}_{\textsf{OAE}}\). The advantage in breaking the \(\textsf{OAE}\) security of \(\textsf{SS}\) is defined as . Adversary \(\mathcal {A}_\textsf{OAE}\) is given access to user and group oracles \(\textsc {U}\) and \(\textsc {G}\) and to the encryption and decryption oracles \(\textsc {SigEnc}\) and \(\textsc {VerDec}\). The goal of the adversary is to guess the challenge bit b. Our security game is defined in the all-in-one style of [29, 32], where an adversary can learn the challenge bit by forging a ciphertext to its decryption oracle.

The user and group oracles in the \(\textsf{OAE}\) game are defined as in the \(\textsf{IUF}\) game, except it does not allow calling the \(\textsc {ExposeGroup}\) oracle to expose the key of a group that was previously used for a left-or-right challenge-encryption query (as explained below). The \(\textsf{OAE}\) game never checks the contents of \(\mathsf {user\_is\_corrupt}\), deliberately giving the adversary full control over user keys.

The encryption oracle \(\textsc {SigEnc}\) takes \(({g, u, n, m_0, m_1, ad})\) and returns a ciphertext c by running \(\textsf{SS}.\textsf{SigEnc}({g, \textsf{K}[g], u , \textsf{sk}[u], n, m_b, ad})\). The group and user keys \(\textsf{K}[g]\) and \(\textsf{sk}[u]\) are the only inputs to \(\textsf{SS}.\textsf{SigEnc}\) not directly chosen by the adversary querying the \(\textsc {SigEnc}\) oracle (and the encrypted message \(m_b\) depends on the challenge bit). The \(\textsc {SigEnc}\) query requires that \(\left| m_0\right| = \left| m_1\right| \) and will only use insecure group keys for non-challenge encryptions (i.e. for \(m_0 = m_1\)). This \(\textsc {SigEnc}\) oracle captures nonce-misuse resistance [30], using the sets \(N_d\) to prevent trivial wins. At the end of \(\textsc {SigEnc}\) queries, the set \(C\) is updated to add the tuple \(((g, u, n, m_b, ad),c)\), and the set \(Q\) is updated to add the tuple \(((g, u, n, m_0, m_1, ad),c)\). Here the \(Q\) set can contain the input-output “transcript” of \(\textsc {SigEnc}\) queries from the adversary’s point of view, whereas the set \(C\) is more informative because it contains the message that was actually encrypted. We will explain the purpose of these sets below.

The decryption oracle \(\textsc {VerDec}\) takes \(({g, u, n, c, ad})\) and returns the message m output by \(\textsf{SS}.\textsf{VerDec}({g, \textsf{K}[g], u, \textsf{vk}[u], n, c, ad})\). Keys \(\textsf{K}[g], \textsf{sk}[g]\) are the only inputs to \(\textsf{SS}.\textsf{VerDec}\) not directly chosen by the adversary querying the \(\textsc {VerDec}\) oracle. The \(\textsc {VerDec}\) oracle disallows queries with corrupt group keys; if an adversary knows a group’s key then it can decrypt ciphertexts for the group on its own. If \(\textsf{SS}.\textsf{VerDec}\) recovers a non-\(\bot \) message m and the end of the \(\textsc {VerDec}\) oracle is reached, then the challenge bit is meant to be revealed through returning m if \(b=1\) and \(\bot \) otherwise. However, this intuition is not precise; it depends on how \(\textsc {VerDec}\) responds to queries that are identified as being trivially forgeable. Similarly to how trivial forgeries were handled in the \(\textsf{IUF}\) game, here \(\textsc {VerDec}\) builds \(z = ((g, u, n, m, ad), c)\) and uses a ciphertext-triviality predicate \(\textsf{pred}_{\textsf{trivial}}^{\textsf{sec}}\) to check z against the set \(C\) from \(\textsc {SigEnc}\). If z is considered not trivially obtainable from the information in \(C\), then \(\textsc {VerDec}\) proceeds to its last instruction that returns \(\bot \) or m depending on the challenge bit. Otherwise, \(\textsc {VerDec}\) should return an output that does not depend on the challenge bit to prevent trivial wins. Such an output is produced by the output-guarding function \(\textsf{func}_{\textsf{out}}^{\textsf{sec}}\), i.e. \(\textsc {VerDec}\) returns the output of \(\textsf{func}_{\textsf{out}}^{\textsf{sec}}(z, Q)\). We now describe the syntax and variants of \(\textsf{func}_{\textsf{out}}^{\textsf{sec}}\).

Output-Guarding Functions. The \(\textsf{OAE}\) game can be parameterized by different choices of an output-guarding function \(\textsf{func}_{\textsf{out}}^{\textsf{sec}}\). We define \(\textsf{func}_{\textsf{out}}^{\textsf{sec}}\) to take a tuple \(z = ((g, u, n, m, ad),c)\) and a set \(Q\) as input, where \(Q\) contains tuples with the format \(((g, u, n, m_0, m_1, ad),c)\). Here z describes the input-output values of a single query to the \(\textsc {VerDec}\) oracle, and each element of \(C\) specifies the input-output of a prior \(\textsc {SigEnc}\) oracle query. At a high level, z contains the message m that was recovered during an ongoing \(\textsc {VerDec}\) call, and m is the only value in \(z, Q\) not necessarily known by the adversary. One might want to define \(\textsc {VerDec}\) to return m whenever the input is identified as a trivial forgery, but m could potentially trivially reveal the challenge bit. So one could roughly think of \(\textsf{func}_{\textsf{out}}^{\textsf{sec}}\) as the function that should enable \(\textsc {VerDec}\) to return m when possible. However, it should determine – from z and \(Q\) – if m would trivially help the adversary win and then “guard” \(\textsc {VerDec}\) against returning this m.

Fig. 8.
figure 8

Sample output-guarding functions \(\textsf{func}_{\textsf{out}}^{\bot }\) and \(\textsf{func}_{\textsf{out}}^{\mathsf {silence\text {-}{}with\text {-}{}m_1}}\). Function \(\textsf{func}_{\textsf{out}}^{\mathsf {silence\text {-}{}with\text {-}{}m_1}}\) is parameterized by a ciphertext-triviality predicate \(\textsf{pred}_{\textsf{trivial}}^{}\).

In Fig. 8 we define two output-guarding functions. The function \(\textsf{func}_{\textsf{out}}^{\bot }\) always returns \(\bot \). This provides no useful information to the adversary and so captures a comparatively weaker security notion. The function \(\textsf{func}_{\textsf{out}}^{\mathsf {silence\text {-}{}with\text {-}{}m_1}}[\textsf{pred}_{\textsf{trivial}}^{}]\) is parameterized by an arbitrary ciphertext-triviality predicate \(\textsf{pred}_{\textsf{trivial}}^{}\) and captures the following logic. For every element in \(Q\) that describes a challenge encryption (i.e. \(m_0 \ne m_1\)) performed by \(\textsc {SigEnc}\), this function checks whether z is trivially forgeable based on the information that the adversary could have learned from the corresponding response. This is checked if z would be trivially forgeable for both choices of \(b \in \{0,1\}\) or only for only one choice of b. The output-guarding function returns \(m_1\) when this condition passes. If no element of \(Q\) triggered the above, then the output-guarding function returns the m contained in z, i.e. the actual message recovered in \(\textsc {VerDec}\).

The Use of \(\boldsymbol{\textsf{func}}_{\boldsymbol{\textsf{out}}}^{{\boldsymbol{\textsf{silence}}\text {-}{\boldsymbol{\textsf{with}}}\text {-}{\boldsymbol{\textsf{m}}}}_1}\) in Our Work. We target \(\textsf{func}_{\textsf{out}}^{\mathsf {silence\text {-}{}with\text {-}{}m_1}}[\textsf{pred}_{\textsf{trivial}}^{}]\) as the output-guarding function that provides the strongest possible security guarantees for the schemes that we analyze in this work. For every \(\textsf{pred}_{\textsf{trivial}}^{}\) we use, \(\textsf{pred}_{\textsf{trivial}}^{}(z, \{((g, u, n, m^*, ad),c)\})\) can only be \(\textsf{true}\) when z contains \(m^*\). So for elements of \(Q\) with \(m_0 \ne m_1\) only one of the two if conditions can pass, meaning it is necessary to silence the output. Otherwise the adversary can trivially win the game by building \(z, Q\) and evaluating \(\textsf{pred}_{\textsf{trivial}}^{}\) to distinguish between \(b=0\) or \(b=1\). (This attack assumes the adversary can always compute \(\textsf{pred}_{\textsf{trivial}}^{}(z, C)\) for \(\textsf{SS}\), in spite of not knowing the challenge bit b that is needed to explicitly build \(C\). This is true in all of our proofs.)

3.3 Symmetric Signcryption from Encryption and Signatures

In the full version, we introduce a provably secure version of Sign-then-Encrypt (\(\textsf{StE}\)). Its signcryption algorithm signs then outputs ciphertext \(c \leftarrow \textsf{NE}.\textsf{Enc}(K_{g}, n, s \,\Vert \,m, \langle {u, ad}\rangle )\). We prove bounds of the form \(\textsf{Adv}^{\textsf{IUF}}_{\textsf{StE},\textsf{pred}_{\textsf{trivial}}^{\textsf{suf}}}(\mathcal {A}_{\textsf{IUF}}) \le \textsf{Adv}^{\textsf{SUFCMA}}_{\textsf{DS}}(\mathcal {A}_{\textsf{SUFCMA}})\) and \(\textsf{Adv}^{\textsf{OAE}}_{\textsf{StE},\textsf{pred}_{\textsf{trivial}}^{\textsf{suf}},\textsf{func}_{\textsf{out}}^{\bot }}(\mathcal {A}_{\textsf{OAE}}) \le \textsf{Adv}^{\textsf{AEAD}}_{\textsf{NE}}(\mathcal {A}_{\textsf{AEAD}})\).

Fig. 9.
figure 9

The \(\textsf{BoxMessage} \) and \(\textsf{SealPacket} \) algorithms used in Keybase for encrypting chat messages from a user to a group. Here g is the group’s identifier, \(K_{g}\) is the symmetric key shared by all group members, u is the sender’s identifier, and \(sk_{u}\) is the sender’s signing key.

4 Keybase Chat Encryption as Symmetric Signcryption

We analyze the security of the cryptographic algorithm \(\textsf{BoxMessage} \) that Keybase uses to encrypt and authenticate chat messages from a sender to a group. \(\textsf{BoxMessage} \) combines multiple cryptographic primitives to offer end-to-end encrypted messaging. In particular it uses \(\mathsf {XSalsa20\text {-}{}Poly1305} \), \(\mathsf {SHA\text {-}{}256} \), \(\mathsf {SHA\text {-}{}512} \), and \(\textsf{Ed25519} \) as building blocks. Within \(\textsf{BoxMessage} \), the \(\textsf{SealPacket} \) subroutine encrypts and authenticates message headers. We show the pseudocode for these algorithms in Fig. 9. We omit the decryption algorithms \(\textsf{BoxMessage}.\textsf{VerDec}\) and \(\textsf{SealPacket}.\textsf{VerDec}\) from Fig. 9 as Keybase’s implementation of these algorithms follows naturally from the corresponding \(\textsf{SigEnc}\) algorithms. We define the \(\textsf{VerDec}\) algorithms explicitly in our formalazation of \(\textsf{BoxMessage} \) and \(\textsf{SealPacket} \).

To formalize the security of \(\textsf{BoxMessage} \), it is crucial to first identify the formal primitive underlying this algorithm and the security goals it aims to achieve. None of the existing primitives in literature seem to aptly model this object, but it is naturally captured by the symmetric signcryption primitive that we defined in Sect. 3. Similarly, \(\textsf{SealPacket} \) can also be modeled as a symmetric signcryption scheme from which \(\textsf{BoxMessage} \) is built. In this section, we present modular constructions that cast \(\textsf{BoxMessage} \) and \(\textsf{SealPacket} \) as symmetric signcryption schemes. We first provide a general overview of the two algorithms.

The BoxMessage Chat-Encryption Algorithm. The \(\textsf{BoxMessage}.\textsf{SigEnc}\) algorithm accepts the following inputs – group’s identifier g, symmetric group key \(K_{g}\), sender identifier u, sender signing key \(sk_{u}\), nonce \(n=(n_{\textsf{body}},n_{\textsf{header}})\), message \(m_{\textsf{body}}\), and associated data \(ad\). It performs the following steps. First it calls \(\mathsf {XSalsa20\text {-}{}Poly1305}.\textsf{Enc}\) to encrypt \(m_{\textsf{body}}\) using key \(K_{g}\) and nonce \(n_{\textsf{body}}\), and obtains the ciphertext \(c_{\textsf{body}}\). It builds header plaintext \(m_{\textsf{header}}\) as \(\langle {ad,u,g,h_{\textsf{body}}}\rangle \) (a unique encoding of \(ad\), u, g, and hash \(h_{\textsf{body}}=\mathsf {SHA\text {-}{}256} (n_{\textsf{body}}\,\Vert \,c_{\textsf{body}})\)). It then invokes \(\textsf{SealPacket}.\textsf{SigEnc}\) to encrypt \(m_{\textsf{header}}\) using \(sk_{u}\), \(K_{g}\), and \(n_{\textsf{header}}\), and obtains the ciphertext \(c_{\textsf{header}}\). Finally, it returns \((c_{\textsf{body}},c_{\textsf{header}})\). To decrypt ciphertext \((c_{\textsf{body}},c_{\textsf{header}})\), the algorithm \(\textsf{BoxMessage}.\textsf{VerDec}\) (not shown) ensures that \(c_{\textsf{header}}\) decrypts into the header plaintext \(m_{\textsf{header}}\) that is equal to the unique string \(\langle {ad,u,g,h_{\textsf{body}}}\rangle \) composed from the inputs of \(\textsf{BoxMessage}.\textsf{VerDec}\). In Keybase, the sender identifier u is their username and the group identifier g is constructed canonically from the usernames of the group members.

The SealPacket Header-Encryption Algorithm. The \(\textsf{SealPacket} \) algorithm accepts the same inputs as \(\textsf{BoxMessage} \), except it does not take associated data \(ad\) as input. We capture this by setting \(\textsf{SealPacket}.\textsf{AD}= \{\varepsilon \}\), meaning \(ad=\varepsilon \) is always true When \(\textsf{SealPacket}.\textsf{SigEnc}\) is called from \(\textsf{BoxMessage}.\textsf{SigEnc}\), it encrypts chat headers. To encrypt m with nonce n, it starts by hashing m to obtain \(h = \mathsf {SHA\text {-}{}512} (m)\). Then it builds an input \(m_{s}\) to the \(\textsf{Ed25519} \) signature scheme by concatenating the prefix string \(\text { ``Keybase-Chat-2''}\) with the unique encoding \(\langle {K_{g},n,h}\rangle \) of \(K_{g}\), n, and h. It invokes \(\textsf{Ed25519}.\textsf{Sig}\) to produce a signature s over \(m_{s}\) using the signing key \(sk_{u}\). Finally it calls \(\mathsf {XSalsa20\text {-}{}Poly1305}.\textsf{Enc}\) to encrypt \(m_{e} = s\,\Vert \,m\) using the key \(K_{g}\) and nonce n, and obtains the ciphertext c which is returned To decryption ciphertext c, the \(\textsf{SealPacket}.\textsf{VerDec}\) algorithm (not shown) first recovers \(m_{e}\) from c and then parses \(m_{e}\) to obtain \(s\,\Vert \,m\). Note that \(m_{e}\) can be unambiguously parsed into \(s\,\Vert \,m\) because \(\textsf{Ed25519} \) produces fixed-length signatures. Then \(\textsf{SealPacket}.\textsf{VerDec}\) reconstructs \(m_{s}\) and ensures that s verifies as a valid signature for \(m_{s}\) under the sender’s public key \(vk_{u}\). We study the security of \(\textsf{SealPacket} \) in the context of the \(\textsf{BoxMessage} \) algorithm, but this is not the only context in which Keybase uses \(\textsf{SealPacket} \). It is also used independently for the encryption of long strings and attachments. In the full version we detail other uses of \(\textsf{SealPacket} \) in Keybase.

Analysis Challenges. The descriptions of \(\textsf{BoxMessage} \) and \(\textsf{SealPacket} \) that we have given so far already present the following challenges in their analysis.

Key Reuse in \(\textsf{BoxMessage}.\) The same symmetric key \(K_{g}\) is used in \(\textsf{BoxMessage} \) and \(\textsf{SealPacket} \). This violates the principle of key separation, which says that one should always use distinct keys for distinct algorithms and modes of operation. Without context separation, this potentially allows an attacker to forward ciphertexts produced by one algorithm to another. There is no explicit context separation, so our analysis will “extract” separation by making assumptions of \(\textsf{Ed25519} \) and using low-level details of how messages are encoded.

Cyclic Key Dependency in \(\textsf{SealPacket}.\) The message \(m_{s}\) signed in \(\textsf{SealPacket} \) is derived from the symmetric group key \(K_{g}\) which is also used to encrypt the signature. This produces what is known as an “encryption cycle”, a generalization of encrypting one’s own key [19]. Standard \(\textsf{AEAD}\) security does not guarantee security when messages being encrypted depend on the key used for encryption. We use an extension of \(\textsf{AEAD}\) security allowing key-dependent messages and prove (in the random oracle model) that \(\mathsf {XSalsa20\text {-}{}Poly1305} \) achieves it for the particular key-dependent messages required.

Lack of Group/User Binding in \(\textsf{SealPacket}.\) By looking at the \(\textsf{SealPacket} \) algorithm in Fig. 9 we can see that the inputs u and g are never used by the algorithm. This means that a \(\textsf{SealPacket} \) ciphertext does not, in general, bind to the group’s or user’s identifiers. This could potentially allow a malicious user to impersonate another group member. When \(\textsf{SealPacket} \) is used within \(\textsf{BoxMessage} \), it is always invoked on a message that contains the group’s and the user’s identifier, so the lack of group/user binding in \(\textsf{SealPacket} \) is not consequential there.

Nonce Repetition in Keybase. \(\mathsf {XSalsa20\text {-}{}Poly1305} \) is not secure when nonces repeat so our security analysis disallows nonce repetition between \(\textsf{BoxMessage} \) and/or \(\textsf{SealPacket} \). The Keybase implementation uses uniformly random nonces, making collisions highly unlikely. Moreover, our results show that \(\textsf{BoxMessage} \) is robust to accidental non-uniformity in randomness as long nonces do not repeat. The \(\mathsf {XSalsa20\text {-}{}Poly1305} \) authenticated encryption scheme combines the \(\textsf{XSalsa20} \) stream cipher and the \(\textsf{Poly1305} \) one-time message authentication code. The stream is derived from the key and nonce and is used for keying \(\textsf{Poly1305} \), so if nonces repeat then privacy and integrity may both be broken.

Message Encryption Scheme BM. Our modular symmetric signcryption construction \(\textsf{BM}\) models the \(\textsf{BoxMessage} \) chat-encryption algorithm as follows.

Fig. 10.
figure 10

Symmetric signcryption scheme \(\textsf{BM}= \mathsf {BOX\text {-}{}MESSAGE\text {-}{}\textsf{SS}}[{\mathcal {M}, \textsf{NE}, \textsf{H}, \textsf{SP}}]\). The right-aligned comments provide a guideline for modeling Keybase.

Construction 1

Let \(\mathcal {M}\subseteq \{0,1\}^*\). Let \(\textsf{NE}\) be a nonce-based encryption scheme. Let \(\textsf{H}\) be a hash function. Let \(\textsf{SP}\) be a deterministic symmetric signcryption scheme. Then \(\textsf{BM}= \mathsf {BOX\text {-}{}MESSAGE\text {-}{}\textsf{SS}}[{\mathcal {M}, \textsf{NE}, \textsf{H}, \textsf{SP}}]\) is the deterministic symmetric signcryption scheme as defined in Fig. 10, with message space \(\textsf{BM}.\textsf{MS}= \mathcal {M}\) and associated-data space \(\textsf{BM}.\textsf{AD}= \{0,1\}^*\). We require the following. The group key taken by \(\textsf{BM}\) is used as the key for both \(\textsf{NE}\) and \(\textsf{SP}\), so \(\textsf{BM}.\textsf{gkl}= \textsf{NE}.\textsf{kl}= \textsf{SP}.\textsf{gkl}\). The nonce taken by \(\textsf{BM}\) is a pair containing a separate nonce for each of \(\textsf{NE}\) and \(\textsf{SP}\), so \(\textsf{BM}.\textsf{NS}= \{0,1\}^{\textsf{NE}.\textsf{nl}} \times \textsf{SP}.\textsf{NS}\).

Header Encryption Scheme SP. Our modular symmetric signcryption construction \(\textsf{SP}\) models the header-encryption algorithm \(\textsf{SealPacket} \) as follows.

Fig. 11.
figure 11

Symmetric signcryption scheme \(\textsf{SP}= \mathsf {SEAL\text {-}{}PACKET\text {-}{}\textsf{SS}}[\textsf{H}, \textsf{DS}, \textsf{NE}]\). The right-aligned comments provide a guideline for modeling Keybase.

Construction 2

Let \(\textsf{H}\) be a hash function. Let \(\textsf{DS}\) be a deterministic digital signature scheme. Let \(\textsf{NE}\) be a nonce-based encryption scheme. Then \(\textsf{SP}= \mathsf {SEAL\text {-}{}PACKET\text {-}{}\textsf{SS}}[{\textsf{H}, \textsf{DS}, \textsf{NE}}]\) is the symmetric signcryption scheme as defined in Fig. 11, with group-key length \(\textsf{SP}.\textsf{gkl}= \textsf{NE}.\textsf{kl}\), nonce space \(\textsf{SP}.\textsf{NS}= \{0,1\}^{\textsf{NE}.\textsf{nl}}\), message space \(\textsf{SP}.\textsf{MS}= \{0,1\}^*\), and associated-data space \(\textsf{SP}.\textsf{AD}= \{\varepsilon \}\).

5 Security Analysis of Keybase Chat Encryption

In this section we analyze the security of the symmetric signcryption schemes \(\textsf{BM}\) and \(\textsf{SP}\) defined in Sect. 4. In Sect. 5.1, we show the in-group unforgeability of \(\textsf{BM}\) and \(\textsf{SP}\). In Sects. 5.2 and 5.3, we show the out-group AE security of \(\textsf{BM}\) and \(\textsf{SP}\). This requires us to introduce two weaker variants of the \(\textsf{OAE}\) security notion, one each for \(\textsf{BM}\) and \(\textsf{SP}\), by relaxing the level of nonce-misuse requirements of the \(\textsf{OAE}\) game defined in Fig. 6. The \(\textsf{SP}\) analysis requires two new security notions, \(\mathcal {M}\)-sparsity for digital signature schemes and authenticated encryption for key-dependent messages for nonce-based encryption schemes.

5.1 In-Group Unforgeability of \(\textsf{BoxMessage}\) and \(\textsf{SealPacket}\)

In-Group Unforgeability of BoxMessage. In-group unforgeability of \(\textsf{BM}= \mathsf {BOX\text {-}{}MESSAGE\text {-}{}\textsf{SS}}[{\mathcal {M}, \textsf{NE}, \textsf{H}, \textsf{SP}}]\) reduces to the security of \(\textsf{SP}\) and \(\textsf{H}\). A \(\textsf{BM}\) ciphertext is a pair \((c_{\textsf{body}},c_{\textsf{header}})\) comprising an \(\textsf{NE}\) ciphertext \(c_{\textsf{body}}\) and an \(\textsf{SP}\) ciphertext \(c_{\textsf{header}}\), which encrypts \(\langle {ad,u,g,h_{\textsf{body}}}\rangle \). The adversary’s objective is to forge a \(\textsf{BM}\) ciphertext by either forging \(c_{\textsf{body}}\) or \(c_{\textsf{header}}\). The adversary can use a corrupt group key \(K_g\), so \(c_{\textsf{body}}\) ciphertexts are easily forged. However, this does not suffice to produce a \(\textsf{BM}\) forgery because \(c_{\textsf{header}}\) encrypts the hash of \(c_{\textsf{body}}\). Therefore, it would need to forge a corresponding \(c_{\textsf{header}}\) ciphertext. The \(\textsf{IUF}\) security of \(\textsf{SP}\) prevents the adversary from forging \(c_{\textsf{header}}\) ciphertexts. As a result, the adversary can only reuse honestly generated \(c_{\textsf{header}}\) from its prior queries to \(\textsc {SigEnc}\) in its forgery attempts. Since an honest \(c_{\textsf{header}}\) effectively commits to \(ad\), u, g, \(h_{\textsf{body}}\), and \(n_{\textsf{header}}\), using an old \(c_{\textsf{header}}\) to construct a new \(\textsf{BM}\) ciphertext requires finding a new \(\textsf{NE}\) nonce-ciphertext pair that hashes to the same \(h_{\textsf{body}}\) under \(\textsf{H}\). Collision resistance of \(\textsf{H}\) prevents this. The formal proof of Theorem 1 is in the full version.

Theorem 1

Let \(\textsf{BM}= \mathsf {BOX\text {-}{}MESSAGE\text {-}{}\textsf{SS}}[{\mathcal {M}, \textsf{NE}, \textsf{H}, \textsf{SP}}]\) be the symmetric signcryption scheme built from some \({\mathcal {M}, \textsf{NE}, \textsf{H}, \textsf{SP}}\) as specified in Construction 1. Let \(\textsf{pred}_{\textsf{trivial}}^{\textsf{suf}}\) and \(\textsf{pred}_{\textsf{trivial}}^{\mathsf {suf\text {-}{}except\text {-}{}group}}\) be the ciphertext-triviality predicates as defined in Fig. 7. Let \(\mathcal {A}_{{\textsf{IUF}\text {-}{}\textsf{of}\text {-}{}\textsf{BM}}}\) be any adversary against the \(\textsf{IUF}\) security of \(\textsf{BM}\) with respect to \(\textsf{pred}_{\textsf{trivial}}^{\textsf{suf}}\). Then we can build adversaries \(\mathcal {A}_{{\textsf{IUF}\text {-}{}\textsf{of}\text {-}{}\textsf{SP}}}\) and \(\mathcal {A}_{\textsf{CR}}\) such that

$$\begin{aligned} \textsf{Adv}_{\textsf{BM},\textsf{pred}_{\textsf{trivial}}^{\textsf{suf}}}^{\textsf{IUF}}(\mathcal {A}_{{\textsf{IUF}\text {-}{}\textsf{of}\text {-}{}\textsf{BM}}}) \le \textsf{Adv}_{\textsf{SP},\textsf{pred}_{\textsf{trivial}}^{\mathsf {suf\text {-}{}except\text {-}{}group}}}^{\textsf{IUF}}(\mathcal {A}_{{\textsf{IUF}\text {-}{}\textsf{of}\text {-}{}\textsf{SP}}}) + \textsf{Adv}^{\textsf{CR}}_{\textsf{H}}(\mathcal {A}_{\textsf{CR}}). \end{aligned}$$

In-Group Unforgeability of SealPacket. In-group unforgeability of the symmetric signcryption scheme \(\textsf{SP}= \mathsf {SEAL\text {-}{}PACKET\text {-}{}\textsf{SS}}[{\textsf{H}, \textsf{DS}, \textsf{NE}}]\) reduces to the security of \(\textsf{DS}\) and \(\textsf{H}\). We parameterize the \(\textsf{IUF}\) security of \(\textsf{SP}\) to aim for a relaxed version of strong unforgeability because \(\textsf{SP}\) ciphertexts do not directly depend on the group’s identifier g (even though it depends on the group key \(K_g\)).

An \(\textsf{SP}\) ciphertext encrypts \(s\,\Vert \,m\) under \(K_{g}\). The adversary can use a corrupt \(K_{g}\), but forging an \(\textsf{SP}\) ciphertext still requires the signature s. So the adversary must either forge a new signature or reuse an honest signature from a prior \(\textsc {SigEnc}\) query. The \(\textsf{SUFCMA}\) security of \(\textsf{DS}\) prevents the former. An honest signature s is computed over \(\text { ``Keybase-Chat-2''}\,\Vert \,\langle {K_g, n, h}\rangle \) where h is the hash of the message m. Hence reusing an honest signature could use a new \(\textsf{SP}\) ciphertext that encrypts \(s \,\Vert \,m\) with \(K_g, n\), but the tidiness of \(\textsf{NE}\) prevents this. So reusing an honest signature requires finding a new message that hashes to the same h under \(\textsf{H}\). Collision resistance of \(\textsf{H}\) prevents this. The formal proof of Theorem 2 is in the full version.

Theorem 2

Let \(\textsf{SP}= \mathsf {SEAL\text {-}{}PACKET\text {-}{}\textsf{SS}}[{\textsf{H}, \textsf{DS}, \textsf{NE}}]\) be the symmetric signcryption scheme built from some \(\textsf{H}\), \(\textsf{DS}\), and \(\textsf{NE}\) as specified in Construction 2. Let \(\textsf{pred}_{\textsf{trivial}}^{\mathsf {suf\text {-}{}except\text {-}{}group}}\) be the ciphertext-triviality predicate as defined in Fig. 7. Let \(\mathcal {A}_{{\textsf{IUF}\text {-}{}\textsf{of}\text {-}{}\textsf{SP}}}\) be any adversary against the \(\textsf{IUF}\) security of \(\textsf{SP}\) with respect to \(\textsf{pred}_{\textsf{trivial}}^{\mathsf {suf\text {-}{}except\text {-}{}group}}\). Then we can build adversaries \(\mathcal {A}_{\textsf{SUFCMA}}\) and \(\mathcal {A}_{\textsf{CR}}\) such that

$$\begin{aligned} \textsf{Adv}_{\textsf{SP},\textsf{pred}_{\textsf{trivial}}^{\mathsf {suf\text {-}{}except\text {-}{}group}}}^{\textsf{IUF}}(\mathcal {A}_{{\textsf{IUF}\text {-}{}\textsf{of}\text {-}{}\textsf{SP}}}) \le \textsf{Adv}^{\textsf{SUFCMA}}_{\textsf{DS}}(\mathcal {A}_{\textsf{SUFCMA}}) + \textsf{Adv}^{\textsf{CR}}_{\textsf{H}}(\mathcal {A}_{\textsf{CR}}). \end{aligned}$$

5.2 Out-Group AE Security of \(\textsf{BoxMessage} \)

Out-group AE security of \(\textsf{BM}= \mathsf {BOX\text {-}{}MESSAGE\text {-}{}\textsf{SS}}[{\mathcal {M}, \textsf{NE}, \textsf{H}, \textsf{SP}}]\) reduces to the security of its underlying primitives as summarized by the rightmost arrows of Fig. 12. At a high level, we show that \(\textsf{BM}\) achieves a variant of \(\textsf{OAE}\) security (\(\textsf{bwOAE}\)) if \(\textsf{SP}\) achieves another variant of \(\textsf{OAE}\) security (\(\textsf{wOAE}\)) and \(\textsf{H}\) is collision-resistant. Because \(\textsf{NE}=\mathsf {XSalsa20\text {-}{}Poly1305} \) in Keybase (which is not nonce-misuse resistant), both variants disallow nonce repetition.

Fig. 12.
figure 12

A summary of the reductions that we provide for the \(\textsf{wOAE}\) security of \(\textsf{SP}\) and the \(\textsf{bwOAE}\) security of \(\textsf{BM}\).

Intuition. An \(\textsf{BM}\) ciphertext is a pair \((c_{\textsf{body}},c_{\textsf{header}})\) consisting of an \(\textsf{NE}\) ciphertext \(c_{\textsf{body}}\) and an \(\textsf{SP}\) ciphertext \(c_{\textsf{header}}\). One way the adversary could learn the challenge bit is by querying its \(\textsc {VerDec}\) oracle on a forged \(\textsf{BM}\) ciphertext that decrypts successfully. In order to accomplish that, the adversary must either forge the underlying \(\textsf{SP}\) ciphertext \(c_{\textsf{header}}\) or reuse an honestly generated \(c_{\textsf{header}}\). The former is prevented by the out-group AE security of \(\textsf{SP}\). The latter is prevented by the collision resistance of \(\textsf{H}\) because of the following. An honestly generated \(c_{\textsf{header}}\) effectively commits to \(ad\), u, g, \(h_{\textsf{body}}\), and \(n_{\textsf{header}}\). In order to reuse \(c_{\textsf{header}}\), an adversary must find a new \(\textsf{NE}\) nonce-ciphertext pair that hashes to \(h_{\textsf{body}}\), hence producing a collision. It follows that the \(\textsc {VerDec}\) oracle is essentially useless to the adversary; it can only serve to decrypt non-challenge ciphertexts that were previously returned by \(\textsc {SigEnc}\). So it remains to show that the adversary cannot learn the challenge bit solely based on the \(\textsf{BM}\) ciphertexts that it receives from \(\textsc {SigEnc}\). For any ciphertext \((c_{\textsf{body}},c_{\textsf{header}})\) returned by \(\textsc {SigEnc}\), the \(\textsf{SP}\) ciphertext \(c_{\textsf{header}}\) encrypts a hash of \(c_{\textsf{body}}\) but otherwise does not depend on the challenge bit. So the adversary gains no advantage from observing \(c_{\textsf{header}}\). Finally, the \(\textsf{AEAD}\) security of \(\textsf{NE}\) guarantees that \(c_{\textsf{body}}\) does not reveal the challenge bit.

Because the header encryption scheme \(\textsf{SP}\) and the body encryption scheme \(\textsf{NE}\) use the same symmetric key \(K_{g}\), we require integrity of \(\textsf{SP}\) ciphertexts produced using \(K_{g}\) hold even when the adversary can obtain other \(\textsf{NE}\) encryptions under the same key. Similarly, the \(\textsf{NE}\) ciphertexts generated using the symmetric key \(K_{g}\) should be indistinguishable even when the adversary can obtain \(\textsf{SP}\) encryptions and decryptions under the same key. We introduce a variant of the \(\textsf{OAE}\) game in Definition 3 to capture these joint requirements.

Restrictions on Nonce Misuse in BM and SP. We now define new variants of out-group AE security for our analysis of Keybase. The \(\textsf{BM}\) and \(\textsf{SP}\) schemes in Keybase are not nonce-misuse resistant so we modify the \(\textsf{OAE}\) game to disallow nonce repetition. We start with \(\textsf{wOAE}\) security for \(\textsf{SP}\).

Definition 1

Let \(\textsf{SS}\) be a symmetric signcryption scheme. Consider the \(\textsf{OAE}\) security game for \(\textsf{SS}\) of Fig. 6 (w.r.t. any \(\textsf{pred}_{\textsf{trivial}}^{\textsf{sec}}\), \(\textsf{func}_{\textsf{out}}^{\textsf{sec}}\)). We define a new variant of this game as follows. The instruction preventing nonce misuse

figure p

In addition, the instructions updating the nonce set

figure q

We denote the resulting game (and security notion) by \(\textsf{wOAE}\). It is a weak variant of \(\textsf{OAE}\) that does not require nonce-misuse resistance. We define an adversary’s advantage in breaking the \(\textsf{wOAE}\) security of \(\textsf{SS}\) in the natural way.

Now we define \(\textsf{bwOAE}\) security for \(\textsf{BM}\). The nonce of \(\textsf{BM}\) is a pair of two separate nonces \(n = (n_{\textsf{body}}, n_{\textsf{header}})\). The \(\textsf{bwOAE}\) security game independently applies the group-nonce uniqueness condition introduced in Definition 1 to each of \((g, n_{\textsf{body}})\) and \((g, n_{\textsf{header}})\), and it also requires that \(n_{\textsf{body}}\ne n_{\textsf{header}}\). This is a necessary because \(\textsf{BM}\) calls \(\textsf{NE}.\textsf{Enc}\) on \((g, n_{\textsf{body}})\), and \(\textsf{SP}\) calls \(\textsf{NE}.\textsf{Enc}\) on \((g, n_{\textsf{header}}\)). In Keybase both \(\textsf{NE}\) schemes are \(\mathsf {XSalsa20\text {-}{}Poly1305} \) using the same key.

Definition 2

Let \(\mathcal {X}, \mathcal {Y}\) be any sets. Let \(\textsf{SS}\) be a symmetric signcryption scheme with the nonce space \(\textsf{SS}.\textsf{NS}= \mathcal {X} \times \mathcal {Y}\). Consider the \(\textsf{OAE}\) security game for \(\textsf{SS}\) of Fig. 6 (w.r.t. any \(\textsf{pred}_{\textsf{trivial}}^{\textsf{sec}}\), \(\textsf{func}_{\textsf{out}}^{\textsf{sec}}\)). We define a new variant of this game as follows. The instruction preventing nonce misuse

figure r

In addition, the instructions updating the nonce set

figure s

We denote the resulting game (and security notion) by \(\textsf{bwOAE}\). Beyond being defined for \(\textsf{SS}\) with a bipartite nonce space, this variant of \(\textsf{OAE}\) is weak in that it does not require nonce-misuse resistance. We define an adversary’s advantage in breaking the \(\textsf{bwOAE}\) security of \(\textsf{SS}\) in the natural way.

The Joint Security Required of SP and NE. Here we define the security notion required from \(\textsf{SP}\) when it is used in the presence of arbitrary \(\textsf{NE}\) encryptions under the same symmetric group keys that are used by \(\textsf{SP}\). We call this notion \(\textsf{wOAE}[\textsc {Enc}[\mathcal {M}, \textsf{NE}]]\). It is a parameterized version of the \(\textsf{wOAE}\) game defined in Definition 1. We use it for our analysis of the \(\textsf{bwOAE}\) security of \(\textsf{BM}\).

At the start of this section we discussed that the security reduction for \(\textsf{BM}\) intuitively requires that it is hard to forge an \(\textsf{SP}\) ciphertext (without knowing the corresponding group key \(K_g\)) in the presence of \(\textsf{NE}\) encryptions. Our definition of \(\textsf{wOAE}[\textsc {Enc}[\mathcal {M}, \textsf{NE}]]\) captures this by providing the adversary access to an \(\textsf{NE}\) encryption oracle \(\textsc {Enc}\) in addition to the \(\textsc {SigEnc}\) and \(\textsc {VerDec}\) oracles in the out-group AE security game of \(\textsf{SP}\). We stress that proving the security of \(\textsf{BM}\) does not, in principle, require us to provide the \(\textsc {SigEnc}\) oracle to the adversary. We choose to require this stronger level of security from \(\textsf{SP}\) because of the following reasons. On the one hand, in Sect. 4 we explained why it is beneficial to prove that \(\textsf{SP}\) satisfies a strong security notion, going beyond what is required by \(\textsf{BM}\). On the other hand, this stronger security notion that we require from \(\textsf{SP}\) will not come at the cost of introducing additional assumptions or achieving looser concrete-security bounds in our analysis of \(\textsf{BM}\).

Definition 3

Let \(\textsf{SS}\) be a symmetric signcryption scheme. Let \(\mathcal {M}\subseteq \{0,1\}^*\). Let \(\textsf{NE}\) be a nonce-based encryption scheme. Consider the \(\textsf{wOAE}\) security game for \(\textsf{SS}\) as defined in Definition 1 (w.r.t. any \(\textsf{pred}_{\textsf{trivial}}^{\textsf{sec}}\), \(\textsf{func}_{\textsf{out}}^{\textsf{sec}}\)). We define a variant of this game by adding an oracle that is defined as follows.

figure t

It shares set \(N\), bit b, and the tables \(\textsf{K}\), \(\mathsf {group\_is\_corrupt}\), \(\textsf{chal}\) with the rest of the security game. We denote the resulting game (and security notion) by \(\textsf{wOAE}[\textsc {Enc}[\mathcal {M}, \textsf{NE}]]\). It simultaneously requires out-group AE security of \(\textsf{SS}\) (without nonce repetition) and an IND-style security of \(\textsf{NE}\). We define an adversary’s advantage in breaking this security notion in the natural way.

Note that we require the messages that the adversary queries to the \(\textsc {Enc}\) oracle to be in \(\mathcal {M}\). Intuitively, in our security analysis of \(\textsf{BM}\), an adversary will only be able to obtain \(\textsf{NE}\) encryptions of messages in the message space of \(\textsf{BM}\). So in the security reduction for \(\textsf{BM}\) we will use \(\mathcal {M}= \textsf{BM}.\textsf{MS}\).

Out-Group AE Security of BoxMessage. We prove \(\textsf{bwOAE}\) security of \(\textsf{BM}\). The formal proof of Theorem 3 is in the full version.

Theorem 3

Let \(\textsf{BM}= \mathsf {BOX\text {-}{}MESSAGE\text {-}{}\textsf{SS}}[{\mathcal {M}, \textsf{NE}, \textsf{H}, \textsf{SP}}]\) be the symmetric signcryption scheme built from some \({\mathcal {M}, \textsf{NE}, \textsf{H}, \textsf{SP}}\) as specified in Construction 1. Let \(\textsf{pred}_{\textsf{trivial}}^{\textsf{suf}}\) and \(\textsf{pred}_{\textsf{trivial}}^{\mathsf {suf\text {-}{}except\text {-}{}user}}\) be the ciphertext-triviality predicates as defined in Fig. 7. Let \(\textsf{func}_{\textsf{out}}^{\bot }\) be the output-guarding functions as defined in Fig. 8. Let \(\textsf{wOAE}[\textsc {Enc}[\mathcal {M}, \textsf{NE}]]\) be the security notion as defined in Definition 3. Let \(\mathcal {A}_{{\textsf{bwOAE}\text {-}{}\textsf{of}\text {-}{}\textsf{BM}}}\) be any adversary against the \(\textsf{bwOAE}\) security of \(\textsf{BM}\) with respect to \(\textsf{pred}_{\textsf{trivial}}^{\textsf{suf}}\) and \(\textsf{func}_{\textsf{out}}^{\textsf{sec}}\). Then we build adversaries \(\mathcal {A}_{{\textsf{wOAE}\text {-}{}\textsf{of}\text {-}{}\textsf{SP}}}\) and \(\mathcal {A}_{\textsf{CR}}\) such that

$$\begin{aligned} \textsf{Adv}_{\textsf{BM},\textsf{pred}_{\textsf{trivial}}^{\textsf{suf}},\textsf{func}_{\textsf{out}}^{\bot }}^{\textsf{bwOAE}}(\mathcal {A}_{{\textsf{bwOAE}\text {-}{}\textsf{of}\text {-}{}\textsf{BM}}}) &\le \textsf{Adv}_{\textsf{SP},\textsf{pred}_{\textsf{trivial}}^{\mathsf {suf\text {-}{}except\text {-}{}user}},\textsf{func}_{\textsf{out}}^{\bot }}^{\textsf{wOAE}[\textsc {Enc}[\mathcal {M}, \textsf{NE}]]}(\mathcal {A}_{{\textsf{wOAE}\text {-}{}\textsf{of}\text {-}{}\textsf{SP}}}) \\ &+ \textsf{Adv}^{\textsf{CR}}_{\textsf{H}}(\mathcal {A}_{\textsf{CR}}). \end{aligned}$$

Note that we prove the security of \(\textsf{BM}\) with respect to \(\textsf{pred}_{\textsf{trivial}}^{\textsf{suf}}\) and \(\textsf{func}_{\textsf{out}}^{\bot }\). As discussed in Sect. 3, \(\textsf{pred}_{\textsf{trivial}}^{\textsf{suf}}\) essentially requires \(\textsf{BM}\) to have ciphertext integrity. Our result relies on the security of \(\textsf{SP}\) with respect to \(\textsf{pred}_{\textsf{trivial}}^{\mathsf {suf\text {-}{}except\text {-}{}user}}\) and \(\textsf{func}_{\textsf{out}}^{\bot }\). Recall that \(\textsf{pred}_{\textsf{trivial}}^{\mathsf {suf\text {-}{}except\text {-}{}user}}\) basically requires \(\textsf{SP}\) to have ciphertext integrity, except it allows for an honest ciphertext to be successfully decrypted even with respect to a wrong user identifier; the latter is not considered a “valid” forgery. This does not translate to an attack against \(\textsf{BM}\) because it only uses \(\textsf{SP}\) to encrypt header messages \(m_{\textsf{header}}= \langle {ad, u, g, h_{\textsf{body}}}\rangle \) that contain u, and the \(\textsf{BM}.\textsf{VerDec}\) algorithm verifies that the group identifier it received as input matches the one that was parsed from \(m_{\textsf{header}}\).

5.3 Out-Group AE Security of \(\textsf{SealPacket} \)

Out-group AE security of \(\textsf{SP}= \mathsf {SEAL\text {-}{}PACKET\text {-}{}\textsf{SS}}[{\textsf{H}, \textsf{DS}, \textsf{NE}}]\) reduces to the security \(\textsf{NE}\) and \(\textsf{DS}\) (see Fig. 12). In particular, \(\textsf{wOAE}[\textsc {Enc}[\mathcal {M}, \textsf{NE}]]\) security holds if \(\textsf{NE}\) provides authenticated encryption for key-dependent messages and \(\textsf{DS}\) produces \(\mathcal {M}\)-sparse signatures. We introduce these security notions below.

Intuition. Recall that in the \(\textsf{wOAE}[\textsc {Enc}[\mathcal {M}, \textsf{NE}]]\) game, the adversary is provided with (un)signcryption oracles \(\textsc {SigEnc}\) and \(\textsc {VerDec}\) for \(\textsf{SP}\), and an encryption oracle \(\textsc {Enc}\) for \(\textsf{NE}\). Each of these returns output based on a challenge bit that is shared between them. The adversary can use three approaches to learn the challenge bit. It can (a) attempt \(\textsf{SP}\) forgeries by calling its \(\textsf{SP}\) decryption oracle \(\textsc {VerDec}\); (b) make left-or-right queries to its \(\textsf{NE}\) encryption oracle \(\textsc {Enc}\); (c) make left-or-right queries to its \(\textsf{SP}\) encryption oracle \(\textsc {SigEnc}\).

The adversary is allowed to expose users’ signing keys so it could attempt to forge an \(\textsf{SP}\) ciphertext using an exposed \(\textsf{DS}\) signing key and its \(\textsc {Enc}\) oracle. The adversary would then query the resulting ciphertext to its \(\textsc {VerDec}\) oracle in an attempt to trivially win the game. We show that the adversary is unable to accomplish this. The \(\textsc {Enc}\) oracle is defined to only produce encryptions of the messages from the set \(\mathcal {M}\). In the implementation of Keybase, the messages from \(\mathcal {M}\) have a specific encoding; we will rely on this property in our proof. In contrast, any ciphertext successfully decrypted by \(\textsc {VerDec}\) must encrypt a message of the form \(m_{e} = s\,\Vert \,m\) where s is a valid \(\textsf{DS}\) signature. So the adversary needs to find a signature s that is consistent with the message encoding that is permitted by \(\textsc {Enc}\). The \(\mathcal {M}\)-sparseness of \(\textsf{DS}\) signatures, which we formalize below, prevents this. It follows that the \(\textsc {VerDec}\) oracle does not help the adversary to win the game by querying ciphertexts that were previously returned by \(\textsc {Enc}\).Footnote 1

Now we can reimagine the \(\textsc {Enc}\) and \(\textsc {SigEnc}\) oracles as producing \(\textsf{NE}\) encryptions of key-dependent messages. The \(\textsc {SigEnc}\) oracle requires messages to be derived as a specific function of the symmetric group key \(K_{g}\). The \(\textsc {Enc}\) oracle can be thought of as allowing to messages that are derived from “constant” functions, meaning the chosen messages do not depend on \(K_g\). We can also view the \(\textsc {VerDec}\) oracle as an \(\textsf{NE}\) decryption oracle that prevents the adversary from trivially winning the game by merely querying the ciphertexts it previously obtained from either \(\textsc {Enc}\) or \(\textsc {SigEnc}\). We define the AE security of \(\textsf{NE}\) for key-dependent messages and show that the adversary can only win the \(\textsf{wOAE}[\textsc {Enc}[\mathcal {M}, \textsf{NE}]]\) game against \(\textsf{SP}\) if it can win the \(\textsf{KDMAE}\) game against \(\textsf{NE}\).

Reliance on the Message Encoding in Keybase. We mentioned in the intuition that we rely on the encoding of messages in \(\mathcal {M}\) in our proof. We emphasize that avoiding this dependency is non-trivial. The cyclic key dependency within \(\textsf{SP}\) and the key reuse between \(\textsf{BM}\) and \(\textsf{SP}\) pose significant challenges when considering the possibility of an alternate proof.

\(\boldsymbol{\mathcal {M}}\)-sparse Signatures. Consider game of Fig. 13, defined for a digital signature scheme \(\textsf{DS}\), a set \(\mathcal {M}\subseteq \{0,1\}^*\), and an adversary \(\mathcal {A}_{\textsf{SPARSE}}\). The advantage of \(\mathcal {A}_{\textsf{SPARSE}}\) in breaking the \(\mathcal {M}\)-\(\textsf{SPARSE}\) security of \(\textsf{DS}\) is defined as . Intuitively, this game captures the inability of an adversary to produce a signature that conforms to the message space \(\mathcal {M}\) even though the adversary chooses the public key used to verify the signature. More formally, the adversary wins if it is able to return \((vk,m,s,\gamma )\) such that s verifies as a signature over the message m under the verification key vk and \(s\,\Vert \,\gamma \in \mathcal {M}\). We stress that the adversary is allowed to choose an arbitrary – possibly malformed – verification key. The adversary is not required to know the corresponding signing key, and such a key may in fact not exist.

We verify our intuition about the \(\mathcal {M}\)-sparsity of the \(\textsf{Ed25519} \) signature scheme underlying \(\textsf{SP}\) in the full version. \(\textsf{Ed25519} \) is a deterministic signature scheme introduced by Bernstein, Duif, Lange, Schwabe, and Yang in [17]. It is obtained by applying the commitment-variant of the Fiat-Shamir transform to an identification scheme. Therefore a signature produced by \(\textsf{Ed25519} \) consists of the commitment and response of the identification scheme. The adversary can only win the \(\textsf{SPARSE}\) game of \(\textsf{Ed25519} \) if it is able to produce an accepting conversation transcript for the identification scheme such that the corresponding commitment conforms to \(\mathcal {M}\). Commitments in the identification scheme underlying \(\textsf{Ed25519} \) are elements of a prime-order group. We prove that finding such a commitment is only possible if the adversary is able to find a group element and its discrete logarithm such that the group element is in \(\mathcal {M}\) which we show is hard in the generic group model.

Fig. 13.
figure 13

Game defining \(\mathcal {M}\)-sparsity of a digital signature scheme \(\textsf{DS}\) for a set \(\mathcal {M}\).

The Message Space \(\boldsymbol{\mathcal {M.}}\) Keybase uses the MessagePack serialization format [21] to encode plaintext messages. Plaintext messages are represented using a custom data structure in Keybase. So the serialized MessagePack encoding of a plaintext is a byte sequence that not only stores the plaintext itself but also some metadata about the data structure that represents it. For messages encrypted by \(\textsf{BM}\), the metadata about the data structure happens be located in the first 17 bytes of the encoding. This means that the encoding of every plaintext encrypted by \(\textsf{BM}\) contains a fixed 17-byte prefix. Let this 17-byte prefix be \(\textsf{pre}\). Then we define the message space of \(\textsf{BM}\) by \(\textsf{BM}.\textsf{MS}= \{ \textsf{pre}\,\Vert \,\nu \;\bigm |\; \nu \in \{0,1\}^* \}\).

Message-Deriving Functions. Let \(\phi \) be any function that takes a symmetric key K as input and uses it to derive and return some message m. We call \(\phi \) a message-deriving function and will consider some classes (i.e. sets) \(\varPhi \) of message-deriving functions. We require that the length of an output returned by \(\phi \) must not depend on its input; we denote the output length of \(\phi \) by \(\Vert \phi \Vert \).

Fig. 14.
figure 14

Game defining authenticated-encryption security of \(\textsf{NE}\) for \(\varPhi \)-key-dependent messages, where \(\varPhi \) is a class of message-deriving functions and \(\textsc {G}= \{{\textsc {NewHonGroup}, \textsc {ExposeGroup}, \textsc {NewCorrGroup}}\}\).

AE Security of NE for Key-Dependent Messages. Consider game of Fig. 14, defined for a nonce-based encryption scheme \(\textsf{NE}\), a class of message-deriving functions \(\varPhi \), and an adversary \(\mathcal {A}_{\textsf{KDMAE}}\). The advantage of \(\mathcal {A}_{\textsf{KDMAE}}\) in breaking the \(\varPhi \)-\(\textsf{KDMAE}\) security of \(\textsf{NE}\) is defined as . This game can be thought of as a modification of the \(\textsf{AEAD}\) security game for \(\textsf{NE}\) (Fig. 4) which does not require nonce-misuse resistance. The core difference is that the \(\textsc {Enc}\) oracle now takes message-deriving functions \(\phi _{0}, \phi _{1} \in \varPhi \) as input. The challenge message is derived as \(m_b \leftarrow \phi _{b}(\textsf{K}[g])\) for \(b\in \{0,1\}\), where \(\textsf{K}[g]\) is the symmetric group key associated to g. Trivial attacks are prevented by requiring that \(\phi _{0}, \phi _{1}\) have the same output length and that \(\phi _{0} = \phi _{1}\) whenever \(\textsc {Enc}\) is called for a corrupt group. Our definition is based on prior work [6, 12, 13, 19]. There are strong impossibility results [12] regarding the existence of schemes that are secure with respect to very large classes of message-deriving functions \(\varPhi \). We sidestep these results by considering a very narrow and simple class \(\varPhi _{\textsf{SP}}\) that we define below.

The Class of Message-Deriving Functions \(\boldsymbol{\varPhi }_{{\textbf {Sp.}}}\) Earlier we discussed that in the \(\textsf{wOAE}[\textsc {Enc}[\mathcal {M}, \textsf{NE}]]\) security game for \(\textsf{SP}\), the \(\textsc {SigEnc}\) and \(\textsc {Enc}\) oracles can be thought of as returning an \(\textsf{NE}\) ciphertext that encrypts an output of some message-deriving function. We now define the class \(\varPhi _{\textsf{SP}}\) containing all message-deriving functions that are used by either \(\textsc {SigEnc}\) or \(\textsc {Enc}\).

Construction 3

Let \(\textsf{NE}\) be a nonce-based encryption scheme. Let \(\textsf{H}\) be a hash function. Let \(\textsf{DS}\) be a digital signature scheme. Let \(\mathsf {SIGENC\text {-}{}DER}\) and \(\mathsf {ENC\text {-}{}DER}\) be the parameterized message-deriving functions that are defined as follows, each taking an \(\textsf{NE}\) key \(K\in \{0,1\}^{\textsf{NE}.\textsf{kl}}\) as input.

figure y

Then \(\varPhi _{\textsf{SP}}= \mathsf {MSG\text {-}{}DER\text {-}{}FUNC}[\textsf{NE}, \textsf{H}, \textsf{DS}]\) is the class of all message-deriving functions of these forms.

Note that \(\mathsf {SIGENC\text {-}{}DER}\) only uses K as a part of the message \(m_s\) signed by \(\textsf{DS}.\textsf{Sig}\). Keybase instantiates \(\textsf{DS}\) with \(\textsf{Ed25519} \) which computes two \(\mathsf {SHA\text {-}{}512} \) hashes of \(m_s\) (mixed with other inputs). The resulting signature does not depend on \(m_s\) in any other way. Using this observation and an indifferentiability result of Bellare, Davis, and Di [10] (for \(\mathsf {SHA\text {-}{}512} \) with output reduced modulo a prime) we capture \(\mathsf {SIGENC\text {-}{}DER}\) as a special class of message-deriving functions for which we can prove security in the random oracle model.

KDMAE Security for Messages Derived from a Hashed Key. Let \(\textsf{H}\) be a hash function. Let \(\varPhi \) be a class of message-deriving functions such that each \(\phi \in \varPhi \) on input K is only allowed to derive messages from the hash value \(\textsf{H}(K)\), and never directly from K. We will roughly show that every \(\textsf{AEAD}\)-secure nonce-based encryption scheme \(\textsf{NE}\) is also \(\varPhi \)-\(\textsf{KDMAE}\)-secure, provided that \(\textsf{H}\) is modeled as a random oracle. We formalize this class of functions as follows.

Definition 4

We say \(\varPhi \) derives messages from a hashed key if there exists a set \(\varGamma \) and a function \(\textsf{H}\) (modeled as a random oracle) such that \( \varPhi = \{ \phi _\gamma \;\bigm |\; \phi _{\gamma }(\cdot )=\gamma (\textsf{H}(\cdot )), \gamma \in \varGamma \}. \)

In the full version we show how to capture \(\varPhi _{\textsf{SP}}\) as satisfying this definition. Thereby, the following result will give us \(\varPhi _{\textsf{SP}}\)-\(\textsf{KDMAE}\) security.

Proposition 1

Let \(\textsf{NE}\) be a nonce-based encryption scheme. Let \(\varPhi \) be a class of message-deriving functions that derives messages from a hash key. Let \(\mathcal {A}_{\textsf{KDMAE}}\) be an adversary against the \(\varPhi \)-\(\textsf{KDMAE}\) security of \(\textsf{NE}\) making \(q_{\textsc {NewHonGroup}}\) queries to its \(\textsc {NewHonGroup}\) oracle. Then we can build adversaries \(\mathcal {A}_{\textsf{KR}}\) and \(\mathcal {A}_{\textsf{AEAD}}\) such that (in the random oracle model)

$$\begin{aligned} \textsf{Adv}^{\textsf{KDMAE}}_{\textsf{NE},\varPhi }(\mathcal {A}_{\textsf{KDMAE}}) \le 2 \cdot \textsf{Adv}^{\textsf{KR}}_{\textsf{NE}}(\mathcal {A}_{\textsf{KR}}) + \textsf{Adv}^{\textsf{AEAD}}_{\textsf{NE}}(\mathcal {A}_{\textsf{AEAD}}) + \frac{q_{\textsc {NewHonGroup}}^{2}}{2^{\textsf{NE}.\textsf{kl}}}. \end{aligned}$$

The constructed adversaries will not repeat (gn) across \(\textsc {Enc}\) queries, so non-nonce-misuse resistant \(\textsf{NE}\) suffices. To prove this, we first assert that a \(\varPhi \)-\(\textsf{KDMAE}\) adversary \(\mathcal {A}_{\textsf{KDMAE}}\) can never directly query the random oracle on any of the (non-exposed) honest keys; otherwise, we could use \(\mathcal {A}_{\textsf{KDMAE}}\) in order to break the key-recovery security of \(\textsf{AEAD}\). But then \(\mathcal {A}_{\textsf{KDMAE}}\) cannot distinguish between messages derived from \(\textsf{H}(K[g])\) or from some \(\textsf{H}^*(g)\). Here \(\textsf{H}\) is the actual random oracle and \(\textsf{H}^*\) is a simulated random oracle whose output depends on a group’s identifier g instead of this group’s key K[g]. We switch from using \(\textsf{H}(K[g])\) to \(\textsf{H}^*(g)\), thus breaking the dependency of each challenge message on the corresponding \(\textsf{NE}\) key. The \(\textsf{AEAD}\) security of \(\textsf{NE}\) then guarantees that \(\mathcal {A}_{\textsf{KDMAE}}\) cannot guess the challenge bit. The formal proof of Proposition 1 is in the full version.

Out-Group AE Security of SealPacket. We prove \(\textsf{wOAE}\) security of \(\textsf{SP}\). The formal proof of Theorem 4 is in the full version.

Theorem 4

Let \(\mathcal {M}\subseteq \{0,1\}^*\). Let \(\textsf{SP}= \mathsf {SEAL\text {-}{}PACKET\text {-}{}\textsf{SS}}[{\textsf{H}, \textsf{DS}, \textsf{NE}}]\) be the symmetric signcryption scheme built from some \(\textsf{H}, \textsf{DS}, \textsf{NE}\) as specified in Construction 2. Let \(\textsf{pred}_{\textsf{trivial}}^{\mathsf {suf\text {-}{}except\text {-}{}user}}\) be the ciphertext-triviality predicate as defined in Fig. 7. Let \(\textsf{func}_{\textsf{out}}^{\bot }\) be the output-guarding function as defined in Fig. 8. Let \(\textsf{wOAE}[\textsc {Enc}[\mathcal {M}, \textsf{NE}]]\) be the security notion as defined in Definition 3. Let \(\varPhi _{\textsf{SP}}= \mathsf {MSG\text {-}{}DER\text {-}{}FUNC}[{\textsf{NE}, \textsf{H}, \textsf{DS}}]\) be the class of message-deriving functions defined in Construction 3. Let \(\mathcal {A}_{{\textsf{wOAE}\text {-}{}\textsf{of}\text {-}{}\textsf{SP}}}\) be an adversary against the \(\textsf{wOAE}[\textsc {Enc}[\mathcal {M}, \textsf{NE}]]\) security of \(\textsf{SP}\) with respect to \(\textsf{pred}_{\textsf{trivial}}^{\mathsf {suf\text {-}{}except\text {-}{}user}}\) and \(\textsf{func}_{\textsf{out}}^{\bot }\). Then we can build adversaries \(\mathcal {A}_{\textsf{KDMAE}}\) and \(\mathcal {A}_{\textsf{SPARSE}}\) such that

$$\begin{aligned} \textsf{Adv}^{\textsf{wOAE}[\textsc {Enc}[\mathcal {M}, \textsf{NE}]]}_{\textsf{SP}, \textsf{pred}_{\textsf{trivial}}^{\mathsf {suf\text {-}{}except\text {-}{}user}}, \textsf{func}_{\textsf{out}}^{\bot }}(\mathcal {A}_{{\textsf{wOAE}\text {-}{}\textsf{of}\text {-}{}\textsf{SP}}}) \le \textsf{Adv}^{\textsf{KDMAE}}_{\textsf{NE}, \varPhi _{\textsf{SP}}}(\mathcal {A}_{\textsf{NE}}) + 2 \cdot \textsf{Adv}^{\textsf{SPARSE}}_{\textsf{DS}, \mathcal {M}}(\mathcal {A}_\textsf{SPARSE}). \end{aligned}$$

The \(\textsf{OAE}\) security results in Theorems 3 and 4 used the weaker output guarding function \(\textsf{func}_{\textsf{out}}^{\bot }\). In the full version of this paper, we show that for \(\textsf{SS}\in \{\textsf{BM}, \textsf{SP}\}\), the \(\textsf{OAE}\) security of \(\textsf{SS}\) with respect to \(\textsf{func}_{\textsf{out}}^{\bot }\) implies its \(\textsf{OAE}\) security with respect to the stronger output guarding function \(\textsf{func}_{\textsf{out}}^{\textsf{sec}}\).

6 Conclusions

Combining Theorem 1 with Theorem 2 and Theorem 3 with Theorem 4 establishes the in-group unforgeability and out-group authenticated encryption security of Keybase’s \(\textsf{BoxMessage}\) algorithm. These results rely on some standard security assumptions (unforgeability of \(\textsf{Ed25519}\) and collision resistance of \(\mathsf {SHA\text {-}{}256}\)) as well as some non-standard assumptions (key-dependent message security of \(\mathsf {XSalsa20\text {-}{}Poly1305}\) and sparsity of \(\textsf{Ed25519}\)). These non-standard assumptions arose, respectively, from the key cycle in \(\textsf{SealPacket} \) and the key reuse without explicit context separation \(\textsf{BoxMessage} \). While we were able to justify these assumptions, we consider them brittle as they are not well studied, their justifications required ideal models, and (in the case of sparsity) they required properties of the specific messaging encoding format used by Keybase.

The comparative simplicity of our Sign-then-Encrypt construction speaks to the value of formalizing the syntax and security of symmetric signcryption. Explicit goals allow designing schemes in parallel with writing proofs to identify precisely what is needed.