Keywords

1 Introduction

Hash-based signature schemes have their origins in the paper “Constructing Digital Signatures from a One Way Function”, by Leslie Lamport [12]. The security of these schemes is based solely on the security properties of a standard hash function, as opposed to schemes whose security relies on problems such as the discrete-logarithm problem on finite groups, or the learning with errors problem. After Lamport’s one-time scheme, Ralph Merkle improved upon the construction with the Winternitz one-time scheme and the ability to sign multiple messages with Merkle trees [15, 16]. The Leighton-Micali scheme, or LMS, proposed some modifications of Merkle’s construction to improve speed and security [13].

Recently, there has been a renewed interest in hash-based signatures in general, and LMS in particular. This is partially due to the expiration of the patents LMS was covered by [13, 16], but more importantly because hash-based schemes are believed to remain secure against a quantum adversary. LMS has been proposed for standardization in a recent IETF draft [14]. In a recent paper, Jonathan Katz analyzed the security of LMS [11].

Katz’s analysis used the random-oracle model to establish the security of LMS. However, as the random-oracle model is insufficient for establishing the security of a protocol against an adversary with access to a quantum computer, we must move to the quantum random-oracle model [3].

In this paper, we reformulate and update Katz’s random-oracle model proof of security for LMS to the quantum random-oracle model. As LMS is a hash-based scheme, this is particularly important as it is a strong candidate for post-quantum standardization. We also discuss some of the difficulties that need to be overcome in order to establish this proof in the quantum random-oracle model.

1.1 The Quantum Random-Oracle Model

Katz’s classical proof of the security of LMS takes place in the random-oracle model. In his proof, he considers an experiment with an adversary \(\mathcal {A}\), who is attacking the existential-unforgeability of the scheme. Whenever this adversary wishes to evaluate the n-bit hash function H on a point x, they must instead query an oracle for the evaluation, and are provided a response which is indistinguishable from random. Katz shows that for any adversary that makes q queries, the probability that \(\mathcal {A}\) can break the existential-unforgeability of LMS is at most \(3q/2^n\). He establishes this by showing that for the adversary to win a game, one of a series of events must occur. Then by upper bounding the probability of these events happening, the upper bound follows.

As the random oracle is meant to replace a hash function, an adversary should be able to interact with this oracle in a similar way to how they interact with a hash function. However it has been noted that an adversary with a quantum computer can interact with a hash function in ways very different from a ‘make a single query, get a response’ model [3]. If a hash function is implemented on a quantum computer, then they are able to evaluate the function in superposition, giving them access to the quantum mapping

$$\begin{aligned} U_H: \sum _{x,y} \alpha _{x,y} |x\rangle |y\rangle \mapsto \sum _{x,y} \alpha _{x,y} |x\rangle |y \oplus H(x)\rangle . \end{aligned}$$
(1)

A model of security in which we provide access to this mapping to an adversary is called the quantum random-oracle model.

New issues arise in this model however, and Katz’s proof no longer works. Katz’s events are defined by considering the queries that the adversary makes and the responses they receive. However in the quantum random-oracle model, the queries the adversary makes no longer need be classical, and so the definition of these events is no longer meaningful. Instead the events must be defined by considering what classical information the adversary is able to find, rather than just what they query. Classically, the information the adversary has about an oracle is entirely specified by the queries being made. But against a quantum adversary, the information an adversary has about an oracle is much more challenging to classify.

1.2 The Multi-user Setting

The security of a protocol is generally defined in terms of a game between a challenger \(\mathcal {C}\) and an adversary \(\mathcal {A}\). If the adversary is unable to win the game with a reasonable number of resources, the protocol is considered secure. For example in our situation, \(\mathcal {C}\) may be a signing oracle with a public key, and \(\mathcal {A}\) may be trying to create a forged signature on that public key.

However in the real world, attackers do not always want to break a specific individual’s security. They may be happy to break the security of any of a large number of entities. To model this, we consider an adversary \(\mathcal {A}\) that plays a game with a large number of independent challengers \(\mathcal {C}_1, \dots , \mathcal {C}_U\). If \(\mathcal {A}\) is able to win the game with any one of these challengers, they are considered to have won. The multi-user setting was first considered in [2].

For many schemes, it is unknown if an adversary’s task in winning a game in the multi-user setting is easier or not. In fact there are schemes for which the adversary’s chances of winning a game increase linearly with the number of challengers [5]. If a scheme is intended for widespread use, even a linear increase can be a cause for concern that can necessitate an increase in the security parameters. Therefore it is very desirable that any adversary gains no advantage in breaking the security of a scheme in the multi-user setting.

1.3 Our Contributions

  • We consider a Lemma by Unruh [18] on distinguishing quantum oracles. We make a small modification that generalizes Unruh’s result and addresses oracles that are more commonly considered.

  • Develop a heuristic approach to study the properties of a series of composed random oracles.

  • Consider the property of undetectability in the random-oracle model.

  • Discuss how these can be applied to LMS in order to upper bound any quantum adversary’s abilities to break the security of the scheme in the quantum random-oracle model.

  • Consider how these results apply to the multi-user setting, where an adversary attempts to break the security of one of many independent instances of the scheme.

1.4 Related Work

The approach for proving LMS in the quantum random-oracle model was largely inspired by the approach in [11], reworking and incorporating modified results from [18, 19]. The quantum-random oracle model was originally defined in [3]. The quantum security of other hash-based constructions, such as Merkle trees and XMSS (another proposed hash-based standard) has been considered before in works such as [4, 10]. In particular [10] considered quantum query bounds on multi-target search problems. A comprehensive report comparing XMSS and LMS [17] has also discussed the need for a quantum random-oracle model proof of LMS. Other works exploring post-quantum signature schemes whose security is established in the quantum random-oracle model include [1, 3, 8]. Undetectability has been considered before to consider the security of the Winternitz one-time signature scheme [6].

2 Scheme Description

2.1 One-Time Scheme

The basic component of the full scheme is the one-time (OT) LMS signature scheme, also known as the Winternitz OT signature scheme. This scheme consists of OT key generation, signing, and verifying algorithms. It uses, as a basic component, a hash function \(H: \{ 0 , 1 \}^* \rightarrow \{0, 1\}^n\), where n is the security parameter. In our analysis, we will model H as a random oracle.

The parameters are:

  • n, the security parameter.

  • w, the Winternitz parameter, which is a small divisor of n less than or equal to eight.

These parameters define the following values:

  • \(E = 2^w - 1\)

  • \(u_1 = n / w\)

  • \(u_2 = \lceil \lfloor \log _2 \left( u_1 \cdot E \right) + 1 \rfloor / w \rceil \)

  • \(p = u_1 + u_2\).

For our purposes, string concatenation is denoted by ||.

We can parse a string of n bits as the concatenation of \(u_1\) strings, each w bits long and representing an integer from 0 to E. This allows us to define the \(\mathsf {checksum}: ( \{0,1\}^w )^{u_1} \rightarrow (\{0,1\}^w )^{u_2}\) function as

$$\begin{aligned} \mathsf {checksum} ( h_1, \dots , h_{u_1} ) = \sum _{i=1}^{u_1} ( E - h_i ). \end{aligned}$$
(2)

We can then see that \(u_2\) was chosen so that \(w \cdot u_2\) is the maximum bit length of the result of the \(\mathsf {checksum}\) function.

The \(\mathsf {checksum}\) function is constructed so that when we compare two vectors of \(u_1\) integers from 0 to E, \((h_1, \dots , h_{u_1})\) and \((h'_1, \dots , h'_{u_1})\), if \(h_i \le h'_i\) for each i (and there is at least one index where they are not equal), then when the checksum is viewed as a vector of \(u_2\) integers from 0 to E, \((c_1, \dots , c_{u_2})\) and \((c'_1, \dots , c'_{u_2})\), there is an index i such that \(c_i > c'_i\). This follows from the fact that if \(h_i \le h'_i\) for all i (and there is at least one index where they are not equal), then \(\sum (E - h_i) > \sum (E - h'_i)\), and so when the checksums are converted into integer vectors, at least one of the \(c_i\) must be greater than the corresponding \(c'_i\).

We define a function F as a repeated application of H, with each application also adding some additional information, such as the number of times H has been applied. We also include \(s = I||Q||i\), a string consisting of an identifying string I for the owner of the public key, a string Q indicating which instance of the scheme is being used, and a number i indicating which chain of hashes we are referring to. This information is used in the multi-user and multi-instance analysis of the scheme. For \(0 \le b \le f \le E\), define

$$\begin{aligned} F_s(x ; b,f) = \left\{ \begin{array}{cl} x &{} \text { if } b=f\\ F_s( H(x||s||b||\text {00}); b+1, f) &{} \text { if } b < f. \end{array} \right. \end{aligned}$$
(3)

The OTLMS algorithms for key generation, signing, and verifying are then described as follows.

figure a
figure b
figure c

The correctness property can verified by inspection. While the \(\mathsf {OTLMS}\) scheme can seem complicated by its description it is conceptually simple. For key generation, the n-bit random values \(x_1^0, \dots , x_p^0\) are hashed E times to generate the values \(x_1^E, \dots , x_p^E\), which are hashed together to make the public key pk. Any message (along with a random salt r) is hashed to generate a seeded digest \(h'\). This digest can then be parsed as a series of p integers from 0 to E. These are interpreted as p positions in a ‘Winternitz chain’ - the number of times \(x_i^0\) is hashed for each i. These repeated hashes are revealed as a signature. To verify a signature, the revealed values are then hashed the correct number of times more to recover \(x_1^E, \dots , x_p^E\), which are all hashed together to get pk.

Readers may be more familiar with the Lamport one-time signature scheme. In that scheme, 2n uniformly random n-bit strings form the private key, \((a_{0,1}, a_{1,1}, a_{0,2}, a_{1,2}, \dots a_{0,n},a_{1,n})\). Each of these strings is hashed once to form the public key, which also consists of 2n bit strings of length n, \((b_{0,1}, b_{1,1}, \dots , b_{0,n}, b_{1,n})\). To sign an n-bit message digest \(h_1 h_2 \dots h_n\) (with \(h_i \in \{0,1\}\)) we reveal \(a_{h_i, i}\) for \(i \in \{1,\dots ,n\}\). In this scheme, public and secret keys are both \(2n^2\) bits long, and the signature is \(n^2\) bits long.

The Winternitz one-time scheme and the Lamport one-time scheme are similar in the aspect that both interpret the message digest as a specification for what parts of the secret key should be revealed. Different messages have different digests, and so while part of the secret key has been revealed by one signature, not enough information has been revealed to sign a second message after seeing one signature.

The Winternitz one-time scheme is one of the earliest hash-based schemes, and offers a considerable advantage in terms of key and signature sizes over the Lamport one-time scheme. Its public key is only n bits, and ignoring the salt, its secret key and signature sizes are just \(p \cdot n\) as opposed to \(n^2\) or \(2n^2\) (for example, for \(n = 256\) and \(w = 8\), this is 8448 bits as opposed to 65536 bits). It obtains this advantage (at the expense of some additional hashes) by grouping together sections of the salted digest and interpreting these sections as a numeric index in a series of hashes, rather than considering each bit of the digest separately.

2.2 Full Scheme

In the full scheme, we combine the one-time scheme as a subroutine with a Merkle tree construction in order to have a full (stateful) signature scheme.

In addition to the parameters for the one-time scheme, we have the parameter G. We will create \(2^{G}\) separate instances of the one-time scheme.

figure d
figure e
figure f

Again, correctness can be verified by inspection. To understand the full scheme, we consider a binary tree, the leaves of which are the public keys of individual one-time schemes. When a message is signed with a one-time scheme, we include the signature of the one-time scheme (in order to generate the public key of that instance), as well as the values of the adjacent nodes on each level of the binary tree in order to be able to recover the value of the root node, which is the overall public key. These values form what is known as the Merkle tree verification path.

3 The (Quantum) Random Oracle

In order to analyze the security of LMS, we need to formulate a few results about the hardness of various problems in the quantum random-oracle model. In Sect. 3.1 we establish upper bounds on the success probability in standard games such as (second-) preimage resistance in a multi-instance and multi-target setting. In Sect. 3.2, we consider the difficulty of a slight variant of second-preimage resistance, and in Sect. 3.3, we consider the properties of functions defined by a composition of random oracles.

3.1 Oracle Distinguishing and Marked Item Searching

To establish the hardness of certain fundamental problems, we need a lemma to upper bound a quantum adversary’s ability to obtain any relevant information from an oracle. In order to do this, we upper bound an adversary’s ability to distinguish two oracles, one which has marked items and one which does not. Furthermore, we would like this upper bound to hold when the adversary has access to multiple independent oracles.

For \({\vec {x}}= (x_1, \dots , x_K) \in (\{0,1\}^n)^K\), and \(z \in \{0,1\}^n\), let

$$\begin{aligned} \delta _{\vec {x}}(z) :={\left\{ \begin{array}{ll} 1 &{} \text {if } z = x_i \text { for some } i\\ 0 &{} \text {otherwise.} \end{array}\right. } \end{aligned}$$
(4)

In other words, \(\delta _{\vec {x}}\) is a function that outputs 1 on any of K marked items specified by \({\vec {x}}\). Next we consider the case where there are M independent oracles. Each of these oracles has K marked items, which are chosen independently. We want to consider an adversary \(\mathcal {A}\) capable of querying such an oracle in superposition who is attempting to tell if any of the oracles have any marked items.

Lemma 1

For \(\mathsf {X}= ({\vec {x}}_1,\dots ,{\vec {x}}_M) \in ((\{0,1\}^n)^K)^M\), \(z \in \{0,1\}^n\), \(j \in \{1, \dots , M\}\), and \(b \in \{0,1\}\), let \(U_\mathsf {X}\) be the mapping

$$\begin{aligned} U_\mathsf {X}: |z\rangle |j\rangle |b\rangle \mapsto |z\rangle |j\rangle |b \oplus \delta _{{\vec {x}}_j} (z)\rangle . \end{aligned}$$
(5)

Let \(\mathcal {A}\) be a quantum algorithm making at most q queries to a mapping. Let \(\rho _b\) denote \(\mathsf {X}\) along with the final state of \(\mathcal {A}\) in the following experiment: Select \(\mathsf {X}= ({\vec {x}}_1, \dots , {\vec {x}}_M) \xleftarrow {\$} ((\{0,1\}^n)^K)^M\). Run \(\mathcal {A}^{(U_\mathsf {X})^b} ()\). Then

$$\begin{aligned} Tr\left( \rho _0,\rho _1 \right) \le 2q \sqrt{ \frac{K}{2^n} }. \end{aligned}$$
(6)

This lemma is a straightforward generalization of [18, Lemma 13]. Its proof is very similar, and can be found in Appendix A of the full version of the paper [7].

The most straightforward application of this Lemma is to upper bound any adversary’s success probability in identifying a marked item in any of a set of oracles that can be queried in superposition.

Lemma 2

Let \(H_1, \dots , H_M\) be independent random oracles with domains \(D_1, \dots , D_M\) onto a common range. Let \(U_H\) be the unitary mapping

$$\begin{aligned} U_H: \sum _{x,y,i} \alpha _{x,y,i} |x\rangle |i\rangle |y\rangle \mapsto \sum _{x,y,i} |x\rangle |i\rangle |y \oplus H_i(x)\rangle . \end{aligned}$$
(7)

Let \(S_1, \dots , S_M\) be random subsets of the respective \(D_i\), such that membership in \(S_i\) can be tested by a query to \(H_i\). We call \(S_i\) the marked items of \(H_i\). Then for any quantum adversary making q queries to \(U_H\), the probability that they find an \(x \in S_i\) for any i is at most

$$\begin{aligned} 2q \sqrt{ \max _i \left\{ \frac{|S_i|}{|D_i|} \right\} }. \end{aligned}$$
(8)

This lemma follows from Lemma 1 by noting that any adversary that is able to find a marked item can certainly distinguish whether a marked item exists. So the bounds on any adversary in Lemma 1 apply, with K being determined by the maximum fraction of marked items.

3.2 Second-Preimage Resistance with Adversary Prefixes

Also important to the analysis of LMS is a slight modification of second-preimage resistance, where the adversary is able to specify a prefix of the element whose second preimage they seek. We define this in terms of a game.

Game 1

(Second-Preimage Resistance with Adversary Prefixes).

  1. 1.

    \(\mathcal {C}\) chooses a random function \(H: \{0,1\}^* \rightarrow \{0,1\}^n\) from all possible mappings, as well as a random suffix \(r' \leftarrow \{0,1\}^n\). \(\mathcal {C}\) provides \(\mathcal {A}_1\) with oracle access to H.

  2. 2.

    \(\mathcal {A}_1\) makes some queries to H, and then outputs some quantum state \(\rho \) and a classical message \(M'\).

  3. 3.

    \(\mathcal {C}\) runs \(\mathcal {A}_2\), with access to \(H, M', r'\), and \(\rho \).

  4. 4.

    \(\mathcal {A}_2\) makes some queries to H, and then submits an \(M^*,r^* \in \{0,1\}^* \times \{0,1\}^n\), with \(M' \ne M^*\).

We say that the adversary \(\mathcal {A}= (\mathcal {A}_1,\mathcal {A}_2)\) has won if \(H(M^*||r^*) = H(M'||r')\).

Classically, it is not difficult to show that an adversary does not obtain much of an advantage. In Katz’s paper [11], he tackles this issue through the use of random oracle reprogramming. Specifically, he considers the challenger that, when the adversary submits their prefix \(M'\), modifies H to \(H'\) so that \(H'(M'||r') = h'\), where \(r'\) and \(h'\) are uniformly random n-bit strings that were chosen at the beginning of the game. The adversary will only notice that \(\mathcal {C}\) isn’t playing by the ‘real’ rules of the game if they had previously queried \(M'||r'\), and since \(r'\) is not disclosed to the adversary in advance, this happens with probability \(\le \frac{q}{2^n}\). Then the probability that an adversary queries a different \(M^*||r^*\) such that \(H(M^*||r^*) = h'\) is simply \(q / 2^n\). So we upper bound the probability that the adversary wins this game by \(2q/2^n\).

It is much more difficult to prove a similar statement in the quantum setting however. In Katz’s proof, an essential step was to reprogram the oracle to reduce to something that more closely resembled second-preimage resistance. Since the adversary has a limited number of queries, they don’t have any information about what is reprogrammed with high probability. In the quantum case however, this is much more challenging. Since the adversary can make a quantum superposition of queries, an adversary can make a query giving them some information about the entire oracle. However, the basic approach is still sound—if \(\mathcal {C}\) selects a \((r',h')\) and sets \(H'(M'||r') = h'\), any adversary should be unable to notice this reprogramming.

For any oracle H, let \(H_{M'||r'\mapsto h'}\) denote the oracle identical to H except that the input \(M'||r'\) maps to \(h'\).

Game 2

  1. 1.

    \(\mathcal {C}\) chooses a random function \(H: \{0,1\}^* \rightarrow \{0,1\}^n\) from all possible mappings, as well as a random suffix and outputs \(r',h' \leftarrow \{0,1\}^n\). \(\mathcal {C}\) provides \(\mathcal {A}_1\) with oracle access to H.

  2. 2.

    \(\mathcal {A}_1\) makes some queries to H, and then outputs some quantum state \(\rho \) and a classical message \(M'\).

  3. 3.

    \(\mathcal {C}\) runs \(\mathcal {A}_2\), with access to \(H_{M'||r'\mapsto h'}, M', r'\), and \(\rho \).

  4. 4.

    \(\mathcal {A}_2\) makes some queries to \(H_{M'||r'\mapsto h'}\), and then submits an \(M^*,r^* \in \{0,1\}^* \times \{0,1\}^n\), with \(M' \ne M^*\).

\(\mathcal {A}_2\) wins Game 2 if \(H(M^*||r^*) = h'\).

Lemma 3

For any \(\mathcal {A}= (\mathcal {A}_1,\mathcal {A}_2)\) making collectively at most q queries to a random oracle H,

$$\begin{aligned} \left| \mathop {\Pr }\limits _{\textit{Game 1}} [\mathcal {A}_2 \text {wins}] - \mathop {\Pr }\limits _{\textit{Game 2}}[\mathcal {A}_2 \text {wins}] \right| \le \frac{4q}{2^{n/2}}. \end{aligned}$$
(9)

Roughly speaking, the proof of this lemma follows a technique also seen in [18]. The idea is to introduce two subgames, and show that the difference in the adversary’s success probabilities for these games and Games 1 and 2 is at most \(2q/2^{n/2}\). This follows from Lemma 1 by showing that any adversary distinguishing between the subgames can also win the game in Lemma 1 with the same probability. The full proof can be found in Appendix B of the full version of the paper [7].

We can also imagine the situation where a single adversary \(\mathcal {A}\) plays Game 1 with multiple challengers \(\mathcal {C}_1, \dots , \mathcal {C}_U\) with access to multiple independent quantum random oracles \(H_1, \dots , H_U\). Then note that the adversary’s chances of success do not increase at all with U. This can be established by considering the same subgames in this multi-user setting. The arguments relating how close the sub-games are still apply, because Lemma 1 does not depend on the number of oracles, as long as each oracle is independent.

3.3 Random Oracle Composition

In the description of LMS, and occasionally in other constructions, a function is defined by a composition of independent random oracles. It would be convenient for this function to itself be a random oracle, or at least have certain properties of a random oracle, from the perspective of both classical and quantum adversaries. However, this is not quite the case.

Let \(\mathcal {O}_1, \dots , \mathcal {O}_E\) be independent random oracles mapping n-bit strings to n-bit strings. Consider the oracle \(\mathcal {O}= \mathcal {O}_E \circ \mathcal {O}_{E-1} \circ \dots \circ \mathcal {O}_1\), \(\mathcal {O}: \{0,1\}^n \rightarrow \{0,1\}^n\). We want to consider properties of the combined oracle \(\mathcal {O}\) with respect to standard properties such as preimage resistance.

Lemma 4

Let O be a random mapping from a domain \(\mathcal {D}\) of size N to a codomain \(\mathcal {R}\) of size M. Then the expected size of the image of \(\mathcal {D}\) under O is

$$\begin{aligned} M \left( 1 - \left( 1 - \frac{1}{M} \right) ^N \right) .\end{aligned}$$
(10)

Proof

Let \(\mathcal {R}= \{1, \dots , M \}\). For each \(1 \le i \le M\), let \(X_i\) be a binary random variable where \(X_i\) is 1 if there is an \(x \in \mathcal {D}\) such that \(O(x) = i\), and 0 otherwise. It is not hard to see that \(E [ X_i ] = 1 - (\frac{M-1}{M})^N\). Then the expected number of elements in the codomain that are hit is \(E[X_1 + X_2 + \dots + X_M] = E[X_1] + E[X_2] + \dots + E[X_M]\), from which the result follows.    \(\square \)

Writing \(N = \alpha \cdot M\), for sufficiently large N and M, Lemma 4 tells us that the fraction of the codomain that is hit is very close to

$$\begin{aligned} \left( 1 - \frac{1}{e^\alpha } \right) , \end{aligned}$$
(11)

where \(e \approx 2.71828\) is Euler’s constant. So when k oracles, each of which maps to a codomain of size \(2^n\), are composed, the overall oracle maps to an image that has size roughly

$$\begin{aligned} 2^n \cdot \left( \left. 1 - \left( \frac{1}{e} \right) ^{ 1 - (1/e)^{1 - (1/e)^{...}}} \right\} k \right) .\end{aligned}$$
(12)

For example, for \(k = 256\), this tells us that after 256 applications of independent random oracles, the final range will be very close to \(2^{-7}\) the size of the original domain. For \(k = 1024\), we have the size of the final range is close to \(2^{-9}\) of the original size.

Remark 1

For the rest of this document we will assume that the actual compression for the composed oracles in LMS does not shrink more than four times the expected rate. We will also assume that no more than 256 oracles are used, as this is the most used in any proposed set of LMS parameters. We will assume that the size of the range of 256 applications of an oracle is no smaller than \(2^{-10} \cdot 2^n\), which is over four times smaller than the expected size of roughly \(2^{-7} \cdot 2^n\). This amount of compression is very unlikely to actually occur, and as actually distinguishing the number of marked items in an oracle is also a exponentially difficult problem, this approach greatly overestimates the compression and the adversary’s ability to take advantage of that compression. A much more careful analysis could result in a slightly tighter bound in Theorem 1. However, as this would provide at most a few bits of security in the analysis, we leave this for future work. For further details on the compression of oracles, we refer to Appendix C in the full version of this paper [7].

3.4 Undetectability

Often in protocols with random oracles, a value y is selected by choosing a uniformly random point x in the domain of the random oracle H, and setting \(y = H(x)\). While the distribution of y is certainly uniform (as H is uniform), the joint distribution of (Hy) is not uniform. Therefore an adversary \(\mathcal {A}\) that has access to the random oracle may be able to tell if a point in the codomain was chosen uniformly at random or if it was chosen by hashing a uniform point in the domain. This is known as the undetectability property.

Game 3

(Undetectability).

  1. 1.

    \(\mathcal {C}\) generates a random oracle \(H: \{0,1\}^n \rightarrow \{0,1\}^n\), and selects a uniformly random bit \(b \xleftarrow {\$} \{0,1\}\).

  2. 2.
    • If \(b = 0\), \(\mathcal {C}\) sends a uniformly random \(y \in \{0,1\}^n\) to \(\mathcal {A}\) and provides oracle access to H.

    • If \(b = 1\), \(\mathcal {C}\) selects a uniformly random \(x \in \{0,1\}^n\) and sends \(y = H(x)\) to \(\mathcal {A}\), and provides oracle access to H.

  3. 3.

    After some queries to H, \(\mathcal {A}\) outputs a bit \(b'\).

\(\mathcal {A}\) is said to have won Game 3 if \(b' = b\).

Lemma 5

Let \(\mathcal {A}\) be a quantum algorithm with oracle access to a random oracle H, making at most q queries. Then

$$\begin{aligned} \left| \mathop {\Pr }\limits _{\textit{Game 3}} [ \mathcal {A}\text { wins}] - 1/2 \right| \le 2q/2^{n/2}. \end{aligned}$$
(13)

Roughly speaking, this lemma is shown by establishing that the only real way to distinguish whether a point in the codomain was chosen uniformly at random or by first choosing a preimage is to actually find that preimage. Finding the preimage can then be tightly reduced to Lemma 1. Furthermore, as Lemma 1 does not depend on the number of instances of the problem, as long as each oracle is independent, the result stays the same when \(\mathcal {A}\) is playing multiple, independent instances of Game 3. The full proof can be found in Appendix D of the full version of this paper [7].

Similar to Lemma 3, we can imagine an adversary \(\mathcal {A}\) playing multiple instances of Game 3 with independent oracles. Then note that this gives no advantage to the adversary’s success probability, even if b is chosen to be the same in each game. This is because the reduction to Lemma 1 still holds, with separate marked items in separate independent oracles.

4 Scheme Proof

4.1 OTLMS Proof

Throughout this section, a variable with a \(*\) will refer to a value derived from the forgery \((M^*,\sigma ^*)\). A variable with \('\) refers to a value derived in the course of the signing query. If neither are present, it refers to a value derived in the key generation algorithm. We define security in terms of the standard notion of existential unforgeability under chosen-message attack. This standard notion of security is defined in terms of the following interaction between an adversary \(\mathcal {A}\) and a challenger \(\mathcal {C}\).

Game 4

(One-time existential-unforgeability under chosen-message attack (OTeucma)).

  1. 1.

    \(\mathcal {C}\) chooses a random oracle \(H:\{0,1\}^* \rightarrow \{0,1\}^n\) from all possible mappings (considering that there is in principle an upper bound on the length of binary strings \(\mathcal {A}\) will ask for evaluation on). \(\mathcal {C}\) then creates a quantum random oracle that provides quantum access to H as in Eq. 1.

  2. 2.

    \(\mathcal {C}\) runs \(\mathsf {OTKeyGen}(1^n,w,I,Q)\), obtaining (pksk), and sends pk to \(\mathcal {A}\).

  3. 3.

    \(\mathcal {A}\) makes some queries to the quantum random oracle and then submits a message \(M'\) for signing.

  4. 4.

    \(\mathcal {C}\) runs \(\mathsf {OTSign}(M',sk,I,Q)\) and sends the resulting signature, \(\sigma '\) to \(\mathcal {A}\).

  5. 5.

    \(\mathcal {A}\) makes some queries to the quantum random oracle, then submits a message-signature pair, \((M^*,\sigma ^*)\), such that \(M^* \ne M'\).

We say that \(\mathcal {A}\) has won the OTeucma game if \(\mathsf {OTVrfy}(M^*,\sigma ^*,pk,I,Q) \rightarrow \mathsf {accept}\). To bound the adversary’s ability to win this, we introduce a separate game:

Game 5

(One-time Simulation).

  1. 1.

    \(\mathcal {C}\) Chooses a random oracle \(H: \{0,1\}^* \rightarrow \{0,1\}^n\), as well as random strings \(r',h' \in \{0,1\}^n\).

  2. 2.

    \(\mathcal {C}\) computes \(c' = \mathsf {checksum}(h')\) and sets \((v'_1, \dots , v'_p) = h'||c'\). \(\mathcal {C}\) chooses p values \(x_1^{v'_1}, \dots , x_p^{v'_p}\) uniformly at random from \(\{0,1\}^n\).

  3. 3.

    For \(i = 1\) to p, let \(s = I||Q||i\) and compute \(x_i^E = F_s(x_i^{v'_i};v'_i,E)\).

  4. 4.

    Send \(pk = H(x_1^E||\dots ||x_p^E||I||Q||01)\) to \(\mathcal {A}\) and provide oracle access to H.

  5. 5.

    \(\mathcal {A}\) makes oracle queries and submits a message \(M'\) for signing.

  6. 6.

    \(\mathcal {C}\) modified H so that \(H(M'||r'||I||Q||02) = h'\), and sends \((r', x_1^{v'_1}, \dots , x_p^{v'_p})\) as the signature.

  7. 7.

    After further oracle queries, \(\mathcal {A}\) submits a message-signature pair \((M^*, \sigma ^*)\) such that \(M^* \ne M'\).

As before, \(\mathcal {A}\) wins this game if \(\mathsf {OTLMSVrfy}(M^*,\sigma ^*,pk,I,Q) \rightarrow \mathsf {accept}\).

Lemma 6

(Simulation Difference). Let \(\mathcal {A}\) be a quantum adversary, making at most q queries to a quantum oracle H. Then

$$\begin{aligned} \left| \mathop {\Pr }\limits _{\textit{Game 4}}[ \mathcal {A}\text { wins}] - \mathop {\Pr }\limits _{\textit{Game 5}}[\mathcal {A}\text { wins}] \right| \le 516q/2^{n/2}. \end{aligned}$$
(14)

Proof

The difference between these two games is established by applications of Lemmas 3 and 5. There are two differences between Games 4 and 5. The first is that the value \(h'\) for the signing query is chosen uniformly at random, and H is later modified so that \(H(M'||r'||I||Q||02) = h'\). This introduces a difference of at most \(4q/2^{n/2}\) by Lemma 3. The second difference is that values \(x_i^{v'_i}\) are chosen uniformly at random, rather than as the output of \(F(x_i^0; 0, v'_i)\) for \(i = 1\) to p. This introduces a difference of at most \(256 \cdot 2q/2^{n/2}\). This can be seen by a game hopping argument. In the original game, \(x_i^{v'_i}\) is chosen by computing \(F(x_i^0; 0, v'_i)\) for a uniform \(x_i^0\). In the next game, it is chosen by computing \(F(x_i^1; 1, v'_1)\) for a uniform \(x_i^1\). By Lemma 5, this only introduces a difference of \(2q/2^{n/2}\). Then we repeatedly apply this lemma until we choose \(x_i^{v'_i}\) uniformly. As E is at most 256, this needs to be applied at most 256 times, and so the difference is at most \(2\cdot 256q/2^{n/2}\). Thus the overall separation between these games is at most \((4 + 2\cdot 256)q/2^{n/2}\).    \(\square \)

Theorem 1

For any adversary \(\mathcal {A}\), making at most q quantum queries to the random oracle, the probability that they win Game 4 is at most

$$\begin{aligned} 580 q / 2^{n/2}. \end{aligned}$$
(15)

Proof

This proof is established by showing that the probability an adversary wins Game 5 is at most \(64q/2^{n/2}\) so that the result follows from Lemma 6.

To upper bound \(\mathcal {A}\)’s chances of winning Game 5, we define a few subsets of the domain of H.

  • \(S_{0,i,j} := \{ x \in \{0,1\}^* : x = x'||I||Q||i||j||00, \; F_{I||Q||i}(x;j,E) = x^E_i \}\)

  • \(S_1 := \{ x \in \{0,1\}^* : x = {x'}_1^E ||\dots ||{x'}_p^E||I||Q||01, \; H(x) = pk,\)

             \(({x'}_1^E||\dots ||{x'}_p^E) \ne (x_1^E || \dots || x_p^E) \}\)

  • \(S_2 := \{ x \in \{0,1\}^*: x = M||r||I||Q||02, \; H(x) = h', M \ne M'\}\).

Then we define the following three events that may occur over the course of the game OTeucma.

  • E0 is the event that \(\mathcal {A}\) has complete knowledge of some \(x \in S_{0,i,j}\) for some i and j where \(v'_i > j\).

  • E1 is the event that \(\mathcal {A}\) has complete knowledge of some \(x \in S_1\).

  • E2 is the event that \(\mathcal {A}\) has complete knowledge of some \(x \in S_2\).

These sets correspond to the (second-) preimages that an adversary will have to find in order to break the security of LMS. These events then represent an adversary actually finding such a preimage. Classically, an adversary finding a relevant preimage is exactly characterized by the adversary querying such a point to the random oracle. In a quantum setting however, this equivalence fails as superposition queries are allowed. Instead we characterize the event of an adversary finding such a preimage by whether such a value is derived when running the verification algorithm \(\mathsf {OTLMSVrfy}\). This is what we mean by “complete knowledge”.

We will establish that if \((M^*,\sigma ^*)\) is a valid forgery, at least one of the three events has occurred. We do this by establishing that in the event of a forgery where events E1 and E2 did not occur, E0 must have happened.

We are assuming that \(\mathcal {A}\) has succeeded in submitting a forgery and that events E1 and E2 have not occurred. We will examine the properties of \((M^*,\sigma ^*)\) and show that E0 must have occurred.

When the adversary submits a forgery \((M^*, \sigma ^*)\), we can run the verification algorithm on this pair. Then the following values are derived in the process of running the verification algorithm:

  • \(M^* || r^* || I||Q||02\)

  • \({x}^{*E}_1 || \dots || {x}^{*E}_p ||I||Q|| 01\)

  • \(\sigma ^*_i || I || Q || i || v^*_i || 00\), for \(i = 1\) to p.

As E1 did not occur, and since the verification algorithm accepts \((M^*, \sigma ^*)\), then we must have that \(H( {x}_1^{*E} || \dots || {x}_1^{*E} ||I||Q|| 01 ) = pk\). So we must have that \(x^{*E}_1 || \dots || x^{*E}_p ||I||Q|| 01 \notin S_1\), and so \(x^{*E}_1||\dots ||x^{*E}_p = x^E_1 || \dots || x^E_p\).

Similarly, E2 did not occur, and since \(M^* \ne M'\), it must be the case that \(H(M^*||r^*||I||Q||02) \ne h'\).

So we know that \(h^* \ne h'\), and that \(x^{*E}_1||\dots ||x^{*E}_p = x^E_1 || \dots || x^E_p\). Note that by the construction of the checksum, when we compare \(v^*\) and \(v'\), there must be an index i for which \(v^*_i < v'_i\). But then since we have that \(x^{*E}_i = {x}^E_i\), we can see that this means that \(\sigma ^*_i || I || Q || i || v^*_i||00 \in S_{0,i,v^*_i}\) and E0 has occurred.

All we need to do now is provide an upper bound on the probability of any of the events occurring. To do this, we establish that for any of these events to occur \(\mathcal {A}\) must solve some quantum search problem on a distinct search space.

Event \({\varvec{E}}{} \mathbf 0 \) . For event E0, we want to consider the adversary’s ability to find any new x, i, and j, with \(j < v'_i\) and \(x \in S_{0,i,j}\). Note that finding an \(x \in S_{0,i,j}\) implies complete knowledge of some \(x' \in S_{0,i,k}\), for \(j \le k < E\). In particular, it implies complete knowledge of some \(x \in S_{0, i, v'_i -1}\). So we need to upper bound the adversaries ability to find such an x.

From the signing query, the adversary knows precisely one element of \(S_{0,i,v'_i}\). However, we can imagine an adversary who knows this set entirely. We will show that finding an element of \(S_{0,i,v'_i -1}\) is still difficult.

From Sect. 3.3, we know that when considering the function F as a composition of random oracles, we have an expectation on the overall compression from the domain to the codomain, based on the number of applications of H in the construction of F. For typical parameter sets, this is less than 256 times, which corresponds to a compression of roughly \(2^7\) times. As noted in Remark 1, we will take a conservative approach and use a compression factor of four times this, \(2^{10}\). One consequence is that \(S_{0,i,v'_i}\) will have size less than \(2^{10}\) (as the remaining oracles then compress this down to a point).

So we can imagine an adversary that for each i, knows entirely the set \(S_{0,i,v'_i}\). The adversary then needs to find an element in \(\{0,1\}^n\) that \(H(\cdot ||I||Q||i||v'_i-1||0)\) maps that point to an element in \(S_{0,i,v'_i}\). As \(S_{0,i,v'_i}\) has size less than \(2^{10}\), a fraction less than \(2^{10}\) of the domain maps to these points. So the adversary needs to find a marked item where the fraction of marked items is at most \(2^{10-n}\).

Event \({\varvec{E}}{} \mathbf 1 \) . Event E1 is simply the adversary’s ability to find some distinct \(x \ne x_1^e || \dots || x_p^e\) that maps to pk under \(H( \cdot || I || Q|| 01)\), when the adversary is already given such an element. This is a game of second-preimage resistance, so the adversary must find a marked item in the oracle \(H(\cdot ||I||Q||01)\), where the fraction of marked items is \(2^{-n}\).

Event \({\varvec{E}}{} \mathbf 2 \) . Event E2 refers to the adversary’s ability to find a distinct \(M^*\) and any \(r^*\) such that \(H(M^*||r^*||I||Q||02) = H(M'||r'||I||Q||02)\), where \(M'\) is chosen by the adversary and \(r'\) is chosen uniformly at random. But this is precisely the game of second-preimage resistance with adversary prefixes with respect to the random oracle \(H(\cdot || \cdot || I || Q || 02)\). So, the adversary’s chances of succeeding differ at most by \(4q/\sqrt{2^n}\) from the challenge of finding a marked item in the oracle \(H(\cdot || \cdot || I || Q ||02)\), where \(h'\) is chosen in advance, and the oracle is reprogrammed. In this case the fraction of marked items is \(2^{-n}\).

We have that the adversary’s chances of succeeding are at most \(4q/2^{n/2}\) from attempting to find a marked item in any of the distinct oracles defined by I, Q, and \(i||v'_i -1\) for \(i = 1\) to p. As the fraction of marked items in any of these oracles is at most \(2^{10-n}\), the chances of any adversary’s success are at most

$$\begin{aligned} \mathop {\Pr }\limits _{\text {Game 5}} [ \mathcal {A}\text { wins}] \le 2q \sqrt{ 2^{10-n}} = 64q/2^{n/2}. \end{aligned}$$
(16)

And so

$$\begin{aligned} \mathop {\Pr }\limits _{\text {Game 4}} [\mathcal {A}\text { wins}] \le 516q/2^{n/2} + 64q/2^{n/2} = 580q/2^{n/2}. \end{aligned}$$
(17)

   \(\square \)

4.2 Security Proof for Full Version and in the Multi-user Setting

Proving the security of the full version is quite simple having developed the techniques and lemmas used to prove the security of the one-time scheme. By the construction of LMS, all oracles contain different identifying information. We can thus prove security by showing that for an adversary to break the security, they must find a marked item in one of these oracles, and calculating the largest fraction of marked items.

To do this we can use Lemmas 5 and 3 to simulate a signing algorithm similar to how we did in Game 5, but instead for each one-time instance of the signature scheme. As these lemmas can be applied in a multi-instance model without affecting the parameters, we can split up the domain by instance number and identifier information to complete the proof in the full version of the scheme and in the multi-user setting without additional theory.

Theorem 2

Let \(\mathcal {A}\) be an adversary attacking the security of the full LMS scheme in the multi-user setting. If \(\mathcal {A}\) makes at most q queries, then the probability they break the existential unforgeability of any of the instances of LMS is at most

$$\begin{aligned} 580q / 2^{n/2}. \end{aligned}$$
(18)

The complete proof of this theorem may be found in Appendices E and F of the full version of the paper [7].

5 Future Work and Discussion

Grover’s algorithm implies that any random-oracle analysis of LMS can show that there exists an adversary whose success probability of after q queries is \(2q/2^{n/2}\). While the bounds in Theorems 1 and 2 asymptotically match this, there is a difference of a constant factor of 290, suggesting a possible loss in roughly 8 bits of security over what is expected based off of the most obvious attacks. However it is not clear if there is an attack on LMS that gives such an advantage. This loss in tightness largely comes from applying Lemma 5 a constant number of times in the proof of Lemma 6. More careful analysis in the proof of Lemma 6 could reduce this constant factor.

In our proof, we also had to assume that the number of collisions in the Winternitz chains was much higher than should ever be the case in order to make up for the heuristic technique of assuming how much they actually decreased by. Better understanding of the statistics of repeated application of independent random mappings could greatly assist in tightening up this analysis for a simpler understanding of the Winternitz chains.

In [9], the author proved the security of LMS in a model where the compression function of a hash function is assumed to be a random oracle, rather than the entire hash function itself. This is particularly relevant when LMS is implemented with hash functions such as the SHA-2 series where the hash function does not entirely behave as a random oracle, due to the Merkle-Damgård construction. Elevating this analysis to the quantum random-oracle model would provide greater security assurance for the use of LMS with such a hash function in practice.