Keywords

1 Introduction

To enable fast operations on encrypted databases, several variants of encryption have been suggested that trade security or efficiency for processing functionality on the server. Amongst the suggested constructions, order-revealing encryption (ORE) and its special case order-preserving encryption (OPE) [1, 3, 4] have seen deployments in productsFootnote 1 and usage in applied research [12, 13, 15]. ORE schemes are symmetric key encryption schemes \(\mathcal {E}\) such that, given ciphertexts \(\mathcal {E}_K(x),\mathcal {E}_K(y)\) for messages xy, one can decide if \(x<y\) or not without the decryption key. OPE schemes are the subset of ORE schemes for which the ciphertexts themselves are numbers that can be compared (so \(\mathcal {E}_K(x)< \mathcal {E}_K(y) \iff x < y\)).

A typical application of ORE is in databases, where one party encrypts numeric columns of a database table. Later, to issue a range query on the column, that party encrypts the endpoints of the range and requests all ciphertexts between them, an operation that can be processed by anyone who holds the encrypted column. In these settings, OPE is preferable because it can more easily be added to a database application, as the server can be oblivious to the fact that encryption is used at all. With more general ORE schemes, one needs to implement the specialized comparison operation in the database, which can be inconvenient (e.g. in a slow SQL implementation) or impossible, for instance when adding encryption to legacy systems.

This work studies the ciphertext length of any OPE construction achieving a certain new security notion recently given by a recent work of Chenette et al. [6] (we refer to this work as CLWW below). This notion is currently the best known security property for OPE that can be implemented and deployed. In particular, it results in strictly better security when combined with prior OPE via double-encryption. It seems likely that deployments using OPE (like those mentioned above) will be extended to use CLWW OPE if possible. And although recent attacks have shown that existing OPE is insecure in many contexts [8, 11], it will likely continue to be used in practice in scenarios where the attacks do not apply.

CLWW constructed ORE with their security notion that has ciphertext length \(\log _2(3) m \approx 1.58m\) bits, where m is the plaintext length, and showed how to convert their scheme to the more convenient OPE, but at the cost of increasing the ciphertext length to \(\lambda m\), where \(\lambda \) is the security parameter. This means that achieving OPE comes at a cost of increasing storage of the column by a factor typically in the range of 80 to 256, compared to the 1.58 expansion of ORE. Achieving smaller OPE ciphertexts with the same security would be highly desirable if possible, as large plaintext data sizes are often the motivating factor for outsourcing data to untrusted server in the first place. (We note that a different, incomparable ORE security notion of [3] can be achieved with \(\approx m\) bit ciphertexts, although this fact will not be used in our work below.)

Below we give evidence that the large ciphertext size of the OPE in CLWW is inherent, by proving that any scheme meeting the information-theoretic version of their security notion must have ciphertexts of length

$$\begin{aligned} \lambda m - m \log m + m \log e , \end{aligned}$$

where again m is the message length, logarithms are base 2, and e is the base of the natural logarithm. This bound shows that CLWW has almost optimal ciphertext size, as it has leading term \(\lambda m\) instead of \((\lambda -\log m)m\).

In the remainder of this section we describe the prior work on ORE in more detail, and then sketch our results.

ORE security. It is immediate that an ORE scheme cannot be semantically secure against passive attacks, because one can compute information about plaintexts. But meaningful and formally-defined security targets for ORE have been suggested, starting with the work of [3]. This work defined two notions, one of which was a ideal ORE security that requires all plaintext information except order to be hidden. They also showed that no efficient OPE scheme (in particular, one with \(\mathrm {poly}(\lambda ,m)\)-size ciphertexts) could achieve ideal security. However, it was later shown [5] that ideal security for ORE is achievable using cryptographic multilinear pairings [9] or indistinguishability obfuscation [10]. This was apparently the first separation of OPE and ORE as primitives.

Motivated by the lack of a practical ideal construction, Boldyreva et  al. [3] investigated a particular weaker notion called ROPFFootnote 2. It was later shown [4] that ROPF-secure ciphers allow a passive adversary to compute the most-significant half of the bits of a random message with high probability, which may be too weak for some applications. The notion was however instantiated with fast blockcipher-based constructions under standard assumptions.

The recent CLWW work [6] introduced a different notion of security for ORE and demonstrated that it is stronger than ROPF-security by certain measures. In particular, that work gave a construction of ORE that could provably hide all but a logarithmic number of bits of a random plaintext. Moreover, the construction is simple to implement and uses only a blockcipher and standard assumptions. The CLWW security notion allows an adversary, given ciphertexts \(\mathcal {E}_K(x),\mathcal {E}_K(y)\), to learn the index of the most significant bit on which x and y differ. As mentioned above, the ORE version of their construction has ciphertext size \(\approx 1.58m\) while the OPE version has ciphertext size \(\lambda m\).

Our result. For technical reasons discussed below, we consider an information theoretic version of CLWW security, which requires the same security but against unbounded adversaries. The CLWW construction achieves this notion in the random oracle model, and we show that their construction is essentially optimal in terms of ciphertext length. Thus their large overhead in converting their construction from ORE to OPE is inherent, and should OPE with lower storage overhead be required, one will have to investigate other security notions for OPE.

We also generalize our lower bound to apply to any OPE with a new security notion that we call inner-distance indistinguishability. While not necessarily interesting as a security goal on its own (one would prefer something stronger), it encapsulates a property that must be avoided in order to build OPE with O(m) size ciphertexts.

Our techniques start from first principles regarding when relations between random variables force their distributions to have large statistical distance. We sketch our proof in Sect. 4. We note that the big-jump attack of Boldyreva et al. [4] proves an exponential lower bound on ideal OPE, and bears some resemblance to our attack. But our attack treats a different and weaker security notion and obtains a fine-grained, polynomial lower bound.

Information-theoretic versus computational security. We attempted to prove our result for any computationally-CLWW-secure ORE scheme, but our techniques do not seem suited to this case. An information-theoretic bound, however, applies to any construction secure in the random-oracle model and includes the CLWW construction. Moreover, if a scheme uses a PRF as its only cryptographic component, then our lower bound applies to a version of that scheme that uses a random-oracle in place of the PRF and thus to the original as well. We are unaware of any technique for building computationally-CLWW-secure OPE that circumvents our bound, and we conjecture that a ciphertext length lower bound also holds in the computational case.

Comparison to concurrent work. Recently, Segev and Shahaf [14] extend our result to computational security level, and our lower bound and their lower bound are identical in terms of the attacker’s success probability, which implies the ciphertext expansion is inherent. Concretely, Segev and Shahaf prove their lower bound by presenting a non-uniform polynomial-time adversary, whereas we prove it via analyzing the statistical distance between ciphertexts distribution, which requires unbounded adversary. In their proof, Segev and Shahaf show that, if the lower bound N does not hold, then there exists a value \(t \in [N]\) and ciphertexts \((c_0, c_1) = \{ (\mathcal {E}(0), \mathcal {E}(2^{j+1}-1), (\mathcal {E}(2^j -1), \mathcal {E}(2^j)) \} , 1\le j \le m-1 \), such that the test \(c_1-c_1 \ge t\) can distinguish the two cases. We note that the two cases have the same leakage profile, which refers the evidence that there exists a non-uniform polynomial-time adversary.

Why it’s hard for uniform adversary? Our result only allows unbounded adversary, and Segev and Shahaf just improve the result to non-uniform computational settingFootnote 3, it would be nicer if we can have a tight lower bound be proved via a uniform polynomial adversary. According to our observation, we note that in both our result and [14], the distinguishing/testing algorithm is a simply comparison: \(\mathbf 1 (c_1 - c_0 \ge t)\), and locating “t” is a super-poly algorithm. One hope might be extracting a more involved but still polynomial-time testing algorithm, and we leave it as an open problem.

Organization. In Sect. 2 we recall definitions for ORE/OPE syntax and security and in Sect. 3 we recall the specific security notion that we study. In Sect. 4 we state our lower bound and sketch its proof, which is given in Sects. 5, 6 and 7. Finally in Sect. 8 we show how to generalize our result to an abstract security property.

2 Preliminaries

Notation and basic results. We always use \(\lambda \) to denote the security parameter. For non-negative integers \(a\le b\) we write [ab] for the set \(\{a,a+1,\ldots ,b\}\), [n] for the set \(\{1,\ldots ,n\}\), and \([n]'\) for the set \(\{0,1,\ldots ,n\}\). We use boldface to denote vector, i.e. \(\varvec{m}\); we denote \(\varvec{m}[i]\) as the i-th component of \(\varvec{m}\). If \(X_1,X_2\) are r.v.s, we let

$$ \varDelta (X_1,X_2) = \frac{1}{2}\sum _{k}|\Pr [X_1 =k] - \Pr [X_2 =k]| $$

denote their statistical distance. We will use the following well-known data processing lemma (c.f. [7]) in our proof.

Lemma 1

Let X and Y be r.v.s, and f be any function that includes the support of X and Y in its domain. Then \(\varDelta (f(X),f(Y)) \le \varDelta (X,Y)\).

For a randomized algorithm \(\mathcal {A}\) we write to denote running \(\mathcal {A}\) on input w, and letting y be the random variable denoting its output. If \(\mathcal {A}\) is deterministic, we denote \(y \leftarrow \mathcal {A}(w)\) to denote running \(\mathcal {A}\) and letting y be its output.

We write \(\mathbf {1}(x < y)\) to mean 1 if \(x<y\) and 0 otherwise.

ORE and OPE. An ORE scheme \(\varPi \) is a tuple of algorithms \((\mathcal {K},\mathcal {E},\mathcal {C})\) for key generation, encryption, and comparison respectively, and always has an associated message space \(\{0,1\}^m\) and ciphertext space \(\{0,1\}^n\). The key generation algorithm \(\mathcal {K}\) is randomized, and on input \(1^\lambda \), outputs a key K. The encryption algorithm \(\mathcal {E}\) is deterministic and takes as input a key K and message \(x\in \{0,1\}^m\) and outputs a ciphertext \(c \leftarrow \mathcal {E}_K(x)\). The comparison algorithm takes as input two ciphertexts \(c_1, c_2\) generated with the same K on messages \(x_1,x_2\) and outputs a bit b.

We assume that all ORE schemes in this paper are correct, meaning that for all \(\lambda \), keys K in the support of \(\mathcal {K}(1^\lambda )\), and all \(x,y\in \{0,1\}^m\), \(\mathcal {C}(\mathcal {E}_K(x),\mathcal {E}_K(y))\) outputs \(\mathbf {1}(x<y)\). Note that this allows testing if \(x=y\) by running the comparison algorithm twice.

When an ORE scheme \(\varPi \) has a canonical comparison algorithm \(\mathcal {C}\) that directly compares its inputs as numbers in \([2^n-1]'\), we say that the scheme is an order-preserving encryption (OPE) scheme. In this case we omit the comparison algorithm and write \(\varPi = (\mathcal {K},\mathcal {E})\).

ORE security. Chenette et al. [6] gave a simulation-based definition for ORE security that used a leakage profile \(\mathcal {L}\) as a parameter, where \(\mathcal {L}\) is an efficient algorithm. We will use a weaker non-interactive indistinguishability-based version of their definition for our lower bounds (which makes our result stronger).

For an ORE scheme \(\varPi = (\mathcal {K},\mathcal {E},\mathcal {C})\), leakage profile \(\mathcal {L}\), and adversary \(\mathcal {A}\) we consider the following game:

figure a

We define the \(\mathcal {L}\)-advantage of \(\mathcal {A}\) against \(\varPi \) to be

$$\begin{aligned} \mathbf {Adv}^{{\mathrm {ore}}}_{\varPi ,\mathcal {L},\mathcal {A}}(\lambda ) = 2\Pr [\mathrm {ORE}_{\varPi ,\mathcal {L},\mathcal {A}}(\lambda ) = 1] - 1. \end{aligned}$$

We say that \(\varPi \) is \(\mathcal {L}\)-computationally secure if for all efficient \(\mathcal {A}\), \(\mathbf {Adv}^{{\mathrm {ore}}}_{\varPi ,\mathcal {L},\mathcal {A}}(\lambda )\) is a negligible function i.e. is \(o(1/\mathrm {poly}(\lambda ))\). We say that \(\varPi \) is \(\mathcal {L}\)-statistically-secure if the same condition holds for all (unbounded, wlog deterministic) adversaries \(\mathcal {A}\); more specifically, we say \(\varPi \) is \(2^{\lambda }\)-\(\mathcal {L}\)-statistically-secure if for all unbounded adversaries, the advantage is at least \(2^{\lambda }\).

We recall, as an example, that the ideal leakage profile only leaks order. Formally, this is

$$\begin{aligned} \mathcal {L}_{\mathrm {ideal}}(m_1,\ldots ,m_q) = \{(i,j,\mathbf {1}(m_i< m_j)) \ : \ 1\le i < j \le q\}. \end{aligned}$$

3 CLWW Security and Constructions

In this section we recall and discuss the CLWW leakage profile and constructions.

CLWW leakage. CLWW considered the following leakage profile \(\mathcal {L}_{\mathrm {clww}}\). On input \(\mathbf {x}= (x_1,\ldots ,x_q)\in (\{0,1\}^m)^q\), the leakage profile is defined by

$$\begin{aligned} \mathcal {L}_{\mathrm {clww}}(x_1,\ldots ,x_q) := \{(i,j,\mathsf {ind}_{\mathsf {diff}}(x_i,x_j), \mathbf {1}(x_i< x_j)) \ : \ 1 \le i < j \le q\}, \end{aligned}$$

where \(\mathsf {ind}_{\mathsf {diff}}(x_i,x_j)\in \{1,\ldots ,m+1\}\) is the left-most bit on which \(x_i\) and \(x_j\) differ, or \(m+1\) if they are equal. Compared to the ideal profile, only the \(\mathsf {ind}_{\mathsf {diff}}(x_i,x_j)\) indices are extra leakage.

The intuition for the leakage is that, when comparing two numbers, an adversary will learn the length of the longest common prefix, and also which is larger. This information combines to reveal one bit of each of the plaintexts.

The CLWW ORE and OPE constructions. Our results will not need the CLWW construction, but it provides intuition for the lower bound and we recall it now, starting with a basic ORE construction \(\varPi _\mathrm {clww{\text {-}}ore}\) and then describing an ORE variant with shorter ciphertexts, and how to build OPE \(\varPi _\mathrm {clww{\text {-}}ope}\). We recall a version that is slightly different from theirs in that it is perfectly correct.

The scheme \(\varPi _\mathrm {clww{\text {-}}ore}= (\mathcal {K}^\mathrm {ore},\mathcal {E}^\mathrm {ore},\mathcal {C}^\mathrm {ore})\) uses a PRF

$$\begin{aligned} F:\{0,1\}^\lambda \times ([m]\times \{0,1\}^m)\rightarrow (\{0,1\}^\lambda \setminus \{1^\lambda \}). \end{aligned}$$

Thus the input domain of F is \([m]\times \{0,1\}^m\), and it outputs a \(\lambda \)-bit string that is assumed to never be \(1^\lambda \) (of course we can modify any PRF so that this is true without affecting asymptotic security).

  • Key generation \(\mathcal {K}^\mathrm {ore}(1^\lambda )\) outputs a random PRF key .

  • Encryption \(\mathcal {E}^\mathrm {ore}_K(x)\), on input a message \(x\in \{0,1\}^m\), the algorithm computes for each \(i=1,\ldots ,m\) the value

    $$\begin{aligned} u_i = F(K, i{\,\Vert \,}x[1,\ldots ,i-1] {\,\Vert \,}0^{m-i+1}) + x[i], \end{aligned}$$
    (1)

    where the addition is done by interpreting the bitstrings as members of \(\{0,\ldots ,2^\lambda -1\}\). Encryption outputs \((u_1,\ldots ,u_m)\).

  • The comparison algorithm \(\mathcal {C}^\mathrm {ore}((u_1,\ldots ,u_m), (u'_1,\ldots ,u'_m))\) takes as input two ciphertexts. It finds the smallest i such that \(u_i \ne u'_i\), and it outputs 1 if \(\mathbf {1}(u_i < u'_i)\).

Correctness follows by observing that the \(u_i\) will be equal until the \(u_i,u'_i\) corresponding to the first differing bit in the plaintexts. At that position, \(u_i\) and \(u'_i\) will differ by 1 (additively) and the smaller plaintext has the smaller value. CLWW proved that \(\varPi _\mathrm {clww{\text {-}}ore}\) (and the variants below) are \(\mathcal {L}_{\mathrm {clww}}\)-secure, assuming that F is a PRF. It is straightforward to derive from their proof that \(\varPi _\mathrm {clww{\text {-}}ore}\) is also statistically-secure with the same leakage profile in the random-oracle model.

Conversion to OPE. Chenette et al. showed how to convert this construction to an OPE scheme \(\varPi _\mathrm {clww{\text {-}}ope}\) by simply concatenating the members of a ciphertext to form a bitstring in \(\{0,1\}^{\lambda m}\) that is interpreted as a number for comparison. This scheme is perfectly correct because of our assumption that F never outputs the all-ones string, and thus the addition in (1) will never wrap modulo \(2^\lambda \).

Compressing ORE ciphertexts. Chenette et al. showed that one can modify \(\varPi _\mathrm {clww{\text {-}}ore}\) to a new ORE scheme which has shorter ciphertext. More precisely, the new scheme use a PRF \(F'\) with range only \(\{0,1,2\}\) instead of F, where

$$\begin{aligned} F':\{0,1\}^\lambda \times ([m]\times \{0,1\}^m)\rightarrow \{0,1,2\}. \end{aligned}$$

Now encryption uses \(F'\), and for \(i=1,\ldots ,m\) computes

$$\begin{aligned} u_i = F'(K, i{\,\Vert \,}x[1,\ldots ,i-1] {\,\Vert \,}0^{m-i+1}) + x[i] \mod 3. \end{aligned}$$
(2)

It outputs the vector \((u_1,\ldots ,u_n)\in \{0,1,2\}^m\).

Comparison now takes as input \((u_1,\ldots ,u_m)\) \((u'_1,\ldots ,u'_m)\). As before, it finds the first i such that \(u_i\ne u'_i\). But now it outputs 1 if \(u'_i = u_i +1 \mod 3\), and otherwise it outputs 0.

A ciphertext for an m-bit input is now a vector in \(\{0,1,2\}^m\), which can be represented using \(log_2(3)m + O(1) \approx 1.58m\) bits.

4 Lower Bound Statement and Proof Sketch

We can now state our lower bound formally.

Theorem 2

Suppose \(\varPi = (\mathcal {K},\mathcal {E},\mathcal {C})\) is an order-preserving encryption scheme with associated message space \(\{0,1\}^m\) and ciphertext space \(\{0,1\}^n\), and that \(\varPi \) is \(2^{-\lambda }\)-\(\mathcal {L}_{\mathrm {clww}}\)-statistically-secure. Then we have

$$\begin{aligned} n \ge \lambda m - m\log m + m \log e \end{aligned}$$

In any practical OPE scenario we are aware of, we have \(\log m - \log e < \lambda \) and thus our bound is nontrivial. For example, considering the message space is 40 bytes, \(\log m - \log e= \log 320/e < 7\), while in real world encryption, the secure parameter is always set to be 80 or larger.

Notation for the proof. To explain why this theorem is true we start with a change of notation that is more convenient for the underlying statistical problem. We will freely treat a string \(i\in \{0,1\}^m\) as a member of \([2^m-1]' = \{0,\ldots ,2^m-1\}\) when convenient (and similarly for strings in \(\{0,1\}^n\)). For each \(i\in \{0,1\}^m\) we define a random variable \(X_i\) by \(X_i = \mathcal {E}_K(i)\), where . These random variables are dependent, and perfect correctness implies that \(X_0< X_1< \cdots < X_{2^m-1}\) with probability one (here we are treating the \(X_i\) as numbers).

Now we consider what the \(\epsilon \)-\(\mathcal {L}_{\mathrm {clww}}\)-statistical security implies about our r.v.s \(X_0,\ldots ,X_{2^m-1}\). For every possible pair of vectors of messages \(\mathbf {m}_0,\mathbf {m}_1\) that does not automatically lose the game because of the leakage requirement, we get a condition about the statistical distance of the distributions of two tuples of random variables. For instance, if the adversary requests singleton vectors \(\mathbf {m}_0 = i\) or \(\mathbf {m}_1 = j\in \{0,1\}^m\) then the leakage \(\mathcal {L}_{\mathrm {clww}}(i) = \mathcal {L}_{\mathrm {clww}}(j) = \emptyset \), so we must have that

$$\begin{aligned} \varDelta (X_i,X_j) \le \varepsilon \end{aligned}$$

for every ij. More generally, for any two vectors \(\mathbf {i}= (i_1,\ldots ,i_q)\) and \(\mathbf {j}= (j_1,\ldots ,j_q)\) in \((\{0,1\}^m)^q\) with \(\mathcal {L}_{\mathrm {clww}}(\mathbf {i}) = \mathcal {L}_{\mathrm {clww}}(\mathbf {j})\), we must have

$$\begin{aligned} \varDelta ((X_{i_1},\ldots ,X_{i_q}),(X_{j_1},\ldots ,X_{j_q})) \le \epsilon . \end{aligned}$$

Thus we need to understand which \(\mathbf {i},\mathbf {j}\) satisfy \(\mathcal {L}_{\mathrm {clww}}(\mathbf {i}) = \mathcal {L}_{\mathrm {clww}}(\mathbf {j})\). Fortunately, our proof will only require inputs of a particular structure. We observe that the following qualify for \(t=0,\ldots ,m-1\):

$$\begin{aligned} \mathbf {i}= (0,2^{t+1}-1) \quad \text {and} \quad \mathbf {j}= (2^t-1,2^t). \end{aligned}$$

In binary, \(\mathbf {i}\) is \((0^m, 0^{m-t-1}1^{t+1})\) and \(\mathbf {j}\) is \((0^{m-t}1^{t},0^{m-t-1}10^{t})\). In both cases, the most significant differing bit is in the \(t+1\)-st least significant position (and the messages are in the same order), so the leakage in the same.

But why should this choice be useful? It represents the most extreme cases of two “distant” plaintexts and two “close” plaintexts that must appear indistinguishable. At a very high level, the scheme must “waste” a lot of its ciphertext space in order to make pairs like this appear indistinguishable. This is because the \(\mathbf {i}\) side must have ciphertexts that are far apart (by roughly \(2^{t+1}\)) simply because correctness forces many ciphertexts to be between \(X_0\) and \(X_{2^{t+1}-1}\), namely \(X_1,X_2,\ldots ,X_{2^{t+1}-2}\). In order to appear indistinguishable, \(X_{2^t-1}\) and \(X_{2^t}\) must also be far apart, with no other ciphertexts between them (again by correctness). Moreover, as t grows we get a nested sequence of pairs, where the space wasted by the previous pair force the next to waste even more.

Fig. 1.
figure 1

Two pairs of r.v.s that are required to be indistinguishable by the security definition. The top arc represents the gap \(G_1\) and the bottom arc represents the gap \(G_2\).

Our proof will argue that this wasted space grows to the quoted bound. We consider the nested sequence of these tuples above, and then proceed by induction to show that a large ciphertext-space is needed for security. The key step in our induction is that, since the tuples \((X_0,X_{2^{t+1}-1})\) and \((X_{2^t-1},X_{2^t})\) must have statistical distance at most \(\epsilon \), then their gaps

$$\begin{aligned} G_1 = X_{2^{t+1}-1} - X_0 \quad \text {and} \quad G_2 = X_{2^{t}} - X_{2^t-1} \end{aligned}$$

must also satisfy \(\varDelta (G_1,G_2) \le \varepsilon \) by the data processing inequality. But the gap measured by \(G_2\) is a subset of the gap measured by \(G_1\), so \(G_2 < G_1\). In fact, as we show via induction on t, \(G_2\) must often be much less than \(G_1\) (since \(G_1\) contains the gap from \(X_{2^t-1}\) and \(X_0\), which is the previous step of the induction). Using this fact, we apply the following lemma that is proved in Sect. 6 (Fig. 1).

Lemma 3

For any two variables \(X \ge Y \in [N-1]'\), and distinct positive integers \(d_1,\ldots , d_k\) such that \(\text{ Pr }[X = Y +d_i] = p_i\), we have

$$\begin{aligned} \varDelta (X,Y) \ge \frac{\sum _{i=1}^k p_i \cdot d_i }{ N-1 }. \end{aligned}$$

Intuitively, this lemma says that if one of the random variables is often much bigger than the other, then they must have large statistical distance.

Contrast with big jump. The big jump attack of [4] gave a ciphertext-size lower bound for any ideal OPE. With ideal ORE, every pair of two random variables \(X_{i_1} < X_{i_2}\) and \(X_{j_1} < X_{j_2}\) must be indistinguishable, which gives the attack more flexibility and results in an exponential bound (without resorting to recursion). Instead our bound works with a particular nested set of m pairs, with each step using a pair to increase the bound by roughly \(\lambda \) bits.

5 Proof of Theorem 2

We start with an additional technical lemma (proved in Sect. 7), and then give the proof.

Lemma 4

Let \(X > Y \in [N-1]'\) be random variables such that \(\varDelta (X,Y) \le \delta \). Let \(i\ge 1\) and assume that for all \(q \in [0,1]\), \(\Pr [X> Y + \frac{(1-q)^i}{ \delta ^i \cdot i! }] \ge q \). Then for all \(q\in [0,1]\) we have

$$\begin{aligned} \Pr [X > \frac{(1-q)^{i+1}}{ \delta ^{i+1} (i+1)! } ] \ge q. \end{aligned}$$

This lemma says that if X is often much larger Y, but also has small statistical distance, then the support of X must include some very large elements, and in fact X concentrates a significant portion of its mass on those large elements. The proof of this lemma (and the proof of the theorem) depends on Lemma 3 from above. We remark that it is crucial that we have the same probability q in the lemma assumption and conclusion, and achieving this requires a delicate argument. A weaker conclusion, where q changes, is more easily achieved using a Markov-type argument (and indeed earlier versions of this paper did exactly this, resulting in a weaker bound).

5.1 Proof

Let \(\varPi =(\mathcal {K},\mathcal {E})\) be an OPE scheme with associated message space \(\{0,1\}^m\) and ciphertext space \(\{0,1\}^n\), and assume \(\varPi \) is \(2^{-\lambda }\)-\(\mathcal {L}_{\mathrm {clww}}\)-statistically-secure.

Below, for \(i\in [2^m-1]'\), we let \(X_i = \mathcal {E}_K(i)\) where as in the proof sketch. That is, the \(X_i\) are dependent random variables that represent the encryption of message i under a random key. Note that \(X_0< X_1< \cdots < X_{2^m-1}\).

We will prove the theorem using following claim. Here, we let \(\varepsilon = 2^{-\lambda }\).

Lemma 5

For \(i\in [2^m-1]'\), let \(X_i\) be defined as above. Then for \(1\le j \le m\) and \(q \in [0,1]\),

$$\begin{aligned} \Pr [X_{2^j-1} - X_0 \ge \frac{(1-q)^{j-1}}{ \varepsilon ^{j-1} \cdot (j-1)! } ] \ge q \end{aligned}$$

Proof

(of Lemma 5). The proof is by induction on j.

Case \(j=1\). This case reduces to \(\Pr [X_1 - X_0 \ge 1] = 1\), which is true by the correctness of the scheme.

Case \(j \implies j+1\). We need to show that for any \(q \in [0,1]\)

$$\begin{aligned} \Pr [X_{2^{j+1}-1} - X_0 \ge \frac{(1-q)^{j}}{ \varepsilon ^{j} \cdot (j)! } ] \ge q. \end{aligned}$$

By the correctness of the scheme, we have that

$$\begin{aligned} X_{2^{j+1}-1} - X_0 \ge (X_{2^j} - X_{2^j-1}) + (X_{2^j-1} - X_0) +1 \end{aligned}$$
(3)

Now define “gap” random variables \(G_1 = X_{2^{j+1}-1} - X_0\) and \(G_2 = (X_{2^j} - X_{2^j-1})\). By induction we know that for any \(q \in [0,1]\)

$$\begin{aligned} \Pr [X_{2^j-1} - X_0 \ge \frac{(1-q)^{j-1}}{ \varepsilon ^{j-1} \cdot (j-1)! } ] \ge q. \end{aligned}$$

Plugging this, and the definitions of \(G_1,G_2\) into (3), we have

$$\begin{aligned} \Pr [G_1 > G_2 + \frac{(1-q)^{j-1}}{ \varepsilon ^{j-1} \cdot (j-1)! } ] \ge q. \end{aligned}$$

Moreover, we know by the \(\varepsilon \)-\(\mathcal {L}_{\mathrm {clww}}\)-statistical security of \(\varPi \) and Lemma 1 that \(\varDelta (G_1,G_2) \le \varepsilon \).

We now want to apply Lemma 4 to \(G_1\) and \(G_2\), to show that \(G_1\) must be large and then conclude the induction. In the lemma, we set \(G_1 = X, G_2=Y, i=j\), and \(\delta = \varepsilon \). The lemma gives

$$\begin{aligned} \Pr [ G_1 > \frac{(1-q)^{j}}{ \varepsilon ^{j} \cdot (j)! } ] \ge q, \end{aligned}$$

obtaining the induction step.

We can now complete the proof of Theorem 2. The above lemma with \(j=m\) tells us that for any \(q \in [0,1]\)

$$\begin{aligned} \Pr [X_{2^m-1} > X_0 + \frac{(1-q)^{m-1}}{ \varepsilon ^{m-1} \cdot (m-1)! } ] \ge q, \end{aligned}$$

and thus for any \(j \le D = 1/\varepsilon ^{m-1} (m-1)!\),

$$\begin{aligned} \Pr [ X_{2^m-1} > X_0 + j ] \ge 1- ((m-1)! \cdot j)^{1/m-1} \varepsilon \end{aligned}$$

and

$$\begin{aligned} \sum _{\ell =1}^j \Pr [X_{2^m-1} = X_0 + \ell ] \le ((m-1)! \cdot j)^{1/m-1} \varepsilon . \end{aligned}$$

Besides, we claim \(D \le N-1\), if not, then there exists \(q >0\) such that

$$\begin{aligned} N-1 = \frac{(1-q)^{m-1}}{ \varepsilon ^{m-1} \cdot (m-1)! } \end{aligned}$$

referring to

$$\begin{aligned} \Pr [ X_{2^m-1}> X_0 + N-1 ] \ge q > 0 \end{aligned}$$

which contradicts \(X_i \in [N-1]'\).

Now we denote \(p_\ell = \Pr [ X_{2^m-1} = X_0 + \ell ]\), and according to Lemma 3, we get that

$$\begin{aligned} \varepsilon \ge \varDelta (X_{2^m-1},X_0) \ge \frac{\sum _{\ell =1}^{N-1} p_{\ell } \cdot \ell }{N-1} \end{aligned}$$
(4)

and

$$\begin{aligned} \sum _{\ell =1}^{N-1} p_{\ell } \cdot \ell&= (p_1 + \cdots + p_{N-1}) + (p_2 + \cdots + p_{N-1}) + \cdots + p_{N-1}\\&\ge 1 + (1-p_1) + (1-p_1-p_2) + \cdots + (1- p_1 -\cdots - p_{D-1})\\&\ge 1+ \sum _{\ell =1}^{D-1} (1 - ((m-1)! \ell )^{\frac{1}{m-1}}\cdot \varepsilon ) \\&= D - (m-1)!^{\frac{1}{m-1}}\cdot \varepsilon \sum _{\ell =1}^{D-1} \ell ^{\frac{1}{m-1}} \\&\ge D - (m-1)!^{\frac{1}{m-1}}\cdot \varepsilon \cdot \int _0^D \! x^{\frac{1}{m-1}} dx \\&= \frac{1}{\varepsilon ^{m-1}(m-1)!} \cdot \frac{1}{m} = \frac{1}{\varepsilon ^{m-1} m!}. \end{aligned}$$

Returning to (4), we have

$$\begin{aligned} N-1 \ge 1/\varepsilon ^{m} m!. \end{aligned}$$

By setting \(\varepsilon = 2^{-\lambda }\), we get

$$\begin{aligned} n \ge \lambda m - \log (m!) \ge \lambda m - \log ((m/e)^m) = \lambda m - m\log m + m\log e. \end{aligned}$$

   \(\square \)

6 Proof of Lemma 3

We recall the lemma.

Lemma 3

For any two variables \(X \ge Y \in [N-1]'\), and distinct positive integers \(d_1,\ldots , d_k\) such that \(\text{ Pr }[X = Y +d_i] = p_i\), we have

$$\begin{aligned} \varDelta (X,Y) \ge \frac{\sum _{i=1}^k p_i \cdot d_i }{ N-1 }. \end{aligned}$$

Proof

We will show that one of the distinguishers \(\mathcal {D}_i\), \(i \in [N-1]\), has the needed advantage, where \(\mathcal {D}_i\) is defined as follows: Given input \(T \in [N-1]'\), \(\mathcal {D}_i\) outputs 1 if and only if \(T \ge i\).

The advantage of \(\mathcal {D}_i\) is \(\delta _i = \Pr [ X\ge i ] - \Pr [ Y \ge i]\). We have that

$$\begin{aligned} \sum _{i=1}^{N-1} \delta _i&= \sum _{i=1}^{N-1} \Pr [X\ge i] - \sum _{i=1}^{N-1} \Pr [Y\ge i] \\&= \sum _{i=0}^{N-1} \Pr [X\ge i] - \sum _{i=0}^{N-1} \Pr [Y\ge i] = E(X - Y) \ge \sum _{i=1}^k p_i d_i. \end{aligned}$$

Thus some \(\delta _i\) must be at least this sum divided by \(N-1\).   \(\square \)

7 Proof of Lemma 4

We first recall the lemma.

Lemma 4

Let \(X > Y \in [N-1]'\) be random variables such that \(\varDelta (X,Y) \le \delta \). Let \(i\ge 1\) and assume that for all \(q \in [0,1]\), \(\Pr [X> Y + \frac{(1-q)^i}{ \delta ^i \cdot i! }] \ge q \). Then for all \(q\in [0,1]\) we have

$$\begin{aligned} \Pr [X > \frac{(1-q)^{i+1}}{ \delta ^{i+1} (i+1)! } ] \ge q. \end{aligned}$$

Proof

Suppose for contradiction that there exists \(q^*\in [0,1]\) such that

$$\begin{aligned} \hat{q}:= \Pr [X > t] < q^*, \end{aligned}$$

where \(t = (1-q^*)^{i+1}/\delta ^{i+1} (i+1)!\).

We will show that \(\varDelta (X,Y) > \delta \), violating the assumption in the lemma. We will prove this by showing the following “truncated” r.v.s WZ satisfy \(\varDelta (X,Y) \ge \varDelta (W,Z) > \delta \), where WZ are defined via the joint distribution

$$\begin{aligned} \Pr [W = a, Z = b] = {\left\{ \begin{array}{ll} \Pr [X=a,Y=b] &{} \text {if} \ (a,b)\in [t]^2 \setminus (0,0), \\ \hat{q}&{} \text {if} \ (a, b) = (0,0)\\ 0 &{} \text{ otherwise } \end{array}\right. }. \end{aligned}$$

According to the definition of (WZ), we show \(\varDelta (X,Y) \ge \varDelta (W,Z)\). For simplifying, we denote

$$\begin{aligned} p_{i,j} = \Pr [X= i, Y=j ] ; \ p_j = \sum _{k=0}^t p_{k,j} ;\ p_j^* = \sum _{k= t+1}^{N-1} p_{k,j}; \ \forall i, j \in [t] \end{aligned}$$

and it’s obvious to note that for \(j \in [t]\): (1) \(\Pr [ X=j ] = \Pr [ W=j ] \); (2) \(\Pr [ Z = j ] = p_j\); (3) \( \Pr [ Y=j ] = p_j + p_j^*\); (4) \(\sum _{k=0}^t p_j^* = \sum _{k=t+1}^{N-1} (\Pr [X=k]- \Pr [Y=k])\). Hence:

$$\begin{aligned} 2\varDelta (X,Y)&= \sum _{j=0}^{N-1} | \Pr [X=j] -\Pr [Y=j] | \\&= \sum _{j=0}^t | \Pr [X=j] -\Pr [Y=j] | + \sum _{j=t+1}^{N-1} | \Pr [X=j ]- \Pr [Y=j] | \\&\ge \sum _{j=0}^t | \Pr [X=j] -\Pr [Y=j] | + \sum _{j=t+1}^{N-1} (\Pr [X=j ]- \Pr [Y=j] )\\&= \sum _{j=0}^t | \Pr [X=j] -\Pr [Y=j] | + \sum _{j=0}^{t} p_j^*\\&= \sum _{j=0}^t | \Pr [W=j] -\Pr [Z=j] - p_j^* | + \sum _{j=0}^{t} p_j^*\\&\ge \sum _{j=0}^t | \Pr [W= j] - \Pr [Z=j] | = 2\varDelta (W,Z) \end{aligned}$$

In the following, it suffices to show that \(\varDelta (W,Z) > \delta \). We denote \(d_j = \Pr [W = Z+ j]\). Applying Lemma 3,

$$\begin{aligned} \varDelta (W,Z) \ge \frac{\sum _{\ell =1}^{t} d_{\ell } \cdot \ell }{t}. \end{aligned}$$

We now show that \(\sum _{\ell =1}^{t} d_{\ell } \cdot \ell > \delta t\), completing the proof. Below we use the following technical claim, which we establish below:

Claim

In the notation of the proof, we have the following:

  1. 1.

    \(\sum _{\ell =1}^{t} d_\ell = 1-\hat{q}\),

  2. 2.

    For each j, \(\sum _{\ell =1}^{j} d_\ell \le (i! \cdot j)^{1/i}\delta \),

  3. 3.

    \(t \ge \hat{t}\), where \(\hat{t}= (1-\hat{q})^i/\delta ^i i!\).

Using the claim, we have

$$\begin{aligned} \sum _{\ell =1}^{t} d_{\ell } \cdot \ell&\ge \sum _{\ell =1}^{\hat{t}} d_{\ell } \cdot \ell = (d_1 + \ldots + d_{\hat{t}}) + ( d_2 + \ldots + d_{\hat{t}}) + \ldots + (d_{\hat{t}}) \\&\ge (1-\hat{q}) + ((1-\hat{q})- d_1) + ((1-\hat{q})- d_1- d_2) + \ldots + ((1-\hat{q})- d_1-\ldots - d_{\hat{t}-1})\\&\ge (1-\hat{q}) \hat{t}- \sum _{\ell =1}^{\hat{t}-1} (\ell i!)^{1/i} \delta \\&\ge (1-\hat{q}) \hat{t}- (i!)^{1/i} \delta \int _0^{\hat{t}} x^{1/i} dx\\&= (1-\hat{q})^{i+1}\hat{t}- (i!)^{1/i} \delta \cdot \frac{i}{i+1} \hat{t}^{\frac{i+1}{i}} = \frac{(1-\hat{q})^{i+1}}{\delta ^i(i+1)!} > \delta t. \end{aligned}$$

We now prove the claim. The first part follows easily from the definition of WZ. For the second part, we have

$$\begin{aligned} \sum _{\ell =1}^j d_{\ell } \le \sum _{\ell =1}^j \Pr [X = Y + \ell ] = 1 -\Pr [X > Y + j] \le (i!j)^{1/i}\delta , \end{aligned}$$

where the last inequality follows since \(\Pr [X>Y + (1-q)^i/\delta ^i i!] \ge q\) holds for all \(q\in [0,1]\), and particular \(q=1-(i!j)^{1/i}\delta \).

For the third part of the claim, suppose for contradiction that \(t < \hat{t}\). Then

$$\begin{aligned} \Pr [X> t] \ge \Pr [X> Y+ t] \ge 1 - (i!t)^{1/i}\delta > 1 - (i!\hat{t})^{1/i}\delta = \hat{q}. \end{aligned}$$

(The second inequality is another application of the condition in the lemma, similar to the proof of the second part.) But this contradicts the definition \(\hat{q}= \Pr [X > t]\) and proves the third part of the claim.    \(\square \)

8 Extensions of the Lower Bound

Our lower bound applies to the specific definition achieved by Chenette et al., and it is possible to circumvent the bound by targeting a different, but hopefully satisfactory, notion of security. In this section we identify an abstract property, which we term inner-distance-indistinguishablity, for which a similar lower bound applies. Thus, to avoid the bound for OPE with another definition, one must avoid this property, and the authors are not aware of an approach for doing so.

We also show how to apply our proof technique to give an essentially-tight lower bound on the ciphertext length of the “base-d” OPE variants suggested by Chenette et al., which achieve a weakened version of security with shorter ciphertexts.

Inner-distance-indistinguishablity. The following property seems mostly useful as a tool for understanding and generalizing the lower bound, and not as a stand-alone target for OPE security in practice.

Definition 6

Let \(\varPi = (\mathcal {K},\mathcal {E},\mathcal {C})\) be an OPE scheme with associated message space M, \(d\ge 1\) be an integer, and \(\varepsilon >0\). We say that \(\varPi \) is (statistically) \(\varepsilon \)- inner-distance-indistinguishable for width d (denoted \(\varepsilon \)-\(\mathrm {IDI}_d\) ) if for all \(i<j \in M\) such that \(j - i > d\), there exist \(k,\ell \in M\) such that

  1. 1.

    \(i \le k < \ell \le j\)

  2. 2.

    \(\ell - k \le d\)

  3. 3.

    \(\varDelta (D_1, D_2) \le \varepsilon \), where \(D_1 = \mathcal {E}_K(j) - \mathcal {E}_K(i)\) and \(D_2 = \mathcal {E}_K(k) - \mathcal {E}_K(\ell )\) and K is random key.

Intuitively, \(\varepsilon \)-\(\mathrm {IDI}_d\) says that the distance between every encrypted pair of messages must be indistinguishable from the gap between two encrypted messages which both lie between them, and moreover the latter gap is required to be small, namely d or less.

The CLWW notion implies \(\varepsilon \)-\(\mathrm {IDI}_{1}\) security. That is, for every pair \(i<j\), \(\mathcal {E}_K(j) - \mathcal {E}_K(i)\) is distinguishable from \(\mathcal {E}_K(k+1) - \mathcal {E}_K(k)\) for some k between i and j (when \(d=1\), we must have \(\ell = k+1\) in the definition).

To see this, fix some ij, with \(j > i + 1\), and consider their binary expansions. We may write i in the form \(p{\,\Vert \,}0 {\,\Vert \,}x\) and j in the form \(p{\,\Vert \,}1 {\,\Vert \,}y\), where p is the longest common prefix and i and j, and \(x,y\in \{0,1\}^L\) for some \(L\ge 1\). Then consider

$$\begin{aligned} k = p{\,\Vert \,}0{\,\Vert \,}1^{L} \quad \text {and} \quad \ell = p{\,\Vert \,}1{\,\Vert \,}0^L. \end{aligned}$$

We have that \(\ell = k+1\) (treating \(\ell ,k\) as numbers), and that either \(k\ne i\) or \(\ell \ne j\). Moreover the CLWW security notion ensures that the condition of \(\mathrm {IDI}_{1}\) security holds for this choice of \(k,\ell \).

The following theorem generalizes Theorem 2.

Theorem 7

Suppose \(\varPi = (\mathcal {K}, \mathcal {E}, \mathcal {C})\) is an order-preserving encryption scheme with security parameter \(\lambda \) and associated message space \(\{0,1\}^m\) and ciphertext space \(\{0,1\}^n\), and \(\varPi \) is \(2^{-\lambda }\)-\(\mathrm {IDI}_d\) secure for some \(d\ge 1\). Let \(m' = m - \lceil \log d \rceil \). Then we have

$$\begin{aligned} n \ge \lambda m' - m' \log m' + m' \log e \end{aligned}$$

Proof

Let \(\varPi = (\mathcal {K},\mathcal {E},\mathcal {C})\) be an OPE scheme with the syntax and conditions in the theorem. Below, for \(i \in \{0,1\}^m\), we write \(X_i = \mathcal {E}_K(i)\), and let \(m'\) be as defined in the theorem.

We will show how to carry out the same strategy used in the proof of Theorem 2. We will prove a version of Lemma 5 for a different nested sequence of pairs of messages \((i^L_j,i^R_j)_{j=1}^{m'}\) that we define inductively from \(m'\) down to 1 now.

  • Base: \(i^L_{m'} = 0, i^R_{m'} = 2^m-1\).

  • Step: Given \((i^L_j, i^R_j)\), let \(k<\ell \) be the pair between \(i^L_j\) and \(i^R_j\) guaranteed by \(\mathrm {IDI}_d\) security. We distinguish two cases:

    1. 1.

      If \(k - i^L_j > i^R_j - \ell \) then set \((i^L_{j-1},i^R_{j-1})\) to be \((i^L_j, k)\).

    2. 2.

      Otherwise, set \((i^L_{j-1},i^R_{j-1})\) to \((\ell ,i^R_j)\).

Intuitively, we use the \(\mathrm {IDI}_d\) security property to find a nested sequence by moving to the “larger” gap at each step, and this continues for at least \(m'\) steps. Using this sequence, the rest of the proof of Lemma 5 can be carried out. Finally, the rest of the proof of Theorem 2 can be applied exactly as before.   \(\square \)

Extension to OPE variants. We can also extend our proof of Theorem 2 to the “d-ary” variants of Chenette et al. That construction saved a modest amount of space over the main CLWW construction via additional leakage, which is described via the following leakage profile \(\mathcal {L}^d_{\mathrm {clww}}\):

$$\begin{aligned} \mathcal {L}^d_{\mathrm {clww}}(x_1,\ldots ,x_q) := \{(i,j,\mathsf {ind}_{\mathsf {diff}}^{(d)}(x_i,x_j), \mathbf {1}(x_i< x_j)) \ : \ 1 \le i < j \le q\}, \end{aligned}$$

where \(\mathsf {ind}_{\mathsf {diff}}^{(d)}(a,b)\) writes its inputs in base d as \(a = (a[1],\ldots ,a[m])\) and \(b = (b[1],\ldots ,b[m])\), and outputs \((k, |b[k] - a[k]|)\), where k is the smallest index such that \(b[k] \ne a[k]\). If there is not such index (i.e. \(a=b\)) then it outputs \((m+1,0)\).

Intuitively, this leakage outputs the index of the first base-d digit where each pair of messages differ, and additionally outputs the absolute difference in that digit. (When \(d=2\) the additional output is trivial, since it is always 1.)

We will show how to carry out the same strategy used in the proof of Theorem 2. Here we denote \(m^* = m/ \log d -1 \), and we will prove a version of Lemma 5 for a different nested sequence of pairs of messages \((i^L_j,i^R_j)_{j=1}^{m^*}\) that we define as follows:

$$\begin{aligned} i_{j}^{L } = 0, \quad i_{j}^{R} = 0^{m^*-j}|| 1 || (d-1)^j \end{aligned}$$

And we define the pair \(\hat{i}_j^L, \hat{i}_j^R\) as:

$$\begin{aligned} \hat{i}_j^L = 0^{m^*-j} || 0 || (d-1)^{j},\quad \hat{i}_j^R = 0^{m^*-j} || 1 || 0^j \end{aligned}$$

According to the leakage profile, we have (\( \mathcal {E}_K ( i^L_j), \mathcal {E}_K(i^R_j) \) ) and (\(\mathcal {E}_K(\hat{i}_j^L), \mathcal {E}_K(\hat{i}_j^R) \)) are statistical indistinguishable. Using the sequence \((i^L_j,i^R_j)_{j=1}^{m^*}\), the rest of the proof of Lemma 5 can be carried out. Finally, the rest of the proof of Theorem 2 can be applied exactly as before. Hence we have the lower bound:

$$\begin{aligned} n \ge \lambda (m/ \log d) - (m/ \log d) \log ( m/ \log d ) \end{aligned}$$

referring to d-ary CLWW is also almost optimal.