Abstract
We introduce Targeted Ciphers, which typically encipher points on domain \(\mathcal {X}\), but can be easily modified to instead encipher points on some subset \(\mathcal{S}\subseteq \mathcal {X}\). Ciphers that can directly support this domain targeting are useful in Format-Preserving Encryption, where one wishes to encipher points on a potentially complex domain \(\mathcal{S}\). We propose two targeted ciphers and analyze their security. The first, Targeted Swap-or-Not, is a modification of the Swap-or-Not cipher proposed by Hoang, Morris, and Rogaway (CRYPTO 2012). The second, a new cipher we call Mix-Swap-Unmix, achieves the stronger notion of full security. Our targeted ciphers perform domain targeting more efficiently than the recently proposed Cycle Slicer algorithm of Miracle and Yilek (ASIACRYPT 2017).
You have full access to this open access chapter, Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
In this era of “big data,” where organizations regularly harvest and store large amounts of customer data, the need to secure personal information in the face of data breaches has become essential. Encrypting sensitive personal and financial data like credit card numbers, social security numbers, and birth dates is an obvious way to defend against data breaches, but how to encrypt these diverse types of data is not always obvious. Practitioners are faced with the challenge of introducing encryption into large databases that interact with a potentially complex system of hardware and legacy software, while trying not to break anything. Given these challenges and constraints, it is easy to see the appeal of Format-Preserving Encryption (FPE) schemes, in which ciphertexts have the same format as plaintexts. For example, if one encrypts a 9 decimal digit US social security number with an FPE scheme, the resulting ciphertext would also be a 9 digit number. Such FPE schemes can often be “dropped in” to existing systems with little disruption.
Early attempts at constructing and analyzing FPE schemes were conducted by Brightwell and Smith [8] and later Spies [25]. The increasing practical interest in the problem, especially related to credit card encryption, has led to a recent surge in academic research on FPE and related problems [1,2,3, 6, 11, 14,15,16,17, 19,20,22, 24]. There are even FPE standards NIST SP 800-38G [12] and ANSI ASC X9.124 that include Feistel-based FPE schemes like FF1 [4] and FF3 [7].
There are a variety of techniques known for constructing a format-preserving encryption scheme to encipher points in domain \(\mathcal{S}\). Since block ciphers have traditionally been designed for bitstring domains, we cannot use an existing cipher (e.g., AES) without modification. Instead, there are generally three main strategies for constructing the desired encryption scheme. First, we could try to construct a cipher that is customized to work directly on domain \(\mathcal{S}\). This works best when \(\mathcal{S}\) has a relatively simple structure, like integers in the range \(\{0,\ldots ,N-1\}\), and many ciphers used in FPE are designed to work on this domain. (For example, such a cipher would work well for our social security number example; we would just need a cipher on \(\{0,\ldots ,N-1\}\) with \(N=10^9\)).
If the domain \(\mathcal{S}\) is more complicated, then a second option for building an FPE scheme is to try to find a way to rank the elements of the domain, then employ a cipher that works on \(\{0,\ldots ,N-1\}\) with \(N=|\mathcal{S}|\), and then unrank. Ranking the elements of \(\mathcal{S}\) means finding an efficient way to map (and unmap) each element \(m \in \mathcal{S}\) to a unique element \(x \in \{0,\ldots ,|\mathcal{S}|-1\}\). The FPE scheme just described is called rank-encipher-unrank [3]. Rank-encipher-unrank only works on domains for which efficient ranking and unranking algorithms are known. Thus, practitioners, when faced with the task of enciphering points in some domain \(\mathcal{S}\), must either invent a custom ranking and unranking procedure,Footnote 1 or, if \(\mathcal{S}\) can be specified with a DFA or regular expression, apply known algorithms to rank regular languages [3]. In this latter case, there are toolkits written to aid practitioners [16], though there can still be some subtle efficiency issues depending on whether one starts with a regular expression or a DFA.
Finally, a third option that only assumes the ability to test membership in \(\mathcal{S}\) is to find a larger domain \(\mathcal {X}\) for which an efficient cipher already exists, and then try to somehow use or modify this cipher to get a new cipher on the target domain \(\mathcal{S}\). For example, if we need a cipher on valid social security numbers (e.g., do not start with 000), we could try to take a cipher on \(\{0,\ldots ,10^9\}\) and somehow cleverly use it to get a cipher on our desired domain. Black and Rogaway [6] were the first to analyze a folklore technique for doing this, called Cycle Walking, in which the cipher on the larger set is applied repeatedly to a point \(m \in \mathcal{S}\) until the resulting ciphertext also is an element of \(\mathcal{S}\). If the size of \(\mathcal {X}\) is not too large relative to the size of \(\mathcal{S}\), then we can expect this procedure to terminate quickly, though the running time can vary across different inputs. Recent work [19, 20] by Miracle and Yilek has explored ways to make this task of transforming a cipher on \(\mathcal {X}\) into a cipher on \(\mathcal{S}\subseteq \mathcal {X}\), which they refer to as domain targeting, possible in constant time, meaning the running time does not depend on the input. Looking ahead, our results can be seen as bringing the very theoretical results of [19, 20] closer to practice.
We emphasize that while ranking/unranking and domain targeting might seem like two distinct ways to build FPE schemes on domains \(\mathcal{S}\), they can actually be complementary techniques. For example, a practitioner might have a very complicated domain \(\mathcal{S}\) for which they wish to do FPE. Perhaps the domain is specified by a complex regular expression, and so the general techniques for ranking/unranking are impractical. An alternative option may be to find a larger, simpler set \(\mathcal {X}\supseteq \mathcal{S}\) that is easier to rank. Then, the rank-encipher-unrank algorithm would need to apply something like domain targeting before unranking, since applying rank-encipher-unrank might yield an element in \(\mathcal {X}-\mathcal{S}\).
Constant-Time Domain Targeting. Our main goal in this paper is to make constant-time domain targeting more efficient. Before getting to our new results, we first give an overview of previous techniques.
The constructions provided by Miracle and Yilek for domain targeting from set \(\mathcal {X}\) to set \(\mathcal{S}\), called Reverse Cycle Walking in [19] and Cycle Slicer in [20], are both based on the same underlying idea: take a cipher on \(\mathcal {X}\) and use it to construct a random matching (i.e., permutation with only 2-cycles or transpositions) on \(\mathcal{S}\subseteq \mathcal {X}\); then swap some of the points that are paired together based on bit flips. Said another way, both the Reverse Cycle Walking (RCW) and Cycle Slicer (CS) constructions give a way to build matchings on the target set \(\mathcal{S}\) out of arbitrary permutations on the larger set \(\mathcal {X}\). Once a matching on \(\mathcal{S}\) is formed, pairs of points in the matchings are swapped based on additional bit flips. This procedure, called a matching exchange process, is repeated over many rounds, and Miracle and Yilek use a result of Czumaj and Kutolowski [9] to argue the resulting ciphers are secure. Further, the number of rounds needed for security does not depend on the specific inputs, so constant-time implementations that do not leak timing information are possible.
Unfortunately, RCW and CS are both rather inefficient, requiring many rounds for security. For example, the Cycle Slicer paper uses social security numbers as an example, with \(\mathcal {X}= \{0,1\}^{30}\) and \(\mathcal{S}= \{0,\ldots ,10^9-1\}\), and claims that about 12,000 rounds of Cycle Slicer are needed for security. If we plug an existing, provably-secure cipher like Swap-or-Not (SN) [15] into the construction, we would end up with hundreds of rounds of Swap-or-Not times 12,000 rounds of Cycle Slicer, meaning overall we need millions of Swap-or-Not rounds. If full security [13, 24] is desired, in which ciphers are required to be indistinguishable from random permutations even when adversaries can query all domain points, the situation is even worse. The key idea in this paper is that instead of applying a general transformation to convert any cipher into one that supports domain targeting, perhaps we can instead specifically design ciphers (or slightly modify existing ones) to directly support domain targeting.
Our Results. We take a step toward bringing constant-time domain targeting closer to practice. We propose using what we refer to as targeted ciphers for the task. The idea is to design new ciphers (or find existing ones) that already can support domain targeting with only small modification. Informally, a targeted cipher will proceed in rounds to encipher points in some domain \(\mathcal {X}\), yet can be slightly modified to have the property that after every round, every point \(x \in \mathcal{S}\subseteq \mathcal {X}\) is still mapped to another point in \(\mathcal{S}\). In other words, over the entire course of the algorithm, elements of the target set \(\mathcal{S}\) never “leave” the target set, and every additional round of the cipher further mixes up these elements.
With this informal idea in mind, we present two targeted ciphers and formally analyze their security. Our first targeted cipher, Targeted Swap-or-Not (TSN), is a modification of the Swap-or-Not cipher, proposed by Hoang, Morris, and Rogaway. The second, which achieves the stronger notion of full security, is a new cipher we design and analyze called Mix-Swap-Unmix (MSU). With both ciphers, we achieve a substantial increase in efficiency when compared to constructions that achieve a similar level of security by using a general transformation like Cycle Slicer, bringing domain targeting closer to practicality.
Techniques. Like previous work on domain targeting, both of our targeted ciphers are matching-based, or swap-based, meaning that every round pairs up points and then swaps some of them. To construct a cipher on \(\mathcal{S}\subseteq \mathcal {X}\), previous work, specifically Cycle Slicer, in each round builds a random matching on the larger set \(\mathcal {X}\) and then, for each pair of points \(x,x'\) paired together in the matching, only swaps x and \(x'\) if both points are in the target set \(\mathcal{S}\) and an additional bit flip is 1. The security analysis heavily relies on the fact that the matchings are random, which allows [20] to apply an existing result of Czumaj and Kutolowski [9].
Our first targeted cipher, Targeted Swap-or-Not, stems from the observation that the Swap-or-Not cipher is already matching-based: focusing on the version of SN for domain \(\mathcal {X}= [N]\), in round i of SN, point x is paired up with point \(x' = K_i-x \mod N\), where \(K_i\) is the random round key. The points x and \(x'\) are then swapped if a random function applied to them is 1. This operation clearly results in a matching on [N], so our targeted version adds the constraint that points should only be swapped if they are both in the target set \(\mathcal{S}\subseteq [N]\).
Since the high-level idea in TSN is the same as in Cycle Slicer, it might appear that the same security analysis should follow. But, there is a key difference: in Cycle Slicer, each round is a random matching, while in TSN we get a very non-random matching completely determined by the round key (which can be computed from any known pair \(x,x'\)). Thus, for TSN’s analysis we cannot rely on the matching exchange process results. Instead, we modify the original Swap-or-Not security proof of [15], using a recent refinement by Dai, Hoang, and Tessaro [10]. Our final security bounds show that TSN needs only a modest increase in rounds over SN to support targeting. As an example, our bounds show that if TSN is applied to domain [N] for \(N=2^{30}\) and targeted to a target set of size \(|\mathcal{S}|=10^9\), and if we allow a CCA adversary \(q=|\mathcal{S}|/2\) queries, then we need just under 600 rounds of Swap-or-Not to get advantage less than \(10^{-9}\). Using Cycle Slicer and Swap-or-Not for the same parameters would require hundreds of thousands of rounds.
For our second targeted cipher, Mix-Swap-Unmix (MSU), we aim to build a targeted cipher that can achieve full security. A fully secure cipher is one that indistinguishable from a random permutation by an adversary who can query all N domain points. Only a few fully-secure ciphers are known, and they tend to be inefficient; for example, the Mix-and-Cut cipher of [24] uses about 10,000 rounds of Swap-or-Not to encipher 30-bit inputs. If one wishes to do domain targeting and still maintain full security, the efficiency problem gets even worse. Combining a fully-secure cipher like Mix-and-Cut with a general domain targeting transformation like Cycle Slicer can result in 100 s of millions of rounds of Swap-or-Not. Thus, we aim to build a new fully-secure cipher that directly supports domain targeting.
Like previous fully-secure ciphers [21, 24], our new cipher MSU is built from Swap-or-Not. At the same time, since we want to support targeting, we need each round of MSU to give a matching on the larger domain \(\mathcal {X}= [N]\), and then we can only swap elements that are both in the target set \(\mathcal{S}\). To build this matching, we use an idea from Naor and Reingold [23]. They used the fact that for permutations \(\pi \) and \(\sigma \), the cycle structure of \(\pi \circ \sigma \circ \pi ^{-1}\) is the same as the cycle structure of the inner permutation \(\sigma \), to build permutations with particular cycle structures. Since we want a matching, or a permutation made up of just 2-cycles, we let \(\pi \) (the outer permutation) be Swap-or-Not, and then \(\sigma \) (the inner permutation) simply be the permutation that swaps adjacent elements. This is one round of MSU. While this gives us a targeted cipher, we still need to argue full security. In Sect. 4, we show that this construction boosts the security of Swap-or-Not and gives us full security. The final construction is also much more efficient than using an existing fully-secure cipher with Cycle Slicer, requiring about 100 times fewer rounds of Swap-or-Not.
Extensions and Future Work. We mention a few other related results we have included in the paper. First, the MSU construction described above uses an additional bit flip for each pair of points in each round. This bit flip seems unnecessary and leads to an increase in the number of rounds in the case where \(\mathcal{S}\) is much smaller than \(\mathcal {X}\). In Appendix A, we show that in this setting, the bit flip can in fact be eliminated. The proof involves finding an equivalent underlying matching exchange process that mimics MSU without the bit flips, and the techniques may be of independent interest. We also show in Appendix A that if domain targeting is not needed and one simply wants to use the MSU cipher on domain [N], then we can prove that significantly less rounds are needed by applying a recent result of Bernstein [5]. In short, MSU without targeting results in something called an involution walk, and techniques from representation theory can be applied.
One last extension of our results is that our targeted ciphers can be used in a straightforward way to solve the domain completion problem, recently introduced in [14] and further studied in [20], in which we wish to construct a cipher that stays consistent with a table of existing input-output mappings that were manually chosen. Specifically, our constructions can take the place of Cycle Slicer in the CSDC algorithm of [20], resulting in efficiency gains in that setting.
Looking forward, an obvious question is whether other well-known cipher design techniques can be modified to directly support targeting. For example, Feistel-based ciphers are widely used and, in fact, the standardized FPE schemes are Feistel-based, so it would be convenient if they could be made to support targeting with simple modifications. Unfortunately, this seems unlikely. A card-shuffling view of Feistel is that the input points are cut into many piles, and then the bottom cards are dropped from the piles in different orderings depending on the internal random round function. Imagine some of the cards at the bottom of the cut piles are initially in positions in the target set \(\mathcal{S}\). These cards will end up near the bottom of the deck after one round of Feistel, but the positions near the bottom of the deck might not correspond to positions in \(\mathcal{S}\). Thus, we immediately lose our desired property of targeted ciphers that points in \(\mathcal{S}\) always stay in \(\mathcal{S}\) after each round.
Finally, though we used the MSU construction to build a (rather slow) fully-secure cipher by applying in each round Swap-or-Not, a swap, and then Swap-or-Not inverse, we believe Swap-or-Not could be replaced by something much faster (e.g., a few rounds of Feistel) in the MSU construction, and the resulting (targeted) cipher could provide strong security with a modest number of rounds.
2 Preliminaries
Notation. If x is a bitstring with length n, then we denote by \(x \oplus 1\) the bitwise exclusive-OR of the n bits of x with the bitstring \(0^{n-1}1\) (\(n-1\) zeroes followed by a single one). If S is a set, then means we choose an element of S uniformly at random and assign it to x. If S is instead an algorithm, then the same notation represents running S with uniformly random coins and assigning the output to x. For permutations \(\pi ,\sigma : \mathcal{M}\rightarrow \mathcal{M}\) with \(\pi \) having inverse \(\pi ^{-1}\), we denote by \(\pi \circ \sigma \circ \pi ^{-1}\) the permutation that computes \(\pi ^{-1}(\sigma (\pi (x)))\) on \(x \in \mathcal{M}\). We let [N] denote the set \(\{0,\ldots ,N-1\}\). For \(X \in [N]\), then we let \(X \oplus 1\) denote the result of taking the binary representation of X and applying a bitwise-XOR with the binary representation of 1; in other words, if X is even (resp. odd), then \(X \oplus 1\) will be the next (resp. previous) number. Let \(\mathsf {odd}(N)\) denote the odd elements of [N].
Block Ciphers. We say that \(E: \mathcal{K}\times \mathcal{M}\rightarrow \mathcal{M}\) for finite sets \(\mathcal{K}\) and \(\mathcal{M}\) (sometimes referred to as the key space and domain, respectively) is a block cipher if \(E_K(\cdot ) = E(K, \cdot )\) is a permutation on \(\mathcal{M}\) for every \(K \in \mathcal{K}\). Let \(E^{-1}\) be the inverse block cipher of \(E\).
The standard notion of security for block ciphers is security against adaptive chosen-ciphertext attack (CCA), sometimes called Strong PRP Security. To define this security notion, we describe the security games \(\mathsf {SPRP1}\) and \(\mathsf {SPRP0}\). In \(\mathsf {SPRP1}\), the game starts with a \(\mathbf{main }\) procedure that chooses a random key for the cipher and then runs the adversary with oracles for procedures \(\mathsf {Enc}\) and \(\mathsf {Dec}\), which answer queries using the cipher and the chosen key. The final output of the game is the bit the adversary outputs. The game \(\mathsf {SPRP0}\) works the same, but with \(\mathbf{main }\) choosing a random permutation from \(\textsf {Perm}(\mathcal{M})\), defined as the set of all permutations \(\pi : \mathcal{M}\rightarrow \mathcal{M}\), and using that to answer oracle queries to \(\mathsf {Enc}\) and \(\mathsf {Dec}\). We can then define the CCA advantage of an adversary A against \(E\) by \( \mathbf {Adv}^{\mathrm {cca}}_{E}(A) = |{\Pr \left[ \,{\mathsf {SPRP1}^A_E\Rightarrow 1}\,\right] } - {\Pr \left[ \,{\mathsf {SPRP0}^A_E\Rightarrow 1}\,\right] }| \) where the probabilities are over the random coins used in the security games. If the adversary A is non-adaptive (meaning it makes the same queries every run) and only makes queries to \(\mathsf {Enc}\) in the SPRP security games above, then we say it is a NCPA (short for non-adaptive chosen-plaintext attack) adversary and we refer to its advantage in the games against block cipher \(E\) as \(\mathbf {Adv}^{\mathrm {ncpa}}_{E}(A)\).
As has become standard, we overload notation and denote by \(\mathbf {Adv}^{\mathrm {cca}}_{E}(q)\) the maximum CCA advantage over all adversaries making at most q adaptive oracle queries. Similarly, the maximum advantage over all adversaries making at most q non-adaptive oracle queries to only the forward direction subroutine \(\mathsf {Enc}\) we denote by \(\mathbf {Adv}^{\mathrm {ncpa}}_{E}(q)\). We will be interested in full security or fully-secure ciphers, meaning \(\mathbf {Adv}^{\mathrm {cca}}_{E}(N)\) is low, where \(N = |\mathcal{M}|\). Said another way, a fully-secure block cipher will be one for which the CCA advantage is low despite the adversary being able to query every domain point. As explained in the introduction, such fully-secure ciphers have been the target of a number of recent papers [13, 21, 24].
Chernoff Bound. Later in the paper, we will need to upper bound the probability that among t independent coin flips there are more than (3 / 4)t heads.
Proposition 1
Let \(X_1,\ldots ,X_t\) be independent random variables such that each \(X_i = 1\) with prob. 1 / 2 and \(X_i = 0\) with prob. 1 / 2. Let \(X = \sum _{i=1}^t X_i\). Then, \( {\Pr \left[ \,{X \ge (3/4)t}\,\right] } \le e^{-t/20}. \)
Matchings. In this paper we use the term matching on \(\mathcal{M}\) to refer to a permutation \(\tau : \mathcal{M}\rightarrow \mathcal{M}\) made up of only transpositions, also called 2-cycles or swaps. A matching is an involution, so \(\tau (\tau (x))=x\) for all \(x \in \mathcal{M}\). Let \(\textsf {Match}(\mathcal{M}, k)\) be the set of all matchings on \(\mathcal{M}\) that are made up of exactly k transpositions. For a set \(\mathcal{M}\) with an even number N of elements, we use the term perfect matching to refer to a matching on \(\mathcal{M}\) with exactly N / 2 transpositions, meaning every point is swapped with another distinct point. Thus, \(\textsf {Match}(\mathcal{M}, |\mathcal{M}|/2)\) is the set of such perfect matchings when \(|\mathcal{M}|\) is even.
A matching exchange process on \(\mathcal{M}\) proceeds in rounds. In each round, k is sampled from some probability distribution on \(\{0,\ldots ,|\mathcal{M}|/2\}\), then \(\tau \) is chosen randomly from \(\textsf {Match}(\mathcal{M}, k)\). Finally, for each pair of points \(x, \tau (x) \in \mathcal{M}\) such that \(x \ne \tau (x)\), we flip a random bit \(b_{\{x,\tau (x)\}}\) and define a new matching \(\bar{\tau }\) by \(\bar{\tau }(x) = \tau (x)\) if \(b_{\{x,\tau (x)\}}=1\), and \(\bar{\tau }(x)=x\) otherwise. We then apply this new matching \(\bar{\tau }\) to each point in \(\mathcal{M}\). This process may repeat for many rounds (with independently chosen k and matchings). We will also consider a special case of a matching exchange process called an involution walk. Here the matching \(\tau \) generating at each step is always a perfect matching (i.e. k is always \(|\mathcal{M}|/2 \)). In Appendix A we bound the number of rounds needed for \(\textsf {MSU}\) in part by relying on previous bounds for matching exchange processes and involution walks.
Total Variation Distance. In order to determine how many rounds of \(\textsf {MSU}\) are needed, we will bound the total variation distance for the underlying matching exchange process. Let \(x,y \in \varOmega \), \(P^r(x,y)\) be the probably of going from x to y in r steps and \(\mu \) be another distribution on \(\varOmega \). In our case \(\varOmega \) will be the set of all permutations of a given size, \(P^r(x,y)\) will be the probably of going between particular permutations x and y with r rounds of \(\textsf {MSU}\) and \(\mu \) will be the uniform distribution on permutations. Specifically, for a permutation y, \(\mu (y) = 1/|\varOmega |\). Then the total variation distance is defined as
Composition and Cycle Structure. We will use the following well-known fact from group theory, which was used in the cryptographic realm by Naor and Reingold [23].
Proposition 2
For any permutations \(\pi \) and \(\sigma \), the cycle structures of permutations \(\sigma \) and \(\pi \circ \sigma \circ \pi ^{-1}\) are the same. Thus, if \(\tau \) is a matching, then \(\pi \circ \tau \circ \pi ^{-1}\) is also a matching with the same number of transpositions.
3 Targeted Swap-or-Not
We begin by describing the Swap-or-Not cipher introduced by Hoang, Morris, and Rogaway [15] and then present our new Targeted Swap-or-Not cipher (\(\textsf {TSN}\)).
Swap-or-Not. Hoang, Morris, and Rogaway [15] showed that the Swap-or-Not (SN) cipher provides CCA security against adversaries who only make \(q=(1-\epsilon )N\) queries, where N is the size of the domain. In words, for domain \(\mathcal{M}= [N]\), the r-round SN cipher has key \(\mathrm {KF}\) specifying round keys \(K_i \in \mathcal{M}\) and round functions \(F_i: \mathcal{M}\rightarrow \mathcal{M}\). In round i, point X is paired with a “buddy” point \(K_i - X \mod N\) (which could be the same point, i.e., \(K_i - X \mod N = X\)), and the result of \(F_i\) determines if X should swap positions with its buddy point or not.
Hoang, Morris, and Rogaway analyzed the security of Swap-or-Not and provided bounds on both the NCPA and CCA advantages of adversaries attacking the scheme. Recently, Dai, Hoang and Tessaro [10] improved these bounds using a technique they named the chi-squared method. We will need their bound
where again N is the size of the domain, r is the number of SN rounds, and q is the number of adversarial queries.
Our Algorithm. In each round i of \(\textsf {TSN}\), point X is again paired with a “buddy” point \(K_i - X \mod N\). However, regardless of the result of the round function \(F_i\) if either X or X’s “buddy” point are not in the target set \(\mathcal{S}\) then the points do not swap positions. If both points are in \(\mathcal{S}\) then whether they swap (or not) is again determined by the round function \(F_i\). The detailed description of how to encipher a single point using \(\textsf {TSN}\) can be found in Fig. 1. Note that if we let \(\mathcal{S}= [N]\) then \(\textsf {TSN}\) becomes the original Swap-or-Not cipher for the domain \(\{0,\ldots ,N-1\}\).
Security Analysis. Our analysis of \(\textsf {TSN}\) relies heavily on the original analysis done by Hoang, Morris and Rogaway to bound the NCPA security of the Swap-or-Not algorithm [15] and then improved by Dai, Hoang and Tessaro [10] using the \(\chi ^2\) method. Our main contribution here lies in the application of this algorithm to the targeting setting and the analysis while quite technical is a generalization of the ideas and techniques used in the previous analysis.
Our goal is to bound the CCA security of \(\textsf {TSN}\) but, as in [10], we will begin by bounding the weaker NCPA security using the \(\chi ^2\) method and then use a result of Maurer, Pietrzak, and Renner [18] to derive a bound on the CCA security. Specifically we adapt Lemma 3 from [10] to the \(\textsf {TSN}\) algorithm. Combining this lemma with the techniques from the proof of Lemma 5 from [10] and applying to \(\textsf {TSN}\) immediately gives the following lemma which shows that in order to bound the NCPA security of \(\textsf {TSN}\) it suffices to bound the \(\chi ^2\)-divergence.
Lemma 1
(adapted from Dai, Hoang, Tessaro [10]). Let \(\textsf {TSN}\) represent the permutation generated by r rounds of Targeted Swap-Or-Not and \(\textsf {UN}\) represent a random permutation. Additionally, let \(p_{\textsf {TSN},r}(^.|Q_i)\) be the distribution on the \(i+1\) query by a non-adaptive NCPA adversary A to \(\textsf {TSN}\) with r rounds conditioned on the output of the previous i queries represented by \(Q_i = \{q_1,q_2, \ldots q_i\}\) and similarly \(p_{\textsf {UN}}(^.|Q_{i})\) is the distribution on the \(i+1\) query to the uniformly random permutation (i.e. the uniform distribution on the remaining \(|\mathcal{S}| - i\) elements). Given this the NCPA advantage of an NCPA adversary A making at most q non-adaptive queries is
Where the expectation is taken over a vector \(Q_i = \{q_1,q_2, \ldots q_i\}\) sampled according to the interaction with \(\textsf {TSN}\) and the \(\chi ^2\) divergence between \(p_{\textsf {TSN},r}(^.|Q_{i})\) and \(p_{\textsf {UN}}(^.|Q_{i})\) is defined as
In order to bound \({{\mathbf {E}}\left[ \,{\chi ^2(Q_{i})}\,\right] }\), we prove the following lemma which generalizes Eq. 5 from [15].
Lemma 2
Let \(|\mathcal{S}|\) be the number of elements in the target set \(\mathcal{S}\) and \(|\mathcal {X}|\) be the number of elements in the larger domain set \(\mathcal {X}\). Then we have,
where the expectation is taken over a vector \(Q_i = \{q_1,q_2, \ldots q_i\}\) sampled according to the interaction with Targeted Swap-Or-Not.
Proof
Again we point out that the following proof uses the same techniques and is a relatively straightforward generalization of the proof of Eq. 5 from [15]. Our proof proceeds by induction on r. We let \(r=0\) be our base case (the proof here follows directly from [15]). When \(r=0\) the elements are in their initial deterministic location and
Next we assume inductively that the lemma holds for r and prove that it holds for \(r+1\). In order to analyze this case we will need to use some additional terminology. For clarity we will use the same terminology as in [15] and [10] and redefine it here for readability. Let \(K_1,\ldots , K_{r+1}\) be the random keys for the first \(r+1\) rounds. Let \(S_r = \mathcal{S}-Q_{i,r}\) be the set of available positions for the \(i+1\) query where \(Q_{i,r}\) is set of positions for the first i queries given r rounds of \(\textsf {TSN}\). We will abbreviate \(p_r(x)\) to mean \(p_{\textsf {TSN},r}(x|Q_{i})\) (i.e. the probability the \(i+1\) query is x given r rounds of \(\textsf {TSN}\)) and define \(s_r = \sum _{x \in S_r }(p_r(x) -1/(|\mathcal{S}|- i))^2\).
Definition 1
(Hoang, Morris, Rogaway [15]). Let f be a bijection from \(S_r\) to \(S_{r+1}\) given by
Given this, Hoang, Morris and Rogaway [10] point out the following.
The size of \(S_r\) is \(|\mathcal{S}|- i\) and thus in our targeted setting, the probability that \(K_{r+1}-x \notin S_r\) is \((|\mathcal {X}|- (|\mathcal{S}|- i))/|\mathcal {X}|\). Combining these and letting \(\mathcal {Q} = {{\mathbf {E}}\left[ \,{(p_{r+1}(f(x)) - (1/(|\mathcal{S}|-i)))^2| s_r}\,\right] }\) gives the following.
Note that the expansion of the first sum uses the definition of \(s_r\) and the fact that \(\sum _{y \in S_r}(p_r(y) - \frac{1}{|\mathcal{S}|-i}) = 0\). Details can be found in [10]. Using the fact that f gives a bijection from \(S_r\) to \(S_{r+1}\) and the equation above, we have the following,
Using the law of iterated expectations and our inductive hypothesis we have,
\(\square \)
Next, we use Lemma 2 to bound the \(\chi ^2\) divergence and subsequently to bound the NCPA security of our Targeted Swap-or-Not Cipher.
Theorem 1
Let \(\textsf {TSN}_{r}\) represent the permutation generated by r rounds of Targeted Swap-Or-Not. The NCPA advantage of an NCPA adversary A making at most q non-adaptive queries is
Proof
Using the definition of the \(\chi ^2\) divergence given in Lemmas 1 and 2 we have the following.
Next we substitute this result into Lemma 1 and bound the subsequent summation with an integral (similar to what was done in [10]) to get the following, which implies our theorem.
Finally, to bound the CCA security of \(\textsf {TSN}\) we will use a well-known result of Maurer, Pietrzak, and Renner [18]. As in the analysis by Dai, Hoang and Tessaro [10] we note that the inverse of r rounds of \(\textsf {TSN}\) is also r rounds of \(\textsf {TSN}\) and thus applying [18] allows us to amplify our NCPA security bound to CCA security and gives the following corollary.
Corollary 1
Let \(\textsf {TSN}\) represent the permutation generated by r rounds of Targeted Swap-Or-Not. The CCA advantage of a CCA adversary A making at most q queries is
4 Mix-Swap-Unmix
Motivation. In the previous section, we saw a way to modify Swap-or-Not to get a targeted cipher, and the resulting cipher is indistinguishable from a random permutation when the adversary queries at most a constant fraction of the points. Recent papers [21, 24] have introduced small-domain ciphers that provide full security, meaning the ciphers are indistinguishable from random permutations even to an adversary allowed to query all domain points. This leaves the question of whether we can build fully-secure small-domain ciphers that support targeting without too much loss in efficiency.
We could certainly take an existing fully-secure cipher and apply a general transformation like Reverse Cycle Walking or Cycle Slicer to get a matching, and then only swap points that both lie in the target set. Unfortunately, fully-secure ciphers are already significantly less efficient than partially-secure counterparts like Swap-or-Not (which itself is far less efficient than the Feistel-based, standardized schemes), so using many rounds of a general transformation like Cycle Slicer is simply too slow to ever be practical. To be more concrete, if we start with the fully-secure cipher Mix-and-Cut [24] on domain \(\mathcal {X}=\{0,1\}^{30}\), that cipher internally needs 10,000 rounds of Swap-or-Not to achieve full security. If we then apply Cycle Slicer to target the set of bitstrings that represent 9-digit numbers, then [20] states we need 12,000 rounds of Cycle Slicer, with each of those 12,000 rounds applying the 10,000 rounds of Swap-or-Not inside of Mix-and-Cut. Thus, to get a targeted, fully-secure cipher with this method, we would need \(10000 \times 12000 = \) 120 million rounds of Swap-or-Not!
Clearly, there is a lot of efficiency loss in using a general transformation like Cycle Slicer on an existing fully-secure cipher. Thus, we instead turn to a different approach: directly constructing a fully-secure cipher that is matching-based and thus supports domain targeting. Like the existing fully-secure ciphers Mix-and-Cut and Sometimes-Recurse, we build our new fully-secure cipher from Swap-or-Not. We call our new algorithm Mix-Swap-Unmix (MSU). MSU, by default, enciphers points in the general domain \([N] = \{0,\ldots ,N-1\}\) for even N, and can support targeting to any domain \(\mathcal{S}\subseteq [N]\).
The Algorithm. Let \(\textsf {SN}_\mathrm {KF}\) denote the Swap-or-Not cipher with domain [N], with key \(\mathrm {KF}\) consisting of round keys \(K_1,\ldots ,K_r\) and round functions \(F_i:[N]\rightarrow \{0,1\}\). Our new cipher \(\textsf {MSU}\) will have domain \(\mathcal{S}\subseteq [N]\) and keys \((\mathbf {KF},\mathbf {G})\) consisting of m Swap-or-Not keys \(\mathbf {KF}=\{\mathrm {KF}_1,\ldots ,\mathrm {KF}_m\}\) and m round functions \(\mathbf {G}=\{G_1,\ldots ,G_m\}\) with each \(G_j: [N] \rightarrow \{0,1\}\). The code is shown in Fig. 2. The boxed statements are for domain targeting; if one’s desired domain is simply [N], the boxed portion can be excluded.
In words, to encipher a point \(X \in \mathcal{S}\subseteq [N]\) with \(\textsf {MSU}\), we first apply r rounds of the Swap-or-Not cipher to get a new point Z. If Z is even, it is swapped with \(Z+1\), otherwise it is swapped with \(Z-1\). We then apply the inverse of the Swap-or-Not cipher applied earlier in the round to get a new point \(X'\). If X and \(X'\) are both in \(\mathcal{S}\) and an additional bit flip is 1, then the swap of X and \(X'\) becomes official; otherwise if either the bit flip is 0 or one of both of the points is not in \(\mathcal{S}\), then X and \(X'\) are simply mapped to themselves for this round of MSU. Thus, in one round of MSU, a point X is either mapped to \(\textsf {SN}^{-1}_{\mathrm {KF}_j}(\textsf {SN}_{\mathrm {KF}_j}(X) \oplus 1)\) or it is simply mapped back to itself.
Each round of \(\textsf {MSU}\) gives a matching, a permutation on \(\mathcal {X}\) made up of only transpositions. This follows from Proposition 2 that states that if \(\pi \) and \(\sigma \) are permutations, then the permutation \(\pi \circ \sigma \circ \pi ^{-1}\) has the same cycle structure as \(\sigma \). Since in \(\textsf {MSU}\) the “inner” permutation \(\sigma \) simply consists of swaps of points Z with \(Z \oplus 1\), the overall cycle structure of \(\textsf {MSU}\) will also be made up of just swaps/transpositions.
Security. We now formally show this construction gives a fully-secure cipher on \(\mathcal{S}\), meaning it is indistinguishable from a random permutation even to an adversary that can see all \(|\mathcal{S}|\) input-output mappings.
Theorem 2
Let \(\textsf {MSU}\) be described as above, with m rounds, each of which uses r rounds of Swap-or-Not. Then,
where \(\varDelta _1 = \frac{2N}{\sqrt{r/2+1}} \left( \frac{7}{8} \right) ^{(r/2+1)/2} + e^{\frac{-N}{40}}\) and \( \varDelta _2 = |\mathcal{S}|^{1-(2m/T)}, \) where
Before proving the theorem, we note that the presence of the \(e^{\frac{-N}{40}}\) term means that MSU does not provide good security for very small domains. Yet, this term is not problematic for domains like those discussed in the Introduction where N is, say, \(2^{30}\).
Proof
Let \(\mathcal{S}\subseteq [N]\) and let \(\textsf {MSU}: \mathcal{K}\times \mathcal{S}\rightarrow \mathcal{S}\) be the m-round Mix-Swap-Unmix algorithm as defined in Sect. 4 with randomly chosen round keys \(\mathbf {KF}\), randomly chosen round functions \(\mathbf {G}\), and using the r-round Swap-or-Not cipher on domain [N]. Let A be a CCA adversary against MSU that queries every point in \(\mathcal{S}\). We wish to bound the following advantage
To do so, we will use a sequence of game transitions, starting with \(\mathsf {Gm}_0 = \mathsf {SPRP1}\) and making small changes to the games until we have \(\mathsf {SPRP0}\).
For the rest of the proof, we will write \({\Pr \left[ \,{\mathsf {Gm}}\,\right] }\) instead of \({\Pr \left[ \,{\mathsf {Gm}^A\Rightarrow 1}\,\right] }\) for brevity. For our first game transition, we will modify the \(\mathsf {Enc}\) procedure to apply the round functions \(\mathbf {G}\) to the maximum of Z and \(Z'\), instead of to the max of X and \(X'\). Let the resulting game be \(\mathsf {Gm}_1\). Regardless of this change, the round function still just associates a random bit flip with the pairs of points that are matched by this round, so \({\Pr \left[ \,{\mathsf {Gm}_0}\,\right] } = {\Pr \left[ \,{\mathsf {Gm}_1}\,\right] }\).
Our next game, \(\mathsf {Gm}_2\), is the same as \(\mathsf {Gm}_1\) but with the random round functions \(\mathbf {G}\) replaced by bit flips that take place in the \(\mathbf{main }\) function and are associated to every possible odd Z value in [N]; there are separate sets of bit flips for each round of MSU (placed into a table \(\mathtt {B}\)), just as there are separate round functions \(G_j\) for each round. The \(\mathsf {Enc}\) procedure then uses the table \(\mathtt {B}\) with these bit flips in place of \(G_j\) in each round to determine if swaps take place. Detailed code for game \(\mathsf {Gm}_2\) is given in Appendix B. Since random round functions with 0/1 outputs have just been replaced by random bit flips, \({\Pr \left[ \,{\mathsf {Gm}_2}\,\right] } = {\Pr \left[ \,{\mathsf {Gm}_1}\,\right] }\).
Notice that if, for any round j, too many bit flips are 1, then a bad flag \(\mathsf {bad}_j\) is set. This will be needed later in the proof, but we point out here that the bad events only depend on the sum of independent bit flips in \(\mathbf{main }\), so we will be able to easily bound the probability of these events with a Chernoff bound.
Our next sequence of game transitions will replace Swap-or-Not in each round of MSU with a randomly chosen permutation on [N]. But, care must be taken, since our adversary A against MSU may query all domain points, yet Swap-or-Not is only proven secure against adversaries that query a constant fraction of the domain points. Intuitively, we will be able to overcome this “gap” by only making queries to Swap-or-Not when the round bits in the table \(\mathtt {B}\) are 1.
More formally, we define a sequence of hybrid games \(\mathsf {H}_0,\ldots ,\mathsf {H}_m\). The first hybrid game, \(\mathsf {H}_0\), is identical to \(\mathsf {Gm}_2\), meaning it uses the bit table \(\mathtt {B}\) in place of random round functions. In game \(\mathsf {H}_\ell \), the first \(\ell \) rounds of MSU use a completely random permutation, while the remaining rounds use Swap-or-Not. This means that the last hybrid game, \(\mathsf {H}_m\), is identical to \(\mathsf {Gm}_2\) but with every round of MSU using a random permutation on [N] in place of Swap-or-Not. We now claim that for every \(i \in \{1,\ldots ,m\}\), \({\Pr \left[ \,{\mathsf {H}_{i-1}}\,\right] } - {\Pr \left[ \,{\mathsf {H}_{i}}\,\right] } \le \mathbf {Adv}^{\mathrm {cca}}_{\textsf {SN}}(3N/4) + e^{\frac{-N}{40}}\).
To prove this, we provide a CCA adversary B against Swap-or-Not that makes at most 3N / 4 oracle queries. These queries will all be to the decryption oracle on the elements of \(\mathsf {odd}(N)\) for which a bit flip is 1. Adversary B will run adversary A, answering its queries using its own oracles. If adversary B has a SN oracle, then it will end up simulating \(\mathsf {H}_{i-1}\) for A, while if it has a random permutation oracle, it will end up simulating \(\mathsf {H}_i\).
Before we get to the exact details of this adversary B, we expand the equation in the above claim to take into account the event that \(\mathsf {bad}_i\) is set to true. In the following equations, let \(\mathsf {bad}_i\) denote the event that the flag \(\mathsf {bad}_i\) (which, recall, means in the part of the \(\mathtt {B}\) used in round i of MSU) is set to \(\mathsf {true}\) during the execution of the game. Note that the probability of \(\mathsf {bad}_i\) being set to \(\mathsf {true}\) is the same in any hybrid game, since they all have identical \(\mathbf{main }\) procedures. Now,
We are now ready to specify our adversary B against Swap-or-Not. Adversary B is given a Swap-or-Not oracle and will run adversary A and try to simulate its environment to match the hybrid games \(\mathsf {H}_{i-1}\) and \(\mathsf {H}_{i}\). If B has a real Swap-or-Not oracle, then it will end up simulating \(\mathsf {H}_{i-1}\), while if it has a random permutation oracle it will end up simulating \(\mathsf {H}_{i}\). To simulate round i of the MSU algorithm, B first flips coins just like in the \(\mathbf{main }\) procedure of the hybrid games to populate the \(\mathtt {B}\) table. If the \(\mathsf {bad}_i\) (the bad flag for round i) gets set to \(\mathsf {true}\), meaning too many coin flips ended up as 1 for that round of MSU, then adversary B needs to stop and simply output a random 0/1 guess. If \(\mathsf {bad}_i\) is not set, then B proceeds by querying its own SN oracle with all z and \(z \oplus 1\) in which \(\mathtt {B}[i][z]=1\). B now runs A and can properly complete round i of MSU for A on any query, since the only way round i can affect a point X is if the corresponding bit in \(\mathtt {B}\) is 1. Because B queried every such point, it will know what to do with any given X or \(X'\). Thus, as long as the \(\mathsf {bad}_i\) flag is not set, B will perfectly simulate the hybrid game for A.
In the equations below, let \(\mathsf {S1}\) be short for \(\mathsf {SPRP1}^B \Rightarrow 1\) and \(\mathsf {S0}\) be short for \(\mathsf {SPRP0}^B \Rightarrow 1\). We can now see adversary B’s advantage
When the \(\mathsf {bad}_i\) flag is not set, adversary B running in the \(\mathsf {SPRP1}\) game is perfectly simulating the hybrid game \(\mathsf {H}_{i-1}\) and B running in \(\mathsf {SPRP0}\) is perfectly simulating the hybrid game \(\mathsf {H}_{i}\). Thus, combining the equations above gives
where adversary B makes at most \(q = (3/4)N\) queries to its oracle. Applying the bound from Eq. (1) in Sect. 3 and Proposition 1 to our hybrid argument over m rounds gives us the \(\varDelta _1\) bound in our theorem statement.
Now, continuing with our game transitions, let \(\mathsf {Gm}_3\) be the same as \(\mathsf {H}_m\), but with the bit flips moved into the \(\mathsf {Enc}\) procedure and taking place at the time they are needed (in the if). This syntactic change has no effect on the output of the game. Next, we will transition from \(\mathsf {Gm}_3\) to a game \(\mathsf {Gm}_4\) in which each round of MSU now applies a randomly chosen perfect matching to X to get \(X'\) instead of computing \(X' \leftarrow \pi ^{-1}(\pi (X)\oplus 1)\).
We now claim that the new version of MSU in \(\mathsf {Gm}_4\) is actually a matching exchange process. This specific matching exchange process, where a perfect matching on [N] is then restricted to a subset \(\mathcal{S}\) (i.e., matchings that do not pair up points in \(\mathcal{S}\) are thrown out), is analyzed in Appendix A. We can apply Theorem 5 in that appendix to show that \( {\Pr \left[ \,{\mathsf {Gm}_4^A \Rightarrow 1}\,\right] } - {\Pr \left[ \,{\mathsf {SPRP0}^A_\textsf {MSU}\Rightarrow 1}\,\right] } \le \varDelta _2 \) where \(\varDelta _2\) is the bound from Theorem 5. Combining all of our bounds on the above game transitions completes the proof of Theorem 2. \(\square \)
Discussion and Extensions. Using the \(\varDelta _1\) and \(\varDelta _2\) bounds above, we can see that we need a few hundred rounds of Swap-or-Not within each of about 5000 rounds of MSU, to get low adversarial advantage. While this is still a lot of rounds, it is substantially less than the 100 s of millions of rounds needed in previous work.
Additionally, we mention two extensions of this result. First, we have presented MSU as a cipher on [N] that can be targeted to a domain \(\mathcal{S}\subseteq [N]\). If we are only interested in a cipher on [N] and do not need targeting, then we can improve the full security bound in Theorem 2 by applying a recent result of Bernstein [5] on the mixing time of involution walks, which are especially one type of matching exchange process. More details can be found in Appendix A, but our \(\varDelta _2\) term in the above theorem will become the value in Corollary 2. Then, for the case where \(N=10^9\), we will only need about 220 rounds of MSU to get the \(\varDelta _2\) term less than \(10^{-9}\).
Second, our MSU algorithm as described and analyzed above works best when \(|\mathcal{S}| \ge |\mathcal {X}|/2\). If the target set is smaller than that, we can show the round function \(\mathbf {G}\) (which essentially does bit flips that determine if a swap should take place) can be removed, which speeds up mixing. Showing this is non-trivial, since the resulting algorithm no longer appears to be a matching exchange process. We analyze the resulting process in more detail in Appendix A, Corollary 3.
Notes
- 1.
For example, if one must encipher dates in the form mm/dd, then a custom ranking might map each date to a day numbered 0–365 in the obvious way.
References
Bellare, M., Hoang, V.T.: Identity-based format-preserving encryption. In: Thuraisingham, B.M., Evans, D., Malkin, T., Xu, D. (eds.) ACM CCS 2017, pp. 1515–1532. ACM Press, October/November 2017
Bellare, M., Hoang, V.T., Tessaro, S.: Message-recovery attacks on Feistel-based format preserving encryption. In: Weippl, E.R., Katzenbeisser, S., Kruegel, C., Myers, A.C., Halevi, S. (eds.) ACM CCS 2016, pp. 444–455. ACM Press, October 2016
Bellare, M., Ristenpart, T., Rogaway, P., Stegers, T.: Format-preserving encryption. In: Jacobson, M.J., Rijmen, V., Safavi-Naini, R. (eds.) SAC 2009. LNCS, vol. 5867, pp. 295–312. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-05445-7_19
Bellare, M., Rogaway, P., Spies, T.: The FFX mode of operation for format-preserving encryption, February 2010. http://csrc.nist.gov/groups/ST/toolkit/BCM/documents/proposedmodes/ffx/ffx-spec.pdf
Bernstein, M.: The mixing time for a random walk on the symmetric group generated by random involutions. In: Proceedings of the 28th International Conference on Formal Power Series and Algebraic Combinatorics (FPSAC) (2016)
Black, J., Rogaway, P.: Ciphers with arbitrary finite domains. In: Preneel, B. (ed.) CT-RSA 2002. LNCS, vol. 2271, pp. 114–130. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-45760-7_9
Brier, E., Peyrin, T., Stern, J.: BPS: a format-preserving encryption proposal. http://csrc.nist.gov/groups/ST/toolkit/BCM/documents/proposedmodes/bps/bps-spec.pdf
Brightwell, M., Smith, H.: Using datatype-preserving encryption to enhance data warehouse security. In: National Information Systems Security Conference (NISSC) (1997)
Czumaj, A., Kutylowski, M.: Delayed path coupling and generating random permutations. Random Struct. Algorithms 17, 238–259 (2000)
Dai, W., Hoang, V.T., Tessaro, S.: Information-theoretic indistinguishability via the chi-squared method. In: Katz, J., Shacham, H. (eds.) CRYPTO 2017. LNCS, vol. 10403, pp. 497–523. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-63697-9_17
Durak, F.B., Vaudenay, S.: Breaking the FF3 format-preserving encryption standard over small domains. In: Katz, J., Shacham, H. (eds.) CRYPTO 2017. LNCS, vol. 10402, pp. 679–707. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-63715-0_23
Dworkin, M.: Recommendation for block cipher modes of operation: methods for format preserving-encryption. NIST Special Publication 800–38G (2016). http://dx.doi.org/10.6028/NIST.SP.800-38G
Granboulan, L., Pornin, T.: Perfect block ciphers with small blocks. In: Biryukov, A. (ed.) FSE 2007. LNCS, vol. 4593, pp. 452–465. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-74619-5_28
Grubbs, P., Ristenpart, T., Yarom, Y.: Modifying an enciphering scheme after deployment. In: Coron, J.-S., Nielsen, J.B. (eds.) EUROCRYPT 2017. LNCS, vol. 10211, pp. 499–527. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-56614-6_17
Hoang, V.T., Morris, B., Rogaway, P.: An enciphering scheme based on a card shuffle. In: Safavi-Naini, R., Canetti, R. (eds.) CRYPTO 2012. LNCS, vol. 7417, pp. 1–13. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-32009-5_1
Luchaup, D., Dyer, K.P., Jha, S., Ristenpart, T., Shrimpton, T.: LibFTE: a toolkit for constructing practical, format-abiding encryption schemes. In: Proceedings of the 23rd USENIX Security Symposium, pp. 877–891 (2014)
Luchaup, D., Shrimpton, T., Ristenpart, T., Jha, S.: Formatted encryption beyond regular languages. In: Ahn, G.J., Yung, M., Li, N. (eds.) ACM CCS 2014, pp. 1292–1303. ACM Press, November 2014
Maurer, U., Pietrzak, K., Renner, R.: Indistinguishability amplification. In: Menezes, A. (ed.) CRYPTO 2007. LNCS, vol. 4622, pp. 130–149. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-74143-5_8
Miracle, S., Yilek, S.: Reverse cycle walking and its applications. In: Cheon, J.H., Takagi, T. (eds.) ASIACRYPT 2016. LNCS, vol. 10031, pp. 679–700. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-53887-6_25
Miracle, S., Yilek, S.: Cycle slicer: an algorithm for building permutations on special domains. In: Takagi, T., Peyrin, T. (eds.) ASIACRYPT 2017. LNCS, vol. 10626, pp. 392–416. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-70700-6_14
Morris, B., Rogaway, P.: Sometimes-recurse shuffle - almost-random permutations in logarithmic expected time. In: Nguyen, P.Q., Oswald, E. (eds.) EUROCRYPT 2014. LNCS, vol. 8441, pp. 311–326. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-642-55220-5_18
Morris, B., Rogaway, P., Stegers, T.: How to encipher messages on a small domain. In: Halevi, S. (ed.) CRYPTO 2009. LNCS, vol. 5677, pp. 286–302. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-03356-8_17
Naor, M., Reingold, O.: Constructing pseudo-random permutations with a prescribed structure. J. Cryptol. 15(2), 97–102 (2002)
Ristenpart, T., Yilek, S.: The mix-and-cut shuffle: small-domain encryption secure against N queries. In: Canetti, R., Garay, J.A. (eds.) CRYPTO 2013. LNCS, vol. 8042, pp. 392–409. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40041-4_22
Spies, T.: Format-preserving encryption. Unpublished whitepaper (2008). https://www.voltage.com/wp-content/uploads/Voltage-Security-WhitePaper-Format-Preserving-Encryption.pdf
Acknowledgements
We thank the SAC 2018 anonymous reviewers for their detailed and helpful comments.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendices
A Analyzing a Matching Exchange Process
In order to bound the number of rounds of \(\textsf {MSU}\) that are needed we analyze the underlying matching exchange process. We obtain three different bounds depending on how the size of the domain \(\mathcal {X}\) relates the size of the target set \(\mathcal{S}\). Our best bound is when the \(|\mathcal{S}| = |\mathcal {X}|\) and we show that the process is an involution walk and rely on a recent result of Bernstein [5]. When \(|\mathcal{S}| < |\mathcal {X}|\) we rely on previous work [9, 20] to bound the variation distance of a general matching exchange process. In the case where \(|\mathcal{S}| \ge |\mathcal {X}|/2\) in order for \(\textsf {MSU}\) to be a matching exchange process we have added an additional bit flip to each pair selected (the round function G). When \(|\mathcal{S}| < |\mathcal {X}|/2\) we prove that there exists a matching exchange process that results in the identical distribution on matchings generated by \(\textsf {MSU}\) and thus we do not need to add the additional bit flip. By eliminating this extra bit flip we improve the parameters of the matching exchange process and provide a tighter bound on the variation distance.
Recall that at each step of a matching exchange process a parameter \(\kappa \le |S|/2\) is selected according to some distribution. Next a matching of size \(\kappa \) on the set S is selected uniformly at random. Finally for each pair in the matching a bit is flipped independently to determine whether that particular pair is kept in the matching. For the purposes of this section, we will view \(\textsf {MSU}\) as generating a perfect matching on \(\mathcal {X}\) and then ignoring all pairs in the matching except for those where both points are in our target set \(\mathcal{S}\subseteq \mathcal {X}\). We consider the ideal scenario where each round of \(\textsf {MSU}\) generates a uniformly random perfect matching on \(\mathcal {X}\).
An Involution Walk. An involution walk is defined as a random walk on the symmetric group \(S_n\) for n even where at each step a uniformly random perfect matching on \(S_n\) is generated and then each pair in the matching is applied with probability \(1-p\) and discarded with probability p. It is straightforward to see that as intended, when \(|\mathcal {X}| = |\mathcal{S}|\), \(\textsf {MSU}\) is indeed an involution walk on the set \(\mathcal {X}\). Bernstein proves the following theorem for any involution walk.
Theorem 3
(Bernstein [5]). For \(t = \log _{\frac{2}{1+p}}(n) + \frac{c}{\ln (\frac{2}{1+p})}\), n such that
\(\frac{10\ln (n+2)}{\sqrt{(n+2)/2}-1}\le \ln \left( \frac{2}{1+p}\right) \) and \(n-1 > \sqrt{n/2}(1+\ln (n))\), then \(||P^{*t} - U||_{TV} \le e^{-c/2}\).
In order to apply this theorem to \(\textsf {MSU}\) we will require \(n \ge 2^{19}\) and let \(p=1/2\) which gives the following corollary.
Corollary 2
For \(n \ge 2^{19}\) the involution walk with parameter \(p=1/2\) satisfies
Proof
Solving for c in the expression \(t = \log _{\frac{2}{1+p}}(n) + \frac{c}{\ln (\frac{2}{1+p})}\) and then simplifying gives \(c = t\ln (\frac{2}{1+p}) - \ln n\). Substituting this into the equation for variation distance and simplifying gives \(||P^{*t} - U||_{TV} \le n^{1/2}e^{-t\ln (\frac{2}{1+p})/2}\). Fixing \(p=1/2\) gives the desired result \(||P^{*t} - U||_{TV} \le n^{1/2}e^{-t\ln (4/3)/2}\). Requiring that \(n \ge 2^{19}\) satisfies the requirements \(\frac{10\ln (n+2)}{\sqrt{(n+2)/2}-1}\le \ln (4/3)\) and \(n-1 > \sqrt{n/2}(1+\ln (n))\), and completes the proof. \(\square \)
General Matching Exchange Processes. When \(|\mathcal{S}| < |\mathcal {X}|\), we will use the following result of Miracle and Yilek [20] which bounds the variation distance of a matching exchange process.
Theorem 4
(Miracle, Yilek [20]).
Let \(T = \max \left( 40\ln (2n^2), \frac{10\ln (n/9)}{\ln (1 + p_1p_2(7/36)((7/9)n^2 -n))} \right) + \frac{72\ln (2n^2)}{p_1n}\), then
where \(\nu _{\text {ME}^r}\) is the distribution after r rounds of a matching exchange process on n elements and \(\mu _{\textsf {UN}}\) is the uniform distribution on permutations of n elements.
In order to apply the theorem we need to bound two parameters \(p_1\) and \(p_2\) of the associated matching exchange process which are defined below.
Definition 2
(Miracle, Yilek [20]).
-
1.
For any points x, y the probability that a pair (x, y) is part of a matching is at least \(p_1\).
-
2.
For any points x, y, z, and w conditioned on (x, y) being a pair in the matching, the probability that (z, w) is also in the matching is at least \(p_2\).
We begin by consider the \(\textsf {MSU}\) process as defined in Fig. 2 and prove the following. We will use this bound for the case when \(|\mathcal {X}| > |\mathcal{S}| \ge |\mathcal {X}|/2\).
Theorem 5
Let \(T = \max \left( 40\ln (2|\mathcal{S}|^2),\frac{10\ln (|\mathcal{S}|/9)}{\ln (1 + (7/36|\mathcal {X}|^2)((7/9)|\mathcal{S}|^2 -|\mathcal{S}|))} \right) + \frac{72|\mathcal {X}|\ln (2|\mathcal{S}|^2)}{|\mathcal{S}|}\), then
where \(\nu _{{\textsf {MSU}}^r}\) is the distribution after r rounds of \(\textsf {MSU}\), \(|\mathcal{S}|\) is the size of the target set \(\mathcal{S}\), \(|\mathcal {X}|\) is the size of the larger domain set \(\mathcal {X}\), and \(\mu _s\) is the uniform distribution on permutations of \(|\mathcal{S}|\) elements.
Proof
In order to apply Theorem 4 we first bound the parameters \(p_1\) and \(p_2\). In \(\textsf {MSU}\) the probability that we select a pair (x, y) with \(x,y \in \mathcal{S}\) is \(1/(|\mathcal {X}| -1)\) since there are \(|\mathcal {X}| -1\) choices for a particular point to get mapped to and each are equally likely. Thus \(p_1 = 1/(|\mathcal {X}| -1) > 1/|\mathcal {X}|\). Given that a pair (x, y) is already included in the matching, the probability that a second pair (z, w) is also included is \(1/(|\mathcal {X}| - 3)\) since there are \(|\mathcal {X}| -3\) remaining choices for z to get mapped to and each are equally likely. Thus \(p_2 = 1/(|\mathcal {X}| - 3) > 1/|\mathcal {X}|. \) Directly substituting these parameters into Theorem 4 completes the proof. \(\square \)
Eliminating the Bit Flip. When \(|\mathcal{S}| < |\mathcal {X}|/2\) we are able to show that \(\textsf {MSU}\) is a matching exchange process without adding an additional bit flip for each pair in the matching and thus we can remove the round function G from Fig. 2. We prove the following.
Theorem 6
The distribution on matchings on the target set \(\mathcal{S}\) generated by \(\textsf {MSU}\) without the round function G is identical to the final distribution generated by a matching exchange process on \(\mathcal{S}\) with parameters \(p_1 = 2/|\mathcal {X}|\) and \(p_2 = 2/|\mathcal {X}|\), where \(|\mathcal {X}|\) is the size of the domain set \(\mathcal {X}\) and \(|\mathcal{S}| < |\mathcal {X}|/2\).
Proof
Our proof begins by giving a particular matching exchange process \(\mathcal {P}\) and associated distribution on \(\kappa \) and then proving that the distribution on matching that results from this process is identical to the distribution that results from the \(\textsf {MSU}\) process. We then bound the matching exchange process parameters \(p_1\) and \(p_2\) for our given process.
Let \(\mathcal {P}\) be a matching exchange process where the probability that \(\kappa = i\) is given by \(p_i\). Let \(G_i\) be the probability that a particular matching of size i on \(\mathcal{S}\) (i.e. a matching with 2i points) is selected by \(\textsf {MSU}\). It is straightforward to see from the definition of \(\textsf {MSU}\) that \(G_i\) is the same for each matching of size i. Let \(m= |\mathcal{S}|/2\) be the size of a perfect matching on \(\mathcal{S}\) and \(\mathcal {M}_i\) be the number of perfect matchings on a set with 2i points. We now define \(p_i\) as follows,
Consider a particular matching \(m_i\) of size i on \(\mathcal{S}\). By definition, it is selected with probability \(G_i\) in \(\textsf {MSU}\). We will show that the probability it is selected by \(\mathcal {P}\) is also \(G_i\). In \(\mathcal {P}\) this matching is selected if we select any matching that contains \(m_i\) as a sub-matching and then flip the bits appropriately to just select the edges in \(m_i\). Thus in \(\mathcal {P}\) the probability that \(m_i\) is selected is the sum over matching from size i to \(m\) of the number of matchings that contain \(m_i\) times the probability a matching of that size is selected times the probability we select the exact edges in \(m_i\) which is \((2^{-1})^{m}\). This gives us the following
We will prove by induction on i that \(G_i = {\Pr \left[ \,{m_i}\,\right] }\) for \(0\le i \le m\). For our base case let \(i=m\). Then we have,
Next we assume inductively that \(G_{i+1} = {\Pr \left[ \,{m_{i+1}}\,\right] }\) and then show that this holds for i as follows.
It remains to show that these choices of \(p_i\) form a probability distribution. To show this we need to show that for all \(0 \le i \le m\), \(p_i \ge 0\) and that \(\sum _{i=0}^{m} p_i= 1\). Given the above definition of the \(\{p_i\}\)’s to show that for all \(0 \le i \le m\), \(p_i \ge 0\) it suffices to show that \(G_i - G_{i+1} > 0\) for all \(0\le i < m\). Recall that \(G_i\) is the probability that the \(\textsf {MSU}\) algorithm results in a particular matching on \(\mathcal{S}\) of size i. Additionally recall that the \(\textsf {MSU}\) process is equivalent to first generating a uniformly random perfect matching on \(\mathcal {X}\) and then removing all edges except those where both points are in \(\mathcal{S}\). Thus \(G_i\) is the number of matchings consistent with a particular matching of size i divided by the total number of matchings. If we fix a particular matching of size i on \(\mathcal{S}\) then there are \(2(m-i)\) remaining points in \(\mathcal{S}\) that are unmatched. In all consistent matchings these are matched with points in \(\mathcal {X}- \mathcal{S}\) of which there are \(|\mathcal {X}| - |\mathcal{S}|\) remaining. There are \(\left( {\begin{array}{c}|\mathcal {X}| - |\mathcal{S}|\\ |\mathcal{S}| - 2i\end{array}}\right) \) ways to choose these points and \((|\mathcal{S}| - 2i)!\) ways to match them with the remaining points in \(\mathcal{S}\). Finally there are \(\mathcal {M}_{|\mathcal {X}|/2 - |\mathcal{S}| +i}\) ways to match up the remaining points in \(\mathcal{S}\). Combining these observations gives the following.
Since our goal is to show that \(G_i - G_{i+1} > 0\) for \(0\le i < m\), it suffices to show
This simplifies to the following which holds as long as \(|\mathcal{S}| < |\mathcal {X}| /2\),
We know that the distribution on matchings given by \(\textsf {MSU}\) is a valid probability distribution. Above we proved that the probability of any particular matching of size i is the same under both \(\textsf {MSU}\) and \(\mathcal {P}\). This implies that the \(\sum _{i=0}^mG_i \times \mathcal {M}^m_i = 1\). Similarly this implies that \(\sum _{i=0}^{m} p_i = \sum _{i=0}^mp_i (\mathcal {M}^m_i)^{-1}\cdot \mathcal {M}^m_i = \sum _{i=0}^mG_i \times \mathcal {M}^m_i =1\). Thus the \(\{p_i\}\)’s form a valid probability distribution as long as \(|\mathcal{S}| < |\mathcal {X}| /2\).
It remains to bound the two parameters \(p_1\) and \(p_2\) for the matching exchange process \(\mathcal {P}\). Recall from Definition 2 that \(p_1\) is a lower bound on the probably that for any two points x and y the pair (x, y) is included in the matching. Note that this is the probability in the matching exchange process before a bit is flipped for each pair in the matching. Recall that in \(\textsf {MSU}\) the probability that we select a pair (x, y) with \(x,y \in \mathcal{S}\) is \(1/(|\mathcal {X}| -1)\) since there are \(|\mathcal {X}| -1\) choices for a particular point to get mapped to and each are equally likely. Let \(p_1\) be the probability that a particular pair (x, y) is select to be part of the matching in the corresponding matching exchange process \(\mathcal {P}\) that we analyzed above. This implies that \(p_1\cdot (1/2) = 1/(|\mathcal {X}| -1)\) and thus \(p_1 = 2/(|\mathcal {X}| -1) > 2/|\mathcal {X}|\).
Next, the parameter \(p_2\) is a lower bound on the probability that for any four points x, y, z, and w in \(\mathcal{S}\) that conditioned on the pair (x, y) being part of the matching, the probability that the pair (z, w) is also part of the matching. Again these are the probabilities for the underlying matching exchange process \(\mathcal {P}\). Let \(P_1\) be the event that the pair (x, y) is part the original matching (before the bit flip) and \(P_2\) be the event that pair (z, w) is part of the original matching. Similarly let \(F_1\) be the event that the pair (x, y) is part of the final matching and \(F_2\) be the event that the pair (z, w) is part of the final matching. We are interested in \(p_2 = {\Pr }\left[ \, P_2\,\left| \right. \,P_1\,\right] \). Note that \({\Pr \left[ \,{P_1 \cap P_2}\,\right] } = 4{\Pr \left[ \,{F_1 \cap F_2}\,\right] }\). By the laws of conditional probability we have,
Recall that for \(\textsf {MSU}\) given that a pair (x, y) is already included in the matching, the probability that a second pair (z, w) is also included is \(1/(|\mathcal {X}| - 3)\) since there are \(|\mathcal {X}| -3\) remaining choices for z to get mapped to and each are equally likely. Thus \(p_2 = 2{\Pr }\left[ \, F_2\,\left| \right. \,F_1\,\right] = 2/(|\mathcal {X}| -3) > 2/|\mathcal {X}|. \) \(\square \)
Directly substituting the parameters on the matching exchange process given by Theorem 6 into Theorem 4 gives the following corollary.
Corollary 3
Let \(T = \max \left( 40\ln (2|\mathcal{S}|^2), \frac{10\ln (|\mathcal{S}|/9)}{\ln (1 + (7/9|\mathcal {X}|^2)((7/9)|\mathcal{S}|^2 -|\mathcal{S}|))} \right) + \frac{36|\mathcal {X}|\ln (2|\mathcal{S}|^2)}{|\mathcal{S}|}\), then
where \(\nu _{{\textsf {MSU}}^r}\) is the distribution after r rounds of \(\textsf {MSU}\) without the round function G, \(|\mathcal{S}|\) is the size of the target set \(\mathcal{S}\), \(|\mathcal {X}|\) is the size of the larger domain set \(\mathcal {X}\), \(|\mathcal{S}|< |\mathcal {X}|/2, \) and \(\mu _s\) is the uniform distribution on permutations of \(|\mathcal{S}|\) points.
B Game for Proof of Theorem 2
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Miracle, S., Yilek, S. (2019). Targeted Ciphers for Format-Preserving Encryption. In: Cid, C., Jacobson Jr., M. (eds) Selected Areas in Cryptography – SAC 2018. SAC 2018. Lecture Notes in Computer Science(), vol 11349. Springer, Cham. https://doi.org/10.1007/978-3-030-10970-7_1
Download citation
DOI: https://doi.org/10.1007/978-3-030-10970-7_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-10969-1
Online ISBN: 978-3-030-10970-7
eBook Packages: Computer ScienceComputer Science (R0)