Keywords

1 Introduction

In this era of “big data,” where organizations regularly harvest and store large amounts of customer data, the need to secure personal information in the face of data breaches has become essential. Encrypting sensitive personal and financial data like credit card numbers, social security numbers, and birth dates is an obvious way to defend against data breaches, but how to encrypt these diverse types of data is not always obvious. Practitioners are faced with the challenge of introducing encryption into large databases that interact with a potentially complex system of hardware and legacy software, while trying not to break anything. Given these challenges and constraints, it is easy to see the appeal of Format-Preserving Encryption (FPE) schemes, in which ciphertexts have the same format as plaintexts. For example, if one encrypts a 9 decimal digit US social security number with an FPE scheme, the resulting ciphertext would also be a 9 digit number. Such FPE schemes can often be “dropped in” to existing systems with little disruption.

Early attempts at constructing and analyzing FPE schemes were conducted by Brightwell and Smith [8] and later Spies [25]. The increasing practical interest in the problem, especially related to credit card encryption, has led to a recent surge in academic research on FPE and related problems [1,2,3, 6, 11, 14,15,16,17, 19,20,22, 24]. There are even FPE standards NIST SP 800-38G [12] and ANSI ASC X9.124 that include Feistel-based FPE schemes like FF1 [4] and FF3 [7].

There are a variety of techniques known for constructing a format-preserving encryption scheme to encipher points in domain \(\mathcal{S}\). Since block ciphers have traditionally been designed for bitstring domains, we cannot use an existing cipher (e.g., AES) without modification. Instead, there are generally three main strategies for constructing the desired encryption scheme. First, we could try to construct a cipher that is customized to work directly on domain \(\mathcal{S}\). This works best when \(\mathcal{S}\) has a relatively simple structure, like integers in the range \(\{0,\ldots ,N-1\}\), and many ciphers used in FPE are designed to work on this domain. (For example, such a cipher would work well for our social security number example; we would just need a cipher on \(\{0,\ldots ,N-1\}\) with \(N=10^9\)).

If the domain \(\mathcal{S}\) is more complicated, then a second option for building an FPE scheme is to try to find a way to rank the elements of the domain, then employ a cipher that works on \(\{0,\ldots ,N-1\}\) with \(N=|\mathcal{S}|\), and then unrank. Ranking the elements of \(\mathcal{S}\) means finding an efficient way to map (and unmap) each element \(m \in \mathcal{S}\) to a unique element \(x \in \{0,\ldots ,|\mathcal{S}|-1\}\). The FPE scheme just described is called rank-encipher-unrank [3]. Rank-encipher-unrank only works on domains for which efficient ranking and unranking algorithms are known. Thus, practitioners, when faced with the task of enciphering points in some domain \(\mathcal{S}\), must either invent a custom ranking and unranking procedure,Footnote 1 or, if \(\mathcal{S}\) can be specified with a DFA or regular expression, apply known algorithms to rank regular languages [3]. In this latter case, there are toolkits written to aid practitioners [16], though there can still be some subtle efficiency issues depending on whether one starts with a regular expression or a DFA.

Finally, a third option that only assumes the ability to test membership in \(\mathcal{S}\) is to find a larger domain \(\mathcal {X}\) for which an efficient cipher already exists, and then try to somehow use or modify this cipher to get a new cipher on the target domain \(\mathcal{S}\). For example, if we need a cipher on valid social security numbers (e.g., do not start with 000), we could try to take a cipher on \(\{0,\ldots ,10^9\}\) and somehow cleverly use it to get a cipher on our desired domain. Black and Rogaway [6] were the first to analyze a folklore technique for doing this, called Cycle Walking, in which the cipher on the larger set is applied repeatedly to a point \(m \in \mathcal{S}\) until the resulting ciphertext also is an element of \(\mathcal{S}\). If the size of \(\mathcal {X}\) is not too large relative to the size of \(\mathcal{S}\), then we can expect this procedure to terminate quickly, though the running time can vary across different inputs. Recent work [19, 20] by Miracle and Yilek has explored ways to make this task of transforming a cipher on \(\mathcal {X}\) into a cipher on \(\mathcal{S}\subseteq \mathcal {X}\), which they refer to as domain targeting, possible in constant time, meaning the running time does not depend on the input. Looking ahead, our results can be seen as bringing the very theoretical results of [19, 20] closer to practice.

We emphasize that while ranking/unranking and domain targeting might seem like two distinct ways to build FPE schemes on domains \(\mathcal{S}\), they can actually be complementary techniques. For example, a practitioner might have a very complicated domain \(\mathcal{S}\) for which they wish to do FPE. Perhaps the domain is specified by a complex regular expression, and so the general techniques for ranking/unranking are impractical. An alternative option may be to find a larger, simpler set \(\mathcal {X}\supseteq \mathcal{S}\) that is easier to rank. Then, the rank-encipher-unrank algorithm would need to apply something like domain targeting before unranking, since applying rank-encipher-unrank might yield an element in \(\mathcal {X}-\mathcal{S}\).

Constant-Time Domain Targeting. Our main goal in this paper is to make constant-time domain targeting more efficient. Before getting to our new results, we first give an overview of previous techniques.

The constructions provided by Miracle and Yilek for domain targeting from set \(\mathcal {X}\) to set \(\mathcal{S}\), called Reverse Cycle Walking in [19] and Cycle Slicer in [20], are both based on the same underlying idea: take a cipher on \(\mathcal {X}\) and use it to construct a random matching (i.e., permutation with only 2-cycles or transpositions) on \(\mathcal{S}\subseteq \mathcal {X}\); then swap some of the points that are paired together based on bit flips. Said another way, both the Reverse Cycle Walking (RCW) and Cycle Slicer (CS) constructions give a way to build matchings on the target set \(\mathcal{S}\) out of arbitrary permutations on the larger set \(\mathcal {X}\). Once a matching on \(\mathcal{S}\) is formed, pairs of points in the matchings are swapped based on additional bit flips. This procedure, called a matching exchange process, is repeated over many rounds, and Miracle and Yilek use a result of Czumaj and Kutolowski [9] to argue the resulting ciphers are secure. Further, the number of rounds needed for security does not depend on the specific inputs, so constant-time implementations that do not leak timing information are possible.

Unfortunately, RCW and CS are both rather inefficient, requiring many rounds for security. For example, the Cycle Slicer paper uses social security numbers as an example, with \(\mathcal {X}= \{0,1\}^{30}\) and \(\mathcal{S}= \{0,\ldots ,10^9-1\}\), and claims that about 12,000 rounds of Cycle Slicer are needed for security. If we plug an existing, provably-secure cipher like Swap-or-Not (SN) [15] into the construction, we would end up with hundreds of rounds of Swap-or-Not times 12,000 rounds of Cycle Slicer, meaning overall we need millions of Swap-or-Not rounds. If full security [13, 24] is desired, in which ciphers are required to be indistinguishable from random permutations even when adversaries can query all domain points, the situation is even worse. The key idea in this paper is that instead of applying a general transformation to convert any cipher into one that supports domain targeting, perhaps we can instead specifically design ciphers (or slightly modify existing ones) to directly support domain targeting.

Our Results. We take a step toward bringing constant-time domain targeting closer to practice. We propose using what we refer to as targeted ciphers for the task. The idea is to design new ciphers (or find existing ones) that already can support domain targeting with only small modification. Informally, a targeted cipher will proceed in rounds to encipher points in some domain \(\mathcal {X}\), yet can be slightly modified to have the property that after every round, every point \(x \in \mathcal{S}\subseteq \mathcal {X}\) is still mapped to another point in \(\mathcal{S}\). In other words, over the entire course of the algorithm, elements of the target set \(\mathcal{S}\) never “leave” the target set, and every additional round of the cipher further mixes up these elements.

With this informal idea in mind, we present two targeted ciphers and formally analyze their security. Our first targeted cipher, Targeted Swap-or-Not (TSN), is a modification of the Swap-or-Not cipher, proposed by Hoang, Morris, and Rogaway. The second, which achieves the stronger notion of full security, is a new cipher we design and analyze called Mix-Swap-Unmix (MSU). With both ciphers, we achieve a substantial increase in efficiency when compared to constructions that achieve a similar level of security by using a general transformation like Cycle Slicer, bringing domain targeting closer to practicality.

Techniques. Like previous work on domain targeting, both of our targeted ciphers are matching-based, or swap-based, meaning that every round pairs up points and then swaps some of them. To construct a cipher on \(\mathcal{S}\subseteq \mathcal {X}\), previous work, specifically Cycle Slicer, in each round builds a random matching on the larger set \(\mathcal {X}\) and then, for each pair of points \(x,x'\) paired together in the matching, only swaps x and \(x'\) if both points are in the target set \(\mathcal{S}\) and an additional bit flip is 1. The security analysis heavily relies on the fact that the matchings are random, which allows [20] to apply an existing result of Czumaj and Kutolowski [9].

Our first targeted cipher, Targeted Swap-or-Not, stems from the observation that the Swap-or-Not cipher is already matching-based: focusing on the version of SN for domain \(\mathcal {X}= [N]\), in round i of SN, point x is paired up with point \(x' = K_i-x \mod N\), where \(K_i\) is the random round key. The points x and \(x'\) are then swapped if a random function applied to them is 1. This operation clearly results in a matching on [N], so our targeted version adds the constraint that points should only be swapped if they are both in the target set \(\mathcal{S}\subseteq [N]\).

Since the high-level idea in TSN is the same as in Cycle Slicer, it might appear that the same security analysis should follow. But, there is a key difference: in Cycle Slicer, each round is a random matching, while in TSN we get a very non-random matching completely determined by the round key (which can be computed from any known pair \(x,x'\)). Thus, for TSN’s analysis we cannot rely on the matching exchange process results. Instead, we modify the original Swap-or-Not security proof of [15], using a recent refinement by Dai, Hoang, and Tessaro [10]. Our final security bounds show that TSN needs only a modest increase in rounds over SN to support targeting. As an example, our bounds show that if TSN is applied to domain [N] for \(N=2^{30}\) and targeted to a target set of size \(|\mathcal{S}|=10^9\), and if we allow a CCA adversary \(q=|\mathcal{S}|/2\) queries, then we need just under 600 rounds of Swap-or-Not to get advantage less than \(10^{-9}\). Using Cycle Slicer and Swap-or-Not for the same parameters would require hundreds of thousands of rounds.

For our second targeted cipher, Mix-Swap-Unmix (MSU), we aim to build a targeted cipher that can achieve full security. A fully secure cipher is one that indistinguishable from a random permutation by an adversary who can query all N domain points. Only a few fully-secure ciphers are known, and they tend to be inefficient; for example, the Mix-and-Cut cipher of [24] uses about 10,000 rounds of Swap-or-Not to encipher 30-bit inputs. If one wishes to do domain targeting and still maintain full security, the efficiency problem gets even worse. Combining a fully-secure cipher like Mix-and-Cut with a general domain targeting transformation like Cycle Slicer can result in 100 s of millions of rounds of Swap-or-Not. Thus, we aim to build a new fully-secure cipher that directly supports domain targeting.

Like previous fully-secure ciphers [21, 24], our new cipher MSU is built from Swap-or-Not. At the same time, since we want to support targeting, we need each round of MSU to give a matching on the larger domain \(\mathcal {X}= [N]\), and then we can only swap elements that are both in the target set \(\mathcal{S}\). To build this matching, we use an idea from Naor and Reingold [23]. They used the fact that for permutations \(\pi \) and \(\sigma \), the cycle structure of \(\pi \circ \sigma \circ \pi ^{-1}\) is the same as the cycle structure of the inner permutation \(\sigma \), to build permutations with particular cycle structures. Since we want a matching, or a permutation made up of just 2-cycles, we let \(\pi \) (the outer permutation) be Swap-or-Not, and then \(\sigma \) (the inner permutation) simply be the permutation that swaps adjacent elements. This is one round of MSU. While this gives us a targeted cipher, we still need to argue full security. In Sect. 4, we show that this construction boosts the security of Swap-or-Not and gives us full security. The final construction is also much more efficient than using an existing fully-secure cipher with Cycle Slicer, requiring about 100 times fewer rounds of Swap-or-Not.

Extensions and Future Work. We mention a few other related results we have included in the paper. First, the MSU construction described above uses an additional bit flip for each pair of points in each round. This bit flip seems unnecessary and leads to an increase in the number of rounds in the case where \(\mathcal{S}\) is much smaller than \(\mathcal {X}\). In Appendix A, we show that in this setting, the bit flip can in fact be eliminated. The proof involves finding an equivalent underlying matching exchange process that mimics MSU without the bit flips, and the techniques may be of independent interest. We also show in Appendix A that if domain targeting is not needed and one simply wants to use the MSU cipher on domain [N], then we can prove that significantly less rounds are needed by applying a recent result of Bernstein [5]. In short, MSU without targeting results in something called an involution walk, and techniques from representation theory can be applied.

One last extension of our results is that our targeted ciphers can be used in a straightforward way to solve the domain completion problem, recently introduced in [14] and further studied in [20], in which we wish to construct a cipher that stays consistent with a table of existing input-output mappings that were manually chosen. Specifically, our constructions can take the place of Cycle Slicer in the CSDC algorithm of [20], resulting in efficiency gains in that setting.

Looking forward, an obvious question is whether other well-known cipher design techniques can be modified to directly support targeting. For example, Feistel-based ciphers are widely used and, in fact, the standardized FPE schemes are Feistel-based, so it would be convenient if they could be made to support targeting with simple modifications. Unfortunately, this seems unlikely. A card-shuffling view of Feistel is that the input points are cut into many piles, and then the bottom cards are dropped from the piles in different orderings depending on the internal random round function. Imagine some of the cards at the bottom of the cut piles are initially in positions in the target set \(\mathcal{S}\). These cards will end up near the bottom of the deck after one round of Feistel, but the positions near the bottom of the deck might not correspond to positions in \(\mathcal{S}\). Thus, we immediately lose our desired property of targeted ciphers that points in \(\mathcal{S}\) always stay in \(\mathcal{S}\) after each round.

Finally, though we used the MSU construction to build a (rather slow) fully-secure cipher by applying in each round Swap-or-Not, a swap, and then Swap-or-Not inverse, we believe Swap-or-Not could be replaced by something much faster (e.g., a few rounds of Feistel) in the MSU construction, and the resulting (targeted) cipher could provide strong security with a modest number of rounds.

2 Preliminaries

Notation. If x is a bitstring with length n, then we denote by \(x \oplus 1\) the bitwise exclusive-OR of the n bits of x with the bitstring \(0^{n-1}1\) (\(n-1\) zeroes followed by a single one). If S is a set, then means we choose an element of S uniformly at random and assign it to x. If S is instead an algorithm, then the same notation represents running S with uniformly random coins and assigning the output to x. For permutations \(\pi ,\sigma : \mathcal{M}\rightarrow \mathcal{M}\) with \(\pi \) having inverse \(\pi ^{-1}\), we denote by \(\pi \circ \sigma \circ \pi ^{-1}\) the permutation that computes \(\pi ^{-1}(\sigma (\pi (x)))\) on \(x \in \mathcal{M}\). We let [N] denote the set \(\{0,\ldots ,N-1\}\). For \(X \in [N]\), then we let \(X \oplus 1\) denote the result of taking the binary representation of X and applying a bitwise-XOR with the binary representation of 1; in other words, if X is even (resp. odd), then \(X \oplus 1\) will be the next (resp. previous) number. Let \(\mathsf {odd}(N)\) denote the odd elements of [N].

Block Ciphers. We say that \(E: \mathcal{K}\times \mathcal{M}\rightarrow \mathcal{M}\) for finite sets \(\mathcal{K}\) and \(\mathcal{M}\) (sometimes referred to as the key space and domain, respectively) is a block cipher if \(E_K(\cdot ) = E(K, \cdot )\) is a permutation on \(\mathcal{M}\) for every \(K \in \mathcal{K}\). Let \(E^{-1}\) be the inverse block cipher of \(E\).

The standard notion of security for block ciphers is security against adaptive chosen-ciphertext attack (CCA), sometimes called Strong PRP Security. To define this security notion, we describe the security games \(\mathsf {SPRP1}\) and \(\mathsf {SPRP0}\). In \(\mathsf {SPRP1}\), the game starts with a \(\mathbf{main }\) procedure that chooses a random key for the cipher and then runs the adversary with oracles for procedures \(\mathsf {Enc}\) and \(\mathsf {Dec}\), which answer queries using the cipher and the chosen key. The final output of the game is the bit the adversary outputs. The game \(\mathsf {SPRP0}\) works the same, but with \(\mathbf{main }\) choosing a random permutation from \(\textsf {Perm}(\mathcal{M})\), defined as the set of all permutations \(\pi : \mathcal{M}\rightarrow \mathcal{M}\), and using that to answer oracle queries to \(\mathsf {Enc}\) and \(\mathsf {Dec}\). We can then define the CCA advantage of an adversary A against \(E\) by \( \mathbf {Adv}^{\mathrm {cca}}_{E}(A) = |{\Pr \left[ \,{\mathsf {SPRP1}^A_E\Rightarrow 1}\,\right] } - {\Pr \left[ \,{\mathsf {SPRP0}^A_E\Rightarrow 1}\,\right] }| \) where the probabilities are over the random coins used in the security games. If the adversary A is non-adaptive (meaning it makes the same queries every run) and only makes queries to \(\mathsf {Enc}\) in the SPRP security games above, then we say it is a NCPA (short for non-adaptive chosen-plaintext attack) adversary and we refer to its advantage in the games against block cipher \(E\) as \(\mathbf {Adv}^{\mathrm {ncpa}}_{E}(A)\).

As has become standard, we overload notation and denote by \(\mathbf {Adv}^{\mathrm {cca}}_{E}(q)\) the maximum CCA advantage over all adversaries making at most q adaptive oracle queries. Similarly, the maximum advantage over all adversaries making at most q non-adaptive oracle queries to only the forward direction subroutine \(\mathsf {Enc}\) we denote by \(\mathbf {Adv}^{\mathrm {ncpa}}_{E}(q)\). We will be interested in full security or fully-secure ciphers, meaning \(\mathbf {Adv}^{\mathrm {cca}}_{E}(N)\) is low, where \(N = |\mathcal{M}|\). Said another way, a fully-secure block cipher will be one for which the CCA advantage is low despite the adversary being able to query every domain point. As explained in the introduction, such fully-secure ciphers have been the target of a number of recent papers [13, 21, 24].

Chernoff Bound. Later in the paper, we will need to upper bound the probability that among t independent coin flips there are more than (3 / 4)t heads.

Proposition 1

Let \(X_1,\ldots ,X_t\) be independent random variables such that each \(X_i = 1\) with prob. 1 / 2 and \(X_i = 0\) with prob. 1 / 2. Let \(X = \sum _{i=1}^t X_i\). Then, \( {\Pr \left[ \,{X \ge (3/4)t}\,\right] } \le e^{-t/20}. \)

Matchings. In this paper we use the term matching on \(\mathcal{M}\) to refer to a permutation \(\tau : \mathcal{M}\rightarrow \mathcal{M}\) made up of only transpositions, also called 2-cycles or swaps. A matching is an involution, so \(\tau (\tau (x))=x\) for all \(x \in \mathcal{M}\). Let \(\textsf {Match}(\mathcal{M}, k)\) be the set of all matchings on \(\mathcal{M}\) that are made up of exactly k transpositions. For a set \(\mathcal{M}\) with an even number N of elements, we use the term perfect matching to refer to a matching on \(\mathcal{M}\) with exactly N / 2 transpositions, meaning every point is swapped with another distinct point. Thus, \(\textsf {Match}(\mathcal{M}, |\mathcal{M}|/2)\) is the set of such perfect matchings when \(|\mathcal{M}|\) is even.

A matching exchange process on \(\mathcal{M}\) proceeds in rounds. In each round, k is sampled from some probability distribution on \(\{0,\ldots ,|\mathcal{M}|/2\}\), then \(\tau \) is chosen randomly from \(\textsf {Match}(\mathcal{M}, k)\). Finally, for each pair of points \(x, \tau (x) \in \mathcal{M}\) such that \(x \ne \tau (x)\), we flip a random bit \(b_{\{x,\tau (x)\}}\) and define a new matching \(\bar{\tau }\) by \(\bar{\tau }(x) = \tau (x)\) if \(b_{\{x,\tau (x)\}}=1\), and \(\bar{\tau }(x)=x\) otherwise. We then apply this new matching \(\bar{\tau }\) to each point in \(\mathcal{M}\). This process may repeat for many rounds (with independently chosen k and matchings). We will also consider a special case of a matching exchange process called an involution walk. Here the matching \(\tau \) generating at each step is always a perfect matching (i.e. k is always \(|\mathcal{M}|/2 \)). In Appendix A we bound the number of rounds needed for \(\textsf {MSU}\) in part by relying on previous bounds for matching exchange processes and involution walks.

Total Variation Distance. In order to determine how many rounds of \(\textsf {MSU}\) are needed, we will bound the total variation distance for the underlying matching exchange process. Let \(x,y \in \varOmega \), \(P^r(x,y)\) be the probably of going from x to y in r steps and \(\mu \) be another distribution on \(\varOmega \). In our case \(\varOmega \) will be the set of all permutations of a given size, \(P^r(x,y)\) will be the probably of going between particular permutations x and y with r rounds of \(\textsf {MSU}\) and \(\mu \) will be the uniform distribution on permutations. Specifically, for a permutation y, \(\mu (y) = 1/|\varOmega |\). Then the total variation distance is defined as

$$||P^r(x,y)-\mu || = \max _{x\in \varOmega } \frac{1}{2}\sum _{y\in \varOmega }|P^r(x,y) - \mu (y)|.$$

Composition and Cycle Structure. We will use the following well-known fact from group theory, which was used in the cryptographic realm by Naor and Reingold [23].

Proposition 2

For any permutations \(\pi \) and \(\sigma \), the cycle structures of permutations \(\sigma \) and \(\pi \circ \sigma \circ \pi ^{-1}\) are the same. Thus, if \(\tau \) is a matching, then \(\pi \circ \tau \circ \pi ^{-1}\) is also a matching with the same number of transpositions.

3 Targeted Swap-or-Not

We begin by describing the Swap-or-Not cipher introduced by Hoang, Morris, and Rogaway [15] and then present our new Targeted Swap-or-Not cipher (\(\textsf {TSN}\)).

Swap-or-Not. Hoang, Morris, and Rogaway [15] showed that the Swap-or-Not (SN) cipher provides CCA security against adversaries who only make \(q=(1-\epsilon )N\) queries, where N is the size of the domain. In words, for domain \(\mathcal{M}= [N]\), the r-round SN cipher has key \(\mathrm {KF}\) specifying round keys \(K_i \in \mathcal{M}\) and round functions \(F_i: \mathcal{M}\rightarrow \mathcal{M}\). In round i, point X is paired with a “buddy” point \(K_i - X \mod N\) (which could be the same point, i.e., \(K_i - X \mod N = X\)), and the result of \(F_i\) determines if X should swap positions with its buddy point or not.

Hoang, Morris, and Rogaway analyzed the security of Swap-or-Not and provided bounds on both the NCPA and CCA advantages of adversaries attacking the scheme. Recently, Dai, Hoang and Tessaro [10] improved these bounds using a technique they named the chi-squared method. We will need their bound

$$\begin{aligned} \mathbf {Adv}^{\mathrm {cca}}_{\textsf {SN}}(A) \le \frac{2N}{\sqrt{r/2+1}} \left( \frac{N+q}{2N} \right) ^{(r/2+1)/2} , \end{aligned}$$
(1)

where again N is the size of the domain, r is the number of SN rounds, and q is the number of adversarial queries.

Our Algorithm. In each round i of \(\textsf {TSN}\), point X is again paired with a “buddy” point \(K_i - X \mod N\). However, regardless of the result of the round function \(F_i\) if either X or X’s “buddy” point are not in the target set \(\mathcal{S}\) then the points do not swap positions. If both points are in \(\mathcal{S}\) then whether they swap (or not) is again determined by the round function \(F_i\). The detailed description of how to encipher a single point using \(\textsf {TSN}\) can be found in Fig. 1. Note that if we let \(\mathcal{S}= [N]\) then \(\textsf {TSN}\) becomes the original Swap-or-Not cipher for the domain \(\{0,\ldots ,N-1\}\).

Fig. 1.
figure 1

The Targeted Swap-or-Not Cipher for target set \(\mathcal{S}\). The addition of the boxed code is the only change from the original Swap-or-Not algorithm.

Security Analysis. Our analysis of \(\textsf {TSN}\) relies heavily on the original analysis done by Hoang, Morris and Rogaway to bound the NCPA security of the Swap-or-Not algorithm [15] and then improved by Dai, Hoang and Tessaro [10] using the \(\chi ^2\) method. Our main contribution here lies in the application of this algorithm to the targeting setting and the analysis while quite technical is a generalization of the ideas and techniques used in the previous analysis.

Our goal is to bound the CCA security of \(\textsf {TSN}\) but, as in [10], we will begin by bounding the weaker NCPA security using the \(\chi ^2\) method and then use a result of Maurer, Pietrzak, and Renner [18] to derive a bound on the CCA security. Specifically we adapt Lemma 3 from [10] to the \(\textsf {TSN}\) algorithm. Combining this lemma with the techniques from the proof of Lemma 5 from [10] and applying to \(\textsf {TSN}\) immediately gives the following lemma which shows that in order to bound the NCPA security of \(\textsf {TSN}\) it suffices to bound the \(\chi ^2\)-divergence.

Lemma 1

(adapted from Dai, Hoang, Tessaro [10]). Let \(\textsf {TSN}\) represent the permutation generated by r rounds of Targeted Swap-Or-Not and \(\textsf {UN}\) represent a random permutation. Additionally, let \(p_{\textsf {TSN},r}(^.|Q_i)\) be the distribution on the \(i+1\) query by a non-adaptive NCPA adversary A to \(\textsf {TSN}\) with r rounds conditioned on the output of the previous i queries represented by \(Q_i = \{q_1,q_2, \ldots q_i\}\) and similarly \(p_{\textsf {UN}}(^.|Q_{i})\) is the distribution on the \(i+1\) query to the uniformly random permutation (i.e. the uniform distribution on the remaining \(|\mathcal{S}| - i\) elements). Given this the NCPA advantage of an NCPA adversary A making at most q non-adaptive queries is

$$\mathbf {Adv}^{\mathrm {ncpa}}_{\textsf {TSN}}(A)\le ||p_{\textsf {TSN},r}(\cdot ) - p_{\textsf {UN}}(\cdot )|| \le \sqrt{\frac{1}{2} \sum _{i=0}^{q-1} {{\mathbf {E}}\left[ \,{\chi ^2(p_{\textsf {TSN},r}(\cdot |Q_{i}), p_{\textsf {UN}}(\cdot |Q_{i}))}\,\right] }}.$$

Where the expectation is taken over a vector \(Q_i = \{q_1,q_2, \ldots q_i\}\) sampled according to the interaction with \(\textsf {TSN}\) and the \(\chi ^2\) divergence between \(p_{\textsf {TSN},r}(^.|Q_{i})\) and \(p_{\textsf {UN}}(^.|Q_{i})\) is defined as

$$\sum _{q_{i+1} \in \mathcal{S}\backslash Q_i}\frac{(p_{\textsf {TSN},r}(q_{i+1}|Q_{i}) - p_{\textsf {UN}}(q_{i+1}|Q_{i}))^2}{p_{\textsf {UN}}(q_{i+1}|Q_{i})}.$$

In order to bound \({{\mathbf {E}}\left[ \,{\chi ^2(Q_{i})}\,\right] }\), we prove the following lemma which generalizes Eq. 5 from [15].

Lemma 2

Let \(|\mathcal{S}|\) be the number of elements in the target set \(\mathcal{S}\) and \(|\mathcal {X}|\) be the number of elements in the larger domain set \(\mathcal {X}\). Then we have,

$${{\mathbf {E}}\left[ \,{\sum _{q_{i+1}\in \mathcal{S}\backslash \{q_1, \ldots , q_{i} \}}(p_{\textsf {TSN}, r}(q_{i+1}|Q_{i}) - p_{\textsf {UN}}(q_{i+1}|Q_{i}))^2 }\,\right] }\le \left( \frac{2|\mathcal {X}|- |\mathcal{S}|+ i}{2|\mathcal {X}|}\right) ^r,$$

where the expectation is taken over a vector \(Q_i = \{q_1,q_2, \ldots q_i\}\) sampled according to the interaction with Targeted Swap-Or-Not.

Proof

Again we point out that the following proof uses the same techniques and is a relatively straightforward generalization of the proof of Eq. 5 from [15]. Our proof proceeds by induction on r. We let \(r=0\) be our base case (the proof here follows directly from [15]). When \(r=0\) the elements are in their initial deterministic location and

$$\begin{aligned} {{\mathbf {E}}\left[ \,{\sum _{q_{i+1}\in \mathcal{S}}(p_{\textsf {TSN}, 0}(q_{i+1}) - p_{\textsf {UN}}(q_{i+1}|Q_{i}))^2}\,\right] }= & {} {{\mathbf {E}}\left[ \,{\sum _{q_{i+1}\in \mathcal{S}}(p_{\textsf {TSN}, 0}(q_{i+1}) - 1/|\mathcal{S}|)^2}\,\right] } \\= & {} (1-1/|\mathcal{S}|) ^2+ (|\mathcal{S}|- 1)(-1/|\mathcal{S}|)^2 \\= & {} 1 - 1/|\mathcal{S}|< \left( \frac{2|\mathcal {X}|- |\mathcal{S}|+ i}{2|\mathcal {X}|}\right) ^0. \end{aligned}$$

Next we assume inductively that the lemma holds for r and prove that it holds for \(r+1\). In order to analyze this case we will need to use some additional terminology. For clarity we will use the same terminology as in [15] and [10] and redefine it here for readability. Let \(K_1,\ldots , K_{r+1}\) be the random keys for the first \(r+1\) rounds. Let \(S_r = \mathcal{S}-Q_{i,r}\) be the set of available positions for the \(i+1\) query where \(Q_{i,r}\) is set of positions for the first i queries given r rounds of \(\textsf {TSN}\). We will abbreviate \(p_r(x)\) to mean \(p_{\textsf {TSN},r}(x|Q_{i})\) (i.e. the probability the \(i+1\) query is x given r rounds of \(\textsf {TSN}\)) and define \(s_r = \sum _{x \in S_r }(p_r(x) -1/(|\mathcal{S}|- i))^2\).

Definition 1

(Hoang, Morris, Rogaway [15]). Let f be a bijection from \(S_r\) to \(S_{r+1}\) given by

$$\begin{aligned} f(x) = {\left\{ \begin{array}{ll} x &{} x \in S_{r+1},\\ K_{r+1} - x &{} \text {otherwise}. \end{array}\right. } \end{aligned}$$

Given this, Hoang, Morris and Rogaway [10] point out the following.

$$\begin{aligned} p_{r+1}(f(x)) = {\left\{ \begin{array}{ll} p_r(x) &{} \text {if } K_{r+1}-x \notin S_{r},\\ \frac{1}{2}p_r(x) + \frac{1}{2}p_r(K_{r+1} - x) &{} \text {otherwise}. \end{array}\right. } \end{aligned}$$

The size of \(S_r\) is \(|\mathcal{S}|- i\) and thus in our targeted setting, the probability that \(K_{r+1}-x \notin S_r\) is \((|\mathcal {X}|- (|\mathcal{S}|- i))/|\mathcal {X}|\). Combining these and letting \(\mathcal {Q} = {{\mathbf {E}}\left[ \,{(p_{r+1}(f(x)) - (1/(|\mathcal{S}|-i)))^2| s_r}\,\right] }\) gives the following.

$$\begin{aligned} \mathcal {Q}= & {} \frac{|\mathcal {X}|-|\mathcal{S}|+ i}{|\mathcal {X}|}\left( p_r(x) -\frac{1}{|\mathcal{S}|-i}\right) ^2 + \frac{1}{|\mathcal {X}|}\sum _{y \in S_r}\left[ \frac{p_r(x) + p_r(y)}{2} - \frac{1}{|\mathcal{S}|-i}\right] ^2\\= & {} \frac{|\mathcal {X}|- |\mathcal{S}|+ i}{|\mathcal {X}|}\left( p_r(x) -\frac{1}{|\mathcal{S}|-i}\right) ^2 + \frac{1}{|\mathcal {X}|}\left( \frac{s_r}{4} + \frac{|\mathcal{S}|-i}{4}\left( p_r(x) - \frac{1}{|\mathcal{S}|-i}\right) ^2\right) \\= & {} \frac{s_r}{4|\mathcal {X}|} + \frac{3i + 4|\mathcal {X}|- 3|\mathcal{S}|}{4|\mathcal {X}|}\left( p_r(x) - \frac{1}{|\mathcal{S}|-i}\right) ^2 \end{aligned}$$

Note that the expansion of the first sum uses the definition of \(s_r\) and the fact that \(\sum _{y \in S_r}(p_r(y) - \frac{1}{|\mathcal{S}|-i}) = 0\). Details can be found in [10]. Using the fact that f gives a bijection from \(S_r\) to \(S_{r+1}\) and the equation above, we have the following,

$$\begin{aligned} {{\mathbf {E}}\left[ \,{s_{r+1} | s_r}\,\right] }= & {} \sum _{x \in S_{r+1}} {{\mathbf {E}}\left[ \,{[p_{r+1}(x) - 1/(|\mathcal{S}|-i)]^2|s_r}\,\right] }\\= & {} \sum _{y \in S_{r}} {{\mathbf {E}}\left[ \,{[p_{r+1}(f(y)) - 1/(|\mathcal{S}|-i)]^2|s_r}\,\right] }\\= & {} \sum _{y \in S_{r}}\left( \frac{s_r}{4|\mathcal {X}|} + \frac{3i + 4|\mathcal {X}|- 3|\mathcal{S}|}{4|\mathcal {X}|}\left( p_r(y) - \frac{1}{(|\mathcal{S}|-i)}\right) ^2\right) \\= & {} \frac{s_r (|\mathcal{S}|-i)}{4|\mathcal {X}|} + \frac{3i + 4|\mathcal {X}|- 3|\mathcal{S}|}{4|\mathcal {X}|} \sum _{y \in S_{r}}\left( p_r(y) - \frac{1}{(|\mathcal{S}|-i)}\right) ^2\\= & {} s_r \left( \frac{2|\mathcal {X}|-|\mathcal{S}|+ i}{2|\mathcal {X}|} \right) . \end{aligned}$$

Using the law of iterated expectations and our inductive hypothesis we have,

$$\begin{aligned} {{\mathbf {E}}\left[ \,{s_{r+1}}\,\right] }= & {} {{\mathbf {E}}\left[ \,{\sum _{q_{i+1}\in \mathcal{S}\backslash \{q_1,\ldots , q_{i} \}}\left( p_{\textsf {TSN}, r}(q_{i+1}|Q_{i}) - \frac{1}{|\mathcal{S}|- i}\right) ^2}\,\right] }\\\le & {} \left( \frac{2|\mathcal {X}|- |\mathcal{S}|+ i}{2|\mathcal {X}|}\right) ^r. \end{aligned}$$

   \(\square \)

Next, we use Lemma 2 to bound the \(\chi ^2\) divergence and subsequently to bound the NCPA security of our Targeted Swap-or-Not Cipher.

Theorem 1

Let \(\textsf {TSN}_{r}\) represent the permutation generated by r rounds of Targeted Swap-Or-Not. The NCPA advantage of an NCPA adversary A making at most q non-adaptive queries is

$$\mathbf {Adv}^{\mathrm {ncpa}}_{\textsf {TSN}}(A) \le ||p_{\textsf {TSN},r}(\cdot ) - p_{\textsf {UN}}(\cdot )|| \le \left( \frac{|\mathcal{S}|\cdot |\mathcal {X}|}{r+1}\right) ^{\frac{1}{2}}\left( \frac{2|\mathcal {X}|- |\mathcal{S}|+q + 1}{2|\mathcal {X}|}\right) ^{\frac{r+1}{2}}.$$

Proof

Using the definition of the \(\chi ^2\) divergence given in Lemmas 1 and 2 we have the following.

$$\begin{aligned} \chi ^2(p_{\textsf {TSN},r}(\cdot |Q_{i}), p_{\textsf {UN}}(\cdot |Q_{i}))= & {} \sum _{q_{i+1} \in \mathcal{S}\backslash Q_i}\frac{(p_{\textsf {TSN},r}(q_{i+1}|Q_{i}) - p_{\textsf {UN}}(q_{i+1}|Q_{i}))^2}{p_{\textsf {UN}}(q_{i+1}|Q_{i})}\\= & {} (|\mathcal{S}|-i)\sum _{q_{i+1} \in \mathcal{S}\backslash Q_i}(p_{\textsf {TSN},r}(q_{i+1}|Q_{i}) - (1/|\mathcal{S}|-i))^2\\\le & {} (|\mathcal{S}|-i)\left( \frac{2|\mathcal {X}|- |\mathcal{S}|+ i}{2|\mathcal {X}|}\right) ^r \end{aligned}$$

Next we substitute this result into Lemma 1 and bound the subsequent summation with an integral (similar to what was done in [10]) to get the following, which implies our theorem.

$$\begin{aligned} (||p_{\textsf {TSN},r}(\cdot ) - p_{\textsf {UN}}(\cdot )||)^2\le & {} \frac{1}{2} \sum _{i=1}^q {{\mathbf {E}}\left[ \,{\chi ^2(Q_i)}\,\right] } \le \frac{1}{2} \sum _{i=1}^q (|\mathcal{S}|-i)\left( \frac{2|\mathcal {X}|- |\mathcal{S}|+ i}{2|\mathcal {X}|}\right) ^r\\\le & {} \frac{|\mathcal{S}|}{2} \int _{0}^{q+1} \left( \frac{2|\mathcal {X}|- |\mathcal{S}|+ i}{2|\mathcal {X}|}\right) ^rdi\\\le & {} \frac{ |\mathcal{S}|\cdot |\mathcal {X}|}{r+1}\left( \frac{2|\mathcal {X}|- |\mathcal{S}|+ q + 1}{2|\mathcal {X}|}\right) ^{r+1}. \end{aligned}$$

Finally, to bound the CCA security of \(\textsf {TSN}\) we will use a well-known result of Maurer, Pietrzak, and Renner [18]. As in the analysis by Dai, Hoang and Tessaro [10] we note that the inverse of r rounds of \(\textsf {TSN}\) is also r rounds of \(\textsf {TSN}\) and thus applying [18] allows us to amplify our NCPA security bound to CCA security and gives the following corollary.

Corollary 1

Let \(\textsf {TSN}\) represent the permutation generated by r rounds of Targeted Swap-Or-Not. The CCA advantage of a CCA adversary A making at most q queries is

$$\mathbf {Adv}^{\mathrm {cca}}_{\textsf {TSN}}(A) \le 2\left( \frac{|\mathcal{S}|\cdot |\mathcal {X}|}{r/2+1}\right) ^{1/2}\left( \frac{2|\mathcal {X}|- |\mathcal{S}|+q + 1}{2|\mathcal {X}|}\right) ^{(r/2+1)/2}.$$

4 Mix-Swap-Unmix

Motivation. In the previous section, we saw a way to modify Swap-or-Not to get a targeted cipher, and the resulting cipher is indistinguishable from a random permutation when the adversary queries at most a constant fraction of the points. Recent papers [21, 24] have introduced small-domain ciphers that provide full security, meaning the ciphers are indistinguishable from random permutations even to an adversary allowed to query all domain points. This leaves the question of whether we can build fully-secure small-domain ciphers that support targeting without too much loss in efficiency.

Fig. 2.
figure 2

Mix-Swap-Unmix Cipher. The boxed code is for domain targeting, and can be excluded if a cipher on [N] is desired.

We could certainly take an existing fully-secure cipher and apply a general transformation like Reverse Cycle Walking or Cycle Slicer to get a matching, and then only swap points that both lie in the target set. Unfortunately, fully-secure ciphers are already significantly less efficient than partially-secure counterparts like Swap-or-Not (which itself is far less efficient than the Feistel-based, standardized schemes), so using many rounds of a general transformation like Cycle Slicer is simply too slow to ever be practical. To be more concrete, if we start with the fully-secure cipher Mix-and-Cut [24] on domain \(\mathcal {X}=\{0,1\}^{30}\), that cipher internally needs 10,000 rounds of Swap-or-Not to achieve full security. If we then apply Cycle Slicer to target the set of bitstrings that represent 9-digit numbers, then [20] states we need 12,000 rounds of Cycle Slicer, with each of those 12,000 rounds applying the 10,000 rounds of Swap-or-Not inside of Mix-and-Cut. Thus, to get a targeted, fully-secure cipher with this method, we would need \(10000 \times 12000 = \) 120 million rounds of Swap-or-Not!

Clearly, there is a lot of efficiency loss in using a general transformation like Cycle Slicer on an existing fully-secure cipher. Thus, we instead turn to a different approach: directly constructing a fully-secure cipher that is matching-based and thus supports domain targeting. Like the existing fully-secure ciphers Mix-and-Cut and Sometimes-Recurse, we build our new fully-secure cipher from Swap-or-Not. We call our new algorithm Mix-Swap-Unmix (MSU). MSU, by default, enciphers points in the general domain \([N] = \{0,\ldots ,N-1\}\) for even N, and can support targeting to any domain \(\mathcal{S}\subseteq [N]\).

The Algorithm. Let \(\textsf {SN}_\mathrm {KF}\) denote the Swap-or-Not cipher with domain [N], with key \(\mathrm {KF}\) consisting of round keys \(K_1,\ldots ,K_r\) and round functions \(F_i:[N]\rightarrow \{0,1\}\). Our new cipher \(\textsf {MSU}\) will have domain \(\mathcal{S}\subseteq [N]\) and keys \((\mathbf {KF},\mathbf {G})\) consisting of m Swap-or-Not keys \(\mathbf {KF}=\{\mathrm {KF}_1,\ldots ,\mathrm {KF}_m\}\) and m round functions \(\mathbf {G}=\{G_1,\ldots ,G_m\}\) with each \(G_j: [N] \rightarrow \{0,1\}\). The code is shown in Fig. 2. The boxed statements are for domain targeting; if one’s desired domain is simply [N], the boxed portion can be excluded.

In words, to encipher a point \(X \in \mathcal{S}\subseteq [N]\) with \(\textsf {MSU}\), we first apply r rounds of the Swap-or-Not cipher to get a new point Z. If Z is even, it is swapped with \(Z+1\), otherwise it is swapped with \(Z-1\). We then apply the inverse of the Swap-or-Not cipher applied earlier in the round to get a new point \(X'\). If X and \(X'\) are both in \(\mathcal{S}\) and an additional bit flip is 1, then the swap of X and \(X'\) becomes official; otherwise if either the bit flip is 0 or one of both of the points is not in \(\mathcal{S}\), then X and \(X'\) are simply mapped to themselves for this round of MSU. Thus, in one round of MSU, a point X is either mapped to \(\textsf {SN}^{-1}_{\mathrm {KF}_j}(\textsf {SN}_{\mathrm {KF}_j}(X) \oplus 1)\) or it is simply mapped back to itself.

Each round of \(\textsf {MSU}\) gives a matching, a permutation on \(\mathcal {X}\) made up of only transpositions. This follows from Proposition 2 that states that if \(\pi \) and \(\sigma \) are permutations, then the permutation \(\pi \circ \sigma \circ \pi ^{-1}\) has the same cycle structure as \(\sigma \). Since in \(\textsf {MSU}\) the “inner” permutation \(\sigma \) simply consists of swaps of points Z with \(Z \oplus 1\), the overall cycle structure of \(\textsf {MSU}\) will also be made up of just swaps/transpositions.

Security. We now formally show this construction gives a fully-secure cipher on \(\mathcal{S}\), meaning it is indistinguishable from a random permutation even to an adversary that can see all \(|\mathcal{S}|\) input-output mappings.

Theorem 2

Let \(\textsf {MSU}\) be described as above, with m rounds, each of which uses r rounds of Swap-or-Not. Then,

$$ \mathbf {Adv}^{\mathrm {cca}}_{\textsf {MSU}}(A) \le m \cdot \varDelta _1 + \varDelta _2 $$

where \(\varDelta _1 = \frac{2N}{\sqrt{r/2+1}} \left( \frac{7}{8} \right) ^{(r/2+1)/2} + e^{\frac{-N}{40}}\) and \( \varDelta _2 = |\mathcal{S}|^{1-(2m/T)}, \) where

$$ T = \max \left( 40\ln (2|\mathcal{S}|^2),\frac{10\ln (|\mathcal{S}|/9)}{\ln (1 + (7/36N^2)((7/9)|\mathcal{S}|^2 -|\mathcal{S}|))} \right) +\frac{72N\ln (2|\mathcal{S}|^2)}{|\mathcal{S}|}. $$

Before proving the theorem, we note that the presence of the \(e^{\frac{-N}{40}}\) term means that MSU does not provide good security for very small domains. Yet, this term is not problematic for domains like those discussed in the Introduction where N is, say, \(2^{30}\).

Proof

Let \(\mathcal{S}\subseteq [N]\) and let \(\textsf {MSU}: \mathcal{K}\times \mathcal{S}\rightarrow \mathcal{S}\) be the m-round Mix-Swap-Unmix algorithm as defined in Sect. 4 with randomly chosen round keys \(\mathbf {KF}\), randomly chosen round functions \(\mathbf {G}\), and using the r-round Swap-or-Not cipher on domain [N]. Let A be a CCA adversary against MSU that queries every point in \(\mathcal{S}\). We wish to bound the following advantage

$$ \mathbf {Adv}^{\mathrm {cca}}_{\textsf {MSU}}(A) = {\Pr \left[ \,{\mathsf {SPRP1}^A_\textsf {MSU}\Rightarrow 1}\,\right] } - {\Pr \left[ \,{\mathsf {SPRP0}^A_\textsf {MSU}\Rightarrow 1}\,\right] } . $$

To do so, we will use a sequence of game transitions, starting with \(\mathsf {Gm}_0 = \mathsf {SPRP1}\) and making small changes to the games until we have \(\mathsf {SPRP0}\).

For the rest of the proof, we will write \({\Pr \left[ \,{\mathsf {Gm}}\,\right] }\) instead of \({\Pr \left[ \,{\mathsf {Gm}^A\Rightarrow 1}\,\right] }\) for brevity. For our first game transition, we will modify the \(\mathsf {Enc}\) procedure to apply the round functions \(\mathbf {G}\) to the maximum of Z and \(Z'\), instead of to the max of X and \(X'\). Let the resulting game be \(\mathsf {Gm}_1\). Regardless of this change, the round function still just associates a random bit flip with the pairs of points that are matched by this round, so \({\Pr \left[ \,{\mathsf {Gm}_0}\,\right] } = {\Pr \left[ \,{\mathsf {Gm}_1}\,\right] }\).

Our next game, \(\mathsf {Gm}_2\), is the same as \(\mathsf {Gm}_1\) but with the random round functions \(\mathbf {G}\) replaced by bit flips that take place in the \(\mathbf{main }\) function and are associated to every possible odd Z value in [N]; there are separate sets of bit flips for each round of MSU (placed into a table \(\mathtt {B}\)), just as there are separate round functions \(G_j\) for each round. The \(\mathsf {Enc}\) procedure then uses the table \(\mathtt {B}\) with these bit flips in place of \(G_j\) in each round to determine if swaps take place. Detailed code for game \(\mathsf {Gm}_2\) is given in Appendix B. Since random round functions with 0/1 outputs have just been replaced by random bit flips, \({\Pr \left[ \,{\mathsf {Gm}_2}\,\right] } = {\Pr \left[ \,{\mathsf {Gm}_1}\,\right] }\).

Notice that if, for any round j, too many bit flips are 1, then a bad flag \(\mathsf {bad}_j\) is set. This will be needed later in the proof, but we point out here that the bad events only depend on the sum of independent bit flips in \(\mathbf{main }\), so we will be able to easily bound the probability of these events with a Chernoff bound.

Our next sequence of game transitions will replace Swap-or-Not in each round of MSU with a randomly chosen permutation on [N]. But, care must be taken, since our adversary A against MSU may query all domain points, yet Swap-or-Not is only proven secure against adversaries that query a constant fraction of the domain points. Intuitively, we will be able to overcome this “gap” by only making queries to Swap-or-Not when the round bits in the table \(\mathtt {B}\) are 1.

More formally, we define a sequence of hybrid games \(\mathsf {H}_0,\ldots ,\mathsf {H}_m\). The first hybrid game, \(\mathsf {H}_0\), is identical to \(\mathsf {Gm}_2\), meaning it uses the bit table \(\mathtt {B}\) in place of random round functions. In game \(\mathsf {H}_\ell \), the first \(\ell \) rounds of MSU use a completely random permutation, while the remaining rounds use Swap-or-Not. This means that the last hybrid game, \(\mathsf {H}_m\), is identical to \(\mathsf {Gm}_2\) but with every round of MSU using a random permutation on [N] in place of Swap-or-Not. We now claim that for every \(i \in \{1,\ldots ,m\}\), \({\Pr \left[ \,{\mathsf {H}_{i-1}}\,\right] } - {\Pr \left[ \,{\mathsf {H}_{i}}\,\right] } \le \mathbf {Adv}^{\mathrm {cca}}_{\textsf {SN}}(3N/4) + e^{\frac{-N}{40}}\).

To prove this, we provide a CCA adversary B against Swap-or-Not that makes at most 3N / 4 oracle queries. These queries will all be to the decryption oracle on the elements of \(\mathsf {odd}(N)\) for which a bit flip is 1. Adversary B will run adversary A, answering its queries using its own oracles. If adversary B has a SN oracle, then it will end up simulating \(\mathsf {H}_{i-1}\) for A, while if it has a random permutation oracle, it will end up simulating \(\mathsf {H}_i\).

Before we get to the exact details of this adversary B, we expand the equation in the above claim to take into account the event that \(\mathsf {bad}_i\) is set to true. In the following equations, let \(\mathsf {bad}_i\) denote the event that the flag \(\mathsf {bad}_i\) (which, recall, means in the part of the \(\mathtt {B}\) used in round i of MSU) is set to \(\mathsf {true}\) during the execution of the game. Note that the probability of \(\mathsf {bad}_i\) being set to \(\mathsf {true}\) is the same in any hybrid game, since they all have identical \(\mathbf{main }\) procedures. Now,

We are now ready to specify our adversary B against Swap-or-Not. Adversary B is given a Swap-or-Not oracle and will run adversary A and try to simulate its environment to match the hybrid games \(\mathsf {H}_{i-1}\) and \(\mathsf {H}_{i}\). If B has a real Swap-or-Not oracle, then it will end up simulating \(\mathsf {H}_{i-1}\), while if it has a random permutation oracle it will end up simulating \(\mathsf {H}_{i}\). To simulate round i of the MSU algorithm, B first flips coins just like in the \(\mathbf{main }\) procedure of the hybrid games to populate the \(\mathtt {B}\) table. If the \(\mathsf {bad}_i\) (the bad flag for round i) gets set to \(\mathsf {true}\), meaning too many coin flips ended up as 1 for that round of MSU, then adversary B needs to stop and simply output a random 0/1 guess. If \(\mathsf {bad}_i\) is not set, then B proceeds by querying its own SN oracle with all z and \(z \oplus 1\) in which \(\mathtt {B}[i][z]=1\). B now runs A and can properly complete round i of MSU for A on any query, since the only way round i can affect a point X is if the corresponding bit in \(\mathtt {B}\) is 1. Because B queried every such point, it will know what to do with any given X or \(X'\). Thus, as long as the \(\mathsf {bad}_i\) flag is not set, B will perfectly simulate the hybrid game for A.

In the equations below, let \(\mathsf {S1}\) be short for \(\mathsf {SPRP1}^B \Rightarrow 1\) and \(\mathsf {S0}\) be short for \(\mathsf {SPRP0}^B \Rightarrow 1\). We can now see adversary B’s advantage

When the \(\mathsf {bad}_i\) flag is not set, adversary B running in the \(\mathsf {SPRP1}\) game is perfectly simulating the hybrid game \(\mathsf {H}_{i-1}\) and B running in \(\mathsf {SPRP0}\) is perfectly simulating the hybrid game \(\mathsf {H}_{i}\). Thus, combining the equations above gives

$$\begin{aligned} {\Pr \left[ \,{\mathsf {H}_{i-1}}\,\right] } - {\Pr \left[ \,{\mathsf {H}_{i}}\,\right] }\le & {} {\Pr \left[ \,{\mathsf {bad}_i}\,\right] } + {\Pr \left[ \,{\overline{\mathsf {bad}}_i}\,\right] }\left( {\Pr }\left[ \, \mathsf {H}_{i-1}\,\left| \right. \,\overline{\mathsf {bad}}_i\,\right] - {\Pr }\left[ \, \mathsf {H}_{i}\,\left| \right. \,\overline{\mathsf {bad}}_i\,\right] \right) \\\le & {} {\Pr \left[ \,{\mathsf {bad}_i}\,\right] } + \mathbf {Adv}^{\mathrm {cca}}_{\textsf {SN}}(B) \end{aligned}$$

where adversary B makes at most \(q = (3/4)N\) queries to its oracle. Applying the bound from Eq. (1) in Sect. 3 and Proposition 1 to our hybrid argument over m rounds gives us the \(\varDelta _1\) bound in our theorem statement.

Now, continuing with our game transitions, let \(\mathsf {Gm}_3\) be the same as \(\mathsf {H}_m\), but with the bit flips moved into the \(\mathsf {Enc}\) procedure and taking place at the time they are needed (in the if). This syntactic change has no effect on the output of the game. Next, we will transition from \(\mathsf {Gm}_3\) to a game \(\mathsf {Gm}_4\) in which each round of MSU now applies a randomly chosen perfect matching to X to get \(X'\) instead of computing \(X' \leftarrow \pi ^{-1}(\pi (X)\oplus 1)\).

We now claim that the new version of MSU in \(\mathsf {Gm}_4\) is actually a matching exchange process. This specific matching exchange process, where a perfect matching on [N] is then restricted to a subset \(\mathcal{S}\) (i.e., matchings that do not pair up points in \(\mathcal{S}\) are thrown out), is analyzed in Appendix A. We can apply Theorem 5 in that appendix to show that \( {\Pr \left[ \,{\mathsf {Gm}_4^A \Rightarrow 1}\,\right] } - {\Pr \left[ \,{\mathsf {SPRP0}^A_\textsf {MSU}\Rightarrow 1}\,\right] } \le \varDelta _2 \) where \(\varDelta _2\) is the bound from Theorem 5. Combining all of our bounds on the above game transitions completes the proof of Theorem 2.   \(\square \)

Discussion and Extensions. Using the \(\varDelta _1\) and \(\varDelta _2\) bounds above, we can see that we need a few hundred rounds of Swap-or-Not within each of about 5000 rounds of MSU, to get low adversarial advantage. While this is still a lot of rounds, it is substantially less than the 100 s of millions of rounds needed in previous work.

Additionally, we mention two extensions of this result. First, we have presented MSU as a cipher on [N] that can be targeted to a domain \(\mathcal{S}\subseteq [N]\). If we are only interested in a cipher on [N] and do not need targeting, then we can improve the full security bound in Theorem 2 by applying a recent result of Bernstein [5] on the mixing time of involution walks, which are especially one type of matching exchange process. More details can be found in Appendix A, but our \(\varDelta _2\) term in the above theorem will become the value in Corollary 2. Then, for the case where \(N=10^9\), we will only need about 220 rounds of MSU to get the \(\varDelta _2\) term less than \(10^{-9}\).

Second, our MSU algorithm as described and analyzed above works best when \(|\mathcal{S}| \ge |\mathcal {X}|/2\). If the target set is smaller than that, we can show the round function \(\mathbf {G}\) (which essentially does bit flips that determine if a swap should take place) can be removed, which speeds up mixing. Showing this is non-trivial, since the resulting algorithm no longer appears to be a matching exchange process. We analyze the resulting process in more detail in Appendix A, Corollary 3.