Keywords

1 Introduction

A new distinguishing property against block ciphers, called the division property, was recently introduced by Todo [25]. This property, that can be seen as a generalization of integral [9, 16] and higher-order differential [15, 17] distinguishers, was used to present new generic distinguishers against both the SPN and the Feistel constructions. Later, this attack was used by the same author to present the first cryptanalysis of the full block cipher Misty [24].

If \(u = (u_1, \dots , u_n)\) is a vector of \(\mathbf {F}_2^n\), we denote by \(x^u\) the coordinate product \(x = (x_1, \dots , x_n) \mapsto \prod _{i=1}^n x_i^{u_i}\). The division property, as introduced by Todo [25], is interested in the sum of this quantity taken over all vectors of X. More precisely, we say that a set \(X \subseteq \mathbf {F}_2^n\) has the division property \(\mathcal {D}_k^n\), for some \(1 \le k \le n\), if the sum over all vectors x in X of the product \(x^u\) equals 0, for all vectors u that have a Hamming weight strictly less than k, i.e.

$$\bigoplus _{x \in X}x^u = 0 \text { for all } u \in \mathbf {F}_2^n \text { such that } wt(u) <k.$$

The division property then generalizes integral attacks in the sense that \(\mathcal {D}^n_2\) means that the set \(X\) is balanced, while \(\mathcal {D}^n_n\) means that it is saturated. But the novelty is that it introduces intermediate properties, \(\mathcal {D}^n_k\) for \(3 \le k \le n-1\), which do not appear in classical integral attacks. Even if these intermediate properties do not have a simple interpretation like \(\mathcal {D}_2^n\) and \(\mathcal {D}_n^n\), they allow to easily propagate the property through the successive rounds of a cipher by capturing some information resulting from the algebraic degree of the round function. In a nutshell, the distinguishers described by Todo in [24, 25] are classical higher-order differential distinguishers, but they are exhibited by exploiting the classical properties used in integral attacks together with some algebraic properties related to the degree of several iterations of a nonlinear function like in [6, 8].

Our Contribution. This work aims at providing new insights into the division property, presenting a new approach to it. This new approach enables us to provide a simpler formulation and interpretation of the division property of any order. It also improves the strength of the distinguishers that exploit this type of properties. For this, we introduce a new notion, that we call the parity set. The parity set of a set \(X \subseteq \mathbf {F}_2^n\) is nothing more than the set of all exponents \(u \in \mathbf {F}_2^n\) such that \(\bigoplus _{x \in X}x^u = 1\). The main advantage of this new notion is that it completely characterises a set X in the sense that there is a one-to-one correspondence between sets and their parity sets. It also provides a very simple formulation of the division property. In particular, we show that the division property of any order can be expressed in an elegant way by using the theory of Reed-Muller codes. One of the first questions we investigate in this work is what does it mean for a set X to have the division property \(\mathcal {D}_k^n\) for some special values of k. As previously explained, this question was treated for \(k \in \{1,2, n\}\) in previous works [14, 22, 25]. However, our approach, and especially the link with the Reed-Muller codes, permits us to recover in a much simpler way these previous results and to characterize the property for some other values of \(k\).

We investigate next the question of how to build distinguishers for keyed permutations by means of parity sets. For this, we start by analyzing the distinguishers built by Todo in [25] and formulate them in terms of parity sets. These distinguishers were of a generic nature as they only exploited the classical integral properties and the propagation of the algebraic degree through the successive non-linear layers. It is thus natural to believe that these distinguishers can be improved if additional information besides the degree is taken into account, in the same spirit as in cube distinguishers [1, 12]. We investigate this issue here and provide a way to exploit these more precise properties. We then further show how to find distinguishers on iterated block ciphers, especially on substitution-permutation networks, by propagating some information on the parity set of the output set through the successive rounds of the cipher. For this, we provide a detailed analysis of the evolution of the parity set through the basic operations of the round function of an SPN cipher.

We illustrate the above technique by constructing low-data distinguishers on the Present block cipher. With this example we aim at particularly showing how the generic distinguishers can be improved when the properties of the linear and the Sbox layer are taken into account. We manage to provide a distinguisher on 6 rounds of Present with data complexity \(2^{12}\), while the generic distinguisher from [25], reaches only 3 rounds for the same quantity of data. Finally, we analyze the resistance of Sboxes against this type of attack, show a link with the algebraic normal forms of the Sbox and of its inverse, and we give a criterion that an Sbox must satisfy in order to resist this kind of attacks.

Organization of the Paper. The rest of the paper is organized as follows. Section 2 introduces the notion of the parity set of a set and shows how it is related to Reed-Muller codes. Section 3 presents the link between this new notion and the division property, and it characterizes the division property of any order. It also focuses on the division property of low and high orders. Section 4 explains how to build distinguishers by means of parity sets and Sect. 5 analyzes the special case of SPN ciphers. In Sect. 6, low-data distinguishers for the block cipher Present are presented. Finally, Sect. 7 discusses the properties that an Sbox must exhibit in order to resist the above attacks.

2 Parity Set of a Set

2.1 Preliminaries

A Boolean function of \(n\) variables can be alternatively represented as a multivariate polynomial, named the Algebraic Normal Form (aka ANF) of the function, or as a \(2^n\)-bit vector (named the value vector) corresponding to all \(f(x), x\in \mathbf {F}_2^n\).

Polynomial Representation. We use the following notation for the monomials of \(n\) variables where \(u\) is an element of \(\mathbf {F}_2^n\):

$$\begin{aligned} x^u = \prod _{i=1}^n x_i^{u_i}. \end{aligned}$$

The following well-known lemma will be extensively used for evaluating a monomial at a given point.

Lemma 1

Let \(x\) and \(u\) be two \(n\)-bit words. Then \(x^u=1\) if and only if \(u \preceq x\), i.e., \(u_i \le x_i\) for all \(1 \le i \le n\).

The previous relation between \(n\)-bit words is a partial order. It equivalently means that the support of \(u\) is included in the support of \(x\). In the whole paper, we will use the following notation for the set of all words less than (resp. greater than) a given word with respect to this partial order.

Notation 1

Let \(u \in \mathbf {F}_2^n\). Then, we define

$$\begin{aligned} \mathsf {Prec}(u)= & {} \{x \in \mathbf {F}_2^n~: x \preceq u\}\\ \mathsf {Succ}(u)= & {} \{x \in \mathbf {F}_2^n~: u \preceq x\}. \end{aligned}$$

It is worth noticing that \(\mathsf {Prec}(u)\) is a linear subspace of dimension \(wt(u)\), while \(\mathsf {Succ}(u)\) is an affine subspace of dimension \((n-wt(u))\).

Value Vector. When a Boolean function is represented by its value vector, it is often convenient to use the terminology coming from coding theory since Boolean functions have been widely studied for error-correction. In this context, the value vector of a function is seen as a codeword from a Reed-Muller code [20, 21].

Definition 1

(Reed-Muller Codes). Let \(n\) be a positive integer and \(r\) an integer such \(0 \le r \le n\). The \(r\) -th order binary Reed-Muller code of length \(2^n\), denoted by \(\mathcal {R}(r,n)\), is the set formed by the value vectors of all Boolean functions of \(n\) variables with degree at most \(r\):

$$\begin{aligned} \mathcal {R}(r,n) = \{(f(x), x \in \mathbf {F}_2^n),\; f\!: \mathbf {F}_2^n \rightarrow \mathbf {F}_2 \text{ with } \deg f \le r\}. \end{aligned}$$

2.2 Parity Set of a Set

We now define a new notion named parity set. We will show that any set is characterized by its parity set. Any property of a set can then also be expressed in terms of its parity set, and we will show that the division property has a very simple expression by means of parity sets.

Definition 2

Let \(X\) be a set of elements in \(\mathbf {F}_2^n\). The parity set of \(X\), denoted by \(\mathcal {U}(X)\), is the subset of \(\mathbf {F}_2^n\) defined by

$$\begin{aligned} \mathcal {U}(X) = \{u \in \mathbf {F}_2^n : \bigoplus _{x \in X} x^u = 1\}. \end{aligned}$$

The parity set provides a complete characterization of a set, as shown by the following results.

Lemma 2

Let \(G\) be the \(2^n \times 2^n\) binary matrix whose entries are indexed by \(n\)-bit vectors and defined by

$$\begin{aligned} G_{u,a} = a^u, \; a,u \in \mathbf {F}_2^n. \end{aligned}$$

For any subset \(X\) of \(\mathbf {F}_2^n\), the incidence vector of \(\mathcal {U}(X)\) is equal to the product of \(G\) by the incidence vector of \(X\).

Proof

The incidence vector of a set \(X\), \(v_X\), is the \(2^n\)-bit vector having a one at position \(x \in \mathbf {F}_2^n\) if and only if \(x \in X\). Then, \(Gv_X\) is equal to the sum of all columns of \(G\) indexed by the elements in the support of \(v_X\), i.e., indexed by the elements in \(X\):

$$\begin{aligned} (Gv_X)_u = \bigoplus _{x \in X} x^u. \end{aligned}$$

By definition, the support of \(v_{\mathcal {U}(X)}\) is then the set of all positions \(u\) such that \((Gv_X)_u=1\).    \(\square \)

We can now deduce that there is a one-to-one correspondence between sets and their parity sets.

Theorem 1

Let \(G\) be the \(2^n \times 2^n\) binary matrix defined by

$$\begin{aligned} G_{u,a} = a^u, \; a,u \in \mathbf {F}_2^n. \end{aligned}$$

Then, \(G\) is non-singular and \(G^{-1}=G\). Therefore, for any subset \(U\) of \(\mathbf {F}_2^n\), there exists a unique set \(X \subset \mathbf {F}_2^n\) such that \(\mathcal {U}(X)=U\).

Proof

The fact that \(G\) is non-singular can be deduced by using that it is a generator matrix of the Reed-Muller code of length \(2^n\) and order \(n\). This code has dimension \(2^n\) [19, p. 376], i.e., \(G\) is invertible. Its inverse is equal to \(G\) itself. Indeed, for any \(u,w \in \mathbf {F}_2^n\), we have

$$\begin{aligned} (G \times G)_{u,w}= & {} \bigoplus _{v \in \mathbf {F}_2^n} G_{u,v} G_{v,w} = \bigoplus _{v \in \mathbf {F}_2^n} v^u w^v\\= & {} |\{v \in \mathbf {F}_2^n~: u \preceq v \text{ and } v \preceq w\}| \bmod {2}\\= & {} \left\{ \begin{array}{l} 2^{wt(w) - wt(u)} \bmod {2} \text{ if } u \preceq w \\ 0 \text{ otherwise. }\end{array}\right. \end{aligned}$$

We then deduce that \((G \times G)_{u,w}=1\) if and only if \(u=w\), i.e., \(G \times G = \mathsf {Id}\). As a direct consequence, we get that the mapping \(v_X \mapsto v_{\mathcal {U}(X)}\) is an isomorphism of the set of \(2^n\)-bit vectors.    \(\square \)

The fact that \(G\) is involutive provides a simple way to find the set \(X\) corresponding to a given parity set \(U\). Indeed, \(X\) corresponds to the parity set of \(U\). Some useful examples are described in the following corollary.

Corollary 1

Let \(X\) be a subset of \(\mathbf {F}_2^n\). Then,

  • \(\mathcal {U}(X)\) is empty if and only if \(X\) is empty.

  • \(\mathcal {U}(X) =\mathsf {Prec}(x)\) if and only if \(X=\{x\}\).

  • \(\mathcal {U}(X) =\{u\}\) if and only if \(X=\mathsf {Prec}(u)\).

  • \(\mathcal {U}(X) =\{\underline{1}\}\) if and only if \(X=\mathbf {F}_2^n\),

where \(\underline{1}\) denotes the all-one vector in \(\mathbf {F}_2^n\).

3 New Insights into the Division Property

3.1 The Division Property by Means of Parity Sets

The division property introduced by Todo in [25] is a distinguishing property of the set \(E_k(X)\) for a given choice of the input set \(X\), where \(E_k\) is (typically) a keyed permutation. This property must be independent from the choice of the secret key. We now reformulate the division property of order \(k\), \(\mathcal {D}^n_k\), on a set \(X\) by a simple property of \(\mathcal {U}(X)\). Indeed, \(\mathcal {D}^n_k\) corresponds to a lower bound on the weights of all elements in \(\mathcal {U}(X)\).

Definition 3

A set \(X\) of elements in \(\mathbf {F}_2^n\) is said to fulfill the division property of order \(k\), \(\mathcal {D}^n_k\), if all elements in \(\mathcal {U}(X)\) have weight at least \(k\), i.e.,

$$\begin{aligned} \mathcal {U}(X) \subseteq \{u \in \mathbf {F}_2^n : wt(u) \ge k\}. \end{aligned}$$

It is worth noticing that in [25], the division property is defined for a multiset, i.e., the elements in \(X\) may appear with some multiplicity. However, the original division property for a multiset \(X\) equivalently corresponds to the division property for the set composed of all elements in \(X\) having an odd multiplicity. Therefore, we will only focus on sets, instead of multisets.

As a direct consequence of the matrix relationship exhibited in Lemma 2, we deduce the following two characterizations of the incidence vectors of the sets satisfying the division property of order \(k\).

Proposition 1

Let \(X\) be a set of elements in \(\mathbf {F}_2^n\) and \(k\) be an integer \(1 \le k \le n\). Then, the following assertions are equivalent:

  • (i) \(X\) fulfills the division property of order \(k\), \(\mathcal {D}^n_k\).

  • (ii) The incidence vector of \(X\) belongs to the Reed-Muller code of length \(2^n\) and order \((n-k)\).

  • (iii) The incidence vector of \(X\) belongs to the dual of the Reed-Muller code of length \(2^n\) and order \((k-1)\).

Proof

Assertion (ii) equivalently means that the incidence vector of \(\mathcal {U}(X)\) vanishes at all positions \(u\) with \(wt(u) \le k-1\). This means that, if \(G'\) denotes the restriction of \(G\) to the rows of index \(u\) with \(wt(u) < k\), then \(G' v_{X}\) is the all-zero vector. But \(G'\) is a generator matrix of the Reed-Muller code of length \(2^n\) and order \((k-1)\). The set of all \(v_{X}\) such that \(G'v_{X}=0\) is therefore the dual (i.e., the orthogonal) of \(\mathcal {R}(k-1,n)\). It is well-known (see e.g. [19, p. 375]) that, for any \(r\), the dual of \(\mathcal {R}(r,n)\) is the Reed-Muller code \(\mathcal {R}(n-r-1,n)\). We deduce that \(G' v_{X}=0\) if and only if \(v_{X} \in \mathcal {R}(n-k,n)\).    \(\square \)

The first one of the previous characterization, (ii), has been independently exhibited by Khovratovich [14], while the equivalent formulation (iii) is new.

Using the minimum distance of the Reed-Muller codes, we recover very easily a result from [22] on the minimal size of a set satisfying \(\mathcal {D}^n_k\). More importantly, we are able to characterize the sets of minimal size satisfying \(\mathcal {D}^n_k\).

Proposition 2

Let \(X\) be a non-empty set of elements in \(\mathbf {F}_2^n\) satisfying \(\mathcal {D}^n_k\). Then

$$\begin{aligned} |X| \ge 2^k. \end{aligned}$$

Moreover, a set \(X\) of size \(2^k\) satisfies \(\mathcal {D}^n_k\) if and only if \(X\) is an affine subspaceFootnote 1 of dimension \(k\).

Proof

We here use that \(X\) satisfies \(\mathcal {D}^n_k\) if and only if \(v_{X}\) belongs to \(\mathcal {R}(n-k,n)\). It is well-known that the minimum distance of \(\mathcal {R}(n-k,n)\) is \(2^k\) [19, p. 375]. Using that \(|X|=wt(v_X)\), we deduce that \(|X| \ge 2^k\).

Moreover, it is known that the minimum-weight codewords in \(\mathcal {R}(n-k,n)\) are the incidence vectors of the affine subspaces of dimension \(k\) [19, p. 380]. It follows that a set of size \(2^k\) satisfies \(\mathcal {D}^n_k\) if and only if it is an affine subspace of dimension \(k\).    \(\square \)

3.2 Division Property of Low Order

Since the codewords of Reed-Muller codes of low order have a very simple form, a simple characterization of the division properties of low order directly follows. The Reed-Muller code \(\mathcal {R}(0,n)\) consists of the all-zero and all-one words, \(\mathcal {R}(0,n)=\{\underline{0}, \underline{1}\}\), and \(\mathcal {R}(1,n)\setminus \mathcal {R}(0,n)\) is composed of all incidence vectors of affine hyperplanes. Then, we easily recover the characterization of the division properties of order \(1\) and \(2\) exhibited in [25]:

  • \(X\) fulfills \(\mathcal {D}^n_1\) if and only if its cardinality is even.

  • \(X\) fulfills \(\mathcal {D}^n_2\) if and only if its cardinality is even and it has the Balance property [16], i.e., \(\oplus _{x \in X} x =0\).

Then, the division property of order \(k > 2\) generalizes the balance property used in integral attacks [16] in the following sense.

Proposition 3

Let \(X\) be a set of elements in \(\mathbf {F}_2^n\). Then, the following assertions are equivalent:

  • (i) \(X\) fulfills the division property of order \(k\), \(\mathcal {D}^n_k\).

  • (ii) For any set of coordinates \(\{i_1, \ldots , i_t\} \subseteq \{1, \ldots ,n \}\) of size \(t < k\) and any constant \(\alpha \in \mathbf {F}_2^t\), the number of elements in \(X\) such that \(x_{i_j}=\alpha _j\) for all \(1 \le j \le t\) is even.

  • (iii) For any set of coordinates \(\{i_1, \ldots , i_t\} \subseteq \{1, \ldots ,n \}\) of size \(t < k\), the number of elements in \(X\) such that \(x_{i_j}=0\) for all \(1 \le j \le t\) is even.

Proof

  • (i) \(\Rightarrow \) (ii) Let \(I=\{i_1, \ldots , i_t\}\) be a set of coordinates of size \(t < k\), and \(u\) be the vector in \(\mathbf {F}_2^n\) having support \(I\). Then,

    $$ \{x \in X : x_{i_j} = \alpha _j , 1 \le j \le t\} = \{x \in X : (x \oplus \beta ) \succeq u\} $$

    where \(\beta \) is the \(n\)-bit vector such that \(\beta _{i_j} = \alpha _j \oplus 1\) for \(1 \le j \le t\), and \(\beta _i = 0\) if \(i \not \in I\). It follows that

    $$\begin{aligned} |\{x \in X : x_{i_j} = \alpha _j , 1 \le j \le t\}| \bmod {2}= & {} \bigoplus _{x \in X} (x \oplus \beta )^u= \bigoplus _{x \in X} \bigoplus _{v \preceq u} x^v \beta ^{u \oplus v} \\= & {} \bigoplus _{v \preceq u} \beta ^{u \oplus v} \left( \bigoplus _{x \in X} x^v\right) =0 \end{aligned}$$

    since the division property of order \(k\) implies that all \(\bigoplus _{x \in X} x^v\) vanish when \(wt(v) \le wt(u) < k\).

  • (ii) \(\Rightarrow \) (iii) Trivial.

  • (iii) \(\Rightarrow \) (i) Let \(u \in \mathbf {F}_2^n\) with \(wt(u)<k\). We have

    $$\begin{aligned} \bigoplus _{x \in X} x^u= & {} \bigoplus _{x \in X} ((x\oplus u) \oplus u)^u = \bigoplus _{v \preceq u} \bigoplus _{x \in X} (x \oplus u)^v u^{u \oplus v} \\= & {} \bigoplus _{v \preceq u} \bigoplus _{x \in X} (x \oplus u)^v = \bigoplus _{v \preceq u} |\{x \in X : x_i=0, \forall i \in \mathsf {Supp}(v)\}| \bmod {2}. \end{aligned}$$

    From (iii), all sets involved in the previous sum have even size because \(wt(v) < k\). We then deduce that \(\bigoplus _{x \in X} x^u =0\), i.e. \(X\) fulfills the division property of order \(k\).

   \(\square \)

It is worth noticing that, more generally, the division property \(\mathcal {D}^n_k\) implies that the number of elements in \(X \cap A\) is even for any affine subspace \(A\) of dimension \(t > n-k\) (see e.g. [7, Proposition III.1]).

As a direct corollary, we can for instance characterize the division property of order \(3\).

Corollary 2

Let \(X\) be a set of elements in \(\mathbf {F}_2^n\). Then, \(X\) fulfills the division property of order \(3\), \(\mathcal {D}^n_3\), if and only if \(X\) and all the \(n\) subsets

$$\begin{aligned} \{x \in X \text{ with } x_i=0\}, \; 1 \le i \le n, \end{aligned}$$

satisfy the balance property.

Example 1

The following set \(X \in \mathbf {F}_2^5\) composed of \(12\) elements satisfy the division property of order \(3\):

\(x_1\)

0

0

0

0

1

1

1

1

1

1

1

1

\(x_2\)

0

1

0

1

0

1

0

1

1

0

1

0

\(x_3\)

1

0

0

1

1

0

0

1

1

0

1

0

\(x_4\)

0

1

0

1

0

1

0

0

0

1

1

1

\(x_5\)

1

1

1

1

1

1

0

1

0

0

0

1

3.3 Division Property of High Order

The division property of maximal order, i.e., \(\mathcal {D}^n_n\), obviously corresponds to the fact that \(X\) is either empty, or equal to the whole set \(\mathbf {F}_2^n\) (see the last item in Corollary 1). But, we are also able to characterize all sets satisfying \(\mathcal {D}^n_{n-1}\).

Proposition 4

Let \(X\) be a set of elements in \(\mathbf {F}_2^n\). Then \(X\) fulfills \(\mathcal {D}^n_{n-1}\) and not \(\mathcal {D}^n_{n}\) if and only if \(X\) is an (affine) hyperplane of \(\mathbf {F}_2^n\).

Proof

Let \(v\) denote the incidence vector of \(X\). From the previous proposition, we have \(X\) satisfies \(\mathcal {D}^n_{n-1}\) and not \(\mathcal {D}^n_{n}\) if and only if \(v \in \mathcal {R}(1,n)\setminus \mathcal {R}(0,n)\). This set consists of the incidence vectors of all (affine) hyperplanes of \(\mathbf {F}_2\). Then, this equivalently means that \(X\) is an (affine) hyperplane.    \(\square \)

For instance, it can be easily checked that the multiset of elements of \(\mathbf {F}_2^4\) defined in [25, p. 293]

$$\begin{aligned} \{\mathtt{0x0, 0x3, 0x3, 0x3, 0x5, 0x6, 0x8, 0xb, 0xd, 0xe}\}, \end{aligned}$$

satisfies \(\mathcal {D}^4_3\) because the corresponding set composed of all elements with an odd multiplicity

$$\begin{aligned} \{\mathtt{0x0, 0x3, 0x5, 0x6, 0x8, 0xb, 0xd, 0xe}\} \end{aligned}$$

is a linear subspace of dimension \(3\) spanned by \(\{\mathtt{0x3, 0x5, 0x8}\}\).

4 Distinguishers Based on Parity Sets

We now investigate how we can build some distinguishers for a given keyed permutation \(E_K\) by means of parity sets. The basic idea consists in choosing an input set \(X\) such that the parity set of the corresponding output set \(E_K(X)\) has some specific property for any choice of the key.

Since the size of \(X\) determines the data complexity of the distinguisher, it has to be as small as possible. For this reason, \(X\) is always chosen to be an affine subspace since subspaces are the smallest possible sets satisfying the division property of a given order (see Proposition 2).

4.1 Todo’s Distinguishers

The strategy proposed by Todo to build a distinguisher is to exhibit an affine subspace \(a+V\) such that the corresponding output set \(E_K(a+V)\) satisfies the division property of order \(2\), i.e., such that \(E_K(a+V)\) is balanced. This property can be easily interpreted in terms of higher-order derivatives in the sense of the following definition.

Definition 4

[17] Let \(F\) be a function from \(\mathbf {F}_2^n\) into \(\mathbf {F}_2^m\). Let \(a \in \mathbf {F}_2^n\). The derivative of \(F\) with respect to \(a\) is the function from \(\mathbf {F}_2^n\) into \(\mathbf {F}_2^m\) defined by

$$\begin{aligned} D_a F(x) = F(x\oplus a) \oplus F(x). \end{aligned}$$

For any \(k\)-dimensional subspace \(V\) of \(\mathbf {F}_2^n\) and for any basis of \(V\), \(\{a_1, \ldots , a_k\}\), the \(k\)-th order derivative of \(F\) with respect to \(V\) is the function defined by

$$\begin{aligned} D_V F(x) = D_{a_1}D_{a_2}\ldots D_{a_k} F(x) = \bigoplus _{v \in V} F(x + v)\; \end{aligned}$$

We introduce now the following notation.

Notation 2

Let P be a permutation of \(\mathbf {F}_2^n\) and \(P_1, P_2, \dots , P_n\) be the n coordinates of P. If \(x = (x_1,\dots , x_n)\) and \(u = (u_1,\dots ,u_n)\) are vectors of \(\mathbf {F}_2^n\), we denote by \(P^u(x)\) the coordinate product \(\prod _{i=1}^nP_i(x)^{u_i}\).

Proposition 5

Let \(P\) be a permutation of \(\mathbf {F}_2^n\). Let \(V\) be the linear subspace of \(\mathbf {F}_2^n\) and \(a \in \mathbf {F}_2^n\). Then, an element \(u\) belongs to \(\mathcal {U}(P(a+V))\) if and only if the derivative of \(P^u\) with respect to \(V\) satisfies

$$\begin{aligned} D_V P^u(a) = 1. \end{aligned}$$

In the particular case where \(V = \mathsf {Prec}(v)\) for some \(v \in \mathbf {F}_2^n\), the following formulations are equivalent:

  • (i) For all \(a \in \mathbf {F}_2^n\), \(u \not \in \mathcal {U}(P(a+V))\)

  • (ii) The algebraic normal form of the Boolean function \(x \mapsto P^u(x)\) contains no monomial multiple of \(x^v\)

  • (iii) The superpoly of \(v\) in \(x \mapsto P^u(x)\) vanishes.

Proof

The fact that \(D_V P^u(a)=1\) if and only if \(u \in \mathcal {U}(P(a+V))\) is directly deduced from the definition of the derivative with respect to \(V\) using that

$$\begin{aligned} D_V P^u(a) = \bigoplus _{x \in a+V} P^u(x). \end{aligned}$$

The superpoly of \(v\) in \(P^u\) is defined [12] as the Boolean function \(p_v\) such that

$$\begin{aligned} P^u(x) = x^v p_v(x) + q(x) \end{aligned}$$

where \(q\) does not contain any monomial multiple of \(x^v\). Moreover, it has been proved in [12, Theorem 1] that

$$\begin{aligned} p_v(a) = \bigoplus _{x \preceq v} P^u(a \oplus x) = D_V P^u(a). \end{aligned}$$

It then follows that, for \(V = \mathsf {Prec}(v)\), \(u \not \in \mathcal {U}(P(a+V))\) for all \(a\) if and only if the superpoly of \(v\) in \(P^u\) vanishes, which equivalently means that \(P^u\) does not contain any monomial multiple of \(x^v\).    \(\square \)

4.2 Improving Todo’s Distinguishers

The distinguishers presented in [25] correspond to the existence of a word \(v\) such that \(E_K(a+\mathsf {Prec}(v))\) satisfies the division property of order \(2\) for all \(a\), which equivalently means that the monomial \(x^v\) does not appear in any coordinate of \(E_K\). Since these distinguishers are constructed by propagating some information on the smallest Hamming weight of the elements in the parity set, they are based on the fact that the weight of \(v\) exceeds the degree of \(E_K\), where the degree of the coordinates of the cipher after several iterations can be upper bounded by exploiting the techniques introduced in [5, 6]. However, it clearly appears that this type of distinguishers can be improved in the following two directions:

  • it may happen that a given monomial \(x^u\) does not appear in the coordinates of \(E_K\) even if \(wt(u) \le \deg P\). This type of property, derived from the sparsity of some coordinates of the cipher, has been extensively used in cube attacks, e.g. [1, 11, 13].

  • it may happen that a given monomial \(x^u\) appears in one coordinate of \(E_K\) but not in all functions \(x \mapsto E_K^v(x)\). Then, \(E_K(a + \mathsf {Prec}(u))\) does not fulfill the division property of order \(2\); instead we obtain a weaker distinguisher based on the fact that a given \(v\) does not belong to the parity set of \(E_K(a + \mathsf {Prec}(u))\).

These two ways of exploiting some additional information besides the degree of the function are illustrated in the following toy example. In the rest of the paper, binary words are represented in hexadecimal notation where the least significant bit corresponds to the rightmost bit in the binary word.

Example 2

Let us consider the \(4\)-bit Sbox \(S\) used in Present [2]. This Sbox has degree \(3\) which is the maximal degree for a permutation of \(\mathbf {F}_2^4\). The rows of Table 1 describes all sets

$$\begin{aligned} V_s(u) = \{v \in \mathbf {F}_2^n : S^v(x) \text{ contains } x^u\}. \end{aligned}$$

In others words, the entry at Row \(u\) and Column \(v\) in this table is an x if and only if \(x^u\) appears in the ANF of \(x \mapsto S^v(x)\). Equivalently, the column of index \(v\) in this table corresponds the list of all monomials in the ANF of \(x \mapsto S^v(x)\).

Table 1. Sets \(V_S(u)\) for all \(u \in \mathbf {F}_2^4\) for the Present Sbox. All \(4\)-bit words are represented in hexadecimal notation, and the rightmost bit of the word corresponds to the least significant bit.

Clearly, we cannot exhibit any \(u\) such that \(S(a+\mathsf {Prec}(u))\) fulfills the division property of order \(2\) for all \(a\) (i.e. such that no coordinate of \(S\) contains \(x^u\)) if we exploit the degree of the Sbox only. Indeed, using that \(\deg (S)=3\), we deduce that the all-one vector \(u = \mathtt{0xf}\) is the only value satisfying this property. This does not provide any distinguisher since this holds for all permutations.

However, distinguishers can be found by using that not all \(u \in \mathbf {F}_2^4\) of weight less than or equal to \(3\) appear in the ANF of the coordinates of \(S\). Indeed \(u=\mathtt{0xe}\) does not appear in any of the coordinates of \(S\), i.e., for \(u=\mathtt{0xe}\), \(S(a+\mathsf {Prec}(u))\) fulfills the division property of order \(2\) for all \(a\). It is worth noticing that this is the only value of \(u\) which provides such a distinguisher.

Instead of searching for \(u\) such that the parity set of \(S(a+\mathsf {Prec}(u))\) does not contain any word of weight \(1\), we can exhibit a few values \(v\) which belong to none of the sets \(S(a+\mathsf {Prec}(u))\), \(a \in \mathbf {F}_2^4\). For instance, we can observe that neither \(S^{\mathtt{1}}\) nor \(S^{\mathtt{e}}\) contains a multiple of \(x^{\mathtt{3}} = x_1x_2\). This means that \(v=\mathtt{0x1}\) and \(v=\mathtt{0xe}\) do not belong to a set \(S(a+\mathsf {Prec}(\mathtt{0x3}))\). Similarly, the sets \(S(a+\mathsf {Prec}(\mathtt{0x9}))\) do not contain \(v=\mathtt{0x1}\) and \(v=\mathtt{0xb}\). The sets \(S(a+\mathsf {Prec}(\mathtt{0xc}))\) do not contain \(v=\mathtt{0x1}\) and \(v=\mathtt{0x6}\).

Clearly, these two ideas allow us to decrease the dimension of the input subspace \(X=a+\mathsf {Prec}(u)\) involved in the distinguisher. But the distinguisher based on the division property of order \(2\) is obviously stronger than the second one. A precise evaluation of the advantages of these distinguishers is provided in Appendix.

5 Exhibiting Distinguishers on SPN by Means of Parity Sets

We now show how to find some distinguishers on iterated block ciphers, especially on substitution-permutation networks, by propagating some information on the parity set of the output set through the successive rounds of the cipher. As previously explained, we choose as input set an affine subspace.

5.1 Propagation Through Key Addition

One of the difficulties for finding a distinguisher for a block cipher is that the distinguishing property must hold for any value of the secret key. For this reason, we need to exploit a property which can be easily propagated through the operation inserting the round key, which is usually an XOR. This is the case of differential properties, or of the algebraic degree. We can show that the parity set can also be easily propagated.

Proposition 6

Let \(X\) be a subset of \(\mathbf {F}_2^n\) with parity set \(\mathcal {U}(X)\). Then, for any \(k \in \mathbf {F}_2^n\), the parity set of \((k+X)\) satisfies

$$\begin{aligned} \mathcal {U}(k+X) \subseteq \bigcup _{u \in \mathcal {U}(X)} \mathsf {Succ}(u). \end{aligned}$$

Proof

We use that

$$\begin{aligned} (x \oplus k)^v = \bigoplus _{u \preceq v} x^{v} k^{v \oplus u}. \end{aligned}$$

It follows that

$$\begin{aligned} \bigoplus _{x \in X} (x \oplus k)^v = \bigoplus _{x \in X}\bigoplus _{u \preceq v} x^{u} k^{v \oplus u}= \bigoplus _{u \preceq v}k^{v \oplus u} \left( \bigoplus _{x \in X}x^{u}\right) . \end{aligned}$$

Then, this sum equals zero for all \(k \in \mathbf {F}_2^n\) if \(\bigoplus _{x \in X}x^{u}=0\) for all \(u\) such that \(u \preceq v\). In other words, if the sum equals one, then there exists \(u\) in \(\mathcal {U}(X)\) such that \(u \preceq v\), i.e., \(v\) satisfies \(v \succeq u\) for at least one \(u \in \mathcal {U}(X)\).    \(\square \)

It is worth noticing that there is no general improvement of the previous result which holds without any further assumption on \(k\) or on \(X\). Indeed, it is easy to check that, when \(X = \{u\}\), we have that any \(v \in \mathsf {Succ}(u)\) belongs to \(\mathcal {U}(k+X)\) for some \(k\) (e.g. \(k=(u+v)\) satisfies this property). Thus, \(\mathsf {Succ}(u)\) is the smallest set which contains the parity sets of all cosets \((k+X)\) in this case.

5.2 Propagation Through an Sbox

We now investigate how a parity set propagates through a permutation, for instance through an Sbox or through a linear permutation.

Proposition 7

Let \(S\) be a permutation of \(\mathbf {F}_2^n\). For any \(v \in \mathbf {F}_2^n\), we define

$$\begin{aligned} V_S(u).= & {} \{v \in \mathbf {F}_2^n : S^v(x) \text{ contains } x^u\}\\ \end{aligned}$$

Then, for any set \(X\) of elements of \(\mathbf {F}_2^n\),

$$\begin{aligned} \mathcal {U}(S(X)) \subseteq \bigcup _{u \in \mathcal {U}(X)} V_S(u). \end{aligned}$$

Proof

By definition, the vectors \(v\) which may be in \(\mathcal {U}(S(X))\) are those such that \(S(x)^v\) contains a monomial \(x^u\) with \(u \in \mathcal {U}(X)\). Otherwise, we have that \(\bigoplus _{x \in X} S(x)^v = 0\). The result then directly follows.    \(\square \)

We will discuss in more details in Sect. 7 the properties of an Sbox which make it resistant or not to this attack. It is worth noticing that the previous proposition applies to any permutation \(S\), including the case where \(S\) corresponds to the linear layer of the cipher.

Another case of interest is the case where the Sbox can be seen as the concatenation of several independent Sboxes, like in a typical Sbox layer.

Proposition 8

Let \(X\) be a set of elements in \(\mathbf {F}_2^{mt}\) and let \(S\) be an Sbox over \(\mathbf {F}_2^{mt}\) which consists of the parallel application of \(t\) Sboxes \(S_1, \ldots , S_t\) over \(\mathbf {F}_2^m\): \(S(x_1, \ldots , x_t) = (S_1(x_1), \ldots , S_t(x_t))\). Then,

$$\begin{aligned} \mathcal {U}(S(X)) \subseteq \bigcup _{(u_1, \ldots , u_t) \in \mathcal {U}(X)} V_{S_1}(u_1) \times \ldots \times V_{S_t}(u_t) \end{aligned}$$

where \(V_{S_i}(u) = \{v \in \mathbf {F}_2^m : S_i^v(x) \text{ contains } x^u\}\).

Proof

From Proposition 7, we know that

$$\begin{aligned} \mathcal {U}(S(X)) \subseteq \bigcup _{(u_1, \ldots , u_t) \in \mathcal {U}(X)} V_S(u). \end{aligned}$$

We then have to determine all \(v =(v_1, \ldots , v_t)\in (\mathbf {F}_2^m)^t\) such that \(S^v(x)\) contains \(u=(u_1, \ldots , u_t)\). We use that

$$\begin{aligned} S^v(x) = S_1^{v_1}(x_1) S_2^{v_2}(x_2) \ldots S_t^{v_t}(x_t). \end{aligned}$$

Since only \(S_i^{v_i}(x_i)\) may contain \(x_i^{u_i}\), we deduce that \(v \in V_S(u)\) if and only if \(v_i \in V_{S_i}(u_i)\) for each \(1 \le i \le n\). Therefore, \(V_S(u)\) is the Cartesian product of all \(V_{S_i}(u_i)\).    \(\square \)

5.3 Propagation Through One Round

We now consider an SPN where the round key is inserted by addition at the end of the round. This implies that each Sbox layer comes after a round-key addition. Thus, if \(\mathcal {U}(X)\) denotes the parity set of the input set \(X\) before the key addition, then the parity set after the key addition is included in a union of sets of the form \(\mathsf {Succ}(u)\), for some \(u \in \mathbf {F}_2^n\). It follows that the parity set after the Sbox layer satisfies

$$\begin{aligned} \mathcal {U}(S(X+k)) \subseteq \bigcup _{u \in \mathcal {U}(X)} \left( \bigcup _{v \in \mathsf {Succ}(u)} V_S(v)\right) . \end{aligned}$$

Therefore, propagating the information from \(\mathcal {U}(X)\) to \(\mathcal {U}(S(X+k))\) involves the sets

$$\begin{aligned} \mathcal {V}_S(u) = \bigcup _{v \in \mathsf {Succ}(u)} V_S(v) \end{aligned}$$

which depend on the Sbox only.

These sets \(\mathcal {V}_S(u)\) are then the relevant quantities involved in the propagation through the Sbox, instead of the sets \(V_S(u), u \in \mathbf {F}_2^n\). For instance, Table 2 provides all sets \(\mathcal {V}_S(u)\) for the Present Sbox.

Table 2. Sets \(\mathcal {V}_S(u)\) for all \(u \in \mathbf {F}_2^4\) for the Present Sbox. All \(4\)-bit words are represented in hexadecimal notation, and the rightmost bit of the word corresponds to the least significant bit.

The table representing all \(\mathcal {V}_S(u)\) has a few generic properties which hold for any bijective Sbox. The first obvious remark is that the all-zero vector does not belong to any \(\mathcal {V}_S(u)\) except when \(u = \underline{0}\). Indeed \(S^0\) is the all-one function and then does not contain any monomial except \(x^0\). The following property is much more interesting.

Proposition 9

Let \(S\) be any permutation of \(\mathbf {F}_2^n\). Then,

$$\begin{aligned} \mathcal {V}_S(\underline{1}) = \{\underline{1}\}, \end{aligned}$$

where \(\underline{1}\) denotes the all-one vector in \(\mathbf {F}_2^n\).

Proof

Since \(\mathsf {Succ}(\underline{1})= \underline{1}\), we have

$$\begin{aligned} \mathcal {V}_S(\underline{1}) = V_S(\underline{1}) = \{v \in \mathbf {F}_2^n : S^v(x) \text{ contains } x^{\underline{1}}\} \end{aligned}$$

or equivalently \(\mathcal {V}_S(\underline{1})\) is the set of all \(v\) such that \(x \mapsto S^v(x)\) has degree \(n\). It is known [6, Proposition 1] that \(\deg (S^v)=n\) if and only if \(v = \underline{1}\).    \(\square \)

Some further properties of Table 2, specific to the Present Sbox will be studied in Sect. 7.

6 Low-Data Distinguishers on a Few Rounds of Present

6.1 Distinguisher on 3 Rounds

In [25] Todo presents generic distinguishers for ciphers based on the SPN construction. In particular, it is shown in Table 4 of this same paper that 3 rounds of an SPN construction whose nonlinear layer is composed of 16 Sboxes over \(\mathbf {F}_2^4\) of degree 3 can be distinguished from a random permutation with data complexity \(2^{12}\). These results are therefore valid for Present. This distinguisher improves upon the distinguishers exploiting the algebraic degree since the best upper bound on the degree of three rounds of such ciphers is \(3^3 = 27\), leading to a distinguisher with data complexity \(2^{28}\).

Todo’s distinguisher can be easily explained in the following way (see also [14]). Suppose that the input space X is composed of vectors taking all possible \(2^4\) values on the first three nibbles and where the last 13 nibbles are fixed to a same constant value for all vectors. In this case \(|X| = 2^{12}\) and X can be seen now as a coset of V, i.e. \(X = a + V\), where

$$V = \mathsf {Prec}(\mathtt{0x0000000000000fff}).$$

After the application of the round-key addition and the Sbox layer to X, the output space Y satisfies the same integral property as X, i.e. \(Y = b + V\). Denote now by \(F = \mathcal {R}^2 \circ P\), where \(\mathcal {R}\) stands for the round function and where P denotes the linear layer. One can easily see that as F contains two non-linear layers, of degree \(3\) each, \(\deg (F) \le 9\). Therefore, as \(\dim Y = 12 > 9\), we have that

$$\bigoplus _{x\in a + V} E_K(X) = \bigoplus _{y\in b + V}F (Y) = \bigoplus _{y \in V} F(b \oplus y) = D_VF(b) = 0.$$

Equivalently, this generic distinguisher on 3 rounds uses the feature that none of the coordinates of \(E_K\) contains a multiple of \(x^u\) for \(u = \mathtt{0x0000000000000fff}\) (see Proposition 5). This distinguisher can therefore be very easily explained in terms of parity sets. Since \(X = a + V\), where \(V = \mathsf {Prec}(\mathtt{0x0000000000000fff})\), we have that \(\mathcal {U}(V) = \{\mathtt{0x0000000000000fff}\}\) (from Corollary 1), implying that

$$\mathcal {U}(X) \subseteq \mathsf {Succ}(\mathtt{0x0000000000000fff}).$$

Since the Sbox S is a permutation, for each Sbox \(\mathcal {V}_S(\mathtt{0xf}) = \{ \mathtt{0xf}\}\) (Proposition 9) meaning that after the first Sbox layer the parity set U of the resulting set is again included in \(\mathsf {Succ}(\mathtt{0x0000000000000fff})\). By defining the function F as before, we have that

$$ \mathcal {U}(E_K(X)) \subseteq \bigcup _{u \in U} V_F(u).$$

But, \(V_F (u) = \{v : F^v(x) \text { contains } x^u\}\) contains no vector v with \(wt(v) \le 1\) when \(wt(u) \ge 12\) since \(\deg (F) \le 9\). Therefore,

$$ \mathcal {U}(E_K(X)) \subseteq \{v : wt(v) \ge 2 \},$$

meaning that the output of the cipher restricted to 3 rounds has the balanced property (i.e., satisfies the division property of order \(2\)).

6.2 Distinguisher on 4 Rounds

As explained in [25] in the case of AES-like ciphers, such generic distinguishers can be improved by exploiting the structure of the linear layer. In the particular case of Present, the linear layer (see Fig. 1) is a bit permutation. Moreover, because of its structure, two rounds of Present (without the last permutation layer) can be seen as the concatenation of four independent Superboxes operating on \(\mathbf {F}_2^{16}\). With this structure, it is clear that any coordinate of the output at round \((r + 1)\) of the cipher only contains monomials involving inputs from the same Superbox at round \(r\).

Fig. 1.
figure 1

Round function of the block cipher Present

Therefore, by exploiting the linear layer of Present one can extend the previous distinguisher to one more round as follows. Suppose now that the input set X has the form \(X = a + V\), with

$$V = \mathsf {Prec}(\mathtt{0x000000000000fff0}).$$

In this case \(\mathcal {U}(X) \subseteq \mathsf {Succ}(\mathtt{0x000000000000fff0 }).\) The parity set remains unchanged after the application of the first nonlinear layer. By applying now the permutation layer we see that the parity set of the resulting set is included in

$$\mathsf {Succ}(\mathtt{0x000e000e000e000e}), $$

leading to four active Superboxes. As previously explained, after the application of the Superboxes, so after the non-linear layer of the third round, any output coordinate only contains monomials coming from the same Superbox. Therefore, the resulting parity set is included in

$$\{u : wt(u) \ge 4 \}.$$

By applying now the linear layer we have that the parity set is included in

$$\{u \text { with } \ge 2 \text { active nibbles}\} \cup \{\mathtt{0x00\dots 0f},\dots ,\mathtt{0x f0\dots 0}\}.$$

This parity set is invariant under the application of the fourth nonlinear layer, meaning that

$$\mathcal {U}(E_K(X)) \subseteq \{v : wt(v) \ge 2\}.$$

We see therefore that the output set has the balanced property after 4 rounds.

However, it can be shown, that by only exploiting the properties of the linear layer of Present, this distinguisher cannot always be extended to five rounds. The following table shows a possible propagation of values in the parity sets, where some output coordinate of the 12th Sbox after 5 rounds may contain the monomial \(x^u\), with \(u = \mathtt{0x0000000000000fff0}\). By looking at rows 2 and 3 of this table, we can see that this propagation can be realised, among others, if the Sbox makes the propagations \(\mathtt{0xe} \rightarrow \mathtt{0x2}\) and \(\mathtt{0xe} \rightarrow \mathtt{0x1}\) possible. All the elements of this table should be interpreted as hexadecimal values.

input

0

0

0

0

0

0

0

0

0

0

0

0

f

f

f

0

output S-layer 1st round

0

0

0

0

0

0

0

0

0

0

0

0

f

f

f

0

output P-layer 1st round

0

0

0

e

0

0

0

e

0

0

0

e

0

0

0

e

output S-layer 2nd round

0

0

0

2

0

0

0

1

0

0

0

1

0

0

0

1

output P-layer 2nd round

0

0

0

0

0

0

0

0

1

0

0

0

0

1

1

1

output S-layer 3rd round

0

0

0

0

0

0

0

0

1

0

0

0

0

1

1

1

output P-layer 3rd round

0

0

0

0

0

0

0

0

0

0

0

0

0

0

8

7

output S-layer 4th round

0

0

0

0

0

0

0

0

0

0

0

0

0

0

2

8

output P-layer 4th round

0

0

0

3

0

0

0

0

0

0

0

0

0

0

0

0

output S-layer 5th round

0

0

0

1

0

0

0

0

0

0

0

0

0

0

0

0

However, by looking closer at Table 1, one can see that these two propagations are not possible for the Present Sbox, as 0xe can only be propagated to values of Hamming weight of at least 2.

6.3 Distinguishers on 5 and 6 Rounds

The previous 4-round distinguisher on Present exploited the algebraic degree of the Sbox together with the structure of the linear layer. We show here, that by further taking into account the particular form of the Present Sbox, the previous distinguisher can be extended to one more round. For doing so, we consider the same input set X as for the 4-round distinguisher, i.e. \(X = a + V\), with \(V = \mathsf {Prec}(\mathtt{0x000000000000fff0})\). As previously shown (see also the second row of the table above), after the application of the first round transformations, the parity set is included in \(\mathsf {Succ}(\mathtt{0x000e000e000e000e})\). However, the value 0xe presents, in the case of Present, a particular interest. Indeed, notice that the row indexed by 0xe in Table 2 contains only a single element of weight 2 and no element of weight 1. This means that compared to other values u, \(\mathcal {V}_S(u)\) for \(u = \mathtt{0xe}\) contains exceptionally few elements, making more than half of the transitions impossible. In particular, the transitions, \(\mathtt{0xe} \rightarrow \mathtt{0x2}\) and \(\mathtt{0xe} \rightarrow \mathtt{0x1}\) in the above 5-round path are not possible.

We checked by computer programming that in this setting, there is no u of weight 1 in \(\mathcal {U}(E_k(X))\) if \(E_k\) is 5-round Present. Therefore, the output set after 5 rounds has the balanced property and can then be distinguished from a random permutation.

However, it is not possible to extend the distinguisher in its actual form to 6 rounds. Indeed, we checked that after 6 rounds, many elements of weight 1 can be found in \(\mathcal {U}(E_k(X))\). Nevertheless, it is still possible to exhibit a weaker distinguisher for 6 rounds of Present, by exploiting the fact that the column corresponding to the element 0x1 in Table 2 is very sparse, meaning that more than half of the transitions u \(\rightarrow \) 0x1 are not possible. We were able to check that in fact, among all the elements of weight 1 present in \(\mathcal {U}(E_k(X))\), only the nibble values 0x2, 0x4 and 0x8 were possible. Therefore, we have exhibited \(16\) values which do not belong to \(\mathcal {U}(E_k(X))\) after six rounds of Present, leading to a distinguisher with data complexity \(2^{12}\) and advantage \((1 - 2^{-16})\) (see Appendix for the evaluation of the advantage).

6.4 Distinguishers Using More Data

We provide in Table 3 a summary of all the distinguishers obtained for reduced-round versions of Present by using different sizes of the input space. These distinguishers were obtained by implementing the propagation of the parity set of the input subspace in a compact way. We mention here only strong distinguishers in the sense that we give only the results where the output set has the balanced property.

Table 3. Input sets leading to the division property of order \(2\) for reduced-round Present.

6.5 Changing the Sbox

We discuss in this section the strength of the above distinguishers in the case where the Sbox of Present is replaced by some other permutation of degree 3. For this, we consider exactly the same cipher but we change the nonlinear permutation. For instance, we consider the Sbox used in the block cipher Prince [3] and we study the propagation of the parity set of the resulting cipher. Table 4 provides the sets \(\mathcal {V}_S(u)\) for all \(u \in \mathbf {F}_2^4\) for this Sbox. However, it has to be noted that we do not propose in any case to replace the Sbox of Present for obtaining a more robust design in general. This change is done here only for being able to run the same experiments by keeping exactly the same other parameters. Applying for example the same attack to the Prince cipher and compare the results would not make any sense as the results could be different because of the different linear layers for example, making the impact of the Sbox unclear. This paragraph aims only at demonstrating that Present with some other Sbox can better resist against this type of distinguishers, but we do not argue that this would necessarily be the case for other type of attacks.

The following table is the equivalent of Table 2 for Present. As one can see, this table, is much less sparse than Table 2. In particular, even the rows the more sparse (rows corresponding to 0xb and 0xd), make at least half of the transitions possible. One can further notice that all rows and columns contain elements of weight 1 and 2. All the above indicate that whatever the input u of a particular Sbox, the set \(\mathcal {V}_S(u)\) contains a high number of values, making only very few transitions impossible.

Table 4. Sets \(\mathcal {V}_S(u)\) for all \(u \in \mathbf {F}_2^4\) for the Prince Sbox. All \(4\)-bit words are represented in hexadecimal notation.

We were able to verify, that if we start with the same input space as before, the output set after five rounds does not satisfy the division property of order \(2\), while this was the case for the Present Sbox. Indeed, Table 5 provides a path that is satisfied when this Sbox is used. As one can see, we end up with a vector of weight 1 in \(\mathcal {U}(E_k)\). This was not the case with the original Present Sbox, proving that the Prince Sbox is more resistant against this kind of property. A more detailed study of why this is so is provided in the following section.

Table 5. Example of a 5-round path that is satisfied when the Prince Sbox is plugged into the Present block cipher. All entries should be interpreted as hexadecimal values.

7 Related Security Criterion for the Sbox

As illustrated by the previous attack, one of the main ingredients of the distinguisher is the particular form of the sets \(\mathcal {V}_S(u)\) for the Present Sbox, i.e., the particular form of Table 2. Two properties of this table are exploited:

  1. 1.

    Column of index 0x1 is very sparse and in particular it does not contain any element of weight \(3\) and it contains a single element of weight \(2\). This property is exploited at the end of the attack on \(6\) rounds, where we use that 0x1 only belongs to a very few sets \(\mathcal {V}_S(u)\), i.e., only a few transitions \(u \rightarrow \mathtt{0x1}\) are possible.

  2. 2.

    Row of index \(\mathtt{0xe}\) is very sparse and in particular it does not contain any element of weight \(1\) and it contains a single element of weight \(2\). This property is exploited when we use that \(\mathcal {V}_S(\mathtt{0xe})\) contains a few elements only, implying that many transitions \(\mathtt{0xe} \rightarrow v\) are impossible.

We now show where these unsuitable properties of the Sbox come from, and how they can be avoided. The first property has an obvious algebraic interpretation since each column in the table is derived from the ANF of a Boolean function \(x \mapsto S^v(x)\). In particular, the columns having an index of weight \(1\) are derived from the ANF of the coordinates of the Sbox. The ANF of the Present Sbox is

$$\begin{aligned} S_1(x_1, x_2, x_3, x_4)= & {} x_1 + x_3 + x_4+ x_2x_3 \\ S_2(x_1, x_2, x_3, x_4)= & {} x_2 + x_4 + x_2x_4 + x_3x_4 + x_1x_2x_3+ x_1x_2x_4 + x_1x_3x_4 \\ S_3(x_1,x_2, x_3, x_4)= & {} 1 + x_3 + x_4 + x_1x_2 + x_1x_4 + x_2x_4 + x_1x_2x_4 + x_1x_3x_4 \\ S_4(x_1, x_2, x_3, x_4)= & {} 1 + x_1 + x_2 + x_4 + x_2x_3 + x_1x_2x_3 + x_1x_2x_4 + x_1x_3x_4. \end{aligned}$$

The first weakness of the Present Sbox then comes from the fact that its first coordinate has degree \(2\) only and contains a single quadratic term.

The second weakness is related to the fact that the monomial \(x^\mathtt{e}=x_2x_3x_4\) appears neither in the coordinates of \(S\), nor in most functions \(S_iS_j\). This second property can be deduced in a much simpler way from the ANF of the inverse Sbox, as shown by the following proposition.

Lemma 3

Let S be a permutation of \(\mathbf {F}_2^n\). Then, for any \(u,v \in \mathbf {F}_2^n\), the ANF of \(x \mapsto S(x)^v\) contains \(x^u\) if and only if the ANF of \(x \mapsto S^{*}(x)^{\overline{u}}\) contains \(x^{\overline{v}}\), where \(\overline{u}\) denotes the vector \(u \oplus \underline{1}\) and \(S^*\) is the permutation \(x \mapsto \overline{S^{-1}(\overline{x})}\).

Proof

Let \(a_{u}\) denote the coefficient of \(x^u\) in the ANF of \(x \mapsto S(x)^v\). Then

$$\begin{aligned} a_u= & {} \bigoplus _{x \preceq u} S(x)^v \\= & {} |\{x \in \mathbf {F}_2^n : x_i=0, i \in \mathsf {Supp}(\overline{u}) \text{ and } S(x)_j=1, j \in \mathsf {Supp}(v)\}| \bmod {2} \\= & {} |\{y \in \mathbf {F}_2^n : S^{-1}(y)_i=0, i \in \mathsf {Supp}(\overline{u}) \text{ and } y_j=1, j \in \mathsf {Supp}(v)\}| \bmod {2} \end{aligned}$$

where the last equality comes from the fact that \(S\) is a permutation, implying that there is a one-to-one correspondence between \(x\) and \(y=S(x)\). We now replace \(y\) by \(z=\overline{y}\) and use that \(S^{-1}(y) = \overline{S^*(\overline{y})}\). Then,

$$\begin{aligned} a_u= & {} |\{z \in \mathbf {F}_2^n : S^{-1}(\overline{z})_i=0, i \in \mathsf {Supp}(\overline{u}) \text{ and } z_j=0, j \in \mathsf {Supp}(v)\}| \bmod {2} \\= & {} |\{z \in \mathbf {F}_2^n : S^{*}(z)_i=1, i \in \mathsf {Supp}(\overline{u}) \text{ and } z_j=0, j \in \mathsf {Supp}(v)\}| \bmod {2} \\= & {} \bigoplus _{z \preceq \overline{v}} \left( S^{*}(z)\right) ^{\overline{u}} \end{aligned}$$

which means that \(a_u\) is the coefficient of \(x^{\overline{v}}\) in the ANF of \(x \mapsto S^{*}(x)^{\overline{u}}\).    \(\square \)

The function \(S^*\) corresponding to the Present Sbox has the following ANF:

$$\begin{aligned} S_1^*(x_1, x_2, x_3, x_4)= & {} 1 + x_1 + x_2 + x_3 + x_4 + x_2x_4\\ S_2^*(x_1, x_2, x_3, x_4)= & {} x_1 + x_4 + x_1x_3 + x_2x_3 + x_1x_2x_3 + x_1x_2x_4 + x_1x_3x_4\\ S_3^*(x_1, x_2, x_3, x_4)= & {} 1 + x_2 + x_4 + x_1x_2 + x_1x_3 + x_1x_4 + x_3x_4 + x_1x_2x_3 \\&\quad + x_1x_2x_4 + x_1x_3x_4\\ S_4^*(x_1, x_2, x_3, x_4)= & {} x_2 + x_3 + x_2x_3 + x_1x_4 + x_3x_4 + x_1x_2x_3 + x_1x_3x_4 \end{aligned}$$

These ANF correspond to the rows with an index of weight \(3\) in Table 2. In particular, the form of the row of index \(\mathtt{0xe}\) (defining \(\mathcal {V}_S(\mathtt{0xe})\)) comes from the fact that the first coordinate of \(S^*\) has degree \(2\) only, and that its ANF contains a single monomial of degree \(2\).

Conversely, we can easily guarantee that, when an \(n\)-bit bijective Sbox is considered, all sets \(\mathcal {V}_S(u)\) with \(wt(u)<n\) contain an element of weight \(1\).

Proposition 10

Let \(S\) be a permutation of \(\mathbf {F}_2^n\). Then, all coordinates of \(S^{-1}\) have degree \((n-1)\) if and only if all sets \(\mathcal {V}_S(u)\), with \(u \ne \underline{1}\), contain at least one element of weight \(1\).

Proof

Since the sets \(\mathcal {V}_S(u)\) include all sets \(\mathcal {V}_S(u')\) for \(u' \succeq u\), we only have to prove the result for the sets \(\mathcal {V}_S(u)\) with \(wt(u)=n-1\). When \(wt(u)=n-1\), we have

$$\begin{aligned} \mathcal {V}_S(u) = V_S(u) \cup V_S(\underline{1}) = V_S(u) \cup \{\underline{1}\} \end{aligned}$$

from Proposition 9. Therefore, all sets \(\mathcal {V}_S(u)\) with \(wt(u)<n\) contain an element of weight \(1\) if and only if all sets \(V_S(u)\) with \(wt(u)=n-1\) contain an element of weight \(1\). This equivalently means that all monomials of degree \((n-1)\) appear in the ANF of the coordinates of \(S\), or from Lemma 3 that all coordinates of \(S^*\) have degree \((n-1)\). But, this last condition equivalently means that all coordinates of \(S^{-1}\) have degree \((n-1)\). Indeed, the monomials of highest degree in the ANF of a Boolean function \(f\) and in the ANF of \(f^*: x \mapsto f(\overline{x})+1\) are the same. Then the \(i\)-th coordinates of \(S^*\) and \(S^{-1}\) have the same degree.    \(\square \)

It is worth noticing that the fact that all coordinates of \(S^{-1}\) have maximal degree is not equivalent to the fact that the same property holds for \(S\), as shown in the following example.

Example 3

We consider the permutation of \(\mathbf {F}_2^4\) corresponding to the inverse of the function \(G_{10}\) defined in the classification in [18, Table 6]. This Sbox has optimal cryptographic properties in the sense that it has the smallest possible differential uniformity and highest nonlinearity. Its coordinates are given by

$$\begin{aligned} S_1(x_1, x_2, x_3, x_4)= & {} x_1 + x_1x_2 + x_2x_3 + x_3x_4 + x_1x_3x_4 + x_2x_3x_4\\ S_2(x_1, x_2, x_3, x_4)= & {} x_2 + x_1x_3 + x_2x_4 + x_3x_4 + x_1x_2x_4 + x_1x_3x_4 + x_2x_3x_4\\ S_3(x_1,x_2, x_3, x_4)= & {} x_3 + x_1x_2 + x_1x_3 + x_1x_4 + x_2x_4 + x_1x_2x_4 + x_1x_3x_4\\ S_4(x_1, x_2, x_3, x_4)= & {} x_4 + x_1x_2 + x_1x_3 + x_1x_2x_4. \end{aligned}$$

It can then be checked that all coordinates of \(S\) have degree \(3\) while the monomial \(x_1x_2x_3\) appears in none of the coordinates, implying that \(\mathcal {V}_S(\mathtt{0x7})\) contains no element of weight \(1\). Indeed,

$$\begin{aligned} \mathcal {V}_S(\mathtt{0x7}) =\{\mathtt{0x9}, \mathtt{0xc}, \mathtt{0x7}, \mathtt{0xb}, \mathtt{0xe}, \mathtt{0xf}\}. \end{aligned}$$

Actually, \(S\) does not satisfy the hypotheses of Proposition 10 since the first coordinate of \(S^{-1}\) has degree \(2\) only.

It is also worth noticing that the condition on the degree of the coordinates of \(S^{-1}\) is not invariant under composition (to the right or left) by an affine permutation.

A simple way to guarantee that all coordinates of the inverse Sbox have maximal degree consists in choosing for \(S\) an Sbox such that any linear combination of its coordinates (i.e. any of its components) has maximal degree. In this case, we obtain the following stronger result on the number of elements of weight \(1\) in \(\mathcal {V}_S(u)\).

Corollary 3

Let \(S\) be a permutation of \(\mathbf {F}_2^n\) such that the Boolean functions \(x \mapsto \lambda \cdot S(x)\) have degree \((n-1)\) for all nonzero \(\lambda \in \mathbf {F}_2^n\). Then, for any \(u \in \mathbf {F}_2^n\), \(\mathcal {V}_S(u)\) contains at least \((n-wt(u))\) elements of weight \(1\).

Proof

We will first prove the result for all \(u\) with \(wt(u)=n-1\) by showing that all coordinates of \(S^{-1}\) have degree \((n-1)\). Let \(A\) denote the \(n \times n\) binary matrix such that \(a_{i,j}\) is the coefficient of the monomial of degree \((n-1)\) \(x^{\overline{e_j}}\) in the ANF of the \(i\)-th coordinate of \(S\) where \(e_j\) denotes the \(n\)-bit word having a single one at position \(j\). A component of \(S\), \(x \mapsto \lambda \cdot S(x)\), \(\lambda \ne 0\), has degree less than \((n-1)\) if and only if the corresponding linear combination of the rows of \(A\) vanishes, i.e., \(\lambda A = 0\). It follows that the number of non-trivial components of \(S\) with degree \((n-1)\) is equal to \(2^{\mathrm {rk}(A)}-1\). From Lemma 3, the coefficients of the monomials of degree \((n-1)\) in the coordinates of \(S^{-1}\) are defined by the transpose of \(A\). Then, the number of non-trivial components of \(S^{-1}\) with degree \((n-1)\) is equal to \(2^{\mathrm {rk}(A^T)}-1\). We deduce from Proposition 10 that, if all components of \(S\) have degree \((n-1)\) (i.e., if \(\mathrm {rk}(A)=\mathrm {rk}(A^T)=n\)), all \(\mathcal {V}_S(u)\) for \(u \ne \underline{1}\) contain at least one element of weight \(1\).

Let us now consider any \(u \in \mathbf {F}_2^n\). By definition, \(\mathcal {V}_S(u)\) contains all sets \(\mathcal {V}_S(\overline{e_i})\) with \(i \not \in \mathsf {Supp}(u)\) since the \(k=(n-wt(u))\) words \(\overline{e_i}\) of weight \((n-1)\) belong to \(\mathsf {Succ}(u)\). As Matrix \(A\) has full rank, the \(k\) columns of \(A\) corresponding to the monomials \(x^{\overline{e_i}}\) with \(i \not \in \mathsf {Supp}(u)\) have rank \(k\), implying that this \(k \times n\)-submatrix has at least \(k\) nonzero rows. These rows correspond to the words of weight \(1\) which belong to \(\mathcal {V}_S(u)\), implying that this set contains at least \(k\) vectors of weight \(1\).    \(\square \)

Example 4

The \(4\)-bit Sbox used in Prince [3] (as well as all Sboxes with similar properties recommended for any cipher within the Prince-family [4, Appendix B]) has been chosen in such a way that all its nontrivial components have degree \(3\). Then, we can guarantee that any set \(\mathcal {V}_S(u)\) contains at least \((4-wt(u))\) elements of weight \(1\). This can be checked on Table 4.

It is worth noticing that the condition exhibited in Corollary 3 offers a similar guarantee for the inverse Sbox. Indeed, making the decryption function also immune to this type of attacks may be relevant, even if mounting the attack on the decryption function is much more difficult in practice because it requires the knowledge of plaintext-ciphertext pairs corresponding to chosen ciphertexts.

Application to the Misty Sboxes. So far, the most important application of the division property is the cryptanalysis of the full Misty1 [24]. It is then relevant to study the Misty Sboxes in the light of the previous criteria. Misty1 is an unbalanced Feistel network. It then uses two different Sboxes, which are both linearly equivalent to a power permutation. More precisely, \(S_7\) is a permutation of degree \(3\) of \(\mathbf {F}_2^7\) and \(S_9\) is a permutation of degree \(2\) of \(\mathbf {F}_2^9\).

We have then computed all sets \(\mathcal {V}_S(u)\) for each of these Sboxes. Most notably, Tables 6 and 7 give the values of the minimal Hamming weight of an element in \(\mathcal {V}_S(u)\) depending on the Hamming weight of \(u\) for the \(7\)-bit Sbox \(S_7\) and for the \(9\)-bit Sbox \(S_9\) respectively. These tables then recover the results on the propagation of the division property described in [24, p. 420]. The fact that, for both Sboxes, many sets \(\mathcal {V}_{S}(u)\) have a large minimal weight show that the two Misty Sboxes are weak regarding the division property. An interesting new observation is that, for \(S_7\), for some vectors \(u\) of weight \(3\), for instance \(u=\mathtt{0x0b}\), \(\mathcal {V}_{S_7}(u)\) does not contain any vector of weight \(1\). This equivalently means that some monomials of degree \(3\) do not appear in any of the coordinates of \(S_7\). This property of \(S_7\), which is more precise that the propagation of the division property studied in [24], may then be exploited in an attack.

Table 6. Minimum Hamming weight of \(\mathcal {V}_{S_7}(u)\) depending on the Hamming weight of \(u\) for the \(7\)-bit Sbox in Misty1.
Table 7. Minimum Hamming weight of \(\mathcal {V}_{S_9}(u)\) depending on the Hamming weight of \(u\) for the \(9\)-bit Sbox in Misty1.

8 Conclusions

In some contexts, the notion of parity set provides a powerful tool for representing subsets. We have shown for instance that the division property has a simple formulation in terms of parity sets, which allows to easily deduce some properties of the sets satisfying the division property of a given order. Also, focusing on the parity set, and not only on the minimal weight of its elements, enables the attacker to capture some algebraic properties of the nonlinear functions used in the cipher, besides the algebraic degree. This general view also brings to light the properties of the Sbox which avoid this type of attacks. The counterpart is that computing the whole parity set after many rounds of a cipher is obviously more expensive than considering its minimal weight only, as this is done in the division property. However, a promising technique consists in combining both approaches where the first and last rounds are analysed with the whole parity set, while the propagation through the middle rounds only exploits the degree of the function. Another direction could be to use parity sets for identifying some sets of weak keys. We have focused on distinguishers which hold for all keys. But, the addition of an unknown key increases the size of the parity set, since all words greater than or equal to the words in the input parity set must be considered. A different approach then consists in considering only round keys of a particular form. Then, the resulting parity set after key addition may simplify a lot, and these particular round keys may then be easily detected.