Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Side channel attacks are well recognized as serious threat to the security of computer systems. Building a system that is resilient to side channel attacks is a challenge, particularly because there are many kinds of side channels (such as, power, timing, and electromagnetic radiation) and attacks on them. In an effort to establish a principled solution to the problem, researchers have proposed formal definitions of resilience against side channel attacks, called leakage resilience [2, 11,12,13, 15, 16]. The benefit of such formal models of side-channel-attack resilience is that a program proved secure according to a model is guaranteed to be secure against all attacks that are permitted within the model.

The previous research has proposed various notions of leakage resilience. In this paper, we focus on the n -threshold-probing model proposed in the seminal work by Ishai et al. [16]. Informally, the model says that, given a program represented as a Boolean circuit, the adversary learns nothing about the secret by executing the program and observing the values of at most n nodes in the circuit (cf. Sect. 2 for the formal definition). The attractive features of the model include its relative simplicity, and the relation to masking, a popular countermeasure technique used in security practice. More precisely, the security under the n-threshold-probing model is equivalent to the security under \(n^{th}\)-order masking [22], and often, the literature uses the terminologies interchangeably [2,3,4, 6, 9, 10]. Further, as recently shown by Duc et al. [8], the security under the model also implies the security under the noisy leakage model [21] in which the adversary obtains information from every node with a probabilistically distributed noise.

In a recent work, Eldib and Wang [9] have proposed a synthesis method that, given a program represented as a circuit, returns a functionally equivalent circuit that is leakage resilient according to the n-threshold-probing model, for the case \(n=1\) (i.e., the adversary observes only one node). The method is a constraint-based algorithm whereby the constraints expressing the necessary conditions are solved in a CEGAR (counterexample-guided abstraction refinement) style. In this work, we extend the synthesis to the general case where n can be greater than 1. Unfortunately, naively extending (the monolithic version of) their algorithm to the case \(n > 1\) results in a method whose complexity is double exponential in n, leading to an immediate roadblock.Footnote 1 As we show empirically in Sect. 5, the cost is highly substantial, and the naive monolithic approach fails even for the case \(n = 2\) on reasonably simple examples.

Our solution to the problem is to exploit a certain compositionality property admitted by the leakage resilience model. We state and prove the property formally in Theorems 2 and 3. Roughly, the compositionality theorems say that composing n-leakage-resilient circuits results in an n-leakage-resilient circuit, under the condition that the randoms in the components are disjoint. The composition property is quite general and is particularly convenient for synthesis. It allows a compositional synthesis method which divides the given circuit into smaller sub-circuits, synthesizes n-leakage-resilient versions of them, and combines the results to obtain an n-leakage-resilient version of the whole. The correctness is ensured by using disjoint randoms in the synthesized sub-circuits. Our approach is an interesting contrast to the approach that aims to achieve compositionality without requiring the disjointness of the component’s randoms, but instead at the cost of additional randoms at the site of the composition [3, 6].

We remark that the compositionality is not at all obvious and quite unexpected. Indeed, at first glance, n-leakage resilience for each individual component seems to say nothing about the security of the composed circuit against an adversary who can observe the nodes of multiple different components in the composition. To further substantiate the non-triviality, we remark that the compositionality property is quite sensitive, and for example, it fails to hold if the bounds are relaxed even slightly so that the adversary makes at most n observations within each individual component but the total number of observations is allowed to be just one more than n (cf. Example 2).

To synthesize n-leakage-resilient sub-circuits, we extend the monolithic algorithm from [9] to the case where n can be greater than 1. We make several improvements to the baseline algorithm so that it scales better for the case \(n > 1\) (cf. Sect. 4.1). We have implemented a prototype of our compositional synthesis algorithm, and experimented with the implementation on benchmarks taken from the literature. We summarize our contributions below.

  • A proof that the n-threshold-probing model of leakage resilience is preserved under certain circuit compositions (Sect. 3).

  • A compositional synthesis algorithm for the leakage-resilience model that utilizes the compositionality property (Sect. 4).

  • Experiments with a prototype implementation of the synthesis algorithm (Sect. 5).

The rest of the paper is organized as follows. Section 2 introduces preliminary definitions and notations, including the formal definition of the n-threshold-probing model of leakage resilience. Section 3 states and proves the compositionality property. Section 4 describes the compositional synthesis algorithm. We report on the experience with a prototype implementation of the algorithm in Sect. 5, and discuss related work in Sect. 6. We conclude the paper in Sect. 7. The extended report [5] contains the omitted proofs.

2 Preliminaries

We use boldface font for finite sequences. For example, \(\varvec{b} = b_1,b_2,\dots ,b_n\). We adopt the standard convention of the literature [3, 4, 9, 10] and assume that a program is represented as an acyclic Boolean circuit.Footnote 2 We assume the usual Boolean operators, such as XOR gates \(\oplus \), AND gates \(\wedge \), OR gates \(\vee \), and NOT gates \(\lnot \).

A program has three kinds of inputs, secret inputs (often called keys) ranged over by \(k\), public inputs ranged over by \(p\), and random inputs ranged over by \(r\). Informally, secret inputs contain the secret bits to be protected from the adversary, public inputs are those that are visible and possibly given by the adversary, and random inputs contain bits that are generated uniformly at random (hence, it may be more intuitive to view randoms as not actual “inputs”).

Consider a program P with secret inputs \(k_1,\dots ,k_x\). In the n-threshold-probing model of leakage resilience, we prepare n+1-split shares of each \(k_i\) (for \(i \in \{{1,\dots ,x}\}\)):

$$ r_{i,1}, \ \dots , \ r_{i,n}, \ k_i \oplus (\bigoplus _{j=1}^n r_{i,j}) $$

where each \(r_{i,j}\) is fresh. Note that the split shares sum to \(k_i\), and (assuming that \(r_{i,j}\)’s are uniformly independently distributed) observing up to n many shares reveals no information about \(k_i\). Adopting the standard terminology of the literature [16], we call the circuit that outputs such split shares the input encoder of P. The leakage resilience model also requires an output decoder, which sums the split shares at the output. More precisely, suppose P has y many \({n}\) \(+\) \(1\)-split outputs \(\varvec{o_1},\dots ,\varvec{o_y}\) (i.e., \(|\varvec{o_i}| = n+1\) for each \(i \in \{{1,\dots ,y}\}\)). Then, the output decoder for P is, for each \(\varvec{o_i} = o_{i,1},\dots ,o_{i,n+1}\), the circuit \(\bigoplus _{j=1}^{n+1} o_{i,j}\). For example, Fig. 1 shows a \({2}\) \(+\) \(1\)-split circuit with the secret inputs \(k_1,k_2\), public inputs \(p_1,p_2\), random inputs \(r_1,r_2,r_3,r_4\), and two outputs. Note that the input encoder (the region Input Encoder) introduces the randoms, and the output decoder (the region Output Decoder) sums the output split shares.

Fig. 1.
figure 1

Example of a 2-leakage-resilient circuit

We associate a unique label (ranged over by \(\alpha \)) to every gate of the circuit. We call the nodes of P, \({ nodes}({P})\), to be the set of labels in P excluding the gates internal to the input encoder and the output decoder part of P (but including the outputs of the input encoder and the inputs to the output decoder). Intuitively, \({ nodes}({P})\) are the nodes that can be observed by the adversary. For example, in Fig. 1, the observable nodes are the ones in the region Observable Nodes labeled \(\alpha _1,\dots ,\alpha _{15}\).

Let \(\nu \) be a mapping from the inputs of P to \(\{{0,1}\}\). Let \(\varvec{\alpha } = \alpha _1,\dots ,\alpha _n\) be a vector of nodes in \({ nodes}({P})\). We define the evaluation, \({\nu }_{P}({\varvec{\alpha }})\), to be the vector \(b_1,\dots ,b_n \in \{{0,1}\}^n\), such that each \(b_i\) is the valuation of the node \(\alpha _i\) when evaluating P under \(\nu \). For example, let \(P'\) be the circuit from Fig. 1. Let \(\nu \) map \(p_1, r_1,k_2\) to 0, and the others to 1. Then, \({\nu }_{P'}({\alpha _{11},\alpha _1}) = 0,1\).

Let us write \(\nu [\varvec{v} \mapsto \varvec{b}]\) for the store \(\nu \) except that each \(v_i\) is mapped to \(b_i\) (for \(i \in \{{1,\dots m}\}\)) where \(\varvec{v} = v_1,\dots ,v_m\) and \(\varvec{b} = b_1,\dots ,b_m\). Let P be a circuit with secret inputs \(\varvec{k}\), public inputs \(\varvec{p}\), and random inputs \(\varvec{r}\). For \(\varvec{b_p} \in \{{0,1}\}^{|p|}\), \(\varvec{b_k} \in \{{0,1}\}^{|k|}\), \(\varvec{\alpha } \in { nodes}({P})^*\), and \(\varvec{b_\alpha } \in \{{0,1}\}^{|\varvec{\alpha }|}\), let \( \#_{P}({\varvec{b_p}},{\varvec{b_k}},{\varvec{\alpha }},{\varvec{b_\alpha }}) = |\{{\varvec{b} \in \{{0,1}\}^{|\varvec{r}|} \mid {\nu [\varvec{r}\mapsto \varvec{b}]}_{P}({\varvec{\alpha }}) = \varvec{b_\alpha }}\}| \) where \(\nu = \{{\varvec{p}\mapsto \varvec{b_p},\varvec{k}\mapsto \varvec{b_k}}\}\). We define \(\mu _{P}({\varvec{b_p}},{\varvec{b_k}},{\varvec{\alpha }})\) to be the finite map from each \(\varvec{b_\alpha }\in \{{0,1}\}^{|\varvec{\alpha }|}\) to \(\#_{P}({\varvec{b_p}},{\varvec{b_k}},{\varvec{\alpha }},{\varvec{b_\alpha }})\). We remark that \(\mu _{P}({\varvec{b_p}},{\varvec{b_k}},{\varvec{\alpha }})\), when normalized by the scaling factor \(2^{-|\varvec{r}|}\), is the joint distribution of the values of the nodes \(\varvec{\alpha }\) under the public inputs \(\varvec{b_p}\) and the secret inputs \(\varvec{b_k}\).

Roughly, the n-threshold-probing model of leakage resilience says that, for any selection of n nodes, the joint distribution of the nodes’ values is independent of the secret. Formally, the leakage-resilience model is defined as follows.

Definition 1

(Leakage Resilience). Let P be an \({n}\) \(+\) \(1\)-split circuit with secret inputs \(\varvec{k}\), public inputs \(\varvec{p}\), and random inputs \(\varvec{r}\). Then, P is said to be leakage-resilient under the n-threshold-probing model (or, simply n-leakage-resilient) if for any \(\varvec{b_p} \in \{{0,1}\}^{|\varvec{p}|}\), \(\varvec{b_k} \in \{{0,1}\}^{|\varvec{k}|}\), \(\varvec{b_k}' \in \{{0,1}\}^{|\varvec{k}|}\), and \(\varvec{\alpha } \in { nodes}({P})^n\), \(\mu _{P}({\varvec{b_p}},{\varvec{b_k}},{\varvec{\alpha }}) = \mu _{P}({\varvec{b_p}},{\varvec{b_k}'},{\varvec{\alpha }})\).

We remark that, above, \(\varvec{r}\) includes all randoms introduced by the input encoder as well as any additional ones that are not from the input encoder, if any. For instance, in the case of the circuit from Fig. 1, the randoms are \(r_1,r_2,r_3,r_4\) and they are all from the input encoder.

Fig. 2.
figure 2

A circuit computing \((p_1\oplus k_1\oplus k_2, k_2\wedge p_2)\)

Informally, the n-threshold-probing model of leakage resilience says that the attacker learns nothing about the secret by executing the circuit and observing the values of up to n many internal gates and wires, excluding those that are internal to the input encoder and the output decoder.

We say that a circuit is random-free if it has no randoms. Let P be a random-free circuit with public inputs \(\varvec{p}\) and secret inputs \(\varvec{k}\), and \(P'\) be a circuit with public inputs \(\varvec{p}\), secret inputs \(\varvec{k}\), and randoms \(\varvec{r}\). We say that \(P'\) is IO-equivalent to P if for any \(\varvec{b_k} \in \{{0,1}\}^{|\varvec{k}|}\), \(\varvec{b_p} \in \{{0,1}\}^{|\varvec{p}|}\), and \(\varvec{b_r} \in \{{0,1}\}^{|\varvec{r}|}\), the output of P when evaluated under \(\nu = \{{\varvec{p} \mapsto \varvec{b_p},\varvec{k} \mapsto \varvec{b_k}}\}\) is equivalent to that of \(P'\) when evaluated under \(\nu [\varvec{r} \mapsto \varvec{b_r}]\). We formalize the synthesis problem.

Definition 2

(Synthesis Problem). Given \(n > 0\) and a random-free circuit P as input, the synthesis problem is the problem of building a circuit \(P'\) such that 1.) \(P'\) is IO-equivalent to P, and 2.) \(P'\) is n-leakage-resilient.

An important result by Ishai et al. [16] is that any random-free circuit can be converted to an IO-equivalent leakage-resilient form.

Theorem 1

([16]). For any random-free circuit P, there exists an n-leakage-resilient circuit that is IO-equivalent to P.

While the result is of theoretical importance, the construction is more of a proof-of-concept in which every gate is transformed uniformly, and the obtained circuits can be quite unoptimized (e.g., injecting excess randoms to mask computations that do not depend on secrets). The subsequent research has proposed to construct more optimized leakage-resilient circuits manually [6, 22], or by automatic synthesis [9]. The latter is the direction of the present paper.

Example 1

Consider the random-free circuit P shown in Fig. 2 which outputs \((p_1\oplus k_1\oplus k_2,k_2\wedge p_2)\). Let \(P'\) be the circuit from Fig. 1. It is easy to see that \(P'\) is IO-equivalent to P. Also, it can be shown that \(P'\) is 2-leakage resilient. Therefore, \(P'\) is a 2-leakage-resilient version of P.

Remark 1

The use of the input encoder and the output decoder is unavoidable. It is easy to see that the input encoder is needed. Indeed, without it, one cannot even defend against an one-node-observing attacker as she may directly observe the secret. To see that the output decoder is also required, consider an one-output circuit without the output decoder and let n be the fan-in of the last gate before the output. Then, assuming that the output depends on the secret, the circuit cannot defend itself against an n-nodes-observing attacker as she may observe the inputs to the last gate.

Remark 2

In contrast to the previous works [3, 6] that implicitly assume that each secret is encoded (i.e., split in \({n}\) \(+\) \(1\) shares) by only one input encoder, we allow a secret to be encoded by multiple input encoders. The relaxation is important in our setting because, as remarked before, the compositionality results require disjointness of the randoms in the composed components.

Split and Non-split Inputs/Outputs. We introduce terminologies that are convenient when describing the compositionality results in Sect. 3. We use the term split inputs to refer to the \({n}\) \(+\) \(1\) tuples of wires to which the \({n}\) \(+\) \(1\)-split (secret) inputs produced by the input encoder (i.e., the pair of triples \(r_1,r_2,k_1\oplus r_1\oplus r_2\) and \(r_3,r_4,k_2\oplus r_3\oplus r_4\) in the example of Fig. 1) are passed, and use the term non-split inputs to refer to the wires to which the original inputs before the split (i.e., \(k_1\) and \(k_2\) in Fig. 1) are passed. We define split outputs and non-split outputs analogously. Roughly, the split inputs and outputs are the inputs and outputs of the attacker-observable part of the circuit (i.e., the region Observable Nodes in Fig. 1), whereas the non-split inputs and outputs are those of the whole circuit with the input encoder and the output decoder.

Fig. 3.
figure 3

Parallel composition of \(P_1\) and \(P_2\).

3 Compositionality of Leakage Resilience

This section shows the compositionality property of the n-threshold-probing model of leakage resilience. We state and prove two main results (for space the proofs are deferred to the extended report [5]).

The first result concerns parallel compositions. It shows that given two n-leakage-resilient circuits \(P_1\) and \(P_2\) that possibly share inputs, the composed circuit that runs \(P_1\) and \(P_2\) in parallel is also n-leakage resilient, assuming that the randoms in the two components are disjoint. Figure 3 shows the diagram depicting the composition. The second result concerns sequential compositions, and it is significantly harder to prove than the first one. The sequential composition takes an n-leakage-resilient circuit \(P_2\) having y many (non-split) inputs, and n-leakage-resilient circuits \(P_{11},\dots ,P_{1y}\) each having one (non-split) output. The composition is done by connecting each split output of the output-decoder-free part of \(P_{1i}\) to the ith split input of the input-encoder-free part of \(P_2\). Clearly, the composed circuit is IO-equivalent to the one formed by connecting each non-split output of \(P_{1i}\) to the ith non-split input of \(P_2\). The sequential compositionality result states that the composed circuit is also n-leakage resilient, under the assumption that the randoms in the composed components are disjoint. Figure 4 shows the diagram of the composition. We state and prove the parallel compositionality result formally in Sect. 3.1, and the sequential compositionality result in Sect. 3.2.

We remark that, in the sequential composition, if a (non-split) secret input, say \(k\), is shared by some \(P_{1i}\) and \(P_{1j}\) for \(i \ne j\), then the disjoint randomness condition requires \(k\) to be encoded by two independent input encoders. This is in contrast to the previous works [3, 6] that only use one input encoder per a secret input. On the other hand, such works require additional randoms at the site of the composition, whereas no additional randoms are needed at the composition site in our case as it directly connects the split outputs of \(P_{1i}\)’s to the split inputs of \(P_2\).

Fig. 4.
figure 4

Sequential composition of \(P_{11}\), \(P_{12}\), and \(P_2\). Here, \({P_{11}}'\) (resp. \({P_{12}}'\)) is the output-decoder-free part of \(P_{11}\) (resp. \(P_{12}\)), and \({P_2}'\) is the input-encoder-free part of \(P_2\). The composition connects the split outputs of \({P_{11}}'\) and \({P_{12}}'\) to the split inputs of \({P_2}'\).

3.1 Parallel Composition

This subsection proves the parallel compositionality result. Let us write \({P_1}||{P_2}\) for the parallel composition of \(P_1\) and \(P_2\). We state and prove the result.

Theorem 2

Let \(P_1\) and \(P_2\) be n-leakage-resilient circuits having disjoint randoms. Then, \({P_1}||{P_2}\) is also n-leakage-resilient.

Remark 3

While Theorem 2 only states that \({P_1}||{P_2}\) can withstand an attack that observes up to n nodes total from the composed circuit, a stronger property can actually be derived from the proof of the theorem. That is, the proof shows that \({P_1}||{P_2}\) can withstand an attack that observes up to n nodes from the \(P_1\) part and up to n nodes from the \(P_2\) part. (However, it is not secure against an attack that picks more than n nodes in an arbitrary way: for example, picking \(n+1\) nodes from one side.)

3.2 Sequential Composition

This subsection proves the sequential compositionality result. As remarked above, the result is significantly harder to prove than the parallel compositionality result. Let us write \(({P_{11},\dots ,P_{1y}})\mathbin {\rhd }{P_2}\) for the sequential composition of \(P_{11},\dots ,P_{1y}\) with a y-input circuit \(P_2\). We state and prove the sequential compositionality result.

Theorem 3

Let \(P_{11},\dots ,P_{1y}\) be n-leakage-resilient circuits, and \(P_2\) be an y-input n-leakage-resilient circuit, having disjoint randoms. Then, \(({P_{11},\dots ,P_{1y}})\mathbin {\rhd }{P_2}\) is n-leakage-resilient.

Fig. 5.
figure 5

Output-sharing sequential composition of \(P_1\), \(P_{21}\), and \(P_{22}\). Here, \({P_1}'\) is the output-decoder-free part of \(P_1\), and \({P_{21}}'\) (resp. \({P_{22}}'\)) is the input-encoder-free part of \(P_{21}\) (resp. \(P_{22}\)). The composition connects the split output of \({P_1}'\) to the split inputs of \({P_{21}}'\) and \({P_{22}}'\).

Example 2

As remarked in Sect. 3, the parallel compositionality result enjoys an additional property that the circuit is secure even under an attack that observes more than n nodes in the composition as long as the observation in each component is at most n. We show that the property does not hold in the case of sequential composition. Indeed, it can be shown that just allowing \(n+1\) observations breaks the security even if the number of observations made within each component is at most n.

To see this, consider the \({n}\) \(+\) \(1\)-split circuit shown in Fig. 6. The circuit implements the identity function, and it is easy to see that the circuit is n-leakage resilient. Let \(P_1\) and \(P_2\) be copies of the circuit, and consider the composition \(({P_1})\mathbin {\rhd }{P_2}\). Then, the composed circuit is not secure against an attack that observes m nodes of \(P_1\) for some \(1 \le m \le n\), and observes \(n+1 - m\) nodes of \(P_2\) such that the nodes picked on the \(P_2\) side are the nodes connected to the nodes that are not picked on the \(P_1\) side.

Fig. 6.
figure 6

An n-leakage resilient identity circuit.

Remark 4

By a reasoning similar to the one used in the proof of Theorem 3, we can show the correctness of a more parsimonious version of parallel composition theorem (Theorem 2) where given \(P_1\) and \(P_2\) that shares a secret, instead of \({P_1}||{P_2}\) duplicating split shares of the secret as in Fig. 3, we make one split share tuple to be used in the both sides of the composition. Combining this improved parallel composition with the sequential compositionality result, we obtain compositionality for the case where an output of a circuit is shared by more than one circuit in a sequential composition.

Figure 5 depicts such an output-sharing sequential composition. Here, \(P_1\), \(P_{21}\), and \(P_{22}\) are n-leakage-resilient circuits, and we wish to compose them by connecting the output of \(P_1\) to the input of \(P_{21}\) and the input of \(P_{22}\). By the parallel compositionality result, the parallel composition of \(P_{21}\) and \(P_{22}\) that shares the same input (\(v\)) is n-leakage-resilient. Then, it follows that, sequentially composing that parallelly composed circuit with \(P_1\), as depicted in the figure, is also n-leakage-resilient thanks to the sequential compositionality result.

4 Compositional Synthesis Algorithm

The compositionality property gives a way for a compositional approach to synthesizing leakage-resilient circuits. Algorithm 1 shows the overview of the synthesis algorithm. Given a random-free circuit as an input, the algorithm synthesizes an IO-equivalent n-leakage-resilient circuit. It first invokes the \(\textsc {Decomp}\) operation to choose a suitable decomposition of the given circuit into some number of sub-circuits. Then, it invokes \(\textsc {MonoSynth}\) on each sub-circuit \(P_i\) to synthesize an n-leakage resilient circuit \({P_i}'\) that is IO-equivalent to \(P_i\). Finally, it returns the composition of the obtained circuits as the synthesized n-leakage resilient version of the original.

figure a

\(\textsc {Comp}\) is the composition operation, and it composes the given n-leakage-resilient circuits in the manner described in Sect. 3. \(\textsc {MonoSynth}\) is a constraint-based “monolithic” synthesis algorithm that synthesizes an n-leakage-resilient circuit that is IO-equivalent to the given circuit without further decomposition. We describe \(\textsc {MonoSynth}\) in Sect. 4.1, and describe the decomposition operation \(\textsc {Decomp}\) in Sect. 4.2.

The algorithm optimizes the synthesized circuits in the following ways. First, as described in Sect. 4.1, the monolithic synthesis looks for tree-shaped circuits of the shortest tree height. Secondly, as described in Sect. 4.2, the decomposition and composition is done in a way to avoid unnecessarily making the non-secret-dependent parts leakage resilient, and also to re-use the synthesis results for shared sub-circuits whenever allowed by the compositionality properties.

Remark 5

The compositional algorithm composes the n-leakage-resilient versions of the sub-circuits. Note that the compositionality property states that the result will be n-leakage-resilient after the composition, regardless of how the sub-circuits are synthesized as long as they are also n-leakage-resilient and have disjoint randoms. Thus, in principle, any method to synthesize the n-leakage-resilient versions of the sub-circuits may be used in place of \(\textsc {MonoSynth}\). For instance, a possible alternative is to use a database of n-leakage-resilient versions of commonly-used circuits (e.g., obtained via the construction of [6, 16]).

4.1 Constraint-Based Monolithic Synthesis

The monolithic synthesis algorithm is based on and extends the constraint-based approach proposed by Eldib and Wang [9]. The algorithm synthesizes an n-leakage-resilient circuit that is IO-equivalent to the given circuit. The algorithm requires the given circuit to have only one output. Therefore, the overall algorithm to decomposes the whole circuit into such sub-circuits before passing them to \(\textsc {MonoSynth}\).

We formalize the algorithm as quantified first-order logic constraint solving. Let P be the random-free circuit given as the input. We prepare a quantifier-free formula \(\varPhi _P(\varvec{\alpha },\varvec{p},\varvec{k},o)\) on the free variables \(\varvec{\alpha }\), \(\varvec{p}\), \(\varvec{k}\), \(o\) that encodes the input-output behavior of P. Formally, \(\exists \varvec{\alpha }.\varPhi _P(\varvec{\alpha },\varvec{p},\varvec{k},o)\) is true iff P outputs \(o\) given public inputs \(\varvec{p}\) and secret inputs \(\varvec{k}\). The variables \(\varvec{\alpha }\) are used for encoding the shared sub-circuits within P (i.e., gates of fan-out \(>1\)). For example, for P that outputs \(k\wedge p\), \(\varPhi _P(p,k,o) \equiv o= k\wedge p\).

Fig. 7.
figure 7

(a) Skeleton circuit. (b) \(\varPhi _{\textit{Sk}_{2}}\) for the skeleton circuit.

Adopting the approach of [9], our monolithic algorithm works by preparing skeleton circuits of increasing size, and searching for an instance of the skeleton that is n-leakage-resilient and IO-equivalent to P. By starting from a small skeleton, the algorithm biases toward finding an optimized leakage resilient circuit. Formally, a skeleton circuit \(\textit{Sk}_{\ell }\) is a tree-shaped circuit (i.e., circuit with all non-input gates having fan-out of 1) of height \(\ell \) whose gates have undetermined functionality except for the parts that implement the input encoder and the output decoder. For example, Fig. 7(a) shows the \({1}\) \(+\) \(1\)-split skeleton circuit of height 2 with one secret input.

We prepare a quantifier-free skeleton formula \(\varPhi _{\textit{Sk}_{\ell }}\) that expresses the skeleton circuit. Formally, \(\varPhi _{\textit{Sk}_{\ell }}(\varvec{C}, \varvec{\alpha },\varvec{p},\varvec{k},\varvec{r},o)\) is true iff \(P'\) outputs \(o\) with the valuations of \({ nodes}({P'})\) having the values \(\varvec{\alpha }\) given public inputs \(\varvec{p}\), secret inputs \(\varvec{k}\), and random inputs \(\varvec{r}\) where \(P'\) is \(\textit{Sk}_{\ell }\) with its nodes’ functionality determined by \(\varvec{C}\).Footnote 3 We call \(\varvec{C}\) the control variables, and write \(\textit{Sk}_{\ell }(\varvec{C})\) for the instance of \(\textit{Sk}_{\ell }\) determined by \(\varvec{C}\). For example, Fig. 7(b) shows \(\varPhi _{\textit{Sk}_{\ell }}\) for the skeleton circuit from Fig. 7(a), when there is one public input and no randoms besides the one from the input encoder.

The synthesis is now reduced to the problem of finding an assignment to \(\varvec{C}\) that satisfies the constraint \(\varPhi _{\textsc {io}}(\varvec{C}) \wedge \varPhi _{\textsc {lr}}(\varvec{C})\) where \(\varPhi _{\textsc {io}}(\varvec{C})\) expresses that \(\textit{Sk}_{\ell }(\varvec{C})\) is IO-equivalent to P, and \(\varPhi _{\textsc {lr}}(\varvec{C})\) expresses that \(\textit{Sk}_{\ell }(\varvec{C})\) is n-leakage-resilient. As we shall show next, the constraints faithfully encode IO-equivalence and leakage-resilience according to the definitions from Sect. 2.

\(\varPhi _{\textsc {io}}(\varvec{C})\) is the formula below.

It is easy to see that \(\varPhi _{\textsc {io}}\) correctly expresses IO equivalence.

The definition of \(\varPhi _{\textsc {lr}}\) is more involved, and we begin by introducing a useful shorthand notation. Let m be the number of (observable) nodes in \(\textit{Sk}_{\ell }\). Without loss of generality, we assume that \(m \ge n\). Let \(\varvec{\sigma } \in \{{1,\dots ,m}\}^*\) be a sequence such that \(|\varvec{\sigma }| \le n\), and \(\varvec{\alpha } = \alpha _1,\dots ,\alpha _m\) be a length m sequence of variables. We write \({\varvec{\alpha }}\langle {\varvec{\sigma }}\rangle \) for the sequence of variables \(\beta _1,\dots \beta _{|\varvec{\sigma }|}\) such that \(\beta _i = \alpha _{\sigma _i}\) for each \(i \in \{{1,\dots ,|\varvec{\sigma }|}\}\). Intuitively, \(\varvec{\sigma }\) represents a selection of nodes observed by the adversary. For example, let \(\varvec{\alpha }=\alpha _1,\alpha _2,\alpha _3\) and \(\varvec{\sigma } = 1,3\), then \({\varvec{\alpha }}\langle {\varvec{\sigma }}\rangle = \alpha _1,\alpha _3\).

Let \(\mathcal {R}= \{{0,1}\}^{|\varvec{r}|}\). Then, \(\varPhi _{\textsc {lr}}(\varvec{C})\) is the formula below.

where \(\varvec{\alpha }'\) is a sequence comprising distinct variables \(\varvec{\alpha _b}\) and \(\varvec{\alpha '_b}\) such that \(|\varvec{\alpha _b}| = |\varvec{\alpha '_b}| = m\) for each \(\varvec{b} \in \mathcal {R}\), \(\varvec{o}\) is a sequence comprising distinct variables \(o_{\varvec{b}}\) and \(o'_{\varvec{b}}\) for each \(\varvec{b} \in \mathcal {R}\), and \(\phi \) is the formula \(\varPhi _{\textit{Sk}_{\ell }}(\varvec{C},\varvec{\alpha },\varvec{p},\varvec{k},\varvec{b},o)\). While \(\varPhi _{\textsc {lr}}(\varvec{C})\) is not strictly a first-order logic formula, it can be converted to the form by expanding the finitely many possible choices of \(\varvec{\sigma }\).

figure b

Because the domains of the quantified variables in \(\varPhi _{\textsc {io}}\) and \(\varPhi _{\textsc {lr}}\) are finite, one approach to solving the constraint may be first eliminating the quantifiers eagerly and then solving for \(\varvec{C}\) that satisfies the resulting constraint. However, the approach is clearly impractical due to the extreme size of the resulting constraint. Instead, adopting the idea from [9], we solve the constraint by lazily instantiating the quantifiers. The main idea is to divide the constraint solving process in two phases: the candidate finding phase that infers a candidate solution for \(\varvec{C}\), and the candidate checking phase that checks whether the candidate is actually a solution. We run the phases iteratively in a CEGAR style until convergence. Algorithm 2 shows the overview of the process. We describe the details of the algorithm below.

Candidate Checking. The candidate checking phase is straightforward. Note that, after expanding the choices of \(\varvec{\sigma }\) in \(\varPhi _{\textsc {lr}}\), \(\varPhi _{\textsc {io}}(\varvec{C}) \wedge \varPhi _{\textsc {lr}}(\varvec{C})\) only has outer-most \(\forall \) quantifiers. Therefore, given a concrete assignment to \(\varvec{C}\), \(\varvec{b_C}\), \(\textsc {CheckCand}\) directly solves the constraint by using an SMT solver.Footnote 4 (However, naively expanding \(\varvec{\sigma }\) can be costly when \(n > 1\), and we show a modification that alleviates the cost in the later part of the subsection.)

Candidate Finding. We describe the candidate finding process \(\textsc {FindCand}\). To find a likely candidate, we adopt the idea from [9] and prepare a test set that is updated via the CEGAR iterations. In [9], a test set, \(\textit{tset}\), is a pair of sets \(\textit{tset}_p\) and \(\textit{tset}_k\) where \(\textit{tset}_p\) (resp. \(\textit{tset}_k\)) contains finitely many concrete valuations of \(\varvec{p}\) (resp. \(\varvec{k}\)). Having such a test set, we can rewrite the constraint so that the public inputs and secret inputs are restricted to those from the test set. That is, \(\varPhi _{\textsc {io}}(\varvec{C})\) is rewritten to be the formula below.

And, \(\varPhi _{\textsc {lr}}(\varvec{C})\) becomes the formula below.

where \(\phi \) is the formula \(\varPhi _{\textit{Sk}_{\ell }}(\varvec{C},\varvec{\alpha },\varvec{b_p},\varvec{k},\varvec{b},o)\). We remark that, because fixing the inputs to concrete values also fixes the valuations of some other variables (e.g., fixing \(\varvec{p}\) and \(\varvec{k}\) also fixes \(\varvec{\alpha }\) and \(o\) in \(\varPhi _P(\varvec{\alpha },\varvec{p},\varvec{k},o)\)), the constraint structure is modified to remove the quantifications on such variables.

At this point, the approach of [9] can be formalized as the following process: it eagerly instantiates the possible choices of \(\varvec{\sigma }\) and \(\varvec{\beta }\) to reduce the constraint to a quantifier-free form, and looks for a satisfying assignment to the resulting constraint. This is a sensible approach when n is 1 because, in that case, the number of possible choices of \(\varvec{\sigma }\) is linear in the size of the skeleton (i.e., is m) and the possible valuations of \(\varvec{\beta }\) are simply \({\{{0,1}\}}\). Unfortunately, the number of possible choices of \(\varvec{\sigma }\) grows exponentially in n, and so does that of the possible valuations of \(\varvec{\beta }\).Footnote 5 We remark that this is expected because \(\varvec{\sigma }\) represents the adversary’s node selection choice, and \(\varvec{\beta }\) represents the valuation of the selected nodes. Indeed, in our experience with a prototype implementation, this method fails even on quite small sub-circuits and with n just 2.

Therefore, we make the following improvements to the base algorithm.

  1. (1)

    We restrict the node selection to root-most \(\gamma \) nodes where \(\gamma \) starts at n and is incremented via the CEGAR loop.

  2. (2)

    We include node valuations in the test set.

  3. (3)

    We use dependency analysis to reduce irrelevant node selections from the constraint in the candidate checking phase.

The rationale for prioritizing the root-most nodes in (1) is that, in a tree-shaped circuit, nodes closer to the root are more likely to be dependent on the secret and therefore are expected to be better targets for the adversary. The number of root-most nodes to select, \(\gamma \), is incremented as needed by a counterexample analysis (cf. lines 7–9 of Algorithm 2). The test set generation for node valuations described in item (2) is done in much the same way as that for public inputs and secret inputs. We describe the test set generation process in more detail in Test Set Generation. With the modifications (1) and (2), the leakage-resilience constraint to be solved in the candidate finding phase is now the following formula.

where \(\varvec{\sigma }:\gamma \) restricts \(\varvec{\sigma }\) to the root most \(\gamma \) indexes, \(\textit{tset}_\beta \) is the set of test set elements for node valuations, and \(\phi \) is the formula \(\varPhi _{\textit{Sk}_{\ell }}(\varvec{C},\varvec{\alpha },\varvec{b_p},\varvec{k},\varvec{b},o)\).

Unlike (1) and (2), the modification (3) applies to the candidate checking phase. To see the benefit of this modification, note that, even in the candidate checking phase, checking the leakage-resilience condition \(\varPhi _{\textsc {lr}}(\varvec{b_C})\) can be quite expensive because it involves expanding exponentially many possible choices of node selections. To mitigate the cost, we take advantage of the fact that the candidate circuit is fixed in the candidate checking phase, and do a simple dependency analysis on the candidate circuit to reduce irrelevant node-selection choices. We describe the modification in more detail. Let \(P'\) be the candidate circuit. For each node of \(P'\), we collect the reachable leafs from the node to obtain the over-approximate set of inputs on which the node may depend. For a node \(\alpha \) of \(P'\), let \(\textit{deps}({\alpha })\) be the obtained set of dependent inputs for \(\alpha \). Then, any selection of nodes \(\varvec{\alpha }\) such that \(\bigcup _{\alpha \in \{{\varvec{\alpha }}\}} \textit{deps}({\alpha })\) does not contain all \({n}\) \(+\) \(1\)-split shares of some secret is an irrelevant selection and can be removed from the constraint. (Here, we use the symbols \(\alpha \) for node labels as in Sect. 2, and not as node-valuation variables in a constraint.)

Test Set Generation. Recall that our algorithm maintains three kinds of test sets, \(\textit{tset}_p\) for public inputs, \(\textit{tset}_k\) for secret inputs, and \(\textit{tset}_\beta \) for node valuations. As shown in lines 7–9 of Algorithm 2, we obtain new test set elements from candidate check failures (here, by abuse of notation, we write \(\textit{tset}\cup \textit{tset}'\) for the component-wise union). We describe the process in more detail. In \(\textsc {CheckCand}\), we convert the constraint \(\varPhi _{\textsc {io}}(\varvec{b_C}) \wedge \varPhi _{\textsc {lr}}(\varvec{b_C})\) to a quantifier free formula \(\varPhi \) by expanding the selection choices and removing the universal quantifiers. Then, we use an SMT solver to check the satisfiability of \(\lnot \varPhi \) and return \(\textsf {success}\) if it is unsatisfiable. Otherwise, the SMT solver returns a satisfying assignment of \(\lnot \varPhi \), and we return the values assigned to variables corresponding to public inputs, secret inputs and node valuations as the new elements for the respective test sets. The number of root-most nodes to select is also raised here by taking the maximum of the root-most nodes observed in the satisfying assignment, \(\gamma '\), with the current \(\gamma \).

4.2 Choosing Decomposition

This subsection describes the decomposition procedure \(\textsc {Decomp}\). Thanks to the generality of the compositionality results, in principle, we can decompose the given circuit into arbitrarily small sub-circuits (i.e., down to individual gates). However, choosing a too fine-grained decomposition may lead to a sub-optimal result.Footnote 6

To this end, we have implemented the following decomposition strategy. First, we run a dependency analysis, similar to the one used in the constraint-based monolithic synthesis (cf. Sect. 4.1). The analysis result is used to identify the parts of the given circuit that do not depend on any of the secrets. We factor out such public-only sub-circuits from the rest so that they will not be subject to the leakage-resilience transformation.

Next, we look for sub-circuits that are used at multiple locations (i.e., whose roots have fan-out \(>1\)), and prioritize them to be synthesized separately and composed at their use sites. Besides the saving in the synthesis effort, the approach can lead a smaller synthesis result when the shared sub-circuit is used in contexts that lead to different outputs (cf. Remark 4). Finally, as a general strategy, we apply parallel composition at the root so that we synthesize separately for each output given a multi-output circuit. And, we set a bound on the maximum size of the circuits that will be synthesized monolithically, and decompose systematically based on the bound. As discussed in Sect. 5, in the prototype implementation, we use an “adaptive” version of the latter decomposition process by adjusting the bound on-the-fly and also opting for a pre-made circuit under certain conditions.

Example 3

Let us apply the compositional synthesis algorithm to the circuit from Fig. 2, for the case \(n=2\). Note that the circuit has no non-trivial public-only sub-circuits or have non-inputs gates with fan-out greater than 1.

First, we apply the parallel compositionality result so that the circuit is decomposed to two parts: the left tree that computes \(p_1 \oplus k_1 \oplus k_2\) and the right tree that computes \(k_2 \oplus p_2\). The right tree cannot be decomposed further, and we apply \(\textsc {MonoSynth}\) to transform it to a leakage-resilient form. A possible synthesis result of this is the right sub-circuit shown in Fig. 1 (i.e., the sub-circuit whose observable part outputs the split output \(o_4,o_5,o_6\)).

For the left tree, if the monolithic-synthesis size bound is set to synthesize circuits of height 2, we apply \(\textsc {MonoSynth}\) directly to the tree. Alternatively, with a lower bound set, we further decompose the left tree to a lower part that computes \(p_1 \oplus k_1\) and \(p_2\) (identity function) and an upper part that computes \(k\oplus p_2\) where the output of the lower part is to be connected to the “place-holder” input \(k\). Following the either strategy, we may obtain the left sub-circuit of Fig. 1 as a possible result. And, the final synthesis result after composing the left and right synthesis results is the whole circuit of Fig. 1.

5 Implementation and Experiments

We have implemented a prototype of the compositional synthesis algorithm. The implementation takes as input a finite-data loop-free C program and converts the program into a Boolean circuit in the standard way. We remark that, in principle, a program with non-input-dependent loops and recursive functions may be converted to such a form by loop unrolling and function inlining.

The implementation is written in the OCaml programming language. We use CIL [20] for the front-end parsing and Z3 [7] for the SMT solver used in the constraint-based monolithic synthesis. The experiments are conducted on a machine with a 2.60GHz Intel Xeon E5-2690v3 CPU with 8GB of RAM running a 64-bit Linux OS, with the time limit of 20 hours.

We have run the implementation on the 18 benchmark programs taken from the paper by Eldib and Wang [9]. The benchmarks are (parts of) various cryptographic algorithm implementations, such as a round of AES, and we refer to their paper for the details of the respective benchmarks (we use the same program names).Footnote 7 Whereas their experiments synthesized leakage-resilient versions of the benchmarks only for the case n is 1, in our experiments, we do the synthesis for the cases \(n=2\), \(n=3\), and \(n=4\).

We describe the decomposition strategy that is implemented in the prototype. Specifically, we give details of the online decomposition process mentioned in Sect. 4.2. The implementation employs the following adaptive strategy when decomposing systematically based on a circuit size bound. First, we set the bound to be circuits of some fixed height (the experiments use height 3), and decompose based on the bound. However, in some cases, the bound can be too large for the monolithic constraint-based synthesis algorithm to complete in a reasonable amount of time. Therefore, we set a limit on the time that the constraint-based synthesis can spend on constraint solving, and when the time limit is exceeded, we further decompose that sub-circuit by using a smaller bound. Further, when the time limit is exceeded even with the smallest bound, or the number of secrets in the sub-circuit exceeds a certain bound (this is done to prevent out of memory exceptions in the SMT solver), we use a pre-made leakage resilient circuit. Recall from Remark 5 that the compositionality property ensures the correctness of such a strategy.

Tables 1 and 2 summarize the experiment results. Table 2 shows the results of the compositional algorithm. Table 1 shows the results obtained by the“monolithic-only” version of the algorithm. Specifically, the monolithic-only results are obtained by, first applying the parallel compositionality property (cf. Sect. 3.1) to divide the given circuit into separate sub-circuit for each output, and then applying the constraint-based monolithic synthesis to each sub-circuit and combining the results. (The per-output parallel decomposition is needed because the constraint-based monolithic synthesis only takes one-output circuits as input – cf. Sect. 4.1.) The monolithic-only algorithm is essentially the monolithic algorithm of Eldib and Wang [9] with the improvements described in Sect. 4.1.

Table 1. Experiment results: monolithic only.
Table 2. Experiment results: compositional.

We describe the table legends. The column labeled “name” shows the benchmark program names. The columns labeled “time” show the time taken to synthesize the circuit. Here, “T/O” means that the algorithm was not able to finish within the time limit, and “M/O” means that the algorithm aborted due to an out of memory error. The columns labeled “size” show the number of gates in the synthesized circuit, and the columns labeled “rds” show the number of randoms in the synthesized circuit.

The columns labeled “mtc” in Table 2 is the maximum time spent by the algorithm to synthesize a sub-circuit in the compositional algorithm. Our prototype implementation currently implements a sequential version of the compositional algorithm where each sub-circuit is synthesized one at a time in sequence. However, in principle, the sub-circuits may be synthesized simultaneously in parallel, and the columns mtc give a good estimate of the efficiency of such a parallel version of the compositional algorithm. We also remark that the current prototype implementation is unoptimized and does not “cache” the synthesis results, and therefore, it naively applies the synthesis repeatedly on the same sub-circuits that have been synthesized previously.

As we can see from the tables, the monolithic-only approach is not able to finish on many of the benchmarks, even for the case \(n = 2\). In particular, it does not finish on any of the large benchmarks (as one can see from the sizes of the synthesized circuits, P13 to P18 are of considerable sizes). By contrast, the compositional approach was able to successfully complete the synthesis for all instances. We observe that the compositional approach was faster for the larger n in some cases (e.g., P9 with \(n=3\) vs. \(n=4\)). While this is partly due to the unpredictable nature of the back-end SMT solver, it is also an artifact of the decomposition strategy described above. More specifically, in some cases, the algorithm more quickly detects (e.g., in earlier iterations of the constraint-based synthesis’s CEGAR loop) that the decomposition bound should be reduced for the current sub-circuit, which can lead to a faster overall running time.

We also observe that the sizes of the circuits synthesized by the compositional approach are quite comparable to those of the ones synthesized by the monolithic-only approach, and the same observation can be made to the numbers of randoms in the synthesized circuits. In fact, in one case (P2 with \(n=3\)), the compositional approach synthesized a circuit that is smaller than the one synthesized by the monolithic-only approach. While this is due in part to the fact that the monolithic synthesis algorithm optimizes circuit height rather than size, in general, it is not inconceivable for the compositional approach to do better than the monolithic-only approach in terms of the quality of the synthesized circuit. This is because the compositional method could make a better use of the circuit structure by sharing synthesized sub-circuits, and also because of parsimonious use of randoms allowed by the compositionality property. We remark that the circuits synthesized by our method are orders of magnitude smaller than those obtained by naively applying the original construction of Ishai et al. [16] (cf. Theorem 1). For instance, for P18 with \(n=4\), the construction would produce a circuit with more than 3600k gates and 500k randoms.

6 Related Work

Verification and Synthesis for n -Threshold-Probing Model of Leakage Resilience. The n-threshold-probing model of leakage resilience was proposed in the seminal work by Ishai et al. [16]. The subsequent research has proposed methods to build circuits that are leakage resilient according to the model [3, 4, 6, 8,9,10, 22]. Along this direction, the two branches of work that are most relevant to ours are verification which aims at verifying whether the given (hand-crafted) circuit is leakage resilient [3, 4, 10], and synthesis which aims at generating leakage resilient circuits automatically [9]. In particular, our constraint-based monolithic synthesis algorithm is directly inspired and extends the algorithm given by Eldib and Wang [9]. As remarked before, their method only handles the case \(n = 1\). By contrast, we propose the first compositional synthesis approach that also works for arbitrary values of n.

On the verification side, the constraint-based verification method proposed in [10] is a precursor to their synthesis work discussed above, and it is similar to the candidate checking phase of the synthesis. Recent papers by Barthe et al. [3, 4] investigate verification methods that aim to also support the case \(n > 1\). Compositional verification is considered in [3]. As remarked before, in contrast to the compositionality property described in our paper, their composition does not require disjointness of the randoms in the composed components but instead require additional randoms at the site of the composition. We believe that the compositionality property investigated in their work is complementary to ours, and we leave for future work to combine these facets of compositionality.

We remark that synthesis is substantially harder than verification. Indeed, in our experience with the prototype implementation, most of the running time is consumed by the candidate finding part of the monolithic synthesis process with relatively little time spent by the candidate checking part.

Quantitative Information Flow. Quantitative information flow (QIF) [1, 19, 23, 24] is a formal measure of information leak, which is based on an information theoretic notion such as Shannon entropy, Rènyi entropy, and channel capacity. Recently, researchers have proposed QIF-based methods for side channel attack resilience [17, 18] whereby static analysis techniques for checking and inferring QIF are applied to side channels.

It is difficult to directly compare the QIF approach with the n-threshold-probing model of leakage resilience. Whereas the notion of security ensured by the latter is the absence of information leakage against an adversary of a certain restricted observation capability, the security ensured by the QIF approach is typically not of the form in which the adversary’s capability is restricted in some way, but instead some (small amount of) leak is permitted. We remark that, as also observed by [4], in the terminology of information flow theory, the n-threshold-probing model of leakage resilience corresponds to enforcing probabilistic non-interference [14] on every n-tuple of the circuit’s internal nodes.

7 Conclusion

We have presented a new approach to synthesizing circuits that are leakage resilient according to the n-threshold-probing model. We have shown that the leakage-resilience model admits a certain compositionality property, which roughly says that composing n-leakage-resilient circuits results in an n-leakage-resilient circuit, assuming the disjointness of the randoms in the composed circuit components. Then, by utilizing the property, we have designed a compositional synthesis algorithm that divides the given circuit into smaller sub-circuits, synthesizes n-leakage-resilient versions of them containing disjoint randoms, and combines the results to obtain an n-leakage-resilient version of the whole.