1 Introduction

Garbled circuits (GC) were introduced by Yao in the 1980s [Yao82] in one of the first secure two-party computation protocols. They remain the leading technique for constant-round two-party computation. Garbled circuits exclusively use extremely efficient symmetric-key operations (e.g., a few calls to AES per gate of the circuit), making communication rather than computation the bottleneck in realistic deployments—the parties must exchange \(O(\kappa )\) bits per gate. For that reason, most improvements to garbled circuits have focused heavily on reducing their concrete size [BMR90, NPS99, KS08, PSSW09, KMR14, GLNP15]. The current state of the art for garbled (boolean) circuits is the half-gates construction of Zahur, Rosulek, and Evans [ZRE15]. In the half-gates scheme, AND gates are garbled with size \(2\kappa \) bits, while XOR gates are free, requiring no communication.

The half-gates paper also establishes a lower bound for the size of garbled circuits. Specifically, the authors define a model of linear garbling—which captured all known techniques at the time—and proved that a garbled AND gate in this model requires \(2\kappa \) bits. Thus, half-gates is optimal among linear garbling schemes. In response, there has been a line of work focused on finding ways around the lower bound. Several works [KKS16, BMR16, WmM17] were successful in constructing an AND gate using only \(\kappa \) bits, using techniques outside of the linear-garbling model. However, these constructions work only for a single AND gate in isolation, so they do not result in any improvement to half-gates for garbling general circuits.Footnote 1 Garbling an entire arbitrary circuit with less than \(2\kappa \) bits per AND-gate remained an open problem. We discuss the linear garbling lower bound and different paths around it later in Sect. 7.

1.1 Our Results

We show a garbling scheme for general boolean circuits, in which XOR gates are free and AND gates cost only \(1.5\kappa + 5\) bits. This is the first scheme to successfully bypass the linear-garbling lower bound for all AND gates in a circuit, not just a single isolated AND gate. For the typical case of \(\kappa =128\) this is a concrete reduction of 23% in the size of garbled circuits relative to half-gates. Our construction compares to half-gates along other dimensions as follows:

  • Hardness assumption: All free-XOR-based garbling schemes require a function H with output length \(\kappa \) and satisfying a circular correlation-robust property. In short, this means that terms of the form \(H(X \oplus \varDelta )\) and \(H(X \oplus \varDelta )\oplus \varDelta \) are indistinguishable from random, for adversarially chosen X and global, secret \(\varDelta \). Our construction requires a slight generalization. First, we require H that gives outputs of length \(\kappa /2\). Second, the secret \(\varDelta \) is split into two halves \(\varDelta = \varDelta _L \Vert \varDelta _R\), and we require terms like \(H( X \oplus \varDelta ) \oplus \varDelta _L\), \(H(X \oplus \varDelta ) \oplus \varDelta _L \oplus \varDelta _R\), etc. to be indistinguishable from random.

  • Computation: Our scheme requires 50% more calls to H per AND gate than half-gates (6 vs 4 for the garbler, and 3 vs 2 for the evaluators). Similar to other work, we can instantiate the necessary H using just 1 call to AES with a key that is fixed for the entire circuit. As a result, the computational cost of our scheme is comparable to prior work.

    Additionally, since we require H with only \(\kappa /2\) bits of output, certain queries to H for different AND-gates can be combined into a single query to a \(\kappa \)-bit-output function. The effect of this optimization depends on the circuit topology but in some cases our construction can have identical or better computation to half-gates (see Sect. 6.2).

We bypass the [ZRE15] lower bound by using two techniques that are outside of its linear-garbling model. We refer to the techniques collectively as slicing-and-dicing.

  • Slicing: In our construction the evaluator slices wire labels into halves, and uses (possibly different!) linear combinations to compute each half. We stress that this does not halve the security—the hash H is still given the whole wire label with \(\kappa \) bits of entropy. To the best of our knowledge, this technique is novel in garbled circuits. As we demonstrate in detail later, introducing more linear combinations for the evaluator increases the linear-algebraic dimension in which the scheme operates, in a way that lets us exploit more linear-algebraic structures that prior schemes could not exploit.

  • Dicing: The evaluator first decrypts a constant-size ciphertext containing “control bits”, which determine the linear combinations (of input label [halves], gate ciphertexts, and H-outputs) he/she will use to compute the output label [halves]. The control bits are chosen randomly by the garbler (i.e., by tossing “dice”) in a particular way. Randomized control bits are outside of the linear garbling model, which requires the evaluator’s linear combinations to be fixed. This technique first appeared in [KKS16].

We also describe a variant of our scheme that can garble any kind of gate (e.g., XOR gates, even constant-output gates) for \(1.5\kappa + 10\) bits, in a way that hides the gate’s truth table from the evaluator. This improves on the state of the art for gate-hiding garbling, due to Rosulek [Ros17], in which each gate is garbled for \(2\kappa + 8\) bits, and constant-output gates are not supported. Additionally, our gate-hiding construction is fully compatible with free-XOR, meaning that the circuit can contain both “public” XOR gates (evaluator knows that this gate is an XOR) and “private” XOR gates (only the garbler knows that this gate is an XOR), with the public ones being free.

1.2 Related Work

The garbled circuits technique was first introduced by Yao [Yao82], although the first complete description and security proof for Yao’s protocol was given much later [LP09]. Bellare, Hoang, and Rogaway [BHR12] promoted garbled circuits from a technique to well-defined cryptographic primitive with standardized security properties, which they dubbed a garbling scheme. In this work, we use their framework to formally express our schemes and prove security.

The garbling scheme formalization captures many techniques, but in this work we focus on “practical” GC techniques built from symmetric-key tools (PRFs, hash functions, but not homomorphic encryption or obfuscation). In the realm of practical garbling, there have been many quantitative and qualitative improvements over the years, especially focused on reducing the size of garbled circuits. These works are showcased in Fig. 1. Of particular note are the Free-XOR technique of Kolesnikov and Schneider [KS08] and the half-gates construction [ZRE15], mentioned above. Free-XOR allows XOR gates in the circuit to be garbled with no communication, and our construction inherits this technique to achieve the same feature. The free-XOR technique requires a cryptographic hash with a property called circular correlation-resistance [CKKZ12]. As mentioned above, the half-gates paper introduced a lower bound for garbling, which several works have bypassed in some limited manner. We discuss the lower bound and these related works in more detail in Sect. 7.

Several garbling schemes are tailored to support both AND and XOR gates while hiding the type of gate from the evaluator [KKS16, WmM17, Ros17]. These works are compared in Fig. 2. They differ in the exact class of boolean gates they can support—all gates, all symmetric gates (satisfying \(g(0,1)=g(1,0)\)), or all non-constant gates.

Fig. 1.
figure 1

Comparison of efficient garbling schemes. Gate size ignores small constant additive term (i.e., “2” means \(2\kappa +O(1)\) bits per gate). CCR = circular correlation robust hash function.

Fig. 2.
figure 2

Comparison of gate-hiding garbling schemes, where the garbled circuit leaks only the topology of the circuit and not the type of each gate. Gate size ignores small constant additive term (i.e., “2” means \(2\kappa +O(1)\) bits per gate). CCR = circular correlation robust hash function. “Symmetric” means all gates g with \(g(0,1)=g(1,0)\). “Non-const” means all gates g except \(g(a,b)=0\) and \(g(a,b)=1\).

2 Preliminaries

2.1 Circuits

We represent a circuit \(f = (\mathsf {inputs}, \mathsf {outputs}, \mathsf {in}, \mathsf {leak}, \mathsf {eval})\) by choosing a topological order of the inputs and gates in the circuit. Let \(\mathsf {inputs} \) be the number of inputs in the circuit, which we require to come first in the ordering. Each gate is then labeled by its index in the order. For every gate index g in the circuit, its two input indicesFootnote 2 are \(\mathsf {in} _1(g)\) and \(\mathsf {in} _2(g)\), where \(\mathsf {in} _i(g) < g\). Each gate can be evaluated using a function \(\mathsf {eval} (g) :\{0,1\}^2 \rightarrow \{0,1\}\). Finally, the outputs are a subset of the indices .

Garbling only hides only partial information about the circuit. What is revealed is contained in the “leakage function” \(\varPhi (f)\). Sometimes two gates in a circuit may both be e.g. XOR-gates, but one will publicly be XOR while the operation performed by the other gate will be hidden. To support this, each gate is associated with some leakage \(\mathsf {leak} (g)\). Gates with different leakages may compute the same function, but have different rules about how much information is revealed. We then define \(\varPhi (f)\) to be \((\mathsf {inputs}, \mathsf {outputs}, \mathsf {in}, \mathsf {leak})\), containing the circuit topology and partial information about the gates’ truth tables.

2.2 Garbling Schemes

We use a slightly modified version of the garbling definitions of [BHR12].

Definition 1

A garbling scheme consists of four algorithms:

  • \((F, e, d) \leftarrow \mathsf {Garble} (1^\kappa , f)\).

  • \(X := \mathsf {Encode} (e, x)\). (deterministic)

  • \(Y := \mathsf {Eval} (F, X)\). (deterministic)

  • \(y := \mathsf {Decode} (d, Y)\). (deterministic)

such that the following conditions hold.

  • Correctness: For any circuit f and input x, if \((F, e, d) \leftarrow \mathsf {Garble} (1^\kappa , f)\) then \(f(x) = \mathsf {Decode} (d, \mathsf {Eval} (\mathsf {Encode} (e, x)))\) holds with all but negligible probability.

  • Privacy with respect to leakage \(\varPhi \): There must be a simulator \(\mathcal {S}\) such that for any circuit f and input x the following distributions are indistinguishable.

    figure a
  • Obliviousness w.r.t. leakage \(\varPhi \): There must be a simulator \(\mathcal {S}\) such that for any circuit f and input x the following distributions are indistinguishable.

    figure b
  • Authenticity: For any circuit f and input x, no PPT adversary \(\mathcal {A} \) can make the following distribution output \(\textsc {true} \) with non-negligible probability.

    figure c

The definitions differ from [BHR12] in two ways. First, we change correctness to allow a negligible failure probability.Footnote 3 Secondly, we strengthen the authenticity property by giving d to the adversary. This stronger property is easy to achieve by simply changing what one takes as garbled output Y.

2.3 Circular Correlation Robust Hashes

Our construction requires a hash function H with a property called circular correlation robustness (CCR). A comprehensive treatment of this property is presented in [CKKZ12, GKWY20].

The relevant definition of [GKWY20] is tweakable CCR (TCCR). For a hash function H, define a related oracle \(\mathcal O_{\varDelta }( X,\tau , b) = H(X \oplus \varDelta , \tau ) \oplus b\varDelta \). Then H is a TCCR if \(\mathcal O_\varDelta \) is indistinguishable from a random oracle, provided that the distinguisher never repeats a \((X,\tau )\) pair in calls to the oracle.

We modify their definition in several important ways:

  • We require H to have different input and output lengths. In the original definition, the adversary used the argument \(b \in \{0,1\}\) to determine whether \(\varDelta \) was XOR’ed with the output of H. We generalize so that the adversary can choose a linear function of (the bits of) \(\varDelta \) that will be XOR’ed with the output of H. Our construction ultimately needs only 4 linear functions reflecting our slicing of wire labels in half: \(L_{a,b}(\varDelta _L \Vert \varDelta _R) = a\varDelta _L \oplus b\varDelta _R\), for \(a,b \in \{0,1\}\).

  •  [GKWY20] observe that a “full” TCCR is stronger than what is needed for garbled circuits. In order to construct a TCCR that uses only one call to an ideal permutation, they prove TCCR security against adversaries that query only on “naturally derived” keys. It is somewhat cumbersome to generalize “naturally derived” keys to our setting, where the values are sliced into pieces. We instead relax TCCR so that H is drawn from a family of hashes, and the adversary only receives the description of H after making all of its oracle queries. This relaxation suffices for garbled circuits (the garbler chooses H and reveals it only in the garbled circuit description, after all queries to H have been made), and simplifies both our definition and our proof.

Definition 2

A family of hash functions \(\mathcal {H}\), where each \(H \in \mathcal {H}\) maps \(\{0,1\}^n \times \mathcal {T} \rightarrow \{0,1\}^m\) for some set of tweaks \(\mathcal {T}\), is randomized tweakable circular correlation robust (RTCCR) for a set of linear functions \(\mathcal {L}\) from \(\{0,1\}^n\) to \(\{0,1\}^m\) if, for any PPTs \(\mathcal {A} _1, \mathcal {A} _2\) that never repeat an oracle query to \(\mathcal {O}_{H,\varDelta }\) on the same \((X, \tau )\),

is negligible, where R is a random oracle and \(\mathcal {O}_{H,\varDelta }\) is defined as

figure d

In the full version we show that if \(F_k(X)\) is both a (plain) CCR hash for \(\mathcal {L}\) when k is fixed and a PRF when k is random, and \(\{ (X,\tau ) \mapsto X \oplus U(\tau ) \mid U \in \mathcal {U} \}\) is a universal hash family,Footnote 4 then \(\bigl \{(X,\tau ) \mapsto F_k(X \oplus U(\tau )) \mid k \in \{0,1\}^\kappa , U \in \mathcal {U}\bigr \}\) is a secure RTCCR hash family for \(\mathcal {L}\).

For our recommended instantiation, let \(\sigma \) be a simple function of the form \(\sigma (X_L \Vert X_R) = \alpha X_L \Vert \alpha X_R\), where \(\alpha \) is any fixed element in \(GF(2^{\kappa /2}) \setminus GF(2^2)\). Then \(\textsf {AES}_k(X) \oplus \sigma (X)\) is both a PRF for random k, and a CCR for any fixed k (modelling \(\textsf {AES}_k\) as an ideal permutation). Hence we get an RTCCR of the form:

$$ (X, \tau ) \mapsto \textsf {AES}_k\Bigl ( X \oplus U(\tau ) \Bigr ) \oplus \sigma (X \oplus U(\tau )) $$

U can likewise be a simple function, e.g., when \(|\tau | \le \kappa /2\) then we can use \(U(\tau ) = u_1 \tau \Vert u_2 \tau \) where \(u_1,u_2\) are random elements of \(GF(2^{\kappa /2})\).

3 A Linear-Algebraic View of Garbling Schemes

In this section we present a linear-algebraic perspective of garbling schemes, which is necessary to understand our construction and its novelty. This perspective is inspired by the presentation of Rosulek [Ros17], where the evaluator’s behavior (in each of the 4 different gate-input combinations) defines a set of linear equations that the garbler must satisfy, and we rearrange those equations to isolate the values that are outside of the garbler’s control.

3.1 The Basic Linear Perspective

Throughout this section, we consider an AND gate whose input wires have labels \((A_0,A_1)\) and \((B_0, B_1)\). We will always consider the free-XOR setting [KS08], where all wires have labels that xor to a common global \(\varDelta \); i.e., \(A_0 \oplus A_1 = B_0\oplus B_1 = \varDelta \). Our view of garbling will always start with the circuit evaluator’s perspective; hence we consider the subscripts to be public. In other words, if the evaluator holds \(A_i\), then he knows the value i. In some works these subscripts are called “color bits” or “permute bits.” The garbler secretly knows which of \(\{A_0,A_1\}\) represent true and which of \(\{B_0,B_1\}\) represent true.

Let’s take an example of a textbook Yao garbled gate, using the point-permute technique. The garbled gate consists of 4 ciphertexts \(G_{00},\ldots , G_{11}\). When the evaluator has input labels \(A_i, B_j\), he computes the output label by decrypting the (ij)’th ciphertext, as \(H(A_i,B_j) \oplus G_{ij}\).Footnote 5 In order to correspond to an AND gate, this evaluation expression must result in some label C (which could be either \(C_0\) or \(C_1\)) representing (false) in 3 cases and \(C \oplus \varDelta \) (true) in the other. Suppose \((A_1,B_0)\) is the case corresponding to inputs (true,true), then the garbler needs to arrange for:

$$\begin{aligned} C&= H(A_0,B_0) \oplus G_{00}&\qquad C \oplus \varDelta&= H(A_1,B_0) \oplus G_{10}\\ C&= H(A_0,B_1) \oplus G_{01}&\qquad C&= H(A_1,B_1) \oplus G_{11} \end{aligned}$$

We can rearrange these equations as follows:

$$ \begin{bmatrix} 1 &{} 1 &{} 0 &{} 0 &{} 0 \\ 1 &{} 0 &{} 1 &{} 0 &{} 0 \\ 1 &{} 0 &{} 0 &{} 1 &{} 0 \\ 1 &{} 0 &{} 0 &{} 0 &{} 1 \end{bmatrix} \quad \begin{bmatrix} C \\ G_{00} \\ G_{01} \\ G_{10} \\ G_{11} \end{bmatrix} \quad = \quad \begin{bmatrix} 1 &{} 0 &{} 0 &{} 0 \\ 0 &{} 1 &{} 0 &{} 0 \\ 0 &{} 0 &{} 1 &{} 0 \\ 0 &{} 0 &{} 0 &{} 1 \end{bmatrix} \quad \begin{bmatrix} H(A_0,B_0) \\ H(A_0,B_1) \\ H(A_1,B_0) \\ H(A_1,B_1) \end{bmatrix} \oplus \underbrace{ \begin{bmatrix} 0 \\ 0 \\ 1 \\ 0 \end{bmatrix} }_{t} \varDelta $$

In this equation, values that the garbler cannot control are on the right, and the results of the garbling process (gate ciphertexts and output labels) are on the left. The vector marked t is the truth table of the gate (when inputs are ordered by color bits), and known only to the garbler.

In order for the scheme to work, for all possible values on the right-hand side (including all choices of secret t!) the garbler must be able to solve for the variables on the left-hand side. In this case the left-hand side is under-determined so solving is easy. The garbler can simply choose random C and move it to the right-hand side. Then the matrix remaining on the left-hand side is an invertible identity matrix. Multiplying by the inverse solves for the desired values. Clearly this can be done for any t, meaning that this approach works to garble any gate (not just AND gates).

3.2 Row-Reduction Techniques

Row reduction refers to any technique to reduce the size of the garbled gate below 4 ciphertexts. The simplest method works by removing the ciphertext \(G_{00}\), and simply having the evaluator take \(H(A_0,B_0)\) as the output label when he has inputs \(A_0,B_0\).

figure e

The matrix on the left is now a square matrix, and invertible. Thus for any choice of t, the garbler can solve for C and the \(G_{ij}\) values by multiplying by the inverse matrix.

3.3 Half-Gates

The previous example shows that decreasing the size of the garbled gate from 4 to 3 causes the matrix on the left to change from size \(4\times 5\) to \(4 \times 4\). Reducing the garbled gate further (from 3 ciphertexts to 2) would cause the matrix to be \(4 \times 3\), and the system of linear equations would be overdetermined! So how does the half-gates garbling scheme [ZRE15] actually achieve a 2-ciphertext AND gate?

Let us recall the gate-evaluation algorithm for the half-gates scheme, which is considerably different from all previous schemes. On inputs \(A_i, B_j\) the evaluator computes the output label as \(H(A_i) \oplus H(B_j) \oplus i\cdot G_0 \oplus j ( G_1 \oplus A_i)\), where \(G_0, G_1\) are the two gate ciphertexts.

Suppose as before that \(A_1\) and \(B_0\) correspond to true. Then the garbler must arrange for the following to be true:

Rearranging in our usual way, we get:

figure f

Note that \(\varDelta \) is used both in the truth table adjustment (t) and in the usual operations of the evaluator (implicitly, in the one case where he includes \(A_1 = A_0 \oplus \varDelta \) in the linear combination).

As promised, the matrix on the left is only \(4\times 3\). We cannot solve for the left-hand side by inverting this matrix as in the previous cases. Instead, the garbler takes advantage of the fact that the matrices on both sides have the same column space. Specifically, the columns on the left span the space of all even-parity vectors. For any choice of t containing just a single 1 (corresponding to the truth table of an AND gate), every column on the right also has even parity! Concretely, suppose the evaluator solved the first three rows of this system of linear equalities (which is possible since the first three rows on the left form an invertible matrix), then the fourth row would automatically be in equality since on both sides it is the sum of the first 3 rows.Footnote 6 One can see that this technique works only for gates whose truth table has odd parity (e.g., AND gates).

Half-gates was the first garbling scheme to structure its oracle queries as \(H(A_i)\) and \(H(B_j)\), instead of \(H(A_i,B_j)\). Our linear-algebraic perspective highlights the importance of this change. For a 2-ciphertext AND gate, the matrix on the left will be \(4\times 3\), so the matrix on the right must have rank 3. An expression like \(H(A_i,B_j)\) can be used by the evaluator in only one combination of inputs, leading to an identity matrix minor that has rank 4. By contrast, each \(H(A_i)\) and \(H(B_j)\) term is used for two input combinations, so the corresponding matrix can have rank 3.

Our linear algebraic perspective confirms and provides an explanation for a prior finding of Carmer and Rosulek [CR16]. They used a SAT solver to show that no garbling scheme (in the linear model of the half-gates paper) could achieve a 2-ciphertext AND gate, when the evaluator makes only one query to H. This reiterates the importance of half gates using H(A), H(B) oracle queries to achieve a 2-ciphertext AND gate.

4 High-Level Overview of Our Scheme

In the previous section, we saw that it was important that the evaluator used oracle queries like \(H(A_i)\) and \(H(B_j)\) in the half-gates scheme. For every term of the form \(H(A_i)\) there are two gate-input combinations in which the evaluator uses this term. This property led to a desirable redundancy in the matrix that relates H-queries to input combinations. Redundancies in this matrix lead to smaller garbled gates. We push this idea further using several key observations.

4.1 Observation #1: Get the Most Out of the Oracle Queries

\(H(A_i)\) and \(H(B_j)\) are not the only oracle queries that can be made in two different gate-input combinations. We can also ask the evaluator to query \(H( A_i \oplus B_j)\). Because of the free-XOR constraint, \(A_0 \oplus B_0 = A_1 \oplus B_1\), and \(A_0\oplus B_1 = A_1 \oplus B_0\). This means that the following oracle queries can be made for each gate-input combination:

figure g

Can we use queries of this form to introduce even more redundancy in the relevant matrices?

4.2 Observation #2: Increase Dimension by Slicing Wire Labels

Our linear-algebraic perspective of garbling includes only 4 linear equations, corresponding to the 4 different gate-inputs. Having only 4 linear equations makes it difficult to take advantage of any new structure introduced by observation #1. Our second observation, and perhaps the key to our entire approach, is to split each wire label into a left and right half, and let the evaluator compute the two halves (of the output label) with different linear combinations. This results in 8 linear equations in our linear-algebraic perspective—2 equations for each of the 4 gate-input combinations.

Consider the following proposal,

figure h

For example, on gate-input (0,0) the evaluator will compute the left half of the output label as \(H(A_0) \oplus H(A_0\oplus B_0) \oplus \cdots \) (plus other terms, involving gate ciphertexts and input labels). There are several important features of this table to note:

  • \(H(\cdot )\) is used in a linear equation to compute half of an output label, therefore \(H(\cdot )\) is a function with \(\kappa /2\) bits of output. Three of these half-sized hash functions are combined to encrypt the gate output.Footnote 7 However, we still will use the entire input wire labels as input to H—using wire-label halves as input to H would cut the effective security parameter in half.

  • For an evaluator with gate-input (0,0), the values \(H(A_1)\), \(H(B_1)\), and \(H(A_0\oplus B_1)\) are all jointly indistinguishable from random. With that in mind, consider the linear combinations for any other gate-input. For example, in the (1,0) case the evaluator will compute the output as

    $$\begin{aligned} \text{ left }&= H(A_1) \oplus H(A_0 \oplus B_1) \oplus \cdots \\ \text{ right }&= H(B_0) \oplus H(A_0 \oplus B_1) \oplus \cdots \end{aligned}$$

    Because \(H(A_1)\) and \(H(A_0\oplus B_1)\) are pseudorandom, this makes both of these outputs jointly pseudorandom. The entire output of the (1,0) case is pseudorandom from the perspective of the evaluator in the (0,0) case. This is a necessary condition, since sometimes the (0,0) and (1,0) cases give different outputs. This pattern holds with respect to any pair of two gate-inputs.

  • If we interpret Eq. 2 as a matrix (\(\checkmark \)=1, empty cell=0), we see that it has rank 5. This suggests that the garbling process can result in only 5 output values, where in this case each of these values is \(\kappa /2\) bits. Two of the values are the halves of the output wire label C, leaving 3 values to comprise the garbled gate ciphertexts. In other words, we are on our way to a garbled gate with only \(3\kappa /2\) bits, if only we can get all of the relevant linear equations to cooperate.

4.3 Observation #3: Randomize and Hide the Evaluator’s Coefficients

Let us apply our observations so far to our linear perspective of Sect. 3. Since wire labels are divided into halves, we use notation like \(A_{0R}\) to denote the right half of \(A_0\). Note that the free-XOR constraint applies independently to the wire label halves; i.e., \(A_{1R} = A_{0R} \oplus \varDelta _R\) and so on.

The evaluator computes each half of the output label separately, using a linear combination of available information: oracle responses, gate ciphertexts, and the 4 (!) halves of the input labels. If we account for all 8 of the evaluator’s linear equations, while using the oracle-query structure suggested in Eq. 2, we obtain the following system:

(3)

The first row represents the evaluator’s linear equation to compute the left half  \(C_L\) of the output label on input \(A_0, B_0\), etc. Note that the truth table t now consists of \(2\times 2\) identity blocks and \(2\times 2\) zero-blocks.

For everything to work correctly, we need to replace the “?” entries, so that for every choice of t, the matrices on both sides have the same column space.

  • The columns on the right-hand side (representing the H outputs) already span a space of dimension 5, so there is no choice but to extend the left-hand side matrix to a basis of that space.

  • The “?” entries on the right are subject to other constraints, so that they reflect what an evaluator can actually do in each input combination. For example, on input \(A_0,B_1\), the evaluator cannot include \(B_{0R}\) in its linear combination, it can only include \(B_{1R} = B_{0R} \oplus \varDelta _R\). Note that the matrix is written in terms of \(B_0\) only.

Unfortunately, it is not possible to complete the right-hand-side matrix subject to these constraints. For every t, there is a valid way to replace the “?” entries, but there is no one way that works for all t.

To get around this problem, we randomize and encrypt the entries of the matrix. To the best of our knowledge, the technique first appeared in the garbling scheme of [KKS16], and was also used in [WmM17, Ros17]. The garbler will complete the matrices so that the system of equations can be solved (i.e., the column spaces coincide). This causes the matrix entries to now depend on the garbler’s secret t. Next, the garbler will encrypt these matrix entries, so that when the evaluator has input \(A_i, B_j\), he can decrypt only those matrix entries needed for that particular input combination—not the entire matrix. For example, the evaluator can use \(A_0, B_0\) to decrypt the top two rows of the matrix—just enough to determine the coefficients of the linear combinations computing the output label. Unlike other schemes, there is a step of indirection (decrypting this additional ciphertext) before the evaluator determines which linear combinations to apply—the linear combination does not depend solely on the color bits of the input labels. We call the contents of these ciphertexts control bits, which tell the evaluator what linear combination to apply. The control bits are of small constant size, so encrypting them adds only a constant number of bits to the garbling scheme.

The garbler completes the missing entries in the matrix by drawing them randomly from a distribution over matrices. The distribution depends on t, as we mentioned—however, it can be arranged that each marginal view of the matrix is independent of t . Since the evaluator sees only such a marginal view, not the entire matrix, the value of t is hidden.

5 Details: Slicing and Dicing

5.1 Choosing the Matrices

Let us begin by filling out the question marks in Eq. 3. We rewrite this equation using block matrices, and we group related parts together.

$$\begin{aligned} V\begin{bmatrix} C \\ \vec {G}\end{bmatrix} = M\vec {H}\oplus \left( R\oplus [0 \cdots 0 |t]\right) \begin{bmatrix} A_0 \\ B_0 \\ \varDelta \end{bmatrix} \end{aligned}$$
(4)

Here C, \(A_0\), \(B_0\), and \(\varDelta \) are two-element (column) vectors representing the two halves of these wire labels; \(\vec {G}\) is the vector of gate ciphertexts; and \(\vec {H}= \bigl [H(A_0) \; H(A_1) \; H(B_0) \; H(B_1) \; H(A_0 \oplus B_0) \; H(A_0 \oplus B_1)\bigr ]^{\top }\) is the vector of H-outputs. t is the \(8 \times 2\) truth table matrix, which contains a \(2 \times 2\) identity matrix block for each case of the gate that should output true. We have already filled out \(M\)—it is the portion of the right-hand side matrix in Eq. 3 with no question marks, that operates on the hash outputs \(\vec {H}\). \(R\) is called the control matrix because it determines which pieces of input labels are added to the output.

Choosing \(V\). Recall that the matrices on both sides of the equation must have the same column space, and that \(M\) already spans this 5-dimensional space. Call this common column space the gate space \(\mathcal {G}\). Then

$$\begin{aligned} \mathcal {G}= \text {colspace}\,(V) = \text {colspace}\,(M) \supseteq \text {colspace}\,\bigl (R\oplus [0 \cdots 0|t]\bigr ) . \end{aligned}$$

It will be more convenient to represent \(\mathcal {G}\) using linear constraints, rather than as the span of the columns of \(M\). We use a matrix \(K\) as a basis for the cokernel of \(M\), so that any vector v is in \(\mathcal {G}\) if and only if \(Kv = 0\). Then \(V\) must satisfy \(rank(V) = 5\) and \(KV= 0\).

Any \(K\) and \(V\) satisfying these constraints will suffice, and we will use the following:

Note that the columns of \(V\) corresponding to the gate ciphertexts (the 3 rightmost columns) are the same as the columns in \(M\) corresponding to hash outputs \(H(A_1), H(B_1), H(A_0 \oplus B_1)\), so they are clearly in the column space of \(M\).

Constraints on Choosing \(R\). It remains to see how we choose the control matrix \(R\). Using our new notation, \(\text {colspace}\,\bigl (R\oplus [0 \cdots 0 |t]\bigr ) \subseteq \mathcal {G}\) is equivalent to \(KR= K[0 \cdots 0 |t] \), so we must choose \(R\) to match \(Kt\). Because t is composed of \(2 \times 2\) zero or identity blocks, we can deduce:

$$\begin{aligned} KR= K[0 \cdots 0 |t] = \left[ \begin{array}{c c | c c | c c} 0 &{} 0 &{} 0 &{} 0 &{} p &{} 0 \\ 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} p \\ 0 &{} 0 &{} 0 &{} 0 &{} a &{} b \end{array} \right] \end{aligned}$$
(5)

for some \(a, b \in \{0,1\}\), where p is the parity of the truth table. In our main construction, \(p=1\) since it only considers garbling AND gates. However, the bits ab reveal more than the parity of the gate—they leak the position of the “1” in the truth table. Since \(R\) must depend on these ab bits, we resort to randomizing the control matrix \(R\) to hide ab.

We also need the control matrix to reflect linear combinations that the evaluator can actually do with the available wire labels. The linear constraints are expressed in terms of \(A_0, B_0\), and \(\varDelta \), but when the evaluator has wire label, say, \(A_1\), he can either include it in the linear combination (adding both \(A_0\) and \(\varDelta \)) or not (adding neither \(A_0\) nor \(\varDelta \))—he cannot include only one of \(A_0, \varDelta \) in the linear combination. This means that \(R\) must decompose into \(2 \times 2\) matrices in the following way:

$$\begin{aligned} R= \begin{bmatrix} R_{0 0 A} &{} R_{0 0 B} &{} 0 \\ R_{0 1 A} &{} R_{0 1 B} &{} R_{0 1 B} \\ R_{1 0 A} &{} R_{1 0 B} &{} R_{1 0 A} \\ R_{1 1 A} &{} R_{1 1 B} &{} R_{1 1 A} \oplus R_{1 1 B} \\ \end{bmatrix} \end{aligned}$$
(6)

When the evaluator holds input labels \(A_i, B_j\), the submatrix \(R_{i j} = \begin{bmatrix} R_{i j A}&R_{i j B} \end{bmatrix}\) is enough to completely determine which linear combination should be applied. We call \(R_{i j}\) the marginal view for that input combination. We will randomize the choice of \(R\), subject to the constraints listed above, so that any single marginal view leaks nothing about t. That is, we want to find a distribution \(\mathcal {R}(t)\) such that when \(R\leftarrow \mathcal {R}(t)\), \(KR= K[0 \cdots 0|t]\) with probability 1, yet for every \(i, j \in \{0,1\}\), if \(t \leftarrow T\) and \(R\leftarrow \mathcal {R}(t)\) then t and \(R_{i j}\) are independently distributed.

Basic Approach to the Distribution \(\mathcal {R}(t)\): We must choose \(R\) to match the pab bits defined above (which depend on the truth table t). Suppose we have a distribution \(\mathcal {R}_0\) with the following properties:

  • If \(R_\$ \leftarrow \mathcal {R}_0\) then \(KR_\$ = 0\)

  • For all \(i,j \in \{0,1\}\), if \(R_\$ \leftarrow \mathcal {R}_0\) then \((R_\$)_{i j}\) (the marginal view) is uniform

and we also have fixed matrices \(R_p\), \(R_a\), \(R_b\) such that:

$$\begin{aligned} KR_p = \left[ \begin{array}{c c | c c | c c} 0 &{} 0 &{} 0 &{} 0 &{} 1 &{} 0 \\ 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 1 \\ 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 \end{array} \right] ~~ KR_a = \left[ \begin{array}{c c | c c | c c} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 &{} 0 &{} 1 &{} 0 \end{array} \right] ~~ KR_b = \left[ \begin{array}{c c | c c | c c} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 1 \end{array} \right] {,} \end{aligned}$$
(7)

Define \(\mathcal {R}(t)\) to first sample \(R_\$ \leftarrow \mathcal {R}_0\) and output \(R= p R_p \oplus a R_a \oplus b R_b \oplus R_\$\). The result \(R\) will always satisfy the condition of Eq. 5. The randomness in \(R_\$\) also causes marginal views of \(R_{i j}\) to be uniform and therefore hide pab. Concrete values for \(R_p, R_a,R_b\) are given in Figs. 3 and 4, as part of a different construction.

If \(\mathcal {R}_0\) is the uniform distribution over all matrices satisfying \(KR=0\), then the garbler must encrypt the full marginal views \(R_{i j}\) at 8 bits per view. A more thoughtful choice of distribution will allow the garbler to convey \(R_{i j}\) marginal views with fewer bits.

Fig. 3.
figure 3

Control matrices for even-parity gates. The top row contains the two basis matrices for \(S\). The bottom row shows the full control matrices (\(R_p\) is not needed for even-parity gates). The middle row shows the “compressed” representation of the control matrices, in terms of the basis \(\{S_1, S_2\}\) (i.e., each row expresses which linear combination of \(S_1, S_2\) appears in the corresponding blocks of the control matrix). The reader can verify that (1) each row in \(\bar{R}_\$\) is individually uniform; (2) \(KR_\$ = 0\); and (3) Eq. 7 holds.

Fig. 4.
figure 4

Control matrices for gate-hiding garbling. The top row contains the basis matrices for \(S\). The basis of Fig. 3 is a subset of this basis, so we can use the same \(R_a\) and \(R_b\) as Fig. 3. The distributions on \(\bar{R}_\$\) and \(R_\$\) also include the matrices from Fig. 3 (omitted with “\(\ldots \)” here). The middle row gives the control matrices in terms of the new basis, while the bottom row shows them directly. The reader may verify that (1) each row of \(\bar{R}_\$\) is individually uniform; (2) \(KR_\$ = 0\); and (3) Eq. 7 holds.

Compressing the Marginal Views: Each marginal view \(R_{i j}\) is a \(2 \times 4\) matrix. We can “compress” these if we manage to restrict all \(R_{ij}\) to some linear subspace \(S= \text {span}\{S_1, S_2, \ldots , S_{d}\}\) of \(2 \times 4\) matrices (presumably with dimension \(d< 8\)), while still maintaining the other properties needed.

Let \(\bar{R}_{i j}\) denote the representation of \(R_{i j}\) with respect to the basis \(S\)i.e., a vector of length \(d\). Then the garbler can encrypt only the \(\bar{R}_{i j}\)’s to convey the marginal views of \(R\). The choice of the subspace \(S\) depends on the class of truth tables that need to be hidden.

Parity-Leaking Gates: We performed an exhaustive computer search of low dimensional subspaces to determine how to pick the basis \(S\) for different types of gates. For even-parity gates (e.g. XOR or constant gates) we found a 2-dimensional subspace that works. Details of the \(\mathcal {R}(t)\) distribution are given in Fig. 3. For odd-parity gates (like AND, OR) we simply use the even-parity distribution and add a public constant \(R_p\) (from Fig. 4) to the result. This approach works when the parity of the gate is public, since the evaluator must know to add \(R_p\) when decoding the description of their marginal view \(R_{i j}\).

The construction for odd-parity gates is our primary construction, which would be used in most applications of garbling (in combination with free XOR gates).

Parity-Hiding Gates: To make the garbling scheme gate-hiding, we also need to hide the parity of the truth table. In other words, the distribution on \(R_\$\) must be random enough to mask the presence (or absence) of a matrix \(R_p\) as in Eq. 7. The \(R_p\) in Fig. 4 is not in the subspace \(S\) of control matrices in Fig. 3. Hence, to support parity-hiding we have had to extend that subspace with two additional basis elements (the basis matrices \(S_1,S_2\) are as in the parity-leaking case). Our parity-hiding gates require 4 (compressed) control bits per gate-input combination, corresponding to the 4-dimensional basis \(S\). See Fig. 4 for details.

5.2 Garbling the Control Bits

So far we have glossed over the details of how the control bits actually get encrypted and sent to the evaluator. We know that there will be some \(4 \times d\) (\(d=2\) for parity-leaking gates and \(d=4\) for parity hiding gates) matrix \(\bar{R}\), and that the evaluator should only get to see a single row \(\bar{R}_{i j}\) of \(\bar{R}\) telling them what linear combination of \(S_1, \ldots , S_d\) to use as control bits. The garbler can easily encrypt these values so that on input \(A_i, B_j\) the evaluator can decrypt only \(\bar{R}_{i j}\).

In order to reuse the calls to H that the evaluator already uses, it turns out that we can use our new garbling construction to garble the control bits as well. At first it looks like this would just give infinite recursion, as if we used something like Eq. 4 to garble the control bits then that garbling would need its own control bits, which would need to be garbled, and so on. In reality, the compressed control bits actually have a structure that allows us to garble them without recursive control bits.

Conceptually, we can treat the bits of \(\bar{R}\) as wire labels and slice them as we do regular wire labels. Collect the bits from odd and even-indexed positions of \(\bar{R}_{i j}\) into numbers \(\overline{r}_{i j L}\) and \(\overline{r}_{i j R} \in GF(2^{d/2})\), respectively. Define the vector

$$ \vec {r}= \bigl [ \overline{r}_{0 0 L} ~ \overline{r}_{0 0 R} ~ \overline{r}_{0 1 L} ~ \overline{r}_{0 1 R} ~ \overline{r}_{1 0 L} ~ \overline{r}_{1 0 R} ~ \overline{r}_{1 1 L} ~ \overline{r}_{1 1 R} \bigr ]^{\top } $$

We observed that for both our parity-leaking and parity-hiding constructions, this vector is always in the gate subspace \(\mathcal {G}\)i.e., that \(K\vec {r}=0\). Looking at Fig. 3, the reader can check that this holds for any possible \(\vec {r}\) (which in this case is the same as \(\bar{R}\) read in row-major order). And similarly for Fig. 4; this time the test for \(\bar{R}\) is equivalent to checking its two \(4 \times 2\) blocks individually.

Since the control bits, when expressed as \(\vec {r}\), are always in the gate subspace \(\mathcal {G}\), they can be garbled without needing their own control bits. The garbler can compute a constant-size ciphertext \(\vec {z}\) such that:

$$\begin{aligned} V\vec {z}\oplus M\ \text {lsb}_{\frac{d}{2}}(\vec {H}) = \vec {r}, \end{aligned}$$
(8)

where \(V, M, \vec {H}\) are as in Eq. 4. Here we assume that every hash has been extended by an extra \({d}/ 2\) bits (or more realistically given that block ciphers have a fixed size, each wire label slice has been shrunk by \(d/2\) bits to make room), and that these extra bit can be extracted with \(\text {lsb}_{\frac{d}{2}}\). The remainder of the hash vector, \(\text {msb}_{\frac{\kappa }{2}}(\vec {H})\), is used for garbling the wire labels themselves. By the same reasoning as for usual garbling, when the evaluator has input labels \(A_i, B_j\), he can learn only the \(\vec {r}_{i j}\) portions of \(\vec {r}\).

We can combine Eqs. 4 and 8 into a single system, allowing the whole gate to be garbled at once.

$$\begin{aligned} V\left( \vec {z}\mathbin {\Big \Vert }\begin{bmatrix} C \\ \vec {G}\end{bmatrix}\right) \oplus M\vec {H}= \vec {r}\mathbin {\Big \Vert }\left( \left( R\oplus [0 \cdots 0|t]\right) \ \begin{bmatrix} A_0 \\ B_0 \\ \varDelta \end{bmatrix} \right) , \end{aligned}$$
(9)

where denotes element wise concatenation, so e.g. the bits of \(\overline{r}_{0 0 L} \in GF({2^{{d} / 2}})\) get concatenated with some \(x \in GF({2^{\kappa / 2}})\) to get a value in \(GF({2^{(\kappa + d) / 2}})\). We write the bits in little endian order, so .

5.3 The Construction

We can now describe our garbling scheme formally. All of our different types of gates are compatible, so we describe a single unified scheme. The circuit has a \(\mathsf {leak}\) function that indicates what information about each gate is public (which affects the cost of garbling each gate):

figure i

Because we need different control matrices depending on what kind of gate is being garbled, we use the notation \(\mathcal {R}(L, t)\), for \(L \in \{\mathsf {EVEN},\mathsf {ODD},\mathsf {NONE} \}\) to denote the appropriate distribution over control matrices. For \(\mathsf {EVEN}\)/\(\mathsf {ODD}\) gates, the distribution is as in Fig. 3 (with \(R_p\) added in the case of \(\mathsf {ODD}\)), and for \(\mathsf {NONE} \) the distribution is as in Fig. 4.

Fig. 5.
figure 5

Our garbling scheme (continued in Fig. 6).

Fig. 6.
figure 6

Our garbling scheme (continued from Fig. 5). \(V^{-1}\) is a left inverse of \(V\).

Our garbling scheme is shown in Figs. 5 and 6. The garbler associates the kth wire in the circuit with a wire label \(W_k\) (and its opposite label \(W_k \oplus \varDelta )\) and a point-and-permute bit \(\pi _k\). \(W_k\) is the label with color bit \(\text {lsb}(W_k) = 0\) (visible to the evaluator). The label \(W_k \oplus \pi _k \varDelta \) is the wire label representing false on that wire. Equivalently, \(W_k\) is the wire label representing logical value \(\pi _k\).

For each non-free gate, the garbler first samples a control matrix \(R\) and encodes its marginal views (i.e., expresses each view in terms of the basis \(\{S_j\}_j\)). We have factored out this sampling procedure into a helper function \(\mathsf {SampleR}\), along with a corresponding decoding function \(\mathsf {DecodeR}\) used by the evaluator to reconstruct its marginal view of the control matrix. One thing to note about \(\mathsf {SampleR} \) is that in the case of a \(\mathsf {ODD} \) gate, the control matrices include the term \(R_p\), but \(R_p\) is not in the subspace spanned by the basis \(\{S_j\}_j\). The compressed representation of each marginal view excludes the contribution of \(R_p\), but in these cases it is publicly known that the evaluator should compensate by manually adding \(R_p\).

For each gate k, we have a master evaluation equation in the style of Eq. 9. This equation expresses constraints that must be true about that gate, but the garbler is interested in computing garbled gate ciphertexts \(\vec {G}_k\), control bit ciphertexts \(\vec {z}_k\), and output wire label that satisfy the constraints. As previously discussed, we can solve for these values by multipying both sides by \(V^{-1}\), a left inverse of \(V\). One possible choice of \(V^{-1}\) is given below:

$$\begin{aligned} V^{-1} = \left[ \begin{array}{cc|cc|cc|cc} 1 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 \\ 0 &{} 1 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 \\ 1 &{} 1 &{} 0 &{} 0 &{} 1 &{} 1 &{} 0 &{} 0 \\ 1 &{} 1 &{} 1 &{} 1 &{} 0 &{} 0 &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 &{} 0 &{} 1 &{} 0 &{} 1 &{} 0 \end{array} \right] \end{aligned}$$
(10)

The queries to hash function H include tweaks based on the gate ID, for domain separation. Finally, for each output wire, the garbler computes hashes of the wire labels, which will be used in \(\mathsf {Decode} \) to authenticate labels and determine their logical value (true or false). These hashes need \(\kappa \) bits for authenticity, so they are computed using another hash function \(H'(E, k)\) with output length \(\kappa \) instead \(\frac{\kappa + d}{2}\). It is simplest to set , which puts together \(\kappa \) bits from two evaluations of H, while avoiding any overlaps in tweaks.

The evaluator follows a similar process. Starting with the input wire labels E, it evaluates the garbled circuit one gate at a time. The invariant is that on wire k, the evaluator will hold the “active” wire label \(E_k = W_k \oplus (x_k \oplus \pi _k) \varDelta \), where \(x_k\) is the logical value on that wire, for the given circuit input. If AB are the active wire labels on the input wires of this gate, then the evaluator computes terms of the form \(H(A), H(B), H(A\oplus B)\) and evaluates the gate according to Eq. 9. The evaluator only knows enough for two rows of Eq. 9, depending on the color bits \(i=\text {lsb}(A)\), \(j=\text {lsb}(B)\), so we let \(V_{i j}\) be the corresponding pair of rows from \(V\). It only evaluates the gate partially at first, in order to find the encoded control bits so that it can decode them with \(\mathsf {DecodeR}\) and use them to finally compute the output wire label.

5.4 Security Proof

Theorem 3

Let \(\mathcal {H}\) be a family of hash functions, with output length \((\kappa + d)/2\) bits, that is RTCCR for \(\mathcal {L} = \{ L_{ab}(\varDelta _L\Vert \varDelta _R) = 0^{d/2} \Vert a \varDelta _L \oplus b \varDelta _R \mid a,b, \in \{0,1\}\}\). Then our construction (Figs. 5 and 6) is a secure garbling scheme.

Proof

We need to prove four properties of the construction.

Correctness: We need to prove an invariant: \(E_k = W_k \oplus (x_k \oplus \pi _k) \varDelta \) for all k, if \(x_k\) is the plaintext value on that wire. Encode chooses the inputs in this way, so at least it’s true for \(k \le \mathsf {inputs} \), and it is trivially maintained for free-XOR gates. For any \(v \in \text {colspace}\,(V) = \mathcal {G}\), we have \(VV^{-1} v = v\), as there exists some u such that \(v = Vu\) and \(VV^{-1} Vu = Vu = v\) because \(V^{-1}\) is a left inverse of \(V\). In Sect. 5.1 we showed that \(\text {colspace}\,(M) = \mathcal {G}\), \(\text {colspace}\,(R\oplus [0 \cdots 0|t]) \subseteq \mathcal {G}\), and \(\vec {r}\in \mathcal {G}\), so after multiplying both sides of garbler’s equation by \(V\) on the left, the \(VV^{-1}\)s will cancel, and taking a two-row piece of this equation gives the evaluator’s equation. In this equation, \(X_{i j}\) is the two rows of

$$\begin{aligned} \vec {X} = C \oplus \left( R\oplus [0 \cdots 0|t]\right) \begin{bmatrix} A_0 \\ B_0 \\ \varDelta \end{bmatrix} , \end{aligned}$$
(11)

corresponding to the evaluation case ij. The structure of \(R\) (see Eq. 6) implies that the evaluator’s row pair of \(R[A_0^{\top } \; B_0^{\top } \; \varDelta ^{\top }]^{\top }\) will be \(R_{i j} [A^{\top } \; B^{\top }]^{\top }\). Therefore

$$ E_k = X_{i j} \oplus R\begin{bmatrix} A \\ B \end{bmatrix} = C \oplus t_{i j} \varDelta = W_k \oplus (\mathsf {eval} (k)(\pi _A \oplus i, \pi _B \oplus j) \oplus \pi _k) \varDelta , $$

which maintains this invariant because

$$ i = \text {lsb}(E_{\mathsf {in} _1(k)}) = \text {lsb}\bigl (W_{\mathsf {in} _1(k)} \oplus (x_{\mathsf {in} _1(k)} \oplus \pi _{\mathsf {in} _1(k)}) \varDelta \bigr ) = x_{\mathsf {in} _1(k)} \oplus \pi _{\mathsf {in} _1(k)}, $$

and similarly for j. Finally, \(\mathsf {Decode} \) will correctly find that \(D_k^{x_k} = H'\bigl (W_k \oplus (x_k \oplus \pi _k) \varDelta , k\bigr ) = H'(E_k, k)\), assuming that \(D_k^{x_k} \ne D_k^{1-x_k}\), which has only negligible probability of failing. Therefore it gives the correct result.

Fig. 7.
figure 7

Left: simulators for privacy and obliviousness. Right: a hybrid for privacy.

Privacy: We need to prove that generating \((\varPhi , \vec {G}, \vec {z}), E, (\varPhi , D)\) with \(\mathsf {Garble} \) and \(\mathsf {Encode} \) is indistinguishable from the output of \(\mathcal {S}_\text {priv}\). We give a sequence of intermediate hybrids, going from the real garbler to the simulator.

Hybrid 1: This hybrid switches from the garbler’s perspective to the evaluator’s perspective when garbling the circuit. Instead of keeping track of the “zero” wire label \(W_k\) for every gate, we keep track of the “active” wire label \(E_k\), and rewrite the garbling procedure in terms of the “active” labels. This basically involves a change of variable names throughout the garbling algorithm. The changes are extensive, and given in detail in Fig. 7:

  • Replace point-and-permute bits \(\pi _k\) with the equivalent expression \(x_k \oplus \text {lsb}(E_k)\).

  • Write the control matrix part of the garbling equation in terms of active wire labels \(A = E_{\mathsf {in} _1(k)}\) and \(B = E_{\mathsf {in} _2(k)}\) instead of \(A_0\) and \(B_0\).

    $$ \text{ replace } R\times \begin{bmatrix} A_0 \\ B_0 \\ \varDelta \end{bmatrix} \text{ with } \text{ equivalent } R' \times \begin{bmatrix} A \\ B \\ \varDelta \end{bmatrix} . $$

    where a change of basis has been applied to R, that expresses \(A_0\) as the appropriate linear combination of A and \(\varDelta \), and expresses \(B_0\) in terms of B and \(\varDelta \).

  • Partition \(\vec {H}\) into two pieces:

    $$\begin{aligned} \vec {H}_0&= [H(A)~H(B)~H(A\oplus B)]^\top \\ \vec {H}_\varDelta&= [H(A \oplus \varDelta )~H(B\oplus \varDelta )~H(A \oplus B\oplus \varDelta )]^\top \end{aligned}$$

    where again A and B are the active wire labels. Similarly partition the matrix \(M\) into \(M_0\) and \(M_\varDelta \), and replace \(M\times \vec {H}\) with \((M_0 \vec {H}_0 \oplus M_\varDelta \vec {H}_\varDelta )\).

  • Note that the matrix \(V^{-1}\) has 5 rows, where the first 2 correspond to slices of the output label and the last 3 correspond to the gate ciphertexts. Denote this division of \(V^{-1}\) by \(V^{-1}_{\textsf {label}}\) and \(V^{-1}_{\textsf {gate}}\). Instead of multiplying on the left by \(V^{-1}\) to solve for the output label and gate ciphertexts, we now multiply on the left by \(V^{-1}_{\textsf {gate}}\) to solve for only the gate ciphertexts. We then evaluate those gate ciphertexts with A and B to learn the (active) output label \(E_k\). This different approach has the same result by the correctness of the scheme. We can similarly partition the control bit ciphertexts \(\vec {z}_k = [ (\vec {z}_k)_{\textsf {top}} ~ (\vec {z}_k)_{\textsf {bot}} ]\), use \(V^{-1}_{\textsf {gate}}\) to compute \((\vec {z}_k)_{\textsf {bot}}\), and then use the evaluator’s computation to solve for \((\vec {z}_k)_{\textsf {top}}\). Solving for \((\vec {z}_k)_{\textsf {top}}\) is simplified by the first two columns of \(V_{i j}\) being the identity matrix. In this case, we solve for the missing positions using knowledge of the compressed control bits \(\overline{r}_{i j}\).

All of the changes are simple variable substitutions or basis changes in the linear algebra, so this hybrid is distributed identically to the real garbling.

Hybrid 2: In this hybrid, we apply the RTCCR property of H to all oracle queries of the form \(H( \cdot \oplus \varDelta )\). We must show that \(\varDelta \) is used in a way that can be achieved by calling the oracle from the RTCCR security game.

We focus on the term

$$ V^{-1}_{\textsf {gate}}M\vec {H}= V^{-1}_{\textsf {gate}}( M_0 \vec {H}_0 \oplus M_\varDelta \vec {H}_\varDelta ) $$

First, consider the expression \(V^{-1}\times M\), and recall that \(M\) is written in terms of the zero-labels \(A_0, B_0\). Using the \(V^{-1}\) given in Eq. 10 , we can compute:

$$\begin{aligned} V^{-1} M= \left[ \begin{array}{cc|cc|cc} 1 &{} 0 &{} 0 &{} 0 &{} 1 &{} 0 \\ 0 &{} 0 &{} 1 &{} 0 &{} 1 &{} 0 \\ 1 &{} 1 &{} 0 &{} 0 &{} 0 &{} 0 \\ 0 &{} 0 &{} 1 &{} 1 &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 &{} 0 &{} 1 &{} 1 \end{array} \right] \end{aligned}$$
(12)

Thus \(V^{-1}_{\textsf {gate}}\times M\) will consist of the bottom three rows of Eq. 12.

Recall that the columns of \(M\) correspond to oracle queries \(H(A_0), H(A_0 \oplus \varDelta ), H(B_0), H(B_0\oplus \varDelta ), H(A\oplus B), H(A\oplus B \oplus \varDelta )\), in that order. In the current hybrid \(M\) is partitioned into \(M_0\) (corresponding to H-queries on active labels) and \(M_\varDelta \) (corresponding to the other queries). In other words, \(M_\varDelta \) will consist of exactly one of rows \(\{1,2\}\), exactly one of rows \(\{3,4\}\), and exactly one of rows \(\{5,6\}\) from \(M\). In all cases, the result of \(V^{-1}_{\textsf {gate}}M_\varDelta \) (i.e., the bottom 3 rows of \(V^{-1}M_\varDelta \)) is the \(3 \times 3\) identity matrix!

This means we can rewrite the hybrid in the following way:

$$\begin{aligned}(\vec {z}_{k})_{\textsf {bot}} \mathbin {\Big \Vert }\vec {G}_k&:= V^{-1}_{\textsf {gate}}\left( \vec {r}\mathbin {\Big \Vert }\left( R' \oplus [0 \cdots 0|\,t]\right) \ \begin{bmatrix} A \\ B \\ \varDelta \end{bmatrix} \right) \oplus V^{-1}_{\textsf {gate}}(M_0 \vec {H}_0 \oplus M_\varDelta \vec {H}_\varDelta ) \\&= \vec {H}_\varDelta \oplus \text{[linear } \text{ combinations } \text{ of } \varDelta \text{] } \oplus \cdots \end{aligned}$$

Since all the H-queries in \(\vec {H}_\varDelta \) include a \(\varDelta \) term, we can compute this expression with 3 suitable calls to the RTCCR oracle.Footnote 8 Finally, \(D_k^{1 - x_k} = H'(E_k \oplus \varDelta , k)\) also uses \(\varDelta \), and will become two calls to the RTCCR oracle. These transformations successfully moves all references to \(\varDelta \) into the RTCCR oracle.

Applying RTCCR security, it has negligible effect to replace the results of these H-queries with uniformly random values. This has the effect of making the entire expression uniform, i.e.:

Also, \(D_k^{1 - x_k}\) is now sampled uniformly at random in \(GF({2^\kappa })\).

Hybrid 3: After making the previous change, the only place that \(R\) is used is when we use the marginal views \(R_{i j}\) and \(\vec {r}_{i j}\) to solve for the output label and for the missing pieces of the control bit ciphertexts. In Sect. 5.1 we specifically chose \(\mathcal {R}\) so that this marginal views is uniform for all t and all ij. Therefore instead of doing \(R, \vec {r}\leftarrow \mathsf {SampleR} (t,\mathsf {leak} (k))\), we can simply choose uniform \(\vec {r}_{i j}\) and use \(\mathsf {DecodeR} \) to reconstruct \(R_{i j}\). The change has no effect on the overall view of the adversary.

Note that after making this change, the control-bit ciphertexts \((\vec {z}_k)_{\textsf {top}}\) become uniform since \(\vec {r}_{i j}\) acts as a one-time pad.

Hybrid 4: As a result of the previous change, the hybrid no longer uses t. Additionally, t was the only place where the plaintext values \(x_k\) were used, other than in the computation of D. But D only uses plaintext values for the circuit’s output wires. In other words, the entire hybrid can be computed knowing only the circuit output f(x). Additionally, all garbled gate ciphertexts and control bit ciphertexts are chosen uniformly, and the active wire labels on output wires are determined by the scheme’s evaluation procedure. Hence, the hybrid exactly matches what happens in \(\mathcal {S}_{\text {priv}}\).

Obliviousness: Notice that \(\mathcal {S}_\text {priv}\) calls \(\mathcal {S}_\text {obliv}\) to generate (FE), then samples some more random bits for decoding and returns it all. Therefore, any adversary for obliviousness could be turned into one for privacy by only looking at (FE) and ignoring the rest.

Authenticity: The first two steps of the authenticity distribution are exactly the same as the real privacy distribution, so we can swap them for the simulated distribution \(\mathcal {S}_\text {priv}\) in a hybrid. Then to break authenticity the adversary must cause \(\mathsf {Decode}\) to choose \(j = 1 - x_k\) for at least one output k, as otherwise it will either produce the correct answer or abort. But \(D_k^{1-x_k}\) is fresh uniform randomness, so the probability that \(D_k^{1 - x_k} = H'(E_k, k)\) is \(2^{-\kappa }\).

5.5 Discussion

Concrete Costs. The garbler makes 6 calls to H per non-free gate, while the evaluator makes 3 calls to H per non-free gate.

Each non-free garbled gate consists of gate ciphertexts \(\vec {G}\) and encrypted control bits \(\vec {z}\). There are 3 gate ciphertexts, each being \(\kappa /2\) bits long. The encrypted control bits are a vector of length 5, where each component of the vector has length \(d/2\) (where \(d\) is the dimension of the control matrix subspace). For the standard (parity-leaking) instantiation of our scheme, \(d=2\) and we get that the total size of a garbled gate is \(1.5\kappa + 5\) bits. For the gate-hiding instantiation, \(d=4\) and we get a size of \(1.5\kappa +10\) bits.

Comparison to Half-Gates. We assume that calls to H are the computational bottleneck, in any implementation of both our scheme and in half-gates [ZRE15]. The following analysis therefore ignores the cost of xor’ing wire labels and bit-fiddling related to color bits and control bits.

In the time it takes to call H 12 times, half-gates generates 3 gates and sends \(6\kappa \) bits (4 calls to H and \(2\kappa \) bits per gate), while our scheme generates 2 gates and sends \(3\kappa \) bits (6 calls to H and \(1.5\kappa \) bits per gate). Thus, a CPU-bound implementation of our scheme will produce garbled output at half the rate of half-gates. We evaluated the optimized half-gates garbling algorithm from the ABY3 library [MR18], and found it capable of generating garbled output at a rate of \(\sim \)850 Mbyte/s on single core of a i7-7500U laptop processor running at 3.5 GHz. Thus, we conservatively estimate that a comparable implementation of our scheme could generate garbled output at \(\sim \)400 Mbyte/s = 3.2 Gbit/s. This rate would still leave our scheme network-bound in most situations and applications of garbled circuits. When both half-gates and our scheme are network bound, our scheme is expected to take \(\sim \)25% less time by virtue of reducing communication by 25\(\%\).

6 Optimizations

6.1 Optimizing Control Bit Encryptions

In our scheme the control bit encryptions \(\vec {z}\) is a vector of length 5, where the components in that vector are each a single bit (in the case of parity-leaking gates) or 2 bits (in the case of parity-hiding gates). These ciphertexts therefore contribute 5 or 10 bits to the size of each garbled gate.

We remark that it is possible to use ideas of garbled row reduction [NPS99, PSSW09] to reduce \(\vec {z}\) to a length-3 vector. This will result in these ciphertexts contributing 3 or 6 bits to the garbled gate. Such an optimization may be convenient in parity-hiding case, where the change from 10 to 6 bits allows these control bit ciphertexts to fit in a single byte.

Recall that in the security proof, we partition the control bit ciphertexts \(\vec {z}\) into \((\vec {z})_{\textsf {top}}\) (2 components) and \((\vec {z})_{\textsf {bot}}\) (3 components). Our idea to reduce their size is to simply fix \((\vec {z})_{\textsf {top}}\) to zeroes, so that these components do not need to be explicitly included in the garbled gate. The evaluator can act exactly as before, taking the missing values from \(\vec {z}\) to be zeroes. The garbler must sample the control matrix subject to it causing \((\vec {z})_{\textsf {top}}=0\).

A drawback to this optimization is that it significantly complicates the security proof (and hence why we only sketch it here). When we apply the security of RTCCR in the security proof, the hybrid acts as follows:

  1. 1.

    It uses the \(d/2\) least significant bits of the H-outputs to determine how the control bits are going to be “masked”.

  2. 2.

    Based on these masks, it chooses a consistent control matrix \(R\) that causes the first two components of  \(\vec {z}\) to be 0.

  3. 3.

    The choice of \(R\) determines which linear combinations of wire label slices (including slices of \(\varDelta \)) are applied.

So the reduction to RTCCR security must first read the low bits of several \(H(\cdot \oplus \varDelta )\) queries before it decides which linear combination of \(\varDelta \) should be XOR’ed with the remaining output of H. Of course the RTCCR oracle requires the choice of linear combination to be provided when H is called. It is indeed possible to formally account for this, but only by modeling the two parts of H’s output (for masking wire label slices and for masking control bits) as separate hash functions for the purposes of the security proof.

6.2 Optimizing Computation

Our construction requires a RTCCR function H with output length \((\kappa + d)/2\). We propose an efficient instantiation of H which naturally results in \(\kappa \)-bit output, which is then truncated to \((\kappa + d)/2\). The hash produces nearly twice as many bits as needed, raising the question of whether we are “wasting” these extra bits. In fact, if we reduce the security parameter slightly so that H is derived from a \((\kappa + d)\)-bit primitive, we can use these extra bits to reduce the computation cost.

Suppose \(H'\) is a [RT]CCR with \((\kappa + d)\) bits of output. Then define

$$ H(X, \tau ) = {\left\{ \begin{array}{ll} \text{ first } \text{ half } \text{ of } H'(X,\frac{\tau }{2}) &{} \tau \text{ even } \\ \text{ second } \text{ half } \text{ of } H'(X,\frac{\tau -1}{2}) &{} \tau \text{ odd } \\ \end{array}\right. } $$

Clearly H is also a [RT]CCR with \((\kappa + d)/2\) bits of output. How can we use this H to reduce the total number of calls to the underlying \(H'\)?

When a wire with labels \((A,A \oplus \varDelta )\) is used as input to an AND gate, our scheme makes calls of the form \(H(A,j), H(A\oplus \varDelta ,j)\) where j is the ID of that AND gate. Let us slightly change how the tweaks are used. Suppose this wire with label \((A,A\oplus \varDelta )\) is used as input in n different AND gates. Then those gates should make calls of the form , where i is now the index of the wire whose labels are \((A,A\oplus \varDelta )\). When H is defined as above, these queries can be computed with only \(\lceil n/2 \rceil \) queries to \(H'\).

Note that both the garbler and evaluator can take advantage of this optimization, with the garbler always requiring exactly twice as many calls to \(H'\) (if in some scenario the evaluator needs \(H'(X)\) then the garbler will need \(H'(X)\) and \(H'(X \oplus \varDelta )\)). Our AND gates require calls to H of the form \(H(A), H(B), H(A\oplus B)\), and so far we have discussed optimizing only the H(A) and H(B) queries. Similar logic can be applied to the queries of the form \(H(A \oplus B)\); for example, if a circuit contains gates \(a \wedge b\) and \((a \oplus b) \wedge c\), then both of those AND gates will require \(H(A \oplus B)\) terms that can be optimized in this way.

Fig. 8.
figure 8

Number of calls to \(\kappa \)-bit \(H'\) RTCCR function (per AND gate) to garble each circuit, with and without the optimization of Sect. 6.2. Evaluating the garbled circuit costs exactly half this number of calls to \(H'\).

We explored the effect of this optimization for a selection of circuits.Footnote 9 The results are shown in Fig. 8. The improvement ranges from 0% to 33.3%. As a reference, our baseline construction requires 6 calls to (\((\kappa + d)/2\)-bit output) H to garble an AND gate, while half-gates requires 4 calls (to a \(\kappa \)-bit function). Interestingly, in the Keccak f-function every wire used as input to an even number of AND gates, so that our optimized scheme has the same computation cost as half-gates (4 calls to \(H'\) per AND gate). In principle, this optimization can result in as few as 3 calls to \(H'\) per AND gate,Footnote 10 but typical circuits do not appear to be nearly so favorable.

7 The Linear Garbling Lower Bound

In [ZRE15], the authors present a lower bound for garbled AND gates in a model that they call linear garbling. The linear garbling model considers schemes with the following properties:

  • Wire labels have an associated color bit which must be \(\{0,1\}\).

  • To evaluate the garbled gate, the evaluator makes a sequence of calls to a random oracle (that depend only on the input wire labels), and then outputs some linear combination of input labels, gate ciphertexts, and random oracle outputs. The linear combination must depend only on the color bits of the input labels.

The bound of [ZRE15] considers only linear combinations over the field \(GF(2^\kappa )\), and it is unclear to what extent the results generalize to other fields.

Several works have bypassed this lower bound, and we summarize them below. All of these works show how to garble an AND gate for \(\kappa +O(1)\) bits, but only a single AND gate in isolation. These constructions all require the input wire labels to satisfy a certain structure, but do not guarantee that the output labels also satisfy that structure.

  • Kempka, Kikuchi, and Suzuki [KKS16] and Wang and Malluhi [WmM17] both use a technique of randomizing the control bits. The evaluator decrypts a constant-size ciphertext to determine which linear combination to apply. This approach is outside of the linear garbling model, which requires that the linear combination depend only on the color bits. These works also add wire labels in \(\mathbb {Z}_{2^\kappa }\) rather than XOR them (as in \(GF(2^\kappa )\)). Apart from these similarities, the two approaches are quite different.

  • Ball, Malkin, and Rosulek [BMR16] deviate from the linear garbling model by letting each wire label have a color “trit” from \(\mathbb {Z}_3\) instead of a color bit from \(\mathbb {Z}_2\). There is no further “indirection” of the evaluator’s linear combination—it depends only on the colors of the input labels. They also perform some linear combinations on wire labels over a field of characteristic 3.

As described earlier, we bypass the lower bound by adopting the control-bit randomization technique of [KKS16] but also introducing the wire-label-slicing technique.

8 Open Problems

We conclude by listing several open problems suggested by our work.

Optimality. Is \(1.5\kappa \) bits optimal for garbled AND gates in a more inclusive model than the one in [ZRE15]? A natural model that excludes “heavy machinery” like fully homomorphic encryption is Minicrypt, in which all parties are computationally unbounded but have bounded access to a random oracle. Conversely, can one do better—say, \(4\kappa /3\) bits per AND gate? Does it help to sacrifice compatibility with free-XOR? In our construction, free-XOR seems crucial.

Computation Cost. In Sect. 6.2 we described how to reduce the number of queries to an underlying \(\kappa \)-bit primitive, with an optimization that depends on topology of the circuit. Is there a way to reduce the computation cost of our scheme (measured in number of calls to, say, a \(\kappa \)-bit ideal permutation), for all circuits?

In the best case, we can garble a circuit for only 3 (amortized) calls per AND gate, whereas all prior schemes require 4. Setting aside garbled circuit size and free-XOR compatibility, is there any scheme that can garble arbitrary circuits for less than 4 (amortized) calls to a \(\kappa \)-bit primitive per AND gate?

Hardness Assumption. Free-XOR garbling requires some kind of circular correlation robust assumption (see [CKKZ12] for a formal statement). The state-of-the-art garbling scheme based on the minimal assumption of PRF is due to Gueron et al. [GLNP15], where AND gates cost \(2\kappa \) and XOR gates cost \(\kappa \) bits. Can our new techniques be used to improve on garbling from the PRF assumption, or alternatively can the optimality of [GLNP15] be proven? Again, our construction seems to rely heavily on the free-XOR structure of wire labels, which (apparently) makes circular correlation robustness necessary.

Privacy-Free Garbling. Frederiksen et al.  [FNO15] introduced privacy-free garbled circuits, in which only the authenticity property is required of the garbling scheme. The state-of-the-art privacy-free scheme is due to [ZRE15], where XOR gates are free and AND gates cost \(\kappa \) bits. Can our new techniques lead to a privacy-free garbling scheme with less than \(\kappa \) bits per AND gate (with or without free-XOR)?

Simpler Description. Is there a way to describe our construction as the clean composition of simpler components, similar to how the half-gates construction is described in terms of simpler “half gate” objects? The challenge in our scheme is the way in which left-slices and right-slices of the wire labels are used together.