1 Introduction

In 2-party secure function evaluation (SFE), Alice holds a private input \(x \in X\), Bob holds a private input \(y \in Y\), and the parties interact to learn f(xy) for some agreed-upon function \(f : X \times Y \rightarrow Z\). Each party should learn no more than can be inferred from f(xy) alone, even when that party behaves adversarially.

Different functions f have different inherent complexities, and one way to compare the “cryptographic complexity” of functions is to use a reduction. The most natural reduction from f to g is a secure protocol for f where the parties are allowed to use an ideally-secure black-box for g (ideal here means that this black-box takes inputs from both parties, and reveals only the output of g). Depending on the security required of the protocol for f, we obtain reductions of various strengths that can resolve finer distinctions in cryptographic complexity.

Cryptographic Complexity and Related Work. In this work we exclusively focus on reductions between two-party, deterministic SFE functions with constant-size truth tables (meaning the function that is computed does not depend on the security parameter). We consider a reduction defined in terms of UC security [4] against computationally unbounded adversaries. We write to denote that there is a UC-secure protocol securely realizing f against unbounded adversaries, that makes calls to an ideal g functionality (i.e., a protocol in the “g-hybrid model”).

After defining a notion of reducibility, the most natural step is to identify which objects are complete for the reduction. A function g is complete (under \(\sqsubseteq \)) if \(f \sqsubseteq g\) for all f. Otherwise we say that g is incomplete.

Kilian [8] was the first to consider completeness of SFE functionalities, proving that the oblivious transfer function is complete. Although the result pre-dates the UC model, a variant of the construction from [9] is likely to achieve UC security — i.e., oblivious transfer is complete under the \(\sqsubseteq \) reduction that we consider in this work. Later work characterized exactly which functions are complete (w.r.t. malicious, unconditional security): for symmetric SFE (where both parties receive the same output) [10], for asymmetric SFE (where only one party receives output) [11], for SFE where parties may receive different outputs [13], and even for randomized SFE functions [12, 20].

When one or both of fg are complete, the question of \(f \sqsubseteq g\) is simple to answer. If g is complete, then \(f\sqsubseteq g\). If f is complete but g is not, then \(f \not \sqsubseteq g\). The goal of this line of work is to therefore understand when \(f \sqsubseteq g\), for f and g which are incomplete.

Prabhakaran and Rosulek [21] gave an example of four functions that satisfy \(f_1 \sqsubset f_2 \sqsubset f_3 \sqsubset f_4\) (where \(f \sqsubset g\) means that \(f \sqsubseteq g\) but \(g \not \sqsubseteq f\)). Maji, Prabhakaran and Rosulek [17] extended this result to show an infinite strict hierarchy \(f_1 \sqsubset f_2 \sqsubset \cdots \sqsubset f_i \sqsubset \cdots \), and also showed an example of a pair of functions that are incomparable (\(f \not \sqsubseteq g\) and \( g \not \sqsubseteq f\)). The same authors in [18] later proved several additional results of the form \(f \not \sqsubseteq g\).

These results hint at a rich landscape of complexity with respect to the \(\sqsubseteq \) reduction, but fall well short of revealing the entire picture. First, they are not complete characterizations, but give only necessary conditions for \(f \sqsubseteq g\). Second, the techniques in these works apply only to f and g that have semi-honest-secure protocols. This leaves a large class of functions that are neither complete nor admit any semi-honest protocol (simple characterizations of both properties are known [1, 10, 14]). A canonical example of such a function is the so-called “spiral” function shown in Fig. 1c. Currently almost nothing is known about reductions involving such intermediate functions.

Other Related Work. We consider a reduction based on UC security against unbounded adversaries. Weaker reductions have been studied, but they do not turn out to illuminate many distinctions in complexity. For example, one may define a reduction based on polynomial-time UC security. It turns out that every SFE is either complete or trivial (it reduces to every other function) under this reduction [19].

In this paper we prove qualitative differences in the power of randomized and deterministic protocols, even when realizing deterministic functions using other deterministic functions (i.e., we show specific, deterministic f and g where \(f \sqsubseteq g\) via a randomized protocol but not by any deterministic protocol). In the two-party setting, Dodis and Micali [5] show such a separation among complete f and g, for a special class of protocols (in which one party does not speak). Beimel and Malkin [2] show specific f and (complete) g for which randomized protocols make exponentially fewer calls to g than deterministic protocols.

2 Overview of Our Results

Scope of Results: Incomplete, Non-Unilateral Functions. As mentioned in the introduction, the question of \(f \sqsubseteq g\) is straight-forward when one of \(\{f,g\}\) are complete. We therefore focus on characterizing \(f \sqsubseteq g\) when both are incomplete.

We say that f is unilateral if there exists an input \(y^*\) for one of the parties (by symmetry, Bob) such that \(f(\cdot , y^*)\) is a constant function. That is, by choosing input \(y^*\) Bob can unilaterally fix the output of f. We characterize \(f \sqsubseteq g\), when f is non-unilateral. In Sect. 7.1 we show an example of unilateral fg that do not obey the characterization, demonstrating that this restriction is tight.

Statement of Main Result. We show a complete and combinatorial characterization of \(f \sqsubseteq g\), for a natural class of protocols. For this characterization, identify each function f with its 2-dimensional truth table (rows corresponding to Alice-inputs and columns to Bob-inputs). We say that f embeds in g if f appears as a submatrix of g, subject to some other restrictions (essentially, the other parts of g can’t “interfere” with the f-submatrix—the formal definition is in Sect. 4). We then prove our main theorem:

Theorem 1

The following are equivalent, when f and g are incomplete and f is non-unilateral.

  1. 1.

    \(f \sqsubseteq g\) via a worst-case \(O(\log \kappa )\)-round protocol (where \(\kappa \) is the security parameter).

  2. 2.

    \(f \sqsubseteq g\) via a deterministic protocol.

  3. 3.

    \(f \sqsubseteq g\) via a deterministic protocol consisting of a single call to g and no additional communication.

  4. 4.

    f embeds in g.

Technical Approach. The most involved part of our main theorem is proving (1) \(\Rightarrow \) (3) and (2) \(\Rightarrow \) (3). Intuitively this involves “compressing” an arbitrary protocol for \(f \sqsubseteq g\) into a single call to g.

Our first step is to show that every secure protocol for \(f \sqsubseteq g\) can be transformed into one with the following instantaneous property:

  • With overwhelming probability, the protocol terminates immediately following some call to g.

  • Strictly before this terminal call to g, the protocol transcript leaks negligible information about either party’s inputs.

Our main technical tool is that of frontier analysis, which was introduced in [17] and extended in [16]. A frontier in the protocol is simply the collection of partial transcripts where some statistical condition is true for the first time. Roughly speaking, we define two frontiers for each party: one expressing “the first time the simulator is likely to extract” and another for “the first time honest parties can reliably predict the final output.” We then argue that these frontiers must all be reached simultaneously, with overwhelming probability. As such, these events can happen only as the result of a call to g. Furthermore, the protocol can be safely truncated after reaching the frontiers (since both parties can already predict the final output). The result of truncation is a protocol with the “instantaneous” property described above. We complete the argument by showing how such an instantaneous protocol can be compressed from \(O(\log \kappa )\) rounds to one round.

Tightness of the Characterization. Our main theorem does not characterize \(f \sqsubseteq g\) when f is unilateral. This restriction is inherent, as we demonstrate with an example in Sect. 7.1. In Sect. 7.2 we also demonstrate an example f and g with the following properties:

  1. 1.

    f does not embed in g. Hence, by the classification theorem, \(f \not \sqsubseteq g\) via any deterministic protocol or (randomized) logarithmic-round protocol.

  2. 2.

    \(f \sqsubseteq g\) via a randomized protocol whose expected round complexity is constant, but whose worst-case round complexity is \(r(\kappa )\) for any \(r(\kappa ) = \omega (\log \kappa )\).

This example demonstrates that our main characterization’s limitation to \(O(\log \kappa )\)-round protocols is inherent.

Interestingly, the \(\omega (\log \kappa )\)-round protocol for \(f \sqsubseteq g\) has the instantaneous property described above. Hence, the protocol leaks no information about the parties’ inputs, until f(xy) is completely revealed in a single call to g. Yet there is no way to securely compress the protocol to just the “meaningful” call to g. Somehow, it is important that the “output-fixing” round is unpredictable.

Mysteriously, a similar structure appears in the protocols of Gordon et al.  [6] that achieve fairness. These protocols leak nothing about the inputs until, in some secret round, the output is completely revealed. Analogously, Lindell and Rabin [15] show that the “output-fixing” round in a fair protocol cannot be predictable. We are not sure what fairness has to do with unfair multi-party computation with an incomplete hybrid functionality, and leave open this exploration for future work.

Ours is also one of the few examples of a \(\omega (1)\) round-complexity lower bound for information-theoretic multi-party computation. Indeed, when g is complete, \(f \sqsubseteq g\) is possible in constant rounds, for any f: first, obtain oblivious transfer from g in constant rounds [10, 13], and from oblivious transfer obtain f in constant rounds [7].Footnote 1

3 Preliminaries

3.1 Secure Function Evaluation, UC Security

We assume the reader has familiarity with the UC framework (a brief overview is given in Appendix A). In this work we study deterministic 2-party secure function evaluation (SFE), in the universal composability (UC) framework [4] against computationally unbounded adversaries that corrupt parties statically (i.e., once and for all before the protocol begins). We consider security-with-abort, meaning that malicious parties are allowed to learn their output first, and delay the honest parties from receiving output (perhaps indefinitely).

We use the following notation:

\(f \sqsubseteq g\)::

there is a secure protocol (UC, unconditional) for f that uses calls to an ideal g (i.e., a secure protocol in the g-hybrid model).

\(f \sqsubseteq _1 g\)::

\(f \sqsubseteq g\) via a protocol that makes only a single call to g and uses no additional communication.

3.2 Combinatorial Properties of Complete/Incomplete f

We review some basics of 2-party SFE. Let f be a 2-party SFE with domain \(X \times Y\). A fundamental property of SFE has to do with decomposing the function into “rectangles.” Define . We refer to as a rectangle of f.

The characterization of [in]completeness for 2-party SFE is due to Kilian:

Theorem 2

([10]). f is incomplete if and only if: for all \(x,x',y,y'\),

$$f(x,y)=f(x',y)=f(x,y') \implies f(x',y')=f(x,y).$$

A useful consequence of Theorem 2 is the following:

Observation 3

Let f be incomplete. Then for all xy, the value is uniquely determined by f(xy) and just one of \(\{x,y\}\). Likewise, f(xy) is uniquely determined by and just one of \(\{x,y\}\).

Observation 3 implies that without loss of generality we can think of the parties as computing the function instead of the function f(xy).

Figure 1 shows three example functions, with the partition into rectangles given for the incomplete functions. Note that the second function has 4 rectangles: two rectangles with output 1, and two with output 2.

Fig. 1.
figure 1

Example SFE functions.

In this work we restrict our attention to symmetric functions, which give the same output f(xy) to both parties. One could easily consider asymmetric functions \(f = (f_A, f_B)\) which give output \(f_A(x,y)\) to Alice and \(f_B(x,y)\) to Bob. However, a result of Kraschewski and Müller-Quade [13] shows that all incomplete functions (even asymmetric ones) are isomorphic to some symmetric one.Footnote 2 Hence, our restriction to symmetric functions is without loss of generality, and our theorem statements can be interpreted to apply to asymmetric functions as well.

3.3 Properties of g-Hybrid Protocols for Incomplete g

Fix a 2-party protocol \(\pi \), and let t be a partial transcript (i.e., a prefix of a complete protocol transcript). We use to denote the probability of obtaining a protocol transcript with prefix t, when both parties run the protocol honestly with respective inputs x and y.

Write t as a sequence of messages \(t = (m_1, \ldots , m_k)\). Suppose Alice sends the odd-numbered messages. Then the choice of the odd-numbered (resp. even-numbered) messages depends only on the previous messages and x (resp. y), but not on y (resp. x). We can therefore write:

$$\begin{aligned} \textstyle \Pr _{\pi }[t|xy]&= \textstyle \prod _{i=1}^k \textstyle \Pr _{\pi }[m_i | xy, m_1 \cdots m_{i-1} ] \\&= \Big (\textstyle \prod _{i ~\text {odd}} \textstyle \Pr _{\pi }[m_i | x, m_1 \cdots m_{i-1} ] \Big ) \Big (\textstyle \prod _{i ~\text {even}} \textstyle \Pr _{\pi }[m_i | y, m_1 \cdots m_{i-1} ] \Big ) \\&\overset{\text {def}}{=} \textstyle \Pr _{\pi }[t|x] \textstyle \Pr _{\pi }[t|y] \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad (\star ) \end{aligned}$$

Here we are defining \(\textstyle \Pr _{\pi }[t|x]\) and \(\textstyle \Pr _{\pi }[t|y]\) to be equal to the parenthesized quantities. Essentially, \(\textstyle \Pr _{\pi }[t|x]\) is the probability that Alice behaves consistently with t when her input is x.

In the g-Hybrid Model. A similar property also holds when the parties can call an ideal functionality g (i.e., protocols in the g-hybrid model), but only when g is incomplete, as we describe below.

When parties invoke g, its output is added to the joint transcript. From Observation 3 this is equivalent to adding to the transcript, where \(\tilde{x}\) and \(\tilde{y}\) were the inputs that the parties gave to this instance of g. Let \(\tilde{X} \times \tilde{Y}\) be a particular rectangle in g. Then when g is incomplete we have:

Alice’s choice of \(\tilde{x}\) depends only on her f-protocol input and the transcript so far; similarly \(\tilde{y}\) depends only on Bob’s input and the transcript so far. Hence, even though the parties contribute simultaneously to the transcript via a call to g (unlike when they alternate exchanging plain messages), the probability of a transcript can still be factored into independent contributions from the two parties, as in (\(\star \)).

Stateless Parties/Adversaries. The “standard” way of defining a protocol is for each party to initially choose a random tape. Their subsequent behavior is a deterministic function of the random tape, their input, and the transcript so far.

However, (\(\star \)) shows that Alice’s view (including her private randomness) is independent of Bob’s view (including his randomness), given the transcript. Therefore, any g-hybrid protocol \(\pi \) can be purged of stateful randomness in the following way. At each step, a stateless party can (1) sample a random tape conditioned on it being consistent with their private input and transcript so far; (2) use that (ephemeral) random tape to choose the next move in the protocol; (3) discard the ephemeral random tape. Note that this transformation may require exponential time, but we consider all parties to have unbounded computation. This transformation also applies to adversaries, so without loss of generality we consider only stateless adversaries.

4 Reducibility Characterization

We define the combinatorial condition at the heart of our main theorem. Intuitively, f embeds to g if one can identify a submatrix of g that “looks like” f. Of course, the outputs of f might be renamed relative to g. Such a submatrix property suffices for a semi-honest protocol for f using g, where parties simply use the subset of inputs of g that comprise the f-submatrix. However, such a protocol need not be secure in the presence of malicious adversaries, because other inputs of g may “interfere” with the f-submatrix. There are two main things that can go wrong, epitomized in the following examples:

figure a

Note that \(f_1\) appears as the white submatrix of \(g_1\). However, when a corrupt column-player cheats and uses the shaded column of \(g_1\), he completely learns the row-players input, even though no column of \(f_1\) legally allows this.

Similarly, \(f_2\) appears as a submatrix of \(g_2\). Consider a corrupt column-player who uses the shaded column of \(g_2\). There is no single input for \(f_2\) that “explains” the effect of this behavior for all possible inputs of the row-player. Concretely, there is no input of \(f_2\) that guarantees an output in \(\{2,3\}\).

The requirements for embedding are formalized in the following definition:

Definition 4

For two functions \(\alpha \) and \(\beta \) we say that \(\alpha \) leaks no more than \(\beta \) if \(\beta (y) = \beta (y') \Rightarrow \alpha (y) = \alpha (y')\) for all inputs \(y,y'\). We say that \(\alpha \) refines \(\beta \) if \(\beta (y) \in \{ \alpha (y), \bot \}\) for all inputs y.

Let \(f : X \times Y \rightarrow Z\) and \(g : \widehat{X}\times \widehat{Y}\rightarrow \widehat{Z}\). Without loss of generality, assume and . We say that f embeds in g if:

  1. 1.

    (f appears as a submatrix in g) There exist two injective mappings, \(A: X \rightarrow \widehat{X}\) and \(B: Y \rightarrow \widehat{Y}\), and a third mapping, \(C: \widehat{Z} \rightarrow Z \cup \{\bot \},\) such that \(\forall x \in X, y \in Y: f(x,y) = C( g(A(x), B(y)) )\).

  2. 2.

    (security guarantees) There exist mappings \(\widehat{A}: \widehat{X} \rightarrow X\) and \(\widehat{B}: \widehat{Y} \rightarrow Y\) such that the following hold:

    1. (a)

      (g doesn’t reveal too much information)

      • for all \(\widehat{x}\in \widehat{X}\), \(g(\widehat{x}, B(\cdot ))\) leaks no more than \(f(\widehat{A}(\widehat{x}), \cdot )\)

      • for all \(\widehat{y}\in \widehat{Y}\), \(g(A(\cdot ), \widehat{y})\) leaks no more than \(f(\cdot , \widehat{B}(\widehat{y}))\)

    2. (b)

      (there are no ambiguous g-inputs)

      • for all \(\widehat{x}\in \widehat{X}\), \(f(\widehat{A}(\widehat{x}),\cdot )\) refines \(C(g(\widehat{x}, B(\cdot )))\).

      • for all \(\widehat{y}\in \widehat{Y}\), \(f(\cdot , \widehat{B}(\widehat{y}))\) refines \(C(g(A(\cdot ), \widehat{y}))\).

To understand this definition, it helps to see how the mappings \(A, B, C, \widehat{A}, \widehat{B}\) relate to a secure protocol demonstrating \(f \sqsubseteq _1 g\):

Lemma 5

If f embeds in g, then \(f \sqsubseteq _1 g\) via a deterministic protocol. This proves (4) \(\Rightarrow \) [(1) \(\wedge \) (2) \(\wedge \) (3)] of Theorem 1 (stated in Sect. 2).

Proof

Let f embed in g, with associated mappings as in Definition 4. The protocol for f is as follows:

  • Alice sends input \(A(x)\) to g where x is her f-input.

  • Bob sends input \(B(y)\) to g where y is his f-input.

  • The parties both output \(C(z)\) where z is the output they receive from g (they output \(\bot \) if g gives output \(\bot \)).

Correctness follows from the first condition of Definition 4. Due to the symmetry in the definitions/protocol, we show security only against a malicious Alice.

Suppose Alice sends input \(\widehat{x}\) to g. In the real protocol, Alice’s view will consist of \(g(\widehat{x}, B(y))\) and Bob’s output will be \(C(g(\widehat{x}, B(y)))\). In the ideal world, the simulator will do the following:

  • The simulator sends \(x^* = \widehat{A}(\widehat{x})\) to the ideal f, and obtains output \(f(x^*,y) = C( g( A(x^*), B(y) ))\).

  • The simulator does not know Bob’s input y but can choose any \(y'\) such that \(f(x^*, y') = f(x^*,y)\). The simulator can give \(g(\widehat{x}, B(y'))\) to Alice as her simulated view. From part 2a of Definition 4, we have that this is identical to the real view \(g(\widehat{x}, B(y))\).

  • The simulator checks whether \(C( g(\widehat{x}, B(y))) = \bot \), and if so sends \((\textsc {deliver},0)\) to f. In this case, Bob’s real and ideal outputs will both be \(\bot \). Otherwise, it sends \((\textsc {deliver},1)\) to f and Bob will receive output \(f(x^*,y)\).

Bob’s ideal output is \(f(x^*,y) = C( g( A(x^*), B(y) ))\). From condition 2b of Definition 4, this matches the real output \(C(g(\widehat{x}, B(y)))\).

Lemma 6

For non-unilateral f, if \(f \sqsubseteq _1 g\) via a deterministic protocol with simulation errorFootnote 3 less than 1, then f embeds in g. This proves (3) \(\Rightarrow \) (4) of Theorem 1 (stated in Sect. 2).

Proof

(Sketch). The full proof is in Appendix B. The fact that \(f \sqsubseteq _1 g\) by some deterministic protocol immediately reveals part 1 of the embedding: there must be some set of mappings that Alice and Bob use to map f-inputs to g-inputs and g-outputs to f-outputs.

The main technical portion of this proof is that, if simulator mappings \(\widehat{A}\) and \(\widehat{B}\) do not follow the rules in parts 2a and 2b of Definition 4, then the simulation error is 1, which contradicts our assertion that the simulation error is less than 1. The intuition for these attacks is clear from the examples above.

5 Instantaneous Protocols

In this section we show how to transform any secure protocol in the g-hybrid model into one that has an “instantaneous” property (described further in Sect. 5.4). The results in this section apply to arbitrary protocols. Later in Sect. 6 we give further transformations that are restricted to deterministic or logarithmic-round protocols.

5.1 Frontier Basics

Recall that \(\textstyle \Pr _{\pi }[t|xy]\) refers to the probability that the protocol results in transcript with prefix t, when run honestly on inputs x and y. We write \(\textstyle \Pr _{\pi }[\mathcal {E}|txy]\) to denote the probability that event \(\mathcal {E}\) happens, given that the parties run honestly with inputs x and y, and conditioned on t being a prefix of the transcript.

Let F be any set of partial protocol transcripts, with the property that if \(t \in F\), and t is a prefix of \(t'\), then \(t'\in F\). In other words, F describes an event in the protocol that happens and does not “unhappen.” In this case we call F a frontier, using the terminology of [17].

It is sometimes helpful to associate the frontier F with its set of prefix-minimal elements, as these represents transcript where some condition happened for the first time. Let \(\textsf {first}(F)\) denote the prefix-minimal elements of F.

If F is a frontier, we use notation \(\textstyle \Pr _{\pi }[F|xy]\) to denote the probability that F is encountered when running the protocol honestly on inputs x and y. More formally:

$$ \textstyle \Pr _{\pi }[F|xy] \overset{\text {def}}{=} \sum _{t \in \textsf {first}(F)} \textstyle \Pr _{\pi }[t|xy] $$

Finally, if F and G are two frontiers, then “\(F < G\)” denotes the event “either F happens strictly before G, or F happens and G never happens.” More formally,

$$ \textstyle \Pr _{\pi }[ F < G |xy] \overset{\text {def}}{=} \sum _{t \in \textsf {first}( F \setminus G)} \textstyle \Pr _{\pi }[t|xy] $$

5.2 Our Frontiers

Our analysis relies on two types of frontiers that we introduce:

  • : captures the first time that the simulator has extracted with reasonable probability, in an ideal-world interaction involving a corrupt Alice running honestly on input x.

  • : captures the first time that Alice’s output becomes relatively fixed, in the following sense. If the parties continue with honest behavior from such a point in the protocol, and Alice has input x, then Alice has only one likely output, no matter what Bob’s input is.

We define such a frontier for every input x. We also define analogous frontiers with respect to Bob.

We have already defined \(\textstyle \Pr _{\pi }[\,\cdot \,| xy]\) notation with respect to an honest execution of the protocol on inputs xy. Since \(F^{x}_{\textsf {A}\hbox {-}\textsf {ext}}\) refers to probabilities in an ideal-model interaction, we introduce notation to differentiate between the probabilities in real and ideal interactions. We write \(\textstyle \Pr _{\textsf {A}\hbox {-}\textsf {sim}}[\,\cdot \,|xy]\) to refer to probabilities induced by an ideal-model interaction among malicious Alice running the protocol honestly on input x, the simulator for corrupt Alice, and ideal honest Bob with input y. \(\textstyle \Pr _{\textsf {B}\hbox {-}\textsf {sim}}\) is defined analogously.

Definition 7

At some point in an ideal interaction between corrupt Alice and the simulator, the simulator will at some point “extract” by sending an input to the ideal f. Define:

$$ \sigma _{\textsf {A}}(t,x) \overset{\text {def}}{=} \textstyle \Pr _{\textsf {A}\hbox {-}\textsf {sim}}[\!\!\text{ simulator } \text{ has } \text{ previously } \text{ extracted } | txy] $$

That is, \(\sigma _{\textsf {A}}(t,x)\) is the probability that the simulator has extracted, given that the transcript so far is t. We define \(\sigma _{\textsf {B}}\) analogously.

In somewhat more detail, consider formally defining a simulator in terms of a its next-message function. Given as input its view so far (messages exchanged with the adversary and functionality and internal state), it outputs either \((\textsc {Prot}, m)\) to indicate sending a simulated protocol message m to the adversary, or \((\textsc {Ext}, x)\) to indicate sending an input x to the ideal functionality. Until the simulator talks to the ideal functionality, the only interaction is between the adversary and the simulator. As such, the simulator may be stateless for this period of time without loss of generality (by the same reasoning as in Sect. 3.3). The simulator’s view certainly indicates whether extraction has happened (i.e., whether the view contains a \(\textsc {Ext}\) message). Since the moment of extraction (\(\textsc {Ext}\) message) is the first place that the simulator’s view and the adversary-simulator transcript diverge, \(\sigma _{\textsf {A}}\) can be defined as a function of the transcript only.

Let \(\sigma _{\textsf {A}}^*(t,x)\) denote the probability that the simulator has decided to extract “at this instant,” i.e., in response to the most recent protocol message sent by the adversary. Formally, let \(t = (t_1, \ldots , t_n)\) be the partial transcript, where Alice is corrupt and speaks first:

$$\begin{aligned} \sigma _{\textsf {A}}^*(t,x) \overset{\text {def}}{=} \textstyle \Pr _{\pi }[t|x] \left( \displaystyle \prod _{\begin{array}{c} i ~\text {even} \\ i < n \end{array}} \Pr [ \mathcal {S}(t_1 \cdots t_i) = (\textsc {Prot},t_{i+1})] \right) \Pr [ \mathcal {S}(t) = (\textsc {Ext},\cdot )] \end{aligned}$$

where \(\mathcal {S}\) is the simulator’s next-message function. Then

$$ \sigma _{\textsf {A}}(t,x) = \sum _{i < n} \sigma _{\textsf {A}}^*\Big ( (t_1, \ldots , t_i), x \Big ) $$

Note that before the simulator extracts, its view is perfectly independent of y in the ideal interaction. Its decision to extract, and hence the probability \(\sigma _{\textsf {A}}(t,x)\), depends only on x and not on y.

With that in mind, note that we have defined \(\sigma _{\textsf {A}}\) to refer to the probability that extraction has happened strictly in the past (note \(i< n\) in the summation above). Another way to interpret \(\sigma _{\textsf {A}}\) is “the probability that the transcript might be affected by the honest party’s input y.” Hence, as the transcript evolves, the value of \(\sigma _{\textsf {A}}\) cannot change as a result of a message sent by Alice. It can only change as a result of a message generated by the simulator, hence an output of g or a simulated Bob-message.

Finally, note that our terminology considers when the simulator actually extracts, and not when the simulator has in principle enough information to extract. Again, the important issue is whether the simulator has already contacted the ideal functionality, and therefore the transcript may be influenced by the honest party’s input.

Definition 8

Given a secure protocol \(\pi \) with simulation error \(\varepsilon \), define the following for all inputs xy:

$$\begin{aligned} F^{x}_{\textsf {A}\hbox {-}\textsf {ext}}&= \{ t \mid \sigma _{\textsf {A}}(t,x)> 4\sqrt{\varepsilon }\} \\ F^{y}_{\textsf {B}\hbox {-}\textsf {ext}}&= \{ t \mid \sigma _{\textsf {B}}(t,y) > 4\sqrt{\varepsilon }\} \\ F^{x}_{\textsf {A}\hbox {-}\textsf {out}}&= \{ t \mid \forall y,y': f(x,y) \ne f(x,y') \Rightarrow \min \left\{ \begin{array}{c} \textstyle \Pr _{\pi }[\!\!\text{ out } f(x,y)|txy], \\ \textstyle \Pr _{\pi }[\!\!\text{ out } f(x,y')|txy'] \end{array} \right\}< 1-\sqrt{\varepsilon }\} \\ F^{y}_{\textsf {B}\hbox {-}\textsf {out}}&= \{ t \mid \forall x,x': f(x,y) \ne f(x',y) \Rightarrow \min \left\{ \begin{array}{c} \textstyle \Pr _{\pi }[\!\!\text{ out } f(x,y)|txy], \\ \textstyle \Pr _{\pi }[\!\!\text{ out } f(x',y)|tx'y] \end{array} \right\} < 1-\sqrt{\varepsilon }\} \end{aligned}$$

Here \(\textstyle \Pr _{\pi }[\!\!\text{ out } z|txy]\) refers to the probability that honest parties output z when starting the protocol at partial transcript t and running honestly with inputs x and y.

To understand \(F^{x}_{\textsf {A}\hbox {-}\textsf {out}}\), observe that for \(t \in F^{x}_{\textsf {A}\hbox {-}\textsf {out}}\) there is at most one output that can be induced with probability at least \(1-\sqrt{\varepsilon }\). It may be the case that no valid output can be induced with this probability, in which case only \(\bot \) output is likely from starting point t.

Note that if \(\varepsilon \) is a negligible function of the security parameter, then \(\sqrt{\varepsilon }\) is a larger function but also still negligible.

5.3 Properties of the Frontiers

We now show that, roughly speaking, all the frontiers that we have defined must occur simultaneously, with overwhelming probability. Proofs for the lemmas in this section are given in Appendix C. Note that all lemmas hold with the roles of Alice and Bob reversed.

Lemma 9

For all xy: \(\textstyle \Pr _{\pi }[ F^{x}_{\textsf {A}\hbox {-}\textsf {ext}}< F^{y}_{\textsf {B}\hbox {-}\textsf {out}} \mid x y] < 2\sqrt{\varepsilon }\).

Proof

(sketch). A partial transcript \(t \in F^{x}_{\textsf {A}\hbox {-}\textsf {ext}} \setminus F^{y}_{\textsf {B}\hbox {-}\textsf {out}}\) represents a situation where there is reasonable probability that the simulator would have extracted from Alice in the ideal-interaction (\(t \in F^{x}_{\textsf {A}\hbox {-}\textsf {ext}}\)), but in the real-interaction Alice can still induce two different outputs for Bob, each with good probability (\(t \not \in F^{y}_{\textsf {B}\hbox {-}\textsf {out}}\)). Intuitively, the simulator has extracted prematurely. This event should be rare.

Next we show that \(F^{x}_{\textsf {A}\hbox {-}\textsf {out}}\) is a point at which the honest parties can predict their eventual output.

Definition 10

Fix x and let \(t \in F^{x}_{\textsf {A}\hbox {-}\textsf {out}}\). Then there is at most one value z such that \(\exists y: \textstyle \Pr _{\pi }[\!\!\text{ out } z |xyt] > 1-\sqrt{\varepsilon }\). Let \(\textsf {guess}_{\textsf {A}}(t,x)\) denote this value z, and note that the value could be \(\bot \). We extend the notation \(\textsf {guess}_{\textsf {A}}(t,x) = \bot \) in the case that \(t \not \in F^{x}_{\textsf {A}\hbox {-}\textsf {out}}\).

Lemma 11

For \(z \ne \bot \) define \(G^z = \{ t \mid \textsf {guess}_{\textsf {A}}(t,x) = z \}\). Then for all xy: \(\textstyle \Pr _{\pi }[ G^{f(x,y)} |xy] > 1- \varepsilon /2\). Intuitively, upon reaching \(F^{x}_{\textsf {A}\hbox {-}\textsf {out}}\), Alice can predict her eventual output with error at most \(\varepsilon /2\).

Lemma 12

For all xy (x not unilateral for f), \(\textstyle \Pr _{\pi }[ F^{x}_{\textsf {A}\hbox {-}\textsf {out}}< F^{x}_{\textsf {A}\hbox {-}\textsf {ext}} \mid xy] < 16\varepsilon \).

Proof

(Sketch). A partial transcript \(t \in F^{x}_{\textsf {A}\hbox {-}\textsf {out}} \setminus F^{x}_{\textsf {A}\hbox {-}\textsf {ext}}\) represents a situation where Alice can predict what the output will be (\(t \in F^{x}_{\textsf {A}\hbox {-}\textsf {out}}\)), but the simulator probably has not extracted yet (\(t \not \in F^{x}_{\textsf {A}\hbox {-}\textsf {ext}}\)). This event should be rare, since in the ideal interaction Alice can gain no information about the f-output before the simulator extracts (assuming x is not a unilateral input, so that the output indeed depends on Bob’s input).

Lemma 13

For all xy (y not unilateral), \(\textstyle \Pr _{\pi }[ F^{y}_{\textsf {B}\hbox {-}\textsf {out}}< F^{x}_{\textsf {A}\hbox {-}\textsf {out}} |xy] < 18\sqrt{\varepsilon }\).

Proof

(Sketch). This follows from the fact that if \(F^{y}_{\textsf {B}\hbox {-}\textsf {out}} < F^{x}_{\textsf {A}\hbox {-}\textsf {out}}\) then either \(F^{y}_{\textsf {B}\hbox {-}\textsf {out}} < F^{y}_{\textsf {B}\hbox {-}\textsf {ext}}\) or \(F^{y}_{\textsf {B}\hbox {-}\textsf {ext}} < F^{x}_{\textsf {A}\hbox {-}\textsf {out}}\), both of which are negligibly likely from Lemmas 9 and 12.

Lemma 14

For all \(x,y',y\) (\(x,y'\) not unilateral), \(\textstyle \Pr _{\pi }[ F^{y'}_{\textsf {B}\hbox {-}\textsf {out}}< F^{y}_{\textsf {B}\hbox {-}\textsf {out}} |xy] < 42\sqrt{\varepsilon }\).

Proof

(Sketch). If \(F^{y'}_{\textsf {B}\hbox {-}\textsf {out}} < F^{y}_{\textsf {B}\hbox {-}\textsf {out}}\) then either \(F^{y'}_{\textsf {B}\hbox {-}\textsf {out}} < F^{x}_{\textsf {A}\hbox {-}\textsf {ext}}\) or \(F^{x}_{\textsf {A}\hbox {-}\textsf {ext}} < F^{y}_{\textsf {B}\hbox {-}\textsf {out}}\).

We can argue that the first case \(F^{y'}_{\textsf {B}\hbox {-}\textsf {out}} < F^{x}_{\textsf {A}\hbox {-}\textsf {ext}}\) would be negligibly likely, if the parties run honestly on inputs \(x,y'\). Unfortunately here we are using input y for Bob. But consider the ideal interaction with corrupt Alice. We are interested in an event in which the simulator is not likely to have extracted from Alice (\(t \not \in F^{x}_{\textsf {A}\hbox {-}\textsf {ext}}\)). Conditioned on the simulator not extracting, the protocol transcript is independent of Bob’s input. Hence whatever is unlikely with input \(y'\) for Bob is also unlikely with input y for Bob.

The second case \(F^{x}_{\textsf {A}\hbox {-}\textsf {ext}} < F^{y}_{\textsf {B}\hbox {-}\textsf {out}}\) is negligibly likely by Lemma 9.

5.4 Securely Truncating a Protocol

Lemma 15

Let \(\pi \) be a secure protocol for f in the g-hybrid model. Define \(\pi '\) to be the following:

  • On input x for Alice and y for Bob, both parties run \(\pi \) honestly on their given inputs.

  • When the protocol transcript t reaches \(F^{\tilde{x}}_{\textsf {A}\hbox {-}\textsf {out}}\) for any \(\tilde{x}\), or reaches \(F^{\tilde{y}}_{\textsf {B}\hbox {-}\textsf {out}}\) for any \(\tilde{y}\), the parties terminate the protocol.

  • Alice outputs \(\textsf {guess}_{\textsf {A}}(t,x)\) and Bob outputs \(\textsf {guess}_{\textsf {B}}(t,y)\).

Then the truncated protocol \(\pi '\) is also a secure protocol for f.

Proof

Let \(\varepsilon \) denote the simulation error of \(\pi \). First, we argue that \(\pi '\) is correct. Alice’s output is \(\textsf {guess}_{\textsf {A}}(t,x)\), which differs from the correct answer f(xy) only in the following events:

  • \(t \not \in F^{x}_{\textsf {A}\hbox {-}\textsf {out}}\) because the protocol reached \(F^{x'}_{\textsf {A}\hbox {-}\textsf {out}}\) and terminated strictly before reaching \(F^{x}_{\textsf {A}\hbox {-}\textsf {out}}\) for \(x' \ne x\). By Lemma 14, this can happen only with probability \(O(\sqrt{\varepsilon })\).

  • \(t \not \in F^{x}_{\textsf {A}\hbox {-}\textsf {out}}\) because the protocol reached \(F^{y}_{\textsf {B}\hbox {-}\textsf {out}}\) and terminated strictly before reaching \(F^{x}_{\textsf {A}\hbox {-}\textsf {out}}\). By Lemma 13, this can happen only with probability \(O(\sqrt{\varepsilon })\).

  • \(t \in F^{x}_{\textsf {A}\hbox {-}\textsf {out}}\) but \(\textsf {guess}_{\textsf {A}}(t,x) \ne f(x,y)\). By Lemma 11, this can only happen with probability \(O(\varepsilon )\).

As for security, the only difference between \(\pi \) and \(\pi '\) is that \(\pi '\) truncates early based on some condition. But this condition is public and independent of either party’s private inputs. Hence the simulation for \(\pi '\) works as follows. It simply runs the simulator for \(\pi \) but terminates the protocol when the transcript reaches the public termination condition.

Overall \(\pi '\) is a secure protocol with negligible simulation error \(O(\sqrt{\varepsilon })\).

Observe that the new protocol \(\pi '\) has the “instantaneous” property discussed in Sect. 2. Importantly for our purposes in the next section, with overwhelming probability \(1-O(\sqrt{\varepsilon })\) the protocol terminates on a transcript that is both in \(F^{x}_{\textsf {A}\hbox {-}\textsf {out}}\) and \(F^{y}_{\textsf {B}\hbox {-}\textsf {out}}\). Such a transcript must end with a message produced by the simulator in both ideal interactions (i.e., when either party is corrupt). Hence the last protocol message must be an output of g, with overwhelming probability.

6 Collapsing Protocols to a Single Call to g

We complete our main theorem with the following lemmas.

Lemma 16

If, for incomplete and non-unilateral f and g, \(f \sqsubseteq g\) via a protocol with strict upper bound on number of rounds \(r = O(\log \kappa )\), then f embeds in g. This proves (1) \(\Rightarrow \) (4) of Theorem 1 (stated in Sect. 2).

Proof

(Sketch). The full proof is in Appendix D. Without loss of generality (from Lemma 15) the last step in \(\pi \) (in particular, the action in final round r) is a call to g with overwhelming probability.

We consider two cases. Consider a call to g that happens in the last round, following some partial transcript t. Imagine a new protocol where the parties simply “fast-forward” directly to this g-call by behaving as if the transcript so far was t. The result is a protocol consisting of a single call to g. If any call to g yields a secure protocol for f in this way, then we are done (we in fact have a 1-round protocol for f).

In the other case, there may be no call to g during the final round of \(\pi \) that yields a secure protocol for f in this way. Intuitively, every time the protocol runs for the full r rounds there would have been a successful attack on the final call to g! Hence it must be negligibly unlikely that \(\pi \) would ever run for r rounds. We show that, in this case, truncating \(\pi \) after \(r-1\) rounds results in a secure protocol for f.

We can repeatedly apply this argument at most \(r-1\) times until we are guaranteed to obtain a 1-round protocol demonstrating \(f \sqsubseteq _1 g\). The parameters are such that after truncating \(r-1\) rounds, the resulting protocol has simulation error \(c^{r-1} \sqrt{\varepsilon }\) for some constant c. Such a protocol is secure as long as \(r = O(\log \kappa )\), since \(c^{O(\log \kappa )} \sqrt{\varepsilon } = \textsf {poly}(\kappa )\sqrt{\varepsilon }\), which is negligible.

Corollary 17

If \(f \sqsubseteq g\) via a deterministic protocol (of any number of rounds) then f embeds in g. This proves (2) \(\Rightarrow \) (4) of Theorem 1 (stated in Sect. 2).

Proof

Deterministic protocols have zero simulation error (without loss of generality). Therefore, the same reasoning as in the previous proof applies but without any error accumulating with each round.

7 Tightness of the Characterization, Limitations

In this section we discuss why our main characterization does not extend (without modification) to consider unilateral functions or superlogarithmic-round, randomized protocols.

In Appendix E we discuss the possibility of extending our protocol model to allow parallel calls to g.

7.1 Unilateral Functions

Failure of Our Characterization on Unilateral Functions. In Fig. 2 we give f and g which are unilateral. Bob is the column-player and thus has 2 unilateral inputs labeled B and C.

First, we argue that \(f \not \sqsubseteq _1 g\). Suppose for sake of contradiction that such a protocol exists. Consider the simulator for a corrupt Bob who runs the protocol semi-honestly, on f-input that is chosen uniformly at random. The only message that the simulator sees is Bob’s input to g, after which the simulator must extract an output to send to f. The simulator gets only one bit of information about Bob’s input (as there are only 2 possible g-inputs), while there are 3 possibilities for the extracted f-input. It follows that with constant probability the simulator must extract the wrong input, and this error will be evident in the output of f.

Fig. 2.
figure 2

Unilateral functions violating the main theorem.

Fig. 3.
figure 3

Functions violating the main theorem via a superlogarithmic-round protocol. Note that the bottom-right \(3 \times 3\) submatrix is unlike the others.

However, there is a simple protocol for f using g: Alice sends her f input directly to g. If Bob has f-input A, he should choose g-input \(A'\). In this case, the parties will see that the g-output is in \(\{0,1\}\) and they terminate with this as their f-output. Otherwise, if Bob has f-input B or C, he should choose g-input \(B'\). In this case, the parties will see that the g output is 2, and then Alice will wait for Bob to send a plain message containing either “2” or “3.” Alice takes this message to be her output.

It is simple to see that this protocol is secure against a malicious Alice. For a malicious Bob, the simulator does the following. If Bob chooses g-input \(A'\), then the simulator extracts Bob’s ideal f-input as A and simulates the g-output to equal the ideal f-output. If Bob chooses g-input \(B'\), then the simulator gives 2 as the simulated g-output, then waits for a message from Bob (either “2” or “3”) and uses this as the extracted ideal f-input. The reason the simulation is secure is that in the second case (Bob chooses g-input \(B'\)), the fact that this is a unilateral input means that the simulator doesn’t need to know Alice’s input to perfectly simulate the g-output. Hence the simulator can delay extraction until the second protocol message, where intuitively Bob resolves which unilateral input he has.

Hence, we have \(f \sqsubseteq g\) via a protocol consisting of a single call to g, plus (in some cases) one extra message. It is a deterministic, constant-round protocol, and yet \(f \not \sqsubseteq _1 g\). This example shows that our classification does not extend to unilateral functions.

7.2 Deterministic/Logarithmic-Round Protocols

Consider the functions f and g in Fig. 3. We first claim that f does not embed in g. Any embedding would map 3 f-columns into 3 distinct g-columns.Footnote 4 For any 3 columns of g, there exists a row for which these columns have distinct entries – this is simple (albeit time-consuming) to verify. However, there is no row in f that has three distinct values. Hence the embedding would contradict rule 2a of the embedding definition. Concretely, any candidate protocol for \(f \sqsubseteq _1 g\) would allow a corrupt row-player to learn the column-player’s input in its entirety, which is not allowed by f.

However, there is a protocol for f that uses g. We group the rows and columns of g into groups of three, as distinguished by the dotted lines in the figure. Associate the first row of f with the first row group of g, etc. Similarly, associate the first column of f with the first column group of g, etc. The protocol for f is as follows:

  • Alice chooses a g-input from the row group associated with her f-input, uniformly at random.

  • Bob chooses a g-input from the column group associated with his f-input, uniformly at random.

  • They call g with their selected g-inputs.

  • If the output of the g-call was in \(\{\)A, B, C, D, E\(\}\), terminate the protocol with that output. Otherwise, repeat (with fresh random choices for the g-inputs).

The correctness of this protocol is clear. By only sending g-inputs in the group associated with their f-inputs, each party restricts any terminating output of g to be one that was possible given their f-input.

To see that the protocol is secure, consider the following simulation. Suppose corrupt Alice chooses some g-input (row). With probability 1/3, the simulator decides that the protocol will terminate at this round. It converts the g-input to an f-input (according to its row group), sends that f-input to the ideal f, then simulates the g-output as the ideal f-input. With probability 2/3, the simulator decides that the protocol will continue. Note that in any row, there are 2 non-terminal g-outputs (for example, in the second row only 3 and 4 are possible), which are equally likely no matter which column group Bob has selected. The simulator simply chooses one of these two with equal probability as the simulated g-output. Then the same process repeats.

The parties’ inputs will “match” by giving a terminal output with probability 1/3, meaning that the expected number of rounds is 3. The probability that the protocol continues for at least r rounds is \((2/3)^r\). We can get a protocol with a strict upper bound on round complexity by having the parties simply abort after some limit r number of rounds. If we set this limit as \(r(\kappa ) = \omega (\log \kappa )\), then the correctness of the protocol suffers by an amount \((2/3)^{\omega (\log \kappa )} = \kappa ^{-\omega (1)}\), which is negligible. However, the simulation is still perfect, and the protocol is secure.

In summary, \(f \sqsubseteq g\) via a randomized, (worst-case) superlogarithmic-round protocol, but f does not embed in g and so \(f\not \sqsubseteq g\) via any deterministic protocol or any strict logarithmic-round protocol.