1 Introduction

Client-server outsourcing is a central problem in secure computation. In particular, there are a wide variety of deployed systems which allow a client to search a database stored in one or more servers for desired contents. Since a client’s query may contain sensitive information, it is important to realize secure database search, enabling a client to search a database without revealing his or her query to the servers. A trivial solution is downloading the whole database and searching it locally. However, since the database size is typically very large, we need to construct protocols whose communication and client-side computational complexity is sublinear in the database size.

Traditionally, the problem of secure database search has been considered in two types of setting. In the single-server setting, there is only one server storing a database who may be corrupted; In the multi-server setting, there are m servers storing copies of a database and any t of them are corrupted. In this work, we focus on the multi-server setting since it is known to be impossible to efficiently achieve information-theoretic security in the single-server setting [21] and even in computational settings, the bounded collusion of servers allows better efficiency and weaker cryptographic assumptions than single-server protocols [12, 13, 26].

Private Information Retrieval (PIR) is a fundamental cryptographic primitive to realize the most basic database search. The goal of PIR is to enable an honest client to retrieve a data item \(a_i\) from a database \(\boldsymbol{a} = (a_1,\ldots , a_N)\) hiding the index i from the serversFootnote 1. To allow more complex queries such as partial match search, Barkol and Ishai [5] considered a more general setting in which a client has a private input x and servers share a function f, and the goal of the client is to obtain f(x) by communicating with the servers. A rich line of works proposed secure protocols computing various classes of functions f including PIR [8,9,10, 14, 17, 27, 28], bounded-degree multivariate polynomials [6, 24, 36, 43], and bounded-depth circuits [5, 16, 39, 40]. Note that the communication complexity of these protocols is much smaller than the size of circuits computing f, which is the main advantages over usual multiparty computation protocols [15, 23, 34, 35].

We note that the above-mentioned protocols are passively secure, i.e., the privacy and correctness are guaranteed only if servers follow the protocol specifications. On the other hand, it is desirable to achieve active security in real-world scenarios. Namely, protocols should not only protect the privacy of queries but also guarantee the correctness of results even if some servers deviate from the protocols arbitrarily. For example, servers may try to let a client accept an incorrect result, or compute responses from an out-of-date copy of a database. This paper concerns a fundamental problem of constructing an efficient compiler from passively secure protocols to actively secure protocols. Given such a compiler, existing passively secure protocols can be directly upgraded into actively secure protocols with small overheads. Prior to our work, however, the only known passive-to-active compilers are the inefficient ones applied to PIR [11, 29], which incurs exponentially large computational overheads in the number of servers (see Sect. 1.2 for more related works on compilers in different settings including GMW-style compilers).

1.1 Our Results

In this paper, we study the problem of secure computation in the client-servers setting, where a client can obtain f(x) on a private input x by communicating with multiple servers each holding f. We demonstrate theoretical feasibility of compilers that upgrade passively secure k-server protocols into actively secure m-server protocols with \(m>k\). We present two such compilers: The first one upgrades a one-round passively secure protocol into a multi-round actively secure protocol. It increases the number of servers by a corruption threshold t, which seems the best possible (see Remark 1). The second one upgrades a one-round passively secure protocol into a one-round actively secure protocol while increasing the number of servers by a larger factor. More specifically,

  • Our first compiler transforms a one-round passively secure k-server protocol into an \(O(m^2)\)-round actively secure m-server protocol such that \(m=k+t\).

  • Our second compiler transforms a one-round passively secure k-server protocol into a one-round actively secure m-server protocol such that \(m=O(k\log k)+2t\).

Our compilers are generic and efficient in the sense that they are applied to protocols computing any class of functions f and the overheads in communication and computational complexity are only polynomial in m, independent of the complexity of f. Furthermore, our compilers are unconditional, i.e., requires no additional assumptions, which allows us to obtain actively secure protocols from various assumptions or even information-theoretically as shown below.

Along the way, we introduce two novel notions, conflict-finding protocols and locally surjective map families. The former is an intermediate notion between passively secure and actively secure protocols, which is used in our first compiler. The latter is a variant of perfect hash families with a stronger property, which is used in our second compiler. A key observation behind our techniques is that if a pair of servers return different answers to the same query, then a client finds that at least one of them is malicious. A difficulty is that we have to carefully design such a strategy, since just disclosing a query for one server to another may reveal his private input. See Sect. 2 for details on our techniques.

Remark 1

Our first compiler increases the number of servers by t but this seems the best possible. Indeed, the existence of a generic compiler for an actively secure protocol with \(m'<t+k\) servers implies a compiler from a k-server protocol to a \(k'\)-server protocol for \(k':=m'-t<k\) since an actively secure \(m'\)-server protocol implies a passively secure \((m'-t)\)-server protocolFootnote 2. Thus, the increase in the number of servers is optimal unless there is a generic method to reduce the number of servers. Such a method has not been found up until now.

Instantiations. Based on our compilers, we show concrete actively secure protocols for PIR, bounded-degree multivariate polynomials and constant-depth circuits. Remarkably, our protocol instantiated from the sparse LPN assumption is the first actively secure protocol for multivariate polynomials which has the minimum number of servers, without assuming fully homomorphic encryption.

PIR. There are compilers from a passively secure k-server PIR protocol to an actively secure m-server protocol for \(m=k+2t\) [11] and \(m=k+t\) [29]. However, these compilers incur exponentially large multiplicative overheads \(m^{O(t)}\) in client-side computational complexity. On the other hand, our first compiler gives an actively secure m-server protocol such that \(m=k+t\) with a polynomial computational overhead \(m^{O(1)}\). The only cost is that it requires \(O(m^2)\) rounds of interaction between a client and servers. Our second compiler gives a one-round actively secure protocol with a polynomial computational overhead at the cost of a larger number of servers \(m=O(k\log k)+2t\). A detailed comparison is shown in Table 1.

In the information-theoretic setting, the currently most communication-efficient passively secure PIR protocol for \(t\ge 2\) is the \(3^t\)-server protocol in [10], which has sub-polynomial communication and computational complexity \(N^{o(1)}\cdot 3^{t+o(t)}\) in the database size N. (Although the original protocols in [10, 14] assume non-colluding servers, i.e., \(t=1\), the corruption threshold t can be amplified by using the technique in [7] as pointed out in [32].) By applying our compilers, we obtain actively secure \(3^{t+o(t)}\)-server PIR protocols whose computational complexity is \(N^{o(1)}\cdot 2^{O(t)}\). It exponentially (in t) improves the complexities \(N^{o(1)}\cdot 3^{t+o(t)}\cdot m^{O(t)}=N^{o(1)}\cdot 2^{O(t^2)}\) of actively secure protocols that are obtained from the previous compilers [11, 29]. In the computational setting, if we apply our compilers to the protocol assuming one-way functions [14], we can achieve logarithmic communication and computational complexity in N and reduce the number of servers. A detailed description is shown in Table 2.

Table 1. Comparison of passive-to-active compilers for PIR
Table 2. Comparison of PIR protocols with sub-polynomial communication and computational complexity in the database size N for a corruption threshold \(t\ge 2\)

Bounded-Degree Multivariate Polynomials. In the information-theoretic setting, there is a passively secure protocol for polynomials in [43], which can be made actively secure by using the technique in [38]. In the computational setting, a passively secure protocol is given in [24], which can be made actively secure by the standard error correction algorithm [41]. Now, by applying our compilers, we can reduce the required number of servers of these protocols by t. Based on the passively secure protocol in [36], we can further reduce the number of servers by a factor of d assuming homomorphic encryption for degree-d polynomials. Notably, our protocol instantiated from [24] achieves the minimum number of servers \(2t+1\).Footnote 3 A detailed comparison is shown in Table 3.

Table 3. Comparison of actively secure protocols for multivariate polynomials

Constant-Depth Circuits. Barkol and Ishai [5] proposed a passively secure protocol for unbounded fan-in constant-depth circuits (i.e., the complexity class \(\textrm{AC}^0\)). It can be made actively secure by applying the error correction algorithm [41], and the resulting protocol needs at least \((\frac{1}{2}(\log M+O(1))^{D-1}+2)t\) servers, where M and \(D=O(1)\) are the size and depth of circuits, respectively. On the other hand, if we apply our first compiler, we need only \((\frac{1}{2}(\log M+O(1))^{D-1}+1)t\) servers, which decreases the number of servers of [5] by t. For example, for the partial match problem on an M-sized database (which can be captured by depth-2 circuits of size M), our protocol requires only \((\log M+2.5)t\) servers while the protocol obtained from [5] requires \((\log M+3.5)t\) servers.

A beneficial consequence is that our compilers can be directly applied to future developments in passively secure protocols in the client-servers scenario and may yield new efficient constructions of actively secure protocols.

1.2 Related Work

Passive-to-Active Compilers. Within the context of PIR, there are compilers from a passively secure k-server protocol to an actively secure m-server protocol for \(m=k+2t\) [11] and \(m=k+t\) [29]. As said above, however, these compilers are not only less generic in that they are applied only to PIR, but also inefficient since they incur exponential overhead \(\left( {\begin{array}{c}m\\ t\end{array}}\right) =m^{O(t)}\) in computational complexity. There are also passive-to-active compilers in a more general multi-client setting where a private input is arbitrarily distributed among multiple clients [15, 23, 34, 35]. However, in actively secure protocols resulting from these compilers, servers need to interactively evaluate a circuit gate by gate. Consequently, protocols require communication and computational complexity that is proportional to the size of a function, and do not work efficiently if the function encodes a large database.

PIR. There are direct constructions of m-server PIR protocols in a malicious setting [3, 25, 33, 38, 47]. However, the communication complexities of [3, 25, 33, 38] are all polynomial \(N^{O(t/m)}\) in the database size N, while those of ours are \(N^{o(1)}\), i.e., smaller than any polynomial function in N. The protocol in [47] does not guarantee privacy if malicious servers collude, and thus does not satisfy active t-security for \(t>1\) in our senseFootnote 4. There are also constructions of PIR with a weaker security guarantee [19, 22], which can only tell a client the existence of malicious servers. Actively secure PIR is also considered in a special setting where the length of each entry of a database is sufficiently large (e.g., [4, 44] and references therein). The protocols in [4, 44] assume that the length of each entry of a database is at least exponential in N and hence result in exponentially large communication complexity in N.

Protocols in the Single-Server Setting. Generally, if we have a passively secure single-server protocol, then we can obtain an actively secure protocol with the minimum number of servers \(2t+1\) since a client just runs the passively secure protocol with each server and computes the majority of \(2t+1\) outputs. However, there is an impossibility result on efficient single-server protocols for PIR in the information-theoretic setting [21]. Even if we go for computational security, it seems to be impossible to construct single-server PIR protocols from the minimal assumption of one-way functions [26], and for a general function, we currently need to assume fully homomorphic encryption, which is only instantiated from a narrow class of assumptions [31, 42].

Verifiable Computation. The problem of dealing with malicious servers has also been considered within the context of verifiable computation in the single-server setting [2, 20, 30] and in the multi-server setting [1, 45, 46]. However, these verifiable computation protocols only detect malicious behavior of servers and cannot achieve active security in our sense. The protocol in [18] uses a similar idea that a client compares answers from one server with those from another. However, it does not consider the setting where a client’s input x should be private and also assumes that all parties agree on a function f in advance, while in our setting the client does not know f since it corresponds to an unknown database.

2 Technical Overview

In this section, we provide an overview of our compilers to construct an actively t-secure m-server protocol \(\varPi '\) from any one-round passively t-secure k-server protocol \(\varPi \) such that \(k<m\). Let \(V=\{\textsf{S}_1,\ldots ,\textsf{S}_m\}\) be the set of m servers of \(\varPi '\). A key observation behind our constructions is that if a pair of servers in V return different answers to the same query of \(\varPi \), then at least one of them is maliciousFootnote 5. We call such two servers a conflicting pair. The client continues to remove conflicting pairs from V in an appropriate way. Finally, the client executes a protocol only with remaining honest servers and obtains a correct result. For ease of exposition, we first explain our non-interactive actively secure protocol and then explain our interactive protocol with fewer servers.

2.1 Non-interactive Actively Secure Protocols

As a first attempt, we consider the following basic construction.

  1. 1.

    A client \(\textsf{C}\) partitions \(V=\{\textsf{S}_1,\ldots ,\textsf{S}_m\}\) into k groups \(V=G_1\cup \ldots \cup G_k\) in such a way that each \(G_j\) contains at least one honest server.

  2. 2.

    \(\textsf{C}\) computes k queries of \(\varPi \) on his private input and sends the j-th query to all servers in the j-th group \(G_j\).

If every group contains no conflicting pair (i.e., all servers in each \(G_j\) return the same answer), then \(\textsf{C}\) can compute the correct result from the k answers of the k groups. Otherwise, \(\textsf{C}\) removes a conflicting pair from V, and repeats the above process at most t times to remove all malicious servers. This method, however, requires a large number of servers \(m=\varOmega (kt)\) since the size of each group \(G_j\) needs to be larger than t.

We reduce the number of servers to \(m\approx 2t+k\) by introducing a novel notion of locally surjective map families. Technically, we consider a family \(\mathcal {F}\) of maps from the set \(V=\{\textsf{S}_1,\ldots ,\textsf{S}_m\}\) of m servers to \([k]:=\{1,2,\ldots ,k\}\). Each map \(f\in \mathcal {F}\) defines a partition \(V=G_{f,1}\cup \cdots \cup G_{f,k}\), where \(G_{f,j}=\{\textsf{S}_i:f(\textsf{S}_i)=j\}\). For each map \(f\in \mathcal {F}\), the client \(\textsf{C}\) computes k queries of \(\varPi \) on his private input and sends the j-th query to all servers in the j-th group \(G_{f,j}\). Our strategy is that \(\textsf{C}\) proceeds in t steps to detect and remove at least one new malicious server per step. In each step,

  • If for every (fj), all the remaining servers in \(G_{f,j}\) return the same answer \({\textsf{ans}}_{f,j}\), then \(\textsf{C}\) computes an output \(x_f\) of \(\varPi \) from \(({\textsf{ans}}_{f,1},\dots ,{\textsf{ans}}_{f,k})\) for each f and decides the final output by the majority vote over the \(x_f\)’s;

  • Otherwise, i.e., if two remaining servers in some \(G_{f,j}\) give different answers, then \(\textsf{C}\) removes this conflicting pair and proceeds to the next step.

Observe that in the latter case, at least one of the two servers is malicious and hence at least one malicious server is always removed. The requirement for \(\textsf{C}\) to succeed is that in the former case, more than half of the \(x_f\)’s are correct. A sufficient condition is that for more than half of the f’s, there remains at least one honest server in each of \(G_{f,1},\ldots ,G_{f,k}\). Indeed, for such f’s, \(\textsf{C}\) receives the correct answer from servers in each of \(G_{f,1},\ldots ,G_{f,k}\), or proceeds to the latter case and removes a conflicting pair. Since there remains at least \(m-2t\) honest servers at every step, the condition can be formulated as the family \(\mathcal {F}\) of maps satisfying that for any subset \(H\subseteq V\) of size \(m-2t\), there exist more than half of the f’s such that \(f(H)=[k]\). We name such a family as a locally surjective map family.

We can prove by a probabilistic argument the existence of a locally surjective map family \(\mathcal {F}\) of size O(m) if \(k=O((m-2t)/\log (m-2t))\). Therefore, we can obtain an actively t-secure m-server protocol \(\varPi '\) from a passively t-secure k-server protocol \(\varPi \) if \(m=O(k\log k)+2t\). Since the client can run all instances of \(\varPi \) in parallel, the resulting protocol \(\varPi '\) is one-round and only incurs a \(O(tm|\mathcal {F}|)=O(tm^2)\) multiplicative overhead to communication and computational complexity.

2.2 Interactive Actively Secure Protocols

We further reduce the number of servers from \(m=O(k\log k)+2t\) to \(m=t+k\). In our first construction, if a client \(\textsf{C}\) finds a conflicting pair of servers, then he removes both servers from the set V. After eliminating all t malicious servers, the number of remaining servers is reduced to \(m-2t\) in the worst case. Therefore, as long as this approach is used, the number of servers must be \(m\ge 2t+k\) since it should hold that \(m-2t\ge k\).

Our second construction reduces the required number of servers to \(m=t+k\) by introducing a notion of t-conflict-finding protocols, which is an intermediate notion between passively t-secure protocols and actively t-secure ones. Intuitively, in a conflict-finding protocol, a client \(\textsf{C}\) obtains a correct result or a non-trivial partition \((G_0,G_1)\) of the set V of servers such that all honest servers are included in \(G_0\) or \(G_1\) (and hence the other group consists of malicious servers only)Footnote 6. A pair of servers crossing the partition \((G_0,G_1)\) is supposed to be conflicting.

More concretely, we consider a graph \(\mathcal {G}\) with m vertices each of which represents a server. Our protocol starts with \(\mathcal {G}\) being a complete graph, and repeats the following steps:

  1. 1.

    The client \(\textsf{C}\) executes a conflict-finding protocol \(\varPi _{\textrm{CF}}\) with some subset \(V'\subseteq V\) which forms a connected subgraph of size \(k=m-t\) in \(\mathcal {G}\) (which can be efficiently found).

  2. 2.

    If all servers in \(V'\) behave honestly, then \(\textsf{C}\) obtains the correct output.

  3. 3.

    Otherwise \(\textsf{C}\) can find a partition \((G_0,G_1)\) of \(V'\) thanks to the conflict-finding property of \(\varPi _{\textrm{CF}}\). Note that there is always an edge \(e=(\textsf{S}_i,\textsf{S}_j)\) between \(G_0\) and \(G_1\) since \(G_0 \cup G_1 = V'\) is connected. Furthermore, since all honest servers in \(V'\) are included in \(G_0\) or \(G_1\), at least one of \(\textsf{S}_i\) and \(\textsf{S}_j\) is malicious. Now, \(\textsf{C}\) removes the edge e from \(\mathcal {G}\) instead of eliminating the two servers, and goes back to the first step.

Since all edges among honest servers remain unremoved (and hence the set of all honest servers remains connected), \(\textsf{C}\) can successfully find a set of \(k=m-t\) honest servers within \(O(m^2)\) rounds. Note that in the above construction, \(\textsf{C}\) chooses a set of servers with which he executes \(\varPi _{\textrm{CF}}\), depending on the answers that are maliciously computed in the previous rounds. Thus servers may learn some information on the client input x by seeing which servers \(\textsf{C}\) removes. To address this problem, we impose an additional property that the distribution of the partition \((G_0,G_1)\) is independent of x regardless of how malicious servers behave. Then, an edge removed in each round leaks no information on x and hence the privacy of x is preserved.

Two-Round Conflict-Finding Protocols. The remaining problem is how to construct conflict-finding protocols. We show a construction of a two-round t-conflict-finding k-server protocol \(\varPi _{\textrm{CF}}\) from a passively t-secure k-server one \(\varPi \). For simplicity, let \(V'=(\textsf{S}_1,\ldots ,\textsf{S}_k)\). In the first round, a client computes real queries \((\textsf{que}_i)_{i\in [k]}\) on his private input x according to \(\varPi \) as usual. He also computes dummy queries \((\textsf{que}_i')_{i\in [k]}\) on a default input \(x_{\textrm{def}}\) which is independent of x. He then sends a random permutation of \((\textsf{que}_i,\textsf{que}_i')\) to each server \(\textsf{S}_i\). Note that the privacy of \(\varPi \) and the random permutation ensure that servers cannot distinguish which queries are computed on x or \(x_{\textrm{def}}\). Each server \(\textsf{S}_i\) returns answers \((\textsf{ans}_i,\textsf{ans}_i')\) to the two queries as usual. In the second round, the client sends all the dummy queries \((\textsf{que}_i')_{i\in [k]}\) on \(x_{\textrm{def}}\) to all servers in \(V'\), which does not affect privacy since \(x_{\textrm{def}}\) is independent of x. In response, each server \(\textsf{S}_j\) returns \(v_j:=(\textsf{ans}_i'(j))_{i\in [k]}\) to \((\textsf{que}_i')_{i\in [k]}\), where \(\textsf{ans}_i'(j)\) is the answer which \(\textsf{S}_i\) would compute to the dummy query \(\textsf{que}_i'\).

For simplicity, suppose that \(\textsf{S}_1\) is the only malicious server. If \(\textsf{S}_1\) behaved honestly in the first round, it holds that \(\textsf{ans}_1'=\textsf{ans}_1'(2)=\cdots =\textsf{ans}_1'(k)\). If \(\textsf{S}_1\) returned an incorrect answer to \(\textsf{que}_1'\), it is different from any of \(\textsf{ans}_1'(2),\ldots ,\textsf{ans}_1'(k)\). From this observation, the client \(\textsf{C}\) trusts the answer \(\textsf{ans}_1\) of \(\textsf{S}_1\) to the main query \(\textsf{que}_1\) in the first round if and only if \(\textsf{ans}_1'=\textsf{ans}_1'(2)=\cdots =\textsf{ans}_1'(k)\). Generalizing this, we let \(\textsf{C}\) compute an output based on \((\textsf{ans}_1,\textsf{ans}_2,\ldots ,\textsf{ans}_k)\) if all the \(v_j\)’s take the same value. Otherwise, he partitions the set of servers into equivalence classes by placing \(\textsf{S}_i\) and \(\textsf{S}_j\) into the same class if and only if \(v_i=v_j\), and outputs a non-trivial partition \((G_0,G_1)\) in some way. Note that any pair of honest servers \(\textsf{S}_i,\textsf{S}_j\) return the same answer in the second round, i.e., \(v_i=v_j\), and hence they are placed in the same class. A malicious server successfully submits an incorrect answer without being detected only if it guesses correctly which query encodes the client’s true input x. As we said above, it happens with probability 1/2. More generally, if the client prepares \(M-1\) sets of dummy queries, the cheating probability of malicious servers can be reduced to 1/M. This can be made even negligible by executing sufficiently many (say, \(\kappa \)) instances in parallel. If a conflict is found in some instance, \(\textsf{C}\) outputs a non-trivial partition \((G_0,G_1)\) obtained in that instance. Otherwise, he outputs the majority of the \(\kappa \) outputs if it exists. To let this protocol fail, malicious servers need to let the client output valid but incorrect outputs in at least \(\kappa /2\) instances. The cheating probability is thus \(O(M^{-\kappa /2})\), which is negligible.

To see that the partition \((G_0,G_1)\) leaks no information on the client’s input x, observe that \((G_0,G_1)\) is determined by answers \((v_j)_{j\in [k]}\). These answers are independent of x since they can be simulated from dummy queries and t queries for x that malicious servers see. The former is independent of x in the first place and the latter leaks no information on x due to the privacy of \(\varPi \).

To summarize, we obtain an \(O(m^2)\)-round actively t-secure m-server protocol from a passively t-secure k-server protocol if \(k\le m-t\). The communication and computational overhead is a multiplicative polynomial factor in m.

3 Preliminaries

Notations. For \(m\in \mathbb {N}\), define \([m]=\{1,2,\ldots ,m\}\). Let XY be sets. If \(X\subseteq Y\), we define \(Y\setminus X=\{y\in Y:y\notin X\}\) and simply denote it by \(\overline{X}\) if Y is clear from the context. We write if u is chosen uniformly at random from X. Define \(\left( {\begin{array}{c}X\\ k\end{array}}\right) \) as the set of all subsets of X of size k. Define \(\textrm{Map}(X,Y)\) as the set of all maps from X to Y. If \(X=[m]\) and \(Y=[k]\), we simply denote it by \(\textrm{Map}(m,k)\). Let \(\log x\) denote the base-2 logarithm of x and \(\ln x\) denote the base-\(\text {e}\) logarithm of x, where \(\text {e}\) is the Napier’s constant. We call a function \(f:\mathbb {N}\ni \lambda \mapsto f(\lambda )\in \mathbb {R}\) negligible if for any \(c>0\), there exists \(\lambda _0\in \mathbb {N}\) such that \(0\le f(\lambda )<\lambda ^{-c}\) for any \(\lambda >\lambda _0\). We call f polynomial there exists \(c>0\) such that \(0\le f(\lambda )<\lambda ^c\) for all \(\lambda \). Throughout the paper, we use the following notations:

  • m denotes the total number of servers, which is polynomial in a security parameter \(\lambda \).

  • t denotes the number of corrupted servers.

  • \(\textsf{C}\) denotes a client and \(\textsf{S}_i\) denotes the i-th server.

The notation \(\widetilde{O}(\cdot )\) hides a polylogarithmic factor in a security parameter \(\lambda \).

3.1 Secure Computation in the Client-Servers Setting

We follow the client-servers model used in [5]. In this model, there is an honest client \(\textsf{C}\) who holds a private input x, and m servers \(\textsf{S}_1,\ldots ,\textsf{S}_m\) who all hold the same input p. The goal is:

  • Byzantine-Robustness. The client learns the value F(px) for a publicly known function F even if t servers behave maliciously;

  • Privacy. The client keeps his input x hidden from any collusion of t servers.

We do not assume any interaction between servers. We call a message from a client to servers a query and a message from servers to the client an answer.

In the above setting, we assume that the function F takes a common input p from servers. Typically, the input p will be a description of a function f applied to the input x of a client (e.g., a description of a circuit or a polynomial) and F is the universal function defined by \(F(p,x)=f(x)\).

If \(m\ge 2t+1\), there is a trivial 1-round protocol achieving the above goal: \(\textsf{C}\) downloads p from all servers, finds the correct p by the majority vote, and computes F(px) by himself. However, this protocol results in large communication and client-side computational complexity that is linear in the description length of p. In applications to database search, p encodes a large database and its size is proportional to the database size. From this point of view, we say that a protocol is efficient if its communication and client-side computational complexity is sublinear in the description length of p and linear in that of x.

More formally, we define a secure computation protocol in the client-servers setting as an abstract primitive. First, we show the syntax and correctness.

Definition 1

Let \(\mathcal {P}=(P_\lambda )_{\lambda \in \mathbb {N}}\), \(\mathcal {X}=(X_\lambda )_{\lambda \in \mathbb {N}}\), and \(\mathcal {Y}=(Y_\lambda )_{\lambda \in \mathbb {N}}\) be sequences of sets with polynomial-size descriptions and \(\mathcal {F}=(F_\lambda :P_\lambda \times X_\lambda \rightarrow Y_\lambda )_{\lambda \in \mathbb {N}}\) be sequences of functions with polynomial-size descriptions. An \(\ell \)-round m-server protocol for \(\mathcal {F}\) is a tuple of three polynomial-time algorithms \(\varPi =(\textsf{Query},\textsf{Answer},\textsf{Output})\), where:

  • \(\textsf{Query}(1^\lambda ,x,\textsf{st}^{(j-1)},(\textsf{ans}_i^{(j-1)})_{i\in [m]})\rightarrow ((\textsf{que}_i^{(j)})_{i\in [m]},\textsf{st}^{(j)})\): \(\textsf{Query}\) is a possibly randomized algorithm that takes \(x\in X_\lambda \), a state \(\textsf{st}^{(j-1)}\) and answers \((\textsf{ans}_i^{(j-1)})_{i\in [m]}\) in round \(j-1\) as input, and outputs queries \((\textsf{que}_i^{(j)})_{i\in [m]}\) and a state \(\textsf{st}^{(j)}\) in round j, where we define \(\textsf{st}^{(0)}\), \(\textsf{ans}_i^{(0)}\) as the empty string;

  • \(\textsf{Answer}(1^\lambda ,p,\textsf{que}_i^{(j)})\rightarrow \textsf{ans}_i^{(j)}\): \(\textsf{Answer}\) is a deterministic algorithm that takes \(p\in P_\lambda \) and a query \(\textsf{que}_i^{(j)}\) in round j as input, and outputs an answer \(\textsf{ans}_i^{(j)}\) in round j;

  • \(\textsf{Output}(1^\lambda ,\textsf{st}^{(\ell )},(\textsf{ans}_i^{(\ell )})_{i\in [m]})\rightarrow y\): \(\textsf{Output}\) is a possibly randomized algorithm that takes a state \(\textsf{st}^{(\ell )}\) and answers \((\textsf{ans}_i^{(\ell )})_{i\in [m]}\) in round \(\ell \) as input, and outputs \(y\in Y_\lambda \);

satisfying the following property:

  • Correctness. There exists a negligible function \(\textsf{negl}{\left( \lambda \right) }\) such that for any \(\lambda \in \mathbb {N}\) and any \((p,x)\in P_\lambda \times X_\lambda \),

    $$\begin{aligned} {\text {Pr}}\left[ \textsf{Output}(1^\lambda ,\textsf{st}^{(\ell )},(\textsf{ans}_i^{(\ell )})_{i\in [m]}))=F_\lambda (p,x)\right] \ge 1-\textsf{negl}{\left( \lambda \right) }, \end{aligned}$$

    where

    $$\begin{aligned} ((\textsf{que}_i^{(j)})_{i\in [m]},\textsf{st}^{(j)})&\leftarrow \textsf{Query}(1^\lambda ,x,\textsf{st}^{(j-1)},(\textsf{ans}_i^{(j-1)})_{i\in [m]}),\\ \textsf{ans}_i^{(j)}&\leftarrow \textsf{Answer}(1^\lambda ,p,\textsf{que}_i^{(j)}) \end{aligned}$$

    for all \(j\in [\ell ]\) and \(i\in [m]\).

We note that an answer algorithm \(\textsf{Answer}\) is not always defined to be deterministic in the literature but all the instantiations considered in this paper actually have deterministic answer algorithms. We omit a security parameter \(1^\lambda \) from inputs if it is clear from the context.

An abstract primitive \(\varPi =(\textsf{Query},\textsf{Answer},\textsf{Output})\) immediately implies an \(\ell \)-round protocol in the above client-servers setting. Indeed, a client has a private input x and m servers have a common input p. In each round, the client runs \(\textsf{Query}\), sends queries to servers and stores a state in his memory. In response, servers run \(\textsf{Answer}\) on the queries that they receive, and send answers back to the client. In the final round, the client runs \(\textsf{Output}\) on his state and servers’ answers, and obtains \(y=F_\lambda (p,x)\). Due to this correspondence, we will use the terminologies interchangeably for the sake of readability.

The above-mentioned trivial 1-round protocol corresponds to the scheme in which \(\textsf{Query}\) outputs nothing, \(\textsf{Answer}\) outputs p and then \(\textsf{Output}\) computes \(y=F_\lambda (p,x)\). To rule out this, we define the efficiency measures of \(\varPi \) as follows. Let \(\textsf{que}_i^{(j)}\) and \(\textsf{ans}_i^{(j)}\) be queries and answers computed by \(\varPi \) and denote their bit-lengths by \(|\textsf{que}_i^{(j)}|\) and \(|\textsf{ans}_i^{(j)}|\), respectively. Define the communication complexity \(\textrm{Comm}_\lambda (\varPi )\) as

$$\begin{aligned} \textrm{Comm}_\lambda (\varPi )=\sup _{(p,x)\in P_\lambda \times X_\lambda }\sum _{i\in [m],j\in [\ell ]}(|\textsf{que}_i^{(j)}|+|\textsf{ans}_i^{(j)}|). \end{aligned}$$

Define the client-side computational complexity \(\textrm{c}\text {-}\textrm{Comp}_\lambda (\varPi )\) as the sum of the running time of \(\textsf{Query}(1^\lambda ,x,\cdot ,\cdot )\) and \(\textsf{Output}(1^\lambda ,\cdot ,\cdot )\) with worst-case inputs \((p,x)\in P_\lambda \times X_\lambda \). Let \(\textrm{Comm}(\varPi )=(\textrm{Comm}_\lambda (\varPi ))_{\lambda \in \mathbb {N}}\) and \(\textrm{c}\text {-}\textrm{Comp}(\varPi )=(\textrm{c}\text {-}\textrm{Comp}_\lambda (\varPi ))_{\lambda \in \mathbb {N}}\). We say that \(\varPi \) is efficient if there exists a sublinear function \(g(\ell )=o(\ell )\) such that

$$\begin{aligned} \max \{\textrm{Comm}(\varPi ),\textrm{c}\text {-}\textrm{Comp}(\varPi )\}\in g(|p|)\cdot |x|\cdot \textsf{poly}{\left( m,\lambda \right) }, \end{aligned}$$
(1)

where |p| and |x| are the description lengths of elements of \(P_\lambda \) and \(X_\lambda \), respectively. One can also define the server-side computational complexity \(\textrm{s}\text {-}\textrm{Comp}_\lambda (\varPi )\) as the running time of \(\textsf{Answer}(1^\lambda ,p,\cdot )\), and define \(\textrm{s}\text {-}\textrm{Comp}(\varPi )=(\textrm{s}\text {-}\textrm{Comp}_\lambda (\varPi ))_{\lambda \in \mathbb {N}}\). We see the communication and client-side complexity as a primary efficiency measure and the server-side computational complexity as a secondary measure.

Next, we show the security requirements.

Definition 2

Let \(\varPi =(\textsf{Query},\textsf{Answer},\textsf{Output})\) be an \(\ell \)-round m-server protocol for \(\mathcal {F}=(F_\lambda :P_\lambda \times X_\lambda \rightarrow Y_\lambda )_{\lambda \in \mathbb {N}}\). We say that \(\varPi \) is actively t-secure if it satisfies the following requirements:

  • Privacy. There exists a negligible function \(\textsf{negl}{\left( \lambda \right) }\) such that for any stateful algorithm \(\mathcal {A}\) and any \(\lambda \in \mathbb {N}\),

    $$\begin{aligned} \textsf{Adv}_{\varPi ,\mathcal {A}}(\lambda ):=\left| {\text {Pr}}\left[ \textsf{Priv}_{\varPi ,\mathcal {A}}^0(\lambda )=0\right] -{\text {Pr}}\left[ \textsf{Priv}_{\varPi ,\mathcal {A}}^1(\lambda )=0\right] \right| <\textsf{negl}{\left( \lambda \right) }, \end{aligned}$$

    where for \(b\in \{0,1\}\), \(\textsf{Priv}_{\varPi ,\mathcal {A}}^b(\lambda )\) is the output \(b'\) of \(\mathcal {A}\) in the following experiment:

    1. 1.

      \((x_0,x_1,p,T)\leftarrow \mathcal {A}(1^\lambda )\), where \(x_0,x_1\in X_\lambda \), \(p\in P_\lambda \) and \(T\subseteq [m]\) is of size at most t.

    2. 2.

      For each \(j=1,2,\ldots ,\ell \),

      1. (a)

        Let \(((\textsf{que}_i^{(j)})_{i\in [m]},\textsf{st}^{(j)})\leftarrow \textsf{Query}(1^\lambda ,x_b,\textsf{st}^{(j-1)},(\textsf{ans}_i^{(j-1)})_{i\in [m]})\) and give \((\textsf{que}_i^{(j)})_{i\in T}\) to \(\mathcal {A}\).

      2. (b)

        If \(j<\ell \), \(\mathcal {A}\) outputs \((\textsf{ans}_i^{(j)})_{i\in T}\). If \(j=\ell \), \(\mathcal {A}\) outputs a bit \(b'\in \{0,1\}\).

  • Byzantine-robustness. There exists a negligible function \(\textsf{negl}{\left( \lambda \right) }\) such that for any stateful algorithm \(\mathcal {A}\) and any \(\lambda \in \mathbb {N}\),

    $$\begin{aligned} {\text {Pr}}\left[ \textsf{BR}_{\varPi ,\mathcal {A}}(\lambda )=1\right] <\textsf{negl}{\left( \lambda \right) }, \end{aligned}$$

    where \(\textsf{BR}_{\varPi ,\mathcal {A}}(\lambda )\) is the output of the following experiment:

    1. 1.

      \((x,p,T)\leftarrow \mathcal {A}(1^\lambda )\), where \(x\in X_\lambda \), \(p\in P_\lambda \) and \(T\subseteq [m]\) is of size at most t.

    2. 2.

      For each \(j=1,2,\ldots ,\ell \),

      1. (a)

        Let \(((\textsf{que}_i^{(j)})_{i\in [m]},\textsf{st}^{(j)})\leftarrow \textsf{Query}(1^\lambda ,x_b,\textsf{st}^{(j-1)},(\textsf{ans}_i^{(j-1)})_{i\in [m]})\) and give \((\textsf{que}_i^{(j)})_{i\in T}\) to \(\mathcal {A}\).

      2. (b)

        \(\mathcal {A}\) outputs \((\textsf{ans}_i^{(j)})_{i\in T}\).

    3. 3.

      Return 1 if \(\textsf{Output}(1^\lambda ,\textsf{st}^{(\ell )},(\textsf{ans}_i^{(\ell )})_{i\in [m]})\ne F_\lambda (p,x)\), and otherwise return 0.

We say that \(\varPi \) is passively t-secure if it satisfies the above requirements for semi-honest adversaries \(\mathcal {A}\), i.e., those following the instructions of \(\varPi \). Note that for semi-honest adversaries, the Byzantine-robustness of \(\varPi \) immediately follows from the correctness of \(\varPi \). We say that \(\varPi \) is computationally actively t-secure (resp. computationally passively t-secure) if it satisfies the above requirements for probabilistic polynomial-time (PPT) adversaries \(\mathcal {A}\) (resp. semi-honest PPT adversaries \(\mathcal {A}\)).

3.2 Existing Passively Secure Protocols

Private Information Retrieval.

Let \(N=N(\lambda )\) be a polynomial function. Define \(\textsc {Index}_N=(F_\lambda :\{0,1\}^N\times [N]\rightarrow \{0,1\})_{\lambda \in \mathbb {N}}\) as a sequence of functions such that for each \(\lambda \in \mathbb {N}\),

$$\begin{aligned} F_\lambda ((a_1,\ldots ,a_N),x)=a_x,~\forall (a_1,\ldots ,a_N)\in \{0,1\}^N,\forall x\in [N]. \end{aligned}$$

An m-server protocol for \(\textsc {Index}_N\) is called an m-server private information retrieval (PIR) protocol for N-sized databases. In the information-theoretic setting, the most communication-efficient passively secure 3-server PIR protocol was given by [10] and in the computational setting, the passively secure 2-server PIR protocol was given by [14] assuming the existence of one-way functions. Although the original protocols in [10, 14] assume \(t=1\), the corruption threshold t can be amplified by using the technique in [7] as pointed out in [32]. More specifically, the following propositions hold.

Proposition 1

There exists a passively t-secure 1-round \(3^t\)-server protocol \(\varPi \) for \(\textsc {Index}_N\) such that

  • \(\textrm{Comm}(\varPi )=\exp (O(\sqrt{\log N\log \log N}))\cdot t3^t=N^{o(1)}\cdot 2^{O(t)}\);

  • \(\textrm{c}\text {-}\textrm{Comp}(\varPi )=\exp (O(\sqrt{\log N\log \log N}))\cdot t3^t=N^{o(1)}\cdot 2^{O(t)}\);

  • \(\textrm{s}\text {-}\textrm{Comp}(\varPi )=N^2\cdot \exp (O(\sqrt{\log N\log \log N}))\cdot 2^t=N^{2+o(1)}\cdot 2^t\).

Note that the above protocol satisfies the efficiency requirement (1) since \(\textrm{Comm}(\varPi )\) and \(\textrm{c}\text {-}\textrm{Comp}(\varPi )\) are sub-polynomial (i.e., less than any polynomial) in the description length N of elements of \(P_\lambda =\{0,1\}^N\).

Proposition 2

Assume a pseudorandom generator \(G:\{0,1\}^\lambda \rightarrow \{0,1\}^{2(\lambda +1)}\). There exists a computationally passively t-secure 1-round \(2^t\)-server protocol \(\varPi \) for \(\textsc {Index}_N\) such that

  • \(\textrm{Comm}(\varPi )=O(\log N\cdot \lambda \cdot t2^t)\);

  • \(\textrm{c}\text {-}\textrm{Comp}(\varPi )\) is \(O(\log N\cdot t2^t)\) invocations of G;

  • \(\textrm{s}\text {-}\textrm{Comp}(\varPi )\) is \(O(N^2\log N\cdot t)\) invocations of G.

Remark 2

Dvir and Gopi [27] devised a technique to optimize the 3-server protocol in [28] and obtained a 2-server PIR protocol with \(N^{o(1)}\) communication. However, since the answer length is not constant, the passively t-secure protocol obtained by applying the amplification technique of [7] has larger communication complexity \(\exp (O(t\sqrt{\log N\log \log N}))\) and does not satisfy the efficiency requirement (1).

Bounded-Degree Polynomials. Let \(N=N(\lambda )\), \(D=D(\lambda )\) and \(M=M(\lambda )\) be polynomial functions. We define \(\textsc {Poly}_{N,D,M}(R)=(F_\lambda )_{\lambda \in \mathbb {N}}\) as a sequence of functions such that \(F_\lambda (p,\textbf{x})=p(\textbf{x})\) for any N-variate polynomial p over a ring R with degree D and number of monomials M, and for any \(\textbf{x}\in R^N\). The following is implicit in [43].

Proposition 3

Let \(N,D,M\in \textsf{poly}{\left( \lambda \right) }\). Let R be a ring such that for any \(a\in \{1,2,\ldots ,m-1\}\), an element \(a\cdot 1_R\) has an inverse in R, where \(1_R\) is the multiplicative identity of R. Suppose that \(m>Dt/2\). Then, there exists a passively t-secure 1-round m-server protocol \(\varPi \) for \(\textsc {Poly}_{N,D,M}(R)\) such that

  • \(\textrm{Comm}(\varPi )\) is O(Nm) ring elements;

  • \(\textrm{c}\text {-}\textrm{Comp}(\varPi )\) is O(Ntm) ring operations;

  • \(\textrm{s}\text {-}\textrm{Comp}(\varPi )\) is O(NMD) ring operations.

Since the description length of a polynomial with M monomials is \(\widetilde{O}(MD\log |R|)\), the above protocol satisfies the efficiency requirement (1) if \(MD=\omega (N)\).

Ishai, Lai and Malavolta [36] showed that assuming homomorphic encryption for degree-d polynomials, the number of servers in Proposition 3 can be decreased by a factor of d.

Proposition 4

Let \(d=O(1)\) and R be a ring such that for any \(a\in \{1,2,\ldots ,\max \{d,m-1\}\}\), an element \(a\cdot 1_R\) has an inverse in R, where \(1_R\) is the multiplicative identity of R. Assume a homomorphic encryption scheme \(\textsf{HE}\) for degree-d polynomials over R. Let \(M,N\in \textsf{poly}{\left( \lambda \right) }\) and \(D=O(1)\). Suppose that \(m>Dt/(d+1)\). Then there exists a computationally passively t-secure 1-round m-server protocol \(\varPi \) for \(\textsc {Poly}_{N,D,M}(R)\) such that

  • \(\textrm{Comm}(\varPi )=O(Nm\cdot \ell _{\textsf{ct}})\), where \(\ell _{\textsf{ct}}\) is the description length of ciphertexts of \(\textsf{HE}\);

  • \(\textrm{c}\text {-}\textrm{Comp}(\varPi )=O((Nt\cdot \tau _{\textsf{Enc}}+\tau _{\textsf{Dec}})m)\), where \(\tau _{\textsf{Enc}}\) and \(\tau _{\textsf{Dec}}\) are the running time of the encryption and decryption algorithms of \(\textsf{HE}\), respectively;

  • \(\textrm{s}\text {-}\textrm{Comp}(\varPi )=O(MN\cdot \tau _{\textsf{Eval}})\), where \(\tau _{\textsf{Eval}}\) is the running time of the evaluation algorithm of \(\textsf{HE}\) per operation.

Note that we have \(\max \{d,m-1\}=\textsf{poly}{\left( \lambda \right) }\) since \(d=O(1)\) and \(m=\textsf{poly}{\left( \lambda \right) }\). On the other hand, homomorphic encryption schemes mentioned in [36] assume that R is a prime field of size q or a ring of integers modulo \(n=q_1q_2\) for exponentially large primes \(q,q_1,q_2\). In these cases, \(a\cdot 1_R\) has an inverse in R if \(a\in \{1,2,\ldots ,\max \{d,m-1\}\}\).

Under the sparse Learning Parity with Noise (LPN) assumption over a field \(\mathbb {F}_q\), Dao et al. [24] proposed a passively t-secure \((t+1)\)-server protocol for polynomials of degree \(D=O(\log \lambda /\log \log \lambda )\). Although the original protocol does not have sublinear-size upload cost when evaluating a single polynomial, it can be seen that the upload cost is amortized if sufficiently many polynomials are evaluated on the same input. Specifically, let \(N=N(\lambda )\), \(D=D(\lambda )\), \(M=M(\lambda )\), and \(L=L(\lambda )\) be polynomial functions. We define \(\textsc {Poly}_{N,D,M}^L(R)=(F_\lambda )_{\lambda \in \mathbb {N}}\) as a sequence of functions such that \(F_\lambda ((p_1,\ldots ,p_L),\textbf{x})=(p_1(\textbf{x}),\ldots ,p_L(\textbf{x}))\) for any N-variate polynomials \(p_1,\ldots ,p_L\) over a ring R with degree D and number of monomials M, and for any \(\textbf{x}\in R^N\).

Proposition 5

Assume that the \((\delta ,q)\)-sLPN assumption holds for a constant \(0\le \delta \le 1\) and a sequence \(q=(q(\lambda ))_{\lambda \in \mathbb {N}}\) of prime powers that are computable in polynomial time in \(\lambda \). Let \(L,M,N\in \textsf{poly}{\left( \lambda \right) }\) and \(D=O(\log \lambda /\log \log \lambda )\). Then, there exists a computationally passively t-secure 1-round \((t+1)\)-server protocol \(\varPi \) for \(\textsc {Poly}_{N,D,M}^L(\mathbb {F}_q)\) such that

  • \(\textrm{Comm}(\varPi )=\widetilde{O}((M^{2/\delta }N+L)(\log q)m\lambda )\);

  • \(\textrm{c}\text {-}\textrm{Comp}(\varPi )=\widetilde{O}((M^{2/\delta }N+L)(\log q)m\lambda )\);

  • \(\textrm{s}\text {-}\textrm{Comp}(\varPi )=\widetilde{O}(M^{1/\delta +1}L(\log q)\lambda )\).

Note that the description length of L polynomials each with M monomials is \(\widetilde{O}(ML\log q)\) if the degree is \(D=o(\log \lambda )\). Thus, if \(L=\omega (M^{2/\delta -1})\), the above protocol satisfies the efficiency requirement (1). See [24] for the details including the definition of the sparse LPN assumption.

Constant-Depth Circuits. We consider Boolean circuits of constant depth with unbounded fan-in and fan-out. Formally, a Boolean circuit C is a labelled directed acyclic graph. The nodes with no incoming edges are labelled with input variables, their negations, or constants. The other nodes are called gates and are labelled with one of operators in \(\{\textsf{AND},\textsf{OR},\textsf{NOT}\}\). Nodes with no outgoing edges are called output nodes. We only consider a circuit with a single output node. The size of a circuit is the number of edges and its depth is the length of the longest path from an input node to the output node. We define the output of C on input x, which we denote by C(x), as the value of the output node after input values proceed through a sequence of gates.

Let \(N=N(\lambda )\), \(D=D(\lambda )\) and \(M=M(\lambda )\) be polynomial functions. We define \(\textsc {Circ}_{N,D,M}=(F_\lambda )_{\lambda \in \mathbb {N}}\) as a sequence of functions such that \(F_\lambda (C,x)=C(x)\) for any Boolean circuit C with N input variables, depth D and size M, and for any N-bit string x.

Proposition 6

Let \(N,M\in \textsf{poly}{\left( \lambda \right) }\) and \(D=O(1)\). Suppose that \(m\ge (\log M + 3)^{D-1}t/2\). Then, there exists a passively t-secure 1-round m-server protocol \(\varPi \) for \(\textsc {Circ}_{N,D,M}\) such that

  • \(\textrm{Comm}(\varPi )=O((\log M)^{D-1}N(\log N)\lambda m)\);

  • \(\textrm{c}\text {-}\textrm{Comp}(\varPi )=O((\log M)^{D-1}N(\log N)tm+(\log M)^2\lambda m)\);

  • \(\textrm{s}\text {-}\textrm{Comp}(\varPi )=O(M(\log M) N \lambda )\).

The protocol is efficient since \(\textrm{Comm}(\varPi )\) and \(\textrm{c}\text {-}\textrm{Comp}(\varPi )\) are linear in N and polylogarithmic in the size M of circuits, omitting factors in \(\lambda \) and m.

4 Interactive Actively Secure Protocols

In this section, we show our compiler from one-round passively t-secure k-server protocols to \(O(m^2)\)-round actively t-secure m-server protocols such that \(m\ge k+t\). To this end, we introduce a notion of conflict-finding protocols, which is an intermediate notion between passively secure and actively secure protocols. We show a generic compiler from conflict-finding to actively secure protocols in Sect. 4.3 and then show a generic compiler from passively secure to conflict-finding protocols.

4.1 Graph Theory

To begin with, we recall the standard terminology of graph theory (see [37, Chapter 2] for instance). A (simple and undirected) graph \(\mathcal {G}\) is a pair (VE), where V is a set of vertices and E is a set of edges \((i,j)\in V\times V\). Throughout the paper, we only consider the cases where V is either [m] or a subset of [m]. Thus we may assume that V is a totally ordered set. The total order on V naturally induces a lexicographic order on E, which is also a total order on E. A graph \(\mathcal {G}\) is called connected if there is a path between each pair of vertices. It is a standard result that there is a deterministic algorithm \(\mathcal {D}\) which decomposes \(\mathcal {G}\) into connected components in time \(O(|V|+|E|)\) [37]. For \(S\subseteq V\), we denote by \(\mathcal {G}[S]\) the induced subgraph, i.e., the graph whose vertex set is S and whose edge set consists of the edges in E that have both endpoints in S.

We show a deterministic algorithm \(\mathcal {C}_k'\) such that for any connected graph \(\mathcal {G}=(V,E)\) with at least k vertices, \(\mathcal {C}_k'(\mathcal {G})\) outputs a subset \(S\subseteq V\) such that \(|S|=k\) and \(\mathcal {G}[S]\) is connected. First, \(\mathcal {C}_k'\) chooses the minimum node s of V with respect to the total order on V. Secondly, \(\mathcal {C}_k'\) runs the “textbook” depth-first search algorithm [37] starting at the vertex s, except that it stops searching if it visits k vertices. Finally, \(\mathcal {C}_k'\) outputs the set S of all vertices it visited so far. By definition, S is of size k. Since any pair of vertices in S are connected via s, \(\mathcal {G}[S]\) is connected. The running time of \(\mathcal {C}_k'\) is \(O(|V|+|E|)\).

Next, we show a deterministic algorithm \(\mathcal {C}_k\) such that for any graph \(\mathcal {G}=(V,E)\), if \(\mathcal {G}\) contains a connected component of size at least k, \(\mathcal {C}_k(\mathcal {G})\) outputs a subset \(S\subseteq V\) of size k such that \(\mathcal {G}[S]\) is connected, and otherwise, it outputs the empty set \(\emptyset \). First, \(\mathcal {C}_k\) lists all the connected components of \(\mathcal {G}\), \((\mathcal {G}_1,\ldots ,\mathcal {G}_q)\leftarrow \mathcal {D}(\mathcal {G})\). Secondly, \(\mathcal {C}_k\) lets \(q_{\textrm{min}}\) be the minimum index q such that \(\mathcal {G}_q\) has at least k vertices. If no component has k vertices, \(\mathcal {C}_k\) outputs \(\emptyset \). Otherwise, \(\mathcal {C}_k\) outputs \(S\leftarrow \mathcal {C}_k'(\mathcal {G}_{q_{\textrm{min}}})\). The correctness of \(\mathcal {C}_k\) immediately follows from those of \(\mathcal {D}\) and \(\mathcal {C}_k'\). The running time of \(\mathcal {C}_k\) is \(O(|V|+|E|)\).

Finally, we show a trivial but frequently-used algorithm \(\mathcal {E}\), which takes as input a graph \(\mathcal {G}=(V,E)\) and a pair of disjoint non-empty subsets \(G_0,G_1\subseteq V\), and outputs the minimum edge \(e=(i,j)\in E\) (with respect to the total order on E) such that \(i\in G_0\) and \(j\in G_1\), or \(j\in G_0\) and \(i\in G_1\). The running time of \(\mathcal {E}\) is O(|E|).

4.2 Formalization of Conflict-Finding Protocols

Roughly speaking, in a conflict-finding protocol, a client obtains (yz), where y is the main output (supposed to be F(px)) and z is an auxiliary string. The string z is either \(\textsf{output}\), \(\textsf{failure}\), or a non-trivial partition \((G_0,G_1)\) of the set of serversFootnote 7. The security requirements are:

  • Soundness. The probability that \(z=\textsf{output}\) and \(y\ne F(p,x)\) is negligible, and the probability that the protocol outputs \(z=\textsf{failure}\) is also negligible;

  • Conflict-Finding. If z is a non-trivial partition \((G_0,G_1)\) of the set of servers, then one of \(G_0\) or \(G_1\) contains all honest servers (and hence the other group consists of malicious servers only);

  • Privacy. An adversary should not learn a client’s input x even if she knows z.

Intuitively, the conflict-finding property ensures that a client learns a subset of malicious servers only, which allows him to find a pair of servers such that at least one of them is malicious. We require the privacy should hold even if z is leaked, in order for an adversary not to learn additional information from a set of servers the client removes. Below, we show formal definitions.

Definition 3

We say that \(\varPi =(\textsf{Query},\textsf{Answer},\textsf{Output})\) is an \(\ell \)-round t-conflict-finding m-server protocol for \(\mathcal {F}=(F_\lambda :P_\lambda \times X_\lambda \rightarrow Y_\lambda )_{\lambda \in \mathbb {N}}\) if it satisfies the following properties:

  • Syntax. The syntax of \(\textsf{Query}\) and \(\textsf{Answer}\) is the same as that of \(\varPi \) as an \(\ell \)-round m-server protocol for \(\mathcal {F}\) (Definition 1). The algorithm \(\textsf{Output}\) takes a state \(\textsf{st}^{(\ell )}\) and answer \((\textsf{ans}_i^{(\ell )})_{i\in [m]}\) in round \(\ell \) as input, and outputs (yz) such that (1) \(y\in Y_\lambda \) and \(z=\textsf{output}\), (2) \(y=\bot \) and \(z=(G_0,G_1)\), which is a non-trivial partition of [m], or (3) \(y=\bot \) and \(z=\textsf{failure}\). We call the first (resp. second) component of the output of \(\textsf{Output}\) the y-output (resp. z-output).

  • Correctness. There exists a negligible function \(\textsf{negl}{\left( \lambda \right) }\) such that for any \(\lambda \in \mathbb {N}\) and any \((p,x)\in P_\lambda \times X_\lambda \), it holds that

    $$\begin{aligned} {\text {Pr}}\left[ (y,z)\leftarrow \textsf{Output}(1^\lambda ,\textsf{st}^{(\ell )},(\textsf{ans}_i^{(\ell )})_{i\in [m]})):y=F_\lambda (p,x)\right] \ge 1-\textsf{negl}{\left( \lambda \right) }, \end{aligned}$$

    where

    $$\begin{aligned} ((\textsf{que}_i^{(j)})_{i\in [m]},\textsf{st}^{(j)})&\leftarrow \textsf{Query}(1^\lambda ,x,\textsf{st}^{(j-1)},(\textsf{ans}_i^{(j-1)})_{i\in [m]}),\\ \textsf{ans}_i^{(j)}&\leftarrow \textsf{Answer}(1^\lambda ,p,\textsf{que}_i^{(j)}) \end{aligned}$$

    for all \(j\in [\ell ]\) and \(i\in [m]\).

  • Soundness. There exists a negligible function \(\textsf{negl}{\left( \lambda \right) }\) such that for any stateful algorithm \(\mathcal {A}\) and any \(\lambda \in \mathbb {N}\),

    $$\begin{aligned} {\text {Pr}}\left[ \textsf{Sound}_{\varPi ,\mathcal {A}}(\lambda )=1\right] <\textsf{negl}{\left( \lambda \right) }, \end{aligned}$$
    (2)

    where \(\textsf{Sound}_{\varPi ,\mathcal {A}}(\lambda )\) is the output of the following experiment:

    1. 1.

      \((x,p,T)\leftarrow \mathcal {A}(1^\lambda )\), where \(x\in X_\lambda \), \(p\in P_\lambda \) and \(T\subseteq [m]\) is of size at most t.

    2. 2.

      For each \(j=1,2,\ldots ,\ell \),

      1. (a)

        Let \(((\textsf{que}_i^{(j)})_{i\in [m]},\textsf{st}^{(j)})\leftarrow \textsf{Query}(1^\lambda ,x,\textsf{st}^{(j-1)},(\textsf{ans}_i^{(j-1)})_{i\in [m]})\) and give \((\textsf{que}_i^{(j)})_{i\in T}\) to \(\mathcal {A}\).

      2. (b)

        \(\mathcal {A}\) outputs \((\textsf{ans}_i^{(j)})_{i\in T}\).

    3. 3.

      Let \((y,z)\leftarrow \textsf{Output}(1^\lambda ,\textsf{st}^{(\ell )},(\textsf{ans}_i^{(\ell )})_{i\in [m]})\).

    4. 4.

      Return 1 if \(y\in Y_\lambda \setminus \{F_\lambda (p,x)\}\) and \(z=\textsf{output}\), or \(y=\bot \) and \(z=\textsf{failure}\). Otherwise return 0.

  • Conflict-Finding. For any stateful algorithm \(\mathcal {A}\) and any \(\lambda \in \mathbb {N}\),

    $$\begin{aligned} {\text {Pr}}\left[ \textsf{CF}_{\varPi ,\mathcal {A}}(\lambda )=1\right] =0, \end{aligned}$$

    where \(\textsf{CF}_{\varPi ,\mathcal {A}}(\lambda )\) is the output of the following experiment:

    1. 1.

      \((x,p,T)\leftarrow \mathcal {A}(1^\lambda )\), where \(x\in X_\lambda \), \(p\in P_\lambda \) and \(T\subseteq [m]\) is of size at most t.

    2. 2.

      For each \(j=1,2,\ldots ,\ell \),

      1. (a)

        Let \(((\textsf{que}_i^{(j)})_{i\in [m]},\textsf{st}^{(j)})\leftarrow \textsf{Query}(1^\lambda ,x,\textsf{st}^{(j-1)},(\textsf{ans}_i^{(j-1)})_{i\in [m]})\) and give \((\textsf{que}_i^{(j)})_{i\in T}\) to \(\mathcal {A}\).

      2. (b)

        \(\mathcal {A}\) outputs \((\textsf{ans}_i^{(j)})_{i\in T}\).

    3. 3.

      Let \((y,z)\leftarrow \textsf{Output}(1^\lambda ,\textsf{st}^{(\ell )},(\textsf{ans}_i^{(\ell )})_{i\in [m]})\).

    4. 4.

      Return 1 if \(z=(G_0,G_1)\), \(G_0\nsubseteq T\) and \(G_1\nsubseteq T\). Otherwise return 0.

  • Privacy. There exists a negligible function \(\textsf{negl}{\left( \lambda \right) }\) such that for any stateful algorithm \(\mathcal {A}\) and any \(\lambda \in \mathbb {N}\),

    $$\begin{aligned} \textsf{Adv}_{\varPi ,\mathcal {A}}^{\textrm{CF}}(\lambda ):=\left| {\text {Pr}}\left[ \textsf{Priv}_{\varPi ,\mathcal {A}}^{\textrm{CF},0}(\lambda )=0\right] -{\text {Pr}}\left[ \textsf{Priv}_{\varPi ,\mathcal {A}}^{\textrm{CF},1}(\lambda )=0\right] \right| <\textsf{negl}{\left( \lambda \right) }, \end{aligned}$$

    where for \(b\in \{0,1\}\), \(\textsf{Priv}_{\varPi ,\mathcal {A}}^{\textrm{CF},b}(\lambda )\) is the output \(b'\) of \(\mathcal {A}\) in the following experiment:

    1. 1.

      \((x_0,x_1,p,T)\leftarrow \mathcal {A}(1^\lambda )\), where \(x_0,x_1\in X_\lambda \), \(p\in P_\lambda \) and \(T\subseteq [m]\) is of size at most t.

    2. 2.

      For each \(j=1,2,\ldots ,\ell \),

      1. (a)

        Let \(((\textsf{que}_i^{(j)})_{i\in [m]},\textsf{st}^{(j)})\leftarrow \textsf{Query}(1^\lambda ,x_b,\textsf{st}^{(j-1)},(\textsf{ans}_i^{(j-1)})_{i\in [m]})\) and give \((\textsf{que}_i^{(j)})_{i\in T}\) to \(\mathcal {A}\).

      2. (b)

        \(\mathcal {A}\) outputs \((\textsf{ans}_i^{(j)})_{i\in T}\).

    3. 3.

      Let \((y,z)\leftarrow \textsf{Output}(1^\lambda ,\textsf{st}^{(\ell )},(\textsf{ans}_i^{(\ell )})_{i\in [m]})\) and give z to \(\mathcal {A}\).

    4. 4.

      \(\mathcal {A}\) outputs a bit \(b'\in \{0,1\}\).

For a (possibly non-negligible) function \(\epsilon (\lambda )\), we define a weaker notion of a \(\epsilon \)-sound t-conflict-finding protocol \(\varPi \) as the one satisfying the requirements in Definition 3 except that the condition (2) is replaced with

$$\begin{aligned} {\text {Pr}}\left[ \textsf{Sound}_{\varPi ,\mathcal {A}}(\lambda )=1\right] <\epsilon . \end{aligned}$$

We say that \(\varPi \) is computationally t-conflict-finding if it satisfies the above requirements for PPT adversaries \(\mathcal {A}\).

4.3 Compiler from Conflict-Finding to Actively Secure Protocols

We construct an actively t-secure m-server protocol from a t-conflict-finding \((m-t)\)-server protocol. We give a sketch here and defer the formal proof to the full version.

Theorem 1

Suppose that there exists an \(\ell \)-round (resp. computationally) t-conflict-finding k-server protocol \(\varPi _{\textrm{CF}}\) for \(\mathcal {F}=(F_\lambda :P_\lambda \times X_\lambda \rightarrow Y_\lambda )_{\lambda \in \mathbb {N}}\). If \(m\ge t+k\), there exists an \(O(\ell m^2)\)-round (resp. computationally) actively t-secure m-server protocol \(\varPi \) for \(\mathcal {F}\) such that

  • \(\textrm{Comm}(\varPi )=O(m^2\cdot \textrm{Comm}(\varPi _{\textrm{CF}}))\);

  • \(\textrm{c}\text {-}\textrm{Comp}(\varPi )=O(m^2\cdot \textrm{c}\text {-}\textrm{Comp}(\varPi _{\textrm{CF}})+m^4)\);

  • \(\textrm{s}\text {-}\textrm{Comp}(\varPi )=O(m^2\cdot \textrm{s}\text {-}\textrm{Comp}(\varPi _{\textrm{CF}}))\).

Proof

(sketch). Define \(N:=\left( {\begin{array}{c}m\\ 2\end{array}}\right) -\left( {\begin{array}{c}m-t\\ 2\end{array}}\right) +1=O(m^2)\). Let V be the set of all m servers and \(\mathcal {G}^{(1)}\) be the complete graph on V. Consider the following protocol \(\varPi \): For each \(j=1,2,\ldots ,N\),

  1. 1.

    The client \(\textsf{C}\) finds a k-sized subset \(S^{(j)}\) of V such that \(\mathcal {G}^{(j)}[S^{(j)}]\) is connected, based on the algorithm \(\mathcal {C}_k\) in Sect. 4.1.

  2. 2.

    \(\textsf{C}\) executes the conflict-finding protocol \(\varPi _{\textrm{CF}}\) with k servers in \(S^{(j)}\), and obtain an output \((y^{(j)},z^{(j)})\).

  3. 3.

    If \(z^{(j)}=\textsf{output}\), then \(\textsf{C}\) outputs the y-output \(y^{(j)}\).

  4. 4.

    If \(z^{(j)}=\textsf{failure}\), then \(\textsf{C}\) outputs any default value \(y_0\).

  5. 5.

    If \(z^{(j)}\) is a non-trivial partition \((G_0^{(j)},G_1^{(j)})\) of \(S^{(j)}\), then \(\textsf{C}\) does the following:

    1. (a)

      Find an edge \(e^{(j)}\) of \(\mathcal {G}^{(j)}\) crossing the partition \((G_0^{(j)},G_1^{(j)})\) based on the algorithm \(\mathcal {E}\) in Sect. 4.1. Such an edge exists since \(\mathcal {G}^{(j)}[S^{(j)}]\) is connected.

    2. (b)

      Let \(\mathcal {G}^{(j+1)}\) be a graph obtained by removing \(e^{(j)}\) from \(\mathcal {G}^{(j)}\).

    3. (c)

      Go back to Step 1.

Privacy. An adversary corrupting a set T of at most t servers cannot learn a client’s input from interaction at Step 2 due to the fact that \(|T\cap S^{(j)}|\le |T|\le t\) and the privacy of \(\varPi _{\textrm{CF}}\). The adversary can also see a sequence of graphs \(\mathcal {G}^{(1)},\mathcal {G}^{(2)},\ldots ,\mathcal {G}^{(N)}\) but as shown at Step 5, the sequence is determined only by a sequence of z-outputs \(z^{(1)},z^{(2)},\ldots ,z^{(N)}\). Since \(\varPi _{\textrm{CF}}\) guarantees privacy even if z-outputs are leaked, she learns no additional information.

Byzantine-Robustness. The client \(\textsf{C}\) outputs an incorrect result only if one of the following events occurs: (1) \(z^{(j)}=\textsf{output}\) and \(y^{(j)}\) is an incorrect result for some \(j\in [N]\), (2) \(z^{(j)}=\textsf{failure}\) for some \(j\in [N]\), or (3) \(z^{(j)}\) is a non-trivial partition for all \(j\in [N]\). It follows from the soundness of \(\varPi _{\textrm{CF}}\) that the first and second cases occur only with negligible probability.

We argue that the third case never occurs. Assume otherwise, then for all j, the z-output \(z^{(j)}\) of the j-th iteration is a non-trivial partition \((G_0^{(j)},G_1^{(j)})\) of \(S^{(j)}\). Since the conflict-finding property of \(\varPi _{\textrm{CF}}\) ensures that either \(G_0^{(j)}\) or \(G_1^{(j)}\) includes the set of honest servers \(H:=[m]\setminus T\), the removed edge \(e^{(j)}=(i_1,i_2)\) satisfies \(i_1\in T\) or \(i_2\in T\) and hence the subgraph \(\mathcal {G}^{(j)}[H]\) is a complete graph for all j. Since N is larger than the total number \(N'=\left( {\begin{array}{c}m\\ 2\end{array}}\right) -\left( {\begin{array}{c}m-|T|\\ 2\end{array}}\right) \) of unordered pairs \((i_1,i_2)\) such that \(i_1\in T\) or \(i_2\in T\), \(\mathcal {G}^{(N')}\) has no edge \(e=(i_1,i_2)\) such that \(i_1\in T\) or \(i_2\in T\). Therefore, a set of servers \(S^{(N')}\) involved in the \(N'\)-th iteration is a subset of H since \(k\le m-t\le |H|\). We have assumed that \(z^{(N')}\) is a non-trivial partition \((G_0^{(N')},G_1^{(N')})\) of \(S^{(N')}\) but the conflict-finding property ensures that \(H\subseteq G_0^{(N')}\) or \(H\subseteq G_1^{(N')}\), which is contradiction.    \(\square \)

4.4 Compiler from Passively Secure to Conflict-Finding Protocols

First, we show a basic construction of \(\epsilon \)-sound conflict-finding protocols for non-negligible \(\epsilon \). We give a sketch here and defer the formal proof to the full version.

Proposition 7

Let \(\varPi \) be a 1-round (resp. computationally) passively t-secure m-server protocol for \(\mathcal {F}=(F_\lambda :P_\lambda \times X_\lambda \rightarrow Y_\lambda )_{\lambda \in \mathbb {N}}\). Let \(M=\textsf{poly}{\left( \lambda \right) }\). Then, there exists a 2-round (resp. computationally) \(\epsilon \)-sound t-conflict-finding m-server protocol \(\varPi '\) for \(\mathcal {F}\) such that

  • \(\textrm{Comm}(\varPi ')=O(mM\cdot \textrm{Comm}(\varPi ))\);

  • \(\textrm{c}\text {-}\textrm{Comp}(\varPi ')=O(m^2M\cdot \textrm{c}\text {-}\textrm{Comp}(\varPi ))\);

  • \(\textrm{s}\text {-}\textrm{Comp}(\varPi ')=O(mM\cdot \textrm{s}\text {-}\textrm{Comp}(\varPi ))\);

where \(\epsilon =m/M+\textsf{negl}{\left( \lambda \right) }\) for some negligible function \(\textsf{negl}{\left( \lambda \right) }\).

Proof

(sketch). Consider the following protocol \(\varPi '\):

  • First Round

    1. 1.

      The client \(\textsf{C}\) chooses \(\mu _*\) uniformly at random from [M].

    2. 2.

      For all \(\mu \in [M]\), \(\textsf{C}\) computes queries \((\textsf{que}_1^{\langle \mu \rangle },\ldots ,\textsf{que}_m^{\langle \mu \rangle })\) of \(\varPi \) on his true input x if \(\mu =\mu _*\), and on a default input \(x_{\textrm{def}}\) otherwise.

    3. 3.

      \(\textsf{C}\) sends the queries \((\textsf{que}_i^{\langle \mu \rangle })_{\mu \in [M]}\) to each server \(\textsf{S}_i\) as usual, who returns answers \((\textsf{ans}_i^{\langle \mu \rangle })_{\mu \in [M]}\) to them.

  • Second Round

    1. 1.

      \(\textsf{C}\) sends all the queries \((\textsf{que}_k^{\langle \mu \rangle })_{k\in [m],\mu \ne \mu _*}\) for the default input \(x_{\textrm{def}}\) to all servers.

    2. 2.

      For all \(k\in [m]\) and \(\mu \in [M]\setminus \{\mu _*\}\), each server \(\textsf{S}_i\) returns an answer \(\textsf{ans}_k^{\langle \mu \rangle }(i)\) as \(\textsf{S}_k\) would answer to \(\textsf{que}_k^{\langle \mu \rangle }\).

To obtain an output, \(\textsf{C}\) defines \(v_i=(\textsf{ans}_k^{\langle \mu \rangle }(i))_{k\in [m],\mu \ne \mu _*}\) for all \(i\in [m]\). For simplicity, we here assume that \(\textsf{ans}_i^{\langle \mu \rangle }(i)=\textsf{ans}_i^{\langle \mu \rangle }\) for all \(i\in [m]\). This is because otherwise, it means that a server \(\textsf{S}_i\) returns different answers in the first and second rounds and hence \(\textsf{S}_i\) is immediately found malicious. The client \(\textsf{C}\) partitions the set of servers into equivalence classes \(G_0',\ldots ,G_\ell '\) under the equivalence relation defined as: \(i\sim j\overset{\textrm{def}}{\iff } v_i=v_j\). If \(\ell =0\) (i.e., all servers belong to the same equivalence class), then he runs the output algorithm of \(\varPi \) on the answers \((\textsf{ans}_1^{\langle \mu _* \rangle },\ldots ,\textsf{ans}_m^{\langle \mu _* \rangle })\) to the queries for his true input. He then outputs the result y along with \(z=\textsf{output}\). If \(\ell \ge 1\), then he outputs \(y=\bot \) and \(z=(G_0,G_1)\), where \(G_0=G_0'\) and \(G_1=G_1'\cup \cdots \cup G_\ell '\).

Conflict-Finding. Let T be a set of corrupted servers. Since honest servers \(i,j\notin T\) always return the same answer to the same query, we have that \(\textsf{ans}_k^{\langle \mu \rangle }(i)=\textsf{ans}_k^{\langle \mu \rangle }(j)\) for all \(k\in [m]\) and \(\mu \in [M]\setminus \{\mu _*\}\), and hence \(v_i=v_j\). Therefore, the set of honest servers is contained in an equivalence class and it holds that \(\overline{T}\subseteq G_0=\overline{G_1}\) or \(\overline{T}\subseteq G_1=\overline{G_0}\).

Soundness. In the first place, the protocol \(\varPi '\) never outputs \(z=\textsf{failure}\). Assume that \(\varPi '\) outputs \(z=\textsf{output}\). Then, all servers belong to the same equivalence class, which implies that \(v_i=v_j\) for any \(i,j\in [m]\). To let the client accept an incorrect result, an adversary needs to let at least one corrupted server \(\textsf{S}_i\) submit an incorrect answer exactly to the query \(\textsf{que}_i^{\langle \mu _* \rangle }\) for the client’s true input. (This is because if a corrupted server submits incorrect \(\textsf{ans}_i^{\langle \mu \rangle }\) for some \(\mu \ne \mu _*\), then it is detected when compared with an answer \(\textsf{ans}_i^{\langle \mu \rangle }(j)\) from an honest server \(j\notin T\).) However, the adversary cannot learn which query encodes the client’s true input due to the privacy of \(\varPi \). Therefore, her best possible strategy is to guess \(\mu _*\) uniformly at random, which succeeds only with probability 1/M. The union bound implies that the error probability is at most m/M.

Privacy. Since M queries are generated independently, an adversary learns no information on the client’s input x in the first round. The queries revealed in the second round are the ones for a default input \(x_{\textrm{def}}\), which is independent of x, and hence the adversary learns no additional information. The privacy holds even if the z-output z is leaked, since z is determined only by \((v_i)_{i\in [m]}\), which can be simulated from information that the adversary learns up to the second round.    \(\square \)

Next, we show that the error probability of the basic construction can be made negligible by parallel execution. The proof is deferred to the full version.

Theorem 2

Let \(\varPi \) be a 1-round (resp. computationally) passively t-secure m-server protocol for \(\mathcal {F}=(F_\lambda :P_\lambda \times X_\lambda \rightarrow Y_\lambda )_{\lambda \in \mathbb {N}}\). Then there exists a 2-round (resp. computationally) t-conflict-finding m-server protocol \(\varPi _{\textrm{CF}}\) for \(\mathcal {F}\) such that

  • \(\textrm{Comm}(\varPi _{\textrm{CF}})=O(m^2\lambda \cdot \textrm{Comm}(\varPi ))\);

  • \(\textrm{c}\text {-}\textrm{Comp}(\varPi _{\textrm{CF}})=O(m^3\lambda \cdot \textrm{c}\text {-}\textrm{Comp}(\varPi ))\);

  • \(\textrm{s}\text {-}\textrm{Comp}(\varPi _{\textrm{CF}})=O(m^2\lambda \cdot \textrm{s}\text {-}\textrm{Comp}(\varPi ))\).

Finally, by combining Theorems 1 and 2, we obtain our generic construction of an \(O(m^2)\)-round actively t-secure m-server protocol from any 1-round passively t-secure k-server protocol for \(k\le m-t\).

Theorem 3

Suppose that \(m>2t\). Let \(k\le m-t\) and \(\varPi \) be a 1-round (resp. computationally) passively t-secure k-server protocol for \(\mathcal {F}\). Then there exists an \(O(m^2)\)-round (resp. computationally) actively t-secure m-server protocol \(\varPi '\) for \(\mathcal {F}\) such that

  • \(\textrm{Comm}(\varPi ')=O(m^4\lambda \cdot \textrm{Comm}(\varPi ))\);

  • \(\textrm{c}\text {-}\textrm{Comp}(\varPi ')=O(m^5\lambda \cdot \textrm{c}\text {-}\textrm{Comp}(\varPi ))\);

  • \(\textrm{s}\text {-}\textrm{Comp}(\varPi ')=O(m^4\lambda \cdot \textrm{s}\text {-}\textrm{Comp}(\varPi ))\).

4.5 Instantiations

By applying our compiler in Theorem 3 to the passively secure protocols in Propositions 1 and 2, we obtain actively secure protocols for \(\textsc {Index}_N\).

Corollary 1

Suppose that \(m\ge 3^t+t\). Let \(N\in \textsf{poly}{\left( \lambda \right) }\). Then, there exists an actively t-secure \(O(m^2)\)-round m-server protocol \(\varPi \) for \(\textsc {Index}_N\) such that

  • \(\textrm{Comm}(\varPi )=\exp (O(\sqrt{\log N\log \log N}))\cdot t 3^t m^4 \lambda \);

  • \(\textrm{c}\text {-}\textrm{Comp}(\varPi )=\exp (O(\sqrt{\log N\log \log N}))\cdot t 3^t m^5 \lambda \);

  • \(\textrm{s}\text {-}\textrm{Comp}(\varPi )=N^2\cdot \exp (O(\sqrt{\log N\log \log N}))\cdot 2^t m^4 \lambda \).

In particular, \(\max \{\textrm{Comm}(\varPi ),\textrm{c}\text {-}\textrm{Comp}(\varPi )\}=N^{o(1)}\cdot 2^{O(t)}\lambda \).

Corollary 2

Assume a pseudorandom generator \(G:\{0,1\}^\lambda \rightarrow \{0,1\}^{2(\lambda +1)}\). Suppose that \(m\ge 2^t+t\). Let \(N\in \textsf{poly}{\left( \lambda \right) }\). Then, there exists a computationally actively t-secure 1-round m-server protocol \(\varPi \) for \(\textsc {Index}_N\) such that

  • \(\textrm{Comm}(\varPi )=O(\log N\cdot \lambda ^2\cdot t2^tm^4)\);

  • \(\textrm{c}\text {-}\textrm{Comp}(\varPi )\) is \(O(\log N\cdot t2^tm^5)\) invocations of G;

  • \(\textrm{s}\text {-}\textrm{Comp}(\varPi )\) is \(O(N^2\log N\cdot tm^4)\) invocations of G.

In particular, \(\max \{\textrm{Comm}(\varPi ),\textrm{c}\text {-}\textrm{Comp}(\varPi )\}=\log N\cdot 2^{O(t)}\cdot \textsf{poly}{\left( \lambda \right) }\).

By applying Theorem 3 to Proposition 3, we obtain an actively secure protocol for multivariate polynomials.

Corollary 3

Let \(N,D,M\in \textsf{poly}{\left( \lambda \right) }\). Let R be a ring such that for any \(a\in \{1,2,\ldots ,m-1\}\), an element \(a\cdot 1_R\) has an inverse in R. Suppose that

$$\begin{aligned} m>\left( \frac{D}{2}+1\right) t. \end{aligned}$$

Then, there exists an actively t-secure 1-round m-server protocol \(\varPi \) for \(\textsc {Poly}_{N,D,M}(R)\) such that

  • \(\textrm{Comm}(\varPi )=O(Nm^4\lambda )\) ring elements;

  • \(\textrm{c}\text {-}\textrm{Comp}(\varPi )=O(Ntm^6\lambda )\) ring operations;

  • \(\textrm{s}\text {-}\textrm{Comp}(\varPi )=O(NMDm^4\lambda )\) ring operations.

In particular, \(\max \{\textrm{Comm}(\varPi ),\textrm{c}\text {-}\textrm{Comp}(\varPi )\}=N\cdot \textsf{poly}{\left( m,\lambda \right) }\).

By applying Theorem 3 to Proposition 4, we can reduce the required number of servers by a factor of d assuming homomorphic encryption for degree-d polynomials.

Corollary 4

Let \(d=O(1)\) and R be a ring such that for any \(a\in \{1,2,\ldots ,\max \{d,m-1\}\}\), an element \(a\cdot 1_R\) has an inverse in R, where \(1_R\) is the multiplicative identity of R. Assume a homomorphic encryption scheme \(\textsf{HE}\) for degree-d polynomials over R. Suppose that

$$\begin{aligned} m>\left( \frac{D}{d+1}+1\right) t \end{aligned}$$

Let \(M,N\in \textsf{poly}{\left( \lambda \right) }\) and \(D=O(1)\). Then there exists a computationally actively t-secure \(O(m^2)\)-round m-server protocol \(\varPi \) for \(\textsc {Poly}_{N,D,M}(R)\) such that

  • \(\textrm{Comm}(\varPi )=O(Nm^5\lambda \cdot \ell _{\textsf{ct}})\), where \(\ell _{\textsf{ct}}\) is the description length of ciphertexts of \(\textsf{HE}\);

  • \(\textrm{c}\text {-}\textrm{Comp}(\varPi )=O((Nt\cdot \tau _{\textsf{Enc}}+\tau _{\textsf{Dec}})m^6\lambda )\), where \(\tau _{\textsf{Dec}}\) and \(\tau _{\textsf{Enc}}\) are the running time of the decryption and encryption algorithms of \(\textsf{HE}\), respectively;

  • \(\textrm{s}\text {-}\textrm{Comp}(\varPi )=O(M\cdot m^4\lambda \tau _{\textsf{Eval}})\), where \(\tau _{\textsf{Eval}}\) is the running time per operation of the evaluation algorithm of \(\textsf{HE}\).

In particular, \(\max \{\textrm{Comm}(\varPi ),\textrm{c}\text {-}\textrm{Comp}(\varPi )\}=N\cdot \textsf{poly}{\left( m,\lambda \right) }\).

By applying Theorem 3 to Proposition 5, we obtain an actively t-secure protocol for polynomials achieving the minimum number of servers \(2t+1\).

Corollary 5

Suppose that \(m=2t+1\). Assume that the \((\delta ,q)\)-sLPN assumption holds for a constant \(0\le \delta \le 1\) and a sequence \(q=(q(\lambda ))_{\lambda \in \mathbb {N}}\) of prime powers that are computable in polynomial time in \(\lambda \). Let \(L,M,N\in \textsf{poly}{\left( \lambda \right) }\) and \(D=O(\log \lambda /\log \log \lambda )\). Then, there exists a computationally actively t-secure \(O(m^2)\)-round m-server protocol \(\varPi \) for \(\textsc {Poly}_{N,D,M}^L(\mathbb {F}_q)\) such that

  • \(\textrm{Comm}(\varPi )=\widetilde{O}((M^{2/\delta }N+L)(\log q)m^5\lambda ^2)\);

  • \(\textrm{c}\text {-}\textrm{Comp}(\varPi )=\widetilde{O}((M^{2/\delta }N+L)(\log q)m^6\lambda ^2)\);

  • \(\textrm{s}\text {-}\textrm{Comp}(\varPi )=\widetilde{O}(M^{1/\delta +1}L(\log q)m^4\lambda ^2)\).

In particular, \(\max \{\textrm{Comm}(\varPi ),\textrm{c}\text {-}\textrm{Comp}(\varPi )\}=(M^{2/\delta }N+L)\log q\cdot \textsf{poly}{\left( m,\lambda \right) }\).

Finally, by applying Theorem 3 to Proposition 6, we obtain an actively secure protocol for constant-depth circuits.

Corollary 6

Let \(N,M\in \textsf{poly}{\left( \lambda \right) }\) and \(D=O(1)\). Suppose that

$$\begin{aligned} m\ge \left( \frac{(\log M+3)^{D-1}}{2}+1\right) t. \end{aligned}$$

Then, there exists an actively t-secure \(O(m^2)\)-round m-server protocol \(\varPi \) for \(\textsc {Circ}_{N,D,M}\) such that

  • \(\textrm{Comm}(\varPi )=O((\log M)^{D-1}N(\log N)\lambda ^2 m^5)\);

  • \(\textrm{c}\text {-}\textrm{Comp}(\varPi )=O((\log M)^{D-1}N(\log N)\lambda tm^6+(\log M)^2\lambda ^2 m^6)\);

  • \(\textrm{s}\text {-}\textrm{Comp}(\varPi )=O(M(\log M) N m^4\lambda ^2)\).

In particular, \(\max \{\textrm{Comm}(\varPi ),\textrm{c}\text {-}\textrm{Comp}(\varPi )\}=N\cdot \textsf{poly}{\left( m,\lambda \right) }\).

5 Non-interactive Actively Secure Protocols

In this section, we show our compiler from one-round passively t-secure k-server protocols to one-round actively t-secure m-server protocols such that \(m=O(k\log k)+2t\). To this end, we introduce a novel combinatorial object of locally surjective map families, which is a variant of perfect hash families with a stronger property. We show a probabilistic construction of such families in Sect. 5.1 and then show a generic compiler from passively secure to actively secure protocols in Sect. 5.2.

5.1 Locally Surjective Map Family

We show the formal definition of locally surjective map families.

Definition 4

Let \(m,h,k\in \mathbb {N}\) and \(\mathcal {L}\) be a family of maps from [m] to [k]. We call \(\mathcal {L}\) an (mhk)-locally surjective map family if \(|A_H|>|\mathcal {L}|/2\) for any \(H\in \left( {\begin{array}{c}[m]\\ h\end{array}}\right) \), where \(A_H=\{f\in \mathcal {L}:f(H)=[k]\}\).

A locally surjective map family satisfies a stronger property than a nearly perfect hash family \(\mathcal {L}'\) introduced in [11], which assumes that for any \(H\in \left( {\begin{array}{c}[m]\\ h\end{array}}\right) \), there exists at least one map \(f\in \mathcal {L}'\) such that \(f(H)=[k]\).

We show a probabilistic construction of an (mhk)-locally surjective map family of size O(m) for \(k=O(h/\log h)\). The formal proof is deferred to the full version.

Proposition 8

Let \(m,h,k\in \mathbb {N}\) be such that \(h\ge 15\), \(m\ge 15\) and \(k\le h/(\gamma \ln h)\), where \(\gamma :=1+(\ln 3-\ln \ln 15)/(\ln 15)<1.04\). Then, there exists an (mhk)-locally surjective map family \(\mathcal {L}\) such that \(w:=|\mathcal {L}|=14m\).

5.2 Compiler from Passively Secure to Actively Secure Protocols

Based on locally surjective map families, we show our construction of one-round actively secure protocols from any one-round passively secure protocol. We give a sketch here and defer the formal proof to the full version.

Theorem 4

Suppose that there exists a 1-round (resp. computationally) passively t-secure k-server protocol \(\varPi =(\textsf{Query},\textsf{Answer},\textsf{Output})\) for \(\mathcal {F}=(F_\lambda :P_\lambda \times X_\lambda \rightarrow Y_\lambda )_{\lambda \in \mathbb {N}}\). If there exists an \((m,m-2t,k)\)-locally surjective map family \(\mathcal {L}\) of size \(w=\textsf{poly}{\left( \lambda \right) }\), there exists a 1-round (resp. computationally) actively t-secure m-server protocol \(\varPi '=(\textsf{Query}',\textsf{Answer}',\textsf{Output}')\) for \(\mathcal {F}\) such that

  • \(\textrm{Comm}(\varPi ')=O(twm\cdot \textrm{Comm}(\varPi ))\);

  • \(\textrm{c}\text {-}\textrm{Comp}(\varPi ')=O(twm\cdot \textrm{c}\text {-}\textrm{Comp}(\varPi ))\);

  • \(\textrm{s}\text {-}\textrm{Comp}(\varPi ')=O(tw\cdot \textrm{s}\text {-}\textrm{Comp}(\varPi ))\).

Proof

(sketch). Let \(\mathcal {L}=\{f_1,\ldots ,f_w\}\) be an (mhk)-locally surjective map family, where \(h=m-2t\). For \(u\in [w]\) and \(j\in [k]\), define \(G_{u,j}=f_u^{-1}(j)=\{i\in [m]:f_u(i)=j\}\). Consider the following protocol \(\varPi '\): For all \(u\in [w]\) and \(\ell \in [t+1]\) (in parallel),

  1. 1.

    The client \(\textsf{C}\) computes k queries \((\textsf{que}_1^{(u,\ell )},\ldots ,\textsf{que}_k^{(u,\ell )})\) of \(\varPi \).

  2. 2.

    \(\textsf{C}\) sends \(\textsf{que}_{f_u(i)}^{(u,\ell )}\) to each server \(\textsf{S}_i\).

  3. 3.

    Each \(\textsf{S}_i\) returns an answer \(\textsf{ans}_i^{(u,\ell )}\) as the \(f_u(i)\)-th server would answer to \(\textsf{que}_{f_u(i)}^{(u,\ell )}\) in \(\varPi \).

To obtain an output, \(\textsf{C}\) sets \(S\leftarrow [m]\) and \(L\leftarrow 1\), and does the following:

  1. 1.

    Check whether for all \(u\in [w]\) and \(j\in [k]\), the answers \(\textsf{ans}_i^{(u,L)}\) returned by servers \(\textsf{S}_i\) in \(G_{u,j}\) are identical with each other.

  2. 2.

    If so, let \(\alpha _{u,j}\) be the unique answer by servers in \(G_{u,j}\) and run the output algorithm of \(\varPi \) on \((\alpha _{u,1},\ldots ,\alpha _{u,k})\) to obtain \(y_u\). Then, output the majority of \(y_1,\ldots ,y_w\).

  3. 3.

    Otherwise, find a pair \((i_1,i_2)\) of servers who are mapped to the same group \(G_{u,j}\) but returned different answers. That is, \(f_u(i_1)=f_u(i_2)\) and \(\textsf{ans}_{i_1}^{(u,L)}\ne \textsf{ans}_{i_2}^{(u,L)}\) for some \(u\in [w]\). Note that at least one of them are malicious. Then, update \(S\leftarrow S\setminus \{i_1,i_2\}\) and \(L\leftarrow L+1\), and go back to Step 1.

Privacy. An adversary corrupting a set T of at most t servers can only learn queries received by a set \(f_u(T)\) of servers in \(\varPi \). Since \(|f_u(T)|\le |T|\le t\), the privacy of \(\varPi '\) follows from that of \(\varPi \).

Byzantine-Robustness. An adversary succeeds in letting the client accept an incorrect result only if at least w/2 out of \(y_1,\ldots ,y_w\) are incorrect in some iteration (say, L) in the output phase of \(\textsf{C}\). This implies that for at least w/2 u’s, there exists a remaining corrupted server \(i\in T\cap S\) who submits an incorrect answer \(\widetilde{\textsf{ans}}_i^{(u,L)}\ne \textsf{ans}_i^{(u,L)}\). On the other hand, since at most one honest server is eliminated from S in each iteration, it holds that \(|H\cap S|\ge (m-t)-t=m-2t\), where H is the set of all honest servers. Therefore, the property of locally surjective map families ensures that \(f_u(H\cap S)=[k]\) holds for at least one of the above w/2 u’s. In other words, there exists a remaining honest server \(i'\in H\cap S\) such that \(f_u(i')=f_u(i)\), and the answer \(\widetilde{\textsf{ans}}_i^{(u,L)}\) is compared with the correct answer \(\textsf{ans}_{i'}^{(u,L)}\) from the honest server \(i'\). Thus, the client can detect the malicious behavior of the corrupted server i. Therefore, the client can successfully eliminate at least one malicious server in each iteration and obtain the correct result after at most t iterations.    \(\square \)

To obtain a concrete compiler from Theorem 4, we plug in the (mhk)-locally surjective map family in Proposition 8 with \(h=m-2t\).

Theorem 5

Suppose that there exists a 1-round (resp. computationally) passively t-secure k-server protocol \(\varPi \) for \(\mathcal {F}\). If

$$\begin{aligned} m\ge 2t+15~\text {and}~\frac{m-2t}{\gamma \ln (m-2t)}\ge k, \end{aligned}$$

where \(1<\gamma <1.04\) is the constant in Proposition 8, then there exists a 1-round (resp. computationally) actively t-secure m-server protocol \(\varPi '\) for \(\mathcal {F}\) such that

  • \(\textrm{Comm}(\varPi ')=O(tm^2\cdot \textrm{Comm}(\varPi ))\);

  • \(\textrm{c}\text {-}\textrm{Comp}(\varPi ')=O(tm^2\cdot \textrm{c}\text {-}\textrm{Comp}(\varPi ))\);

  • \(\textrm{s}\text {-}\textrm{Comp}(\varPi ')=O(tm\cdot \textrm{s}\text {-}\textrm{Comp}(\varPi ))\).

Remark 3

The computational complexity of the construction in Theorem 5 does not take into account that of finding a locally surjective map family \(\mathcal {L}\). We note that the choice of \(\mathcal {L}\) does not affect the security of a protocol. Hence we can construct it before the protocol starts and the family is reusable any number of times.

5.3 Instantiations

By applying our compiler in Theorem 5 to the protocols in Propositions 1 and 2, we obtain the following corollaries. The formal proof appears in the full version.

Corollary 7

Suppose that \(m\ge \max \{2t3^t+2t,2t+15\}.\) Let \(N\in \textsf{poly}{\left( \lambda \right) }\). Then, there exists a computationally actively t-secure 1-round m-server protocol \(\varPi \) for \(\textsc {Index}_N\) such that

  • \(\textrm{Comm}(\varPi )=\exp (O(\sqrt{\log N\log \log N}))\cdot t^2 3^t m^2\);

  • \(\textrm{c}\text {-}\textrm{Comp}(\varPi )=\exp (O(\sqrt{\log N\log \log N}))\cdot t^2 3^t m^2\);

  • \(\textrm{s}\text {-}\textrm{Comp}(\varPi )=N^2\cdot \exp (O(\sqrt{\log N\log \log N}))\cdot t 2^t m\).

In particular, \(\max \{\textrm{Comm}(\varPi ),\textrm{c}\text {-}\textrm{Comp}(\varPi )\}=N^{o(1)}\cdot 2^{O(t)}\).

Corollary 8

Assume a pseudorandom function \(G:\{0,1\}^\lambda \rightarrow \{0,1\}^{2(\lambda +1)}\). Suppose that \(m\ge \max \{t2^{t+1}+2t,2t+15\}.\) Let \(N\in \textsf{poly}{\left( \lambda \right) }\). Then, there exists an actively t-secure 1-round m-server protocol \(\varPi \) for \(\textsc {Index}_N\) such that

  • \(\textrm{Comm}(\varPi )=O(\log N\cdot \lambda \cdot t^2 2^t m^2)\);

  • \(\textrm{c}\text {-}\textrm{Comp}(\varPi )\) is \(O(\log N\cdot t^2 2^t m^2)\) invocations of G;

  • \(\textrm{s}\text {-}\textrm{Comp}(\varPi )\) is \(O(N^2\log N\cdot t 2^t m)\) invocations of G.

In particular, \(\max \{\textrm{Comm}(\varPi ),\textrm{c}\text {-}\textrm{Comp}(\varPi )\}=\log N\cdot 2^{O(t)}\cdot \textsf{poly}{\left( \lambda \right) }\).

Note that it is possible to apply the compiler in Theorem 5 to the passively secure k-server protocols in Propositions 345, and 6. Since \(k>t\), the number of servers of the resulting protocols is \(\varOmega (k\log k)+2t=\varOmega (t\log t)\). On the other hand, these protocols can also be made actively secure by using the standard error correction algorithm [41] or the technique of [38], and one can then obtain actively secure protocols that has a smaller number of servers O(t). We thus do not show instantiations based on these protocols.