Abstract
We present a new efficient protocol for computing private set union (PSU). Here two semi-honest parties, each holding a dataset of known size (or of a known upper bound), wish to compute the union of their sets without revealing anything else to either party. Our protocol is in the OT hybrid model. Beyond OT extension, it is fully based on symmetric-key primitives. We motivate the PSU primitive by its direct application to network security and other areas.
At the technical core of our PSU construction is the reverse private membership test (RPMT) protocol. In RPMT, the sender with input \(x^*\) interacts with a receiver holding a set X. As a result, the receiver learns (only) the bit indicating whether \(x^* \in X\), while the sender learns nothing about the set X. (Previous similar protocols provide output to the opposite party, hence the term “reverse” private membership.) We believe our RPMT abstraction and constructions may be a building block in other applications as well.
We demonstrate the practicality of our proposed protocol with an implementation. For input sets of size \(2^{20}\) and using a single thread, our protocol requires 238 s to securely compute the set union, regardless of the bit length of the items. Our protocol is amenable to parallelization. Increasing the number of threads from 1 to 32, our protocol requires only 13.1 s, a factor of \(18.25{\times }\) improvement.
To the best of our knowledge, ours is the first protocol that reports on large-size experiments, makes code available, and avoids extensive use of computationally expensive public-key operations. (No PSU code is publicly available for prior work, and the only prior symmetric-key-based work reports on small experiments and focuses on the simpler 3-party, 1-corruption setting.) Our work improves reported PSU state of the art by factor up to \(7,600{\times }\) for large instances.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Of course, \(x\in \{0,1\}^*\) needs to be “hashed down” to an element of the field we are working with. This can be done, e.g., by applying a collision resistant hash function. For simplicity, here we mention, but don’t formalize this step.
References
Arbitman, Y., Naor, M., Segev, G.: Backyard cuckoo hashing: constant worst-case operations with a succinct representation. In: 51st FOCS, pp. 787–796. IEEE Computer Society Press, October 2010
Asharov, G., Lindell, Y., Schneider, T., Zohner, M.: More efficient oblivious transfer and extensions for faster secure computation. In: Sadeghi, A.R., Gligor, V.D., Yung, M. (eds.) ACM CCS 2013, pp. 535–548. ACM Press, New York (2013)
Ateniese, G., De Cristofaro, E., Tsudik, G.: (If) Size matters: size-hiding private set intersection. In: Catalano, D., Fazio, N., Gennaro, R., Nicolosi, A. (eds.) PKC 2011. LNCS, vol. 6571, pp. 156–173. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-19379-8_10
Beaver, D.: Correlated pseudorandomness and the complexity of private computations. In: 28th ACM STOC, pp. 479–488. ACM Press, May 1996
Blanton, M., Aguiar, E.: Private and oblivious set and multiset operations. In: Youm, H.Y., Won, Y. (eds.) ASIACCS 2012, pp. 40–41. ACM Press, New York (2012)
Boudot, F., Schoenmakers, B., Traoré, J.: A fair and efficient solution to the socialist millionaires’ problem. Discrete Appl. Math. 111, 2001 (2001)
Brickell, J., Shmatikov, V.: Privacy-preserving graph algorithms in the semi-honest model. In: Roy, B. (ed.) ASIACRYPT 2005. LNCS, vol. 3788, pp. 236–252. Springer, Heidelberg (2005). https://doi.org/10.1007/11593447_13
Burkhart, M., Strasser, M., Many, D., Dimitropoulos, X.: SEPIA: privacy-preserving aggregation of multi-domain network events and statistics. In: Proceedings of the 19th USENIX Conference on Security, USENIX Security 2010, p. 15. USENIX Association, Berkeley (2010)
Canetti, R., Paneth, O., Papadopoulos, D., Triandopoulos, N.: Verifiable set operations over outsourced databases. In: Krawczyk, H. (ed.) PKC 2014. LNCS, vol. 8383, pp. 113–130. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-642-54631-0_7
Cerulli, A., De Cristofaro, E., Soriente, C.: Nothing refreshes like a RePSI: reactive private set intersection. In: Preneel, B., Vercauteren, F. (eds.) ACNS 2018. LNCS, vol. 10892, pp. 280–300. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-93387-0_15
Chen, H., Laine, K., Rindal, P.: Fast private set intersection from homomorphic encryption. In: Thuraisingham, B.M., Evans, D., Malkin, T., Xu, D. (eds.) ACM CCS 2017, pp. 1243–1255. ACM Press, New York (2017)
Cho, C., Dachman-Soled, D., Jarecki, S.: Efficient concurrent covert computation of string equality and set intersection. In: Sako, K. (ed.) CT-RSA 2016. LNCS, vol. 9610, pp. 164–179. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-29485-8_10
Ciampi, M., Orlandi, C.: Combining private set-intersection with secure two-party computation. In: Catalano, D., De Prisco, R. (eds.) SCN 2018. LNCS, vol. 11035, pp. 464–482. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-98113-0_25
Davidson, A., Cid, C.: An efficient toolkit for computing private set operations. In: Pieprzyk, J., Suriadi, S. (eds.) ACISP 2017, Part II. LNCS, vol. 10343, pp. 261–278. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-59870-3_15
De Cristofaro, E., Kim, J., Tsudik, G.: Linear-complexity private set intersection protocols secure in malicious model. In: Abe, M. (ed.) ASIACRYPT 2010. LNCS, vol. 6477, pp. 213–231. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-17373-8_13
De Cristofaro, E., Tsudik, G.: Practical private set intersection protocols with linear complexity. In: Sion, R. (ed.) FC 2010. LNCS, vol. 6052, pp. 143–159. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-14577-3_13
Demmler, D., Rindal, P., Rosulek, M., Trieu, N.: PIR-PSI: scaling private contact discovery. In: Proceedings on Privacy Enhancing Technologies (2018)
Dong, C., Chen, L., Wen, Z.: When private set intersection meets big data: an efficient and scalable protocol. In: Sadeghi, A.R., Gligor, V.D., Yung, M. (eds.) ACM CCS 2013, pp. 789–800. ACM Press, New York (2013)
Fagin, R., Naor, M., Winkler, P.: Comparing information without leaking it. Commun. ACM 39, 77–85 (1996)
Falk, B.H., Noble, D., Ostrovsky, R.: Private set intersection with linear communication from general assumptions. Cryptology ePrint Archive, Report 2018/238 (2018). https://eprint.iacr.org/2018/238
Freedman, M.J., Ishai, Y., Pinkas, B., Reingold, O.: Keyword search and oblivious pseudorandom functions. In: Kilian, J. (ed.) TCC 2005. LNCS, vol. 3378, pp. 303–324. Springer, Heidelberg (2005). https://doi.org/10.1007/978-3-540-30576-7_17
Freedman, M.J., Nissim, K., Pinkas, B.: Efficient private matching and set intersection. In: Cachin, C., Camenisch, J.L. (eds.) EUROCRYPT 2004. LNCS, vol. 3027, pp. 1–19. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-24676-3_1
Frikken, K.: Privacy-preserving set union. In: Katz, J., Yung, M. (eds.) ACNS 2007. LNCS, vol. 4521, pp. 237–252. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-72738-5_16
Gaudry, P., Brent, R., Zimmermann, P., Thomé, E.: https://gforge.inria.fr/projects/gf2x/
Gonnet, G.H.: Expected length of the longest probe sequence in hash code searching. J. ACM 28(2), 289–304 (1981)
Hazay, C., Nissim, K.: Efficient set operations in the presence of malicious adversaries. J. Cryptol. 25(3), 383–433 (2012)
Hazay, C., Venkitasubramaniam, M.: Scalable multi-party private set-intersection. In: Fehr, S. (ed.) PKC 2017. LNCS, vol. 10174, pp. 175–203. Springer, Heidelberg (2017). https://doi.org/10.1007/978-3-662-54365-8_8
Hogan, K., et al.: Secure multiparty computation for cooperative cyber risk assessment. In: 2016 IEEE Cybersecurity Development (SecDev), pp. 75–76, November 2016
Huang, Y., Evans, D., Katz, J.: Private set intersection: are garbled circuits better than custom protocols? In: NDSS 2012. The Internet Society, February 2012
Huang, Y., Evans, D., Katz, J., Malka, L.: Faster secure two-party computation using garbled circuits. In: USENIX Security 2011 (2011)
Impagliazzo, R., Rudich, S.: Limits on the provable consequences of one-way permutations. In: 21st ACM STOC, pp. 44–61. ACM Press, May 1989
Ion, M., et al.: On deploying secure computing commercially: private intersection-sum protocols and their business applications. Cryptology ePrint Archive, Report 2019/723 (2019)
Ishai, Y., Kilian, J., Nissim, K., Petrank, E.: Extending oblivious transfers efficiently. In: Boneh, D. (ed.) CRYPTO 2003. LNCS, vol. 2729, pp. 145–161. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-540-45146-4_9
Kissner, L., Song, D.: Privacy-preserving set operations. In: Shoup, V. (ed.) CRYPTO 2005. LNCS, vol. 3621, pp. 241–257. Springer, Heidelberg (2005). https://doi.org/10.1007/11535218_15
Kolesnikov, V., Kumaresan, R.: Improved OT extension for transferring short secrets. In: Canetti, R., Garay, J.A. (eds.) CRYPTO 2013. LNCS, vol. 8043, pp. 54–70. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40084-1_4
Kolesnikov, V., Kumaresan, R., Rosulek, M., Trieu, N.: Efficient batched oblivious PRF with applications to private set intersection. In: Weippl, E.R., Katzenbeisser, S., Kruegel, C., Myers, A.C., Halevi, S. (eds.) ACM CCS 2016, pp. 818–829. ACM Press, New York (2016)
Kolesnikov, V., Matania, N., Pinkas, B., Rosulek, M., Trieu, N.: Practical multi-party private set intersection from symmetric-key techniques. In: Thuraisingham, B.M., Evans, D., Malkin, T., Xu, D. (eds.) ACM CCS 2017, pp. 1257–1272. ACM Press, New York (2017)
Lenstra, A., Voss, T.: Information security risk assessment, aggregation, and mitigation. In: Wang, H., Pieprzyk, J., Varadharajan, V. (eds.) ACISP 2004. LNCS, vol. 3108, pp. 391–401. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-27800-9_34
Lipmaa, H.: Verifiable homomorphic oblivious transfer and private equality test. In: Laih, C.-S. (ed.) ASIACRYPT 2003. LNCS, vol. 2894, pp. 416–433. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-540-40061-5_27
Naor, M., Pinkas, B.: Oblivious transfer and polynomial evaluation. In: Proceedings of the Thirty-First Annual ACM Symposium on Theory of Computing, STOC 1999 (1999)
Naor, M., Pinkas, B.: Efficient oblivious transfer protocols. In: Kosaraju, S.R. (ed.) 12th SODA, pp. 448–457. ACM-SIAM, January 2001
Pagh, R., Rodler, F.F.: Cuckoo hashing. J. Algorithms 51(2), 122–144 (2004)
Pinkas, B., Rosulek, M., Trieu, N., Yanai, A.: SpOT-light: lightweight private set intersection from sparse OT extension. In: Boldyreva, A., Micciancio, D. (eds.) CRYPTO 2019. LNCS, vol. 11694, pp. 401–431. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-26954-8_13
Pinkas, B., Schneider, T., Segev, G., Zohner, M.: Phasing: private set intersection using permutation-based hashing. In: Proceedings of the 24th USENIX Conference on Security Symposium, pp. 515–530. USENIX Association (2015)
Pinkas, B., Schneider, T., Tkachenko, O., Yanai, A.: Efficient circuit-based PSI with linear communication. In: Ishai, Y., Rijmen, V. (eds.) EUROCRYPT 2019. LNCS, vol. 11478, pp. 122–153. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-17659-4_5
Pinkas, B., Schneider, T., Zohner, M.: Faster private set intersection based on OT extension. In: Proceedings of the 23rd USENIX Conference on Security Symposium, pp. 797–812. USENIX Association (2014)
Pinkas, B., Schneider, T., Zohner, M.: Scalable private set intersection based on OT extension. ACM Trans. Priv. Secur. 21(2) (2018)
Rabin, M.O.: How to exchange secrets by oblivious transfer. Aiken Computation Laboratory, Harvard U. (1981)
Resende, A.C.D., Aranha, D.F.: Faster unbalanced private set intersection. In: Meiklejohn, S., Sako, K. (eds.) FC 2018. LNCS, vol. 10957, pp. 203–221. Springer, Heidelberg (2018). https://doi.org/10.1007/978-3-662-58387-6_11
Rindal, P.: libOTe: an efficient, portable, and easy to use Oblivious Transfer Library. https://github.com/osu-crypto/libOTe
Rindal, P., Rosulek, M.: Improved private set intersection against malicious adversaries. In: Coron, J.-S., Nielsen, J.B. (eds.) EUROCRYPT 2017, Part I. LNCS, vol. 10210, pp. 235–259. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-56620-7_9
Rindal, P., Rosulek, M.: Malicious-secure private set intersection via dual execution. In: Thuraisingham, B.M., Evans, D., Malkin, T., Xu, D. (eds.) ACM CCS 2017, pp. 1229–1242. ACM Press, New York (2017)
Shoup, V.: http://www.shoup.net/ntl/
Zahur, S., Rosulek, M., Evans, D.: Two halves make a whole: reducing data transfer in garbled circuits using half gates. In: Oswald, E., Fischlin, M. (eds.) EUROCRYPT 2015. LNCS, vol. 9057, pp. 220–250. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-46803-6_8
Acknowledgments
We thank all anonymous reviewers and Brice Minaud for insightful feedback.
Vladimir Kolesnikov was supported in part by Sandia National Laboratories, a multimission laboratory managed and operated by National Technology and Engineering Solutions of Sandia, LLC., a wholly owned subsidiary of Honeywell International, Inc., for the U.S. Department of Energy’s National Nuclear Security Administration under contract DE-NA-0003525. He was also supported in part by the Office of the Director of National Intelligence (ODNI), Intelligence Advanced Research Projects Activity (IARPA), via 2019-1902070008. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies, either expressed or implied, of ODNI, IARPA, or the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for governmental purposes notwithstanding any copyright annotation therein.
Mike Rosulek and Ni Trieu were partially supported by NSF awards #1617197, a Google faculty award, and a Visa faculty award.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
A RPMT Optimization
A RPMT Optimization
In the RPMT protocol, the receiver computes a polynomial P with special output s. The sender computes \(s^* = P(h(x^*)) \oplus q^*\), where \(q^*\) is its OPRF output. Then the parties use PEQT to securely compare s to \(s^*\).
In the context of PSU, it is not necessary to use PEQT for this step. Instead, the sender can simply send \(s^*\) to the receiver. The logic is as follows: If \(x^* \in X\), the sender should learn only this fact (and nothing about \(x^*\)). This is still the case after the optimization because the sender will compute the same polynomial output \(s^*\) for any such \(x^* \in X\). If \(x^* \in X\), it means that the receiver will eventually learn \(x^*\) as part of the PSU output (and the sender can infer that \(x^*\) was contributed by the receiver). The PSU simulator will therefore have the value \(x^*\), and it can perfectly simulate the polynomial output \(s^* = P(h(x^*)) \oplus q^*\).
We now formalize the details of this modification. Rather than define a weaker/leaky version of RPMT, we instead introduce a protocol for 1-vs-n PSU. Such a functionality is quite similar to RPMT, which can be thought of as revealing only the cardinality of \(| \{x^*\} \cup X|\), which is equivalent to revealing the cardinality of \(|\{x^*\} \setminus X|\) (either 0 or 1).
The details of the 1-vs-n PSU protocol are given in Fig. 9. Now, using 1-vs-n PSU as a building block instead of RPMT, our full-fledged PSU protocol can be written as in Fig. 10.
The security proof of the full-fledged PSU protocol is essentially the same as in the pre-optimization protocol. The security of the 1-vs-n protocol is given below:
Theorem 3
The construction of Fig. 9 securely implements functionality \(\mathcal {F}^{1,n}_{\textsf {psu}}\) in the semi-honest model, given the OPRF primitive defined in Fig. 4.
Proof
We exhibit simulators \(\mathsf {Sim}_{\mathcal {R}}\) and \(\mathsf {Sim}_{\mathcal {S}}\) for simulating corrupt \(\mathcal {R}\) and \(\mathcal {S}\) respectively, and argue the indistinguishability of the produced transcript from the real execution.
Corrupt Sender. \(\mathsf {Sim}_{\mathcal {S}}(x^*)\) simulates the view of corrupt \(\mathcal {S}\), which consists of \(\mathcal {S}\)’s randomness, input, output and received messages. \(\mathsf {Sim}_{\mathcal {S}}\) proceeds as follows. It first chooses \(q'\in _R \{0,1\}^\sigma \), calls OPRF simulator \(\mathsf {Sim}_{S_\mathsf{OPRF}} (x^*, q')\), and appends its output to the view.
\(\mathsf {Sim}_{\mathcal {S}}\) simulates Step 3 as follows. It generates random \(s' \in \{0,1\}^\sigma \), and n random points \((x'_i, q'_i) \in _R (\{0,1\}^\star ,\{0,1\}^\sigma )\). \(\mathsf {Sim}_{\mathcal {S}}\) then interpolates the polynomial P over these points \(\{h(x'_i),s' \oplus q'_i\}\) and appends its coefficients to the generated view.
We argue that the output of \(\mathsf {Sim}_{\mathcal {S}}\) is indistinguishable from the real execution. For this, we formally show the simulation by proceeding the sequence of hybrid transcripts \(T_0,T_1, T_2\), where \(T_0\) is real view of \(\mathcal {S}\), and \(T_2\) is the output of \(\mathsf {Sim}_{\mathcal {S}}\).
-
Hybrid 1. Let \(T_1\) be the same as \(T_0\), except that the OPRF execution is replaced as follows. By the OPRF/BaRK-OPRF pseudorandomness guarantee and the indistinguishability of the output of \(\mathsf {Sim}_{S_\mathsf{OPRF}}\), we replace \(F(k,x^*)\) and \(F(k,x_i), \forall i \in [n],\) with \(q'\) and \(q'_i, \forall i \in [n]\), respectively. We note that if \(x^* = x_i\), then \(q'=q'_i\). It is easy to see that \(T_0\) and \(T_1\) are indistinguishable.
-
Hybrid 2. Let \(T_2\) be the same as \(T_1\), except that the polynomial is an uniform polynomial of degree \(n-1\). Consider two following cases:
-
\(x^* \not \in X\): Since all values \(q'_i\) are uniformly random from the \(\mathcal {S}\)’s point of view, so are the \(s \oplus q'_i\).
-
\(x^* = x_i\) (consequently, \(q'=q'_i\)): Since other values \(q'_{j \in [n]}, \forall j \ne i,\) are uniformly random from \(\mathcal {S}\)’s point of view, we replace these \(s \oplus q'_j\) with random. Then s is used only in the expression \(s\,\oplus \,q'_i\). Since s is uniform, \(s\,\oplus \,q'_i\) is also uniformly random from the \(\mathcal {S}\)’s view even though the adversary knows \(q'=q'_i\).
In summary, the polynomial from the real execution can be replaced with a polynomial P over random points. \(T_1\) and \(T_2\) are indistinguishable.
-
Corrupt Receiver. \(\mathsf {Sim}_{\mathcal {R}}(x_1,...,x_n, out)\) simulates \(\mathcal {R}\)’s view, which includes \(\mathcal {R}\)’s randomness, input, output and received messages. \(\mathsf {Sim}_{\mathcal {R}}\) proceeds as follows.
First, if \(out = \{x_1, \ldots , x_n, x^*\}\) for some \(x^*\), then the simulator knows \(\mathcal {S}\)’s input \(x^*\) and can trivially simulate all of \(\mathcal {S}\)’s actions honestly. This case of simulation is clearly perfect.
Otherwise, \(\mathsf {Sim}_{\mathcal {R}}\) chooses a random \(k'\in _r \{0,1\}^\kappa \), calls OPRF simulator \(\mathsf {Sim}_{S_\mathsf{OPRF}} (\bot ,k')\), and appends its output to the view. It simulates a message \(s^*=s\) from \(\mathcal {S}\) in Step 4. Finally, to simulate Step 5, \(\mathsf {Sim}_{\mathcal {S}}\) runs simulator \(\mathsf {Sim}_\mathsf{OT}\) on input \((1,\bot )\) and appends the output of \(\mathsf {Sim}_\mathsf{OT}\) to its output of the view.
The view generated by \(\mathsf {Sim}_{\mathcal {R}}\) in indistinguishable from a real view because of the indistinguishability of the transcripts of the underlying simulators.
Rights and permissions
Copyright information
© 2019 International Association for Cryptologic Research
About this paper
Cite this paper
Kolesnikov, V., Rosulek, M., Trieu, N., Wang, X. (2019). Scalable Private Set Union from Symmetric-Key Techniques. In: Galbraith, S., Moriai, S. (eds) Advances in Cryptology – ASIACRYPT 2019. ASIACRYPT 2019. Lecture Notes in Computer Science(), vol 11922. Springer, Cham. https://doi.org/10.1007/978-3-030-34621-8_23
Download citation
DOI: https://doi.org/10.1007/978-3-030-34621-8_23
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-34620-1
Online ISBN: 978-3-030-34621-8
eBook Packages: Computer ScienceComputer Science (R0)