Abstract
The Wasserstein distance, also known in computer science as the Earth Mover’s Distance (EMD) is a distance metric between two probability distributions. EMD has often been used as a distance metric to compare images and documents, and is central to privacy models such as t-closeness. In this work, we show that, given one-dimensional discrete probability distributions, the computation of EMD can be reduced to the computation of the cardinality of the intersection of two sets. We then use a private matching scheme to create a privacy-preserving computation protocol for EMD: two parties can compute EMD between their privately-owned documents without revealing them to the other party. We demonstrate our proposal by implementing a privacy-preserving reverse image search, where images are kept encrypted at an external server.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Blanco, A., Domingo-Ferrer, J., Farràs, O., Sánchez, D.: Distance computation between two private preference functions. In: Cuppens-Boulahia, N., Cuppens, F., Jajodia, S., Abou El Kalam, A., Sans, T. (eds.) SEC 2014. IAICT, vol. 428, pp. 460–470. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-642-55415-5_39
Bloom, B.H.: Space/time trade-offs in hash coding with allowable errors. Commun. ACM 13(7), 422–426 (1970)
Bost, R., Fouque, P.A.: Security-efficiency tradeoffs in searchable encryption. Proc. Privacy Enhanc. Technol. 2019(4), 132–151 (2019)
Bost, R., Fouque, P.A., Pointcheval, D.: Verifiable dynamic symmetric searchable encryption: optimality and forward security. IACR Cryptol. ePrint Arch. 2016, 62 (2016)
Curtmola, R., Garay, J., Kamara, S., Ostrovsky, R.: Searchable symmetric encryption: improved definitions and efficient constructions. J. Comput. Security 19(5), 895–934 (2011)
Dong, C., Chen, L., Wen, Z.: When private set intersection meets big data: an efficient and scalable protocol. In: Proceedings of the 2013 ACM SIGSAC Conference on Computer & Communications Security, pp. 789–800 (2013)
Erlingsson, Ú., Pihur, V., Korolova, A.: Rappor: randomized aggregatable privacy-preserving ordinal response. In: Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security, pp. 1054–1067 (2014)
Fei-Fei, L., Fergus, R., Perona, P.: Learning generative visual models from few training examples: an incremental bayesian approach tested on 101 object categories. In: 2004 Conference on Computer Vision and Pattern Recognition Workshop, pp. 178–178. IEEE (2004)
Freedman, M.J., Hazay, C., Nissim, K., Pinkas, B.: Efficient set intersection with simulation-based security. J. Cryptol. 29(1), 115–155 (2016)
Freedman, M.J., Nissim, K., Pinkas, B.: Efficient private matching and set intersection. In: Cachin, C., Camenisch, J.L. (eds.) EUROCRYPT 2004. LNCS, vol. 3027, pp. 1–19. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-24676-3_1
Gerbet, T., Kumar, A., Lauradoux, C.: The power of evil choices in bloom filters. In: 2015 45th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, pp. 101–112. IEEE (2015)
Guo, C., Chen, X., Jie, Y., Zhangjie, F., Li, M., Feng, B.: Dynamic multi-phrase ranked search over encrypted data with symmetric searchable encryption. In: IEEE Transactions on Services Computing (2017)
Kerschbaum, F.: Outsourced private set intersection using homomorphic encryption. In: Proceedings of the 7th ACM Symposium on Information, Computer and Communications Security, pp. 85–86 (2012)
Kusner, M., Sun, Y., Kolkin, N., Weinberger, K.: From word embeddings to document distances. In: International Conference on Machine Learning, pp. 957–966 (2015)
Li, N., Li, T., Venkatasubramanian, S.: t-closeness: privacy beyond k-anonymity and l-diversity. In: 2007 IEEE 23rd International Conference on Data Engineering, pp. 106–115. IEEE (2007)
Paillier, P.: Public-key cryptosystems based on composite degree residuosity classes. In: Stern, J. (ed.) EUROCRYPT 1999. LNCS, vol. 1592, pp. 223–238. Springer, Heidelberg (1999). https://doi.org/10.1007/3-540-48910-X_16
Pinkas, B., Schneider, T., Zohner, M.: Faster private set intersection based on \(\{\)OT\(\}\) extension. In: 23rd \(\{\)USENIX\(\}\) Security Symposium (\(\{\)USENIX\(\}\) Security 14), pp. 797–812 (2014)
Rubner, Y., Tomasi, C., Guibas, L.J.: A metric for distributions with applications to image databases. In: Sixth International Conference on Computer Vision (IEEE Cat. No. 98CH36271), pp. 59–66. IEEE (1998)
Rubner, Y., Tomasi, C., Guibas, L.J.: The earth mover’s distance as a metric for image retrieval. Int. J. Comput. Vis. 40(2), 99–121 (2000)
Sun, S.F., Yuan, X., Liu, J.K., Steinfeld, R., Sakzad, A., Vo, V., Nepal, S.: Practical backward-secure searchable encryption from symmetric puncturable encryption. In: Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, pp. 763–780 (2018)
Sweeney, L.: k-anonymity: a model for protecting privacy. Int. J. Uncertainty, Fuzziness Knowl. Based Syst. 10(05), 557–570 (2002)
Wan, X., Peng, Y.: The earth mover’s distance as a semantic measure for document similarity. In: Proceedings of the 14th ACM International Conference on Information and Knowledge Management, pp. 301–302 (2005)
Acknowledgements and Disclaimer
We acknowledge support from: European Commission (projects H2020-871042 “SoBigData++” and H2020-101006879 “MobiDataLab”), Government of Catalonia (ICREA Acadèmia Prize to the second author and grant 2017 SGR 705) and Spanish Government (project RTI2018-095094-B-C21). The authors are with the UNESCO Chair in Data Privacy, but their views here are not necessarily shared by UNESCO.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Blanco-Justicia, A., Domingo-Ferrer, J. (2020). Privacy-Preserving Computation of the Earth Mover’s Distance. In: Susilo, W., Deng, R.H., Guo, F., Li, Y., Intan, R. (eds) Information Security. ISC 2020. Lecture Notes in Computer Science(), vol 12472. Springer, Cham. https://doi.org/10.1007/978-3-030-62974-8_23
Download citation
DOI: https://doi.org/10.1007/978-3-030-62974-8_23
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-62973-1
Online ISBN: 978-3-030-62974-8
eBook Packages: Computer ScienceComputer Science (R0)