skip to main content
10.1145/3151759.3151788acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiiwasConference Proceedingsconference-collections
research-article

Secure similarity joins using fully homomorphic encryption

Published: 04 December 2017 Publication History

Abstract

Similarity joins are important database operations that can identify pairs of roughly similar records. Due to their many applications (e.g., duplicate elimination and plagiarism detection), a number of algorithms have been created to enhance similarity joins, especially in terms of performance. However, in some cases, the privacy of the data being joined also becomes an important aspect to consider, as leaking sensitive information can result in grave consequences for individuals, enterprises and governmental organizations. We propose a protocol for secure execution of similarity joins that is based on fully homomorphic cryptosystems, which are resistant to a number of attacks and provide flexibility to calculate the similarity between encrypted records. We also consider the adaptation of filter techniques to improve the efficiency of the protocol by reducing the number of record pairs that are compared. In addition, we exploit modern hardware to parallelize the solution and evaluate the performance of the proposal using real datasets.

References

[1]
Burton H. Bloom. 1970. Space/Time Trade-offs in Hash Coding with Allowable Errors. Commun. ACM 13, 7 (July 1970), 422--426.
[2]
Zvika Brakerski, Craig Gentry, and Vinod Vaikuntanathan. 2014. (Leveled) Fully Homomorphic Encryption Without Bootstrapping. ACM Trans. Comput. Theory 6, 3, Article 13 (July 2014), 36 pages.
[3]
Andrei Z. Broder, Moses Charikar, Alan M. Frieze, and Michael Mitzenmacher. 2000. Min-Wise Independent Permutations. J. Comput. Syst. Sci. 60, 3 (2000), 630--659. http://dblp.uni-trier.de/db/journals/jcss/jcss60.html#BroderCFM00
[4]
Surajit Chaudhuri, Venkatesh Ganti, and Raghav Kaushik. 2006. A primitive operator for similarity joins in data cleaning. In Data Engineering, 2006. ICDE'06. proceedings of the 22nd International Conference on Data Engineering. IEEE, 5--5.
[5]
Peter Christen, Rainer Schnell, Dinusha Vatsalan, and Thilina Ranbaduge. 2017. Efficient Cryptanalysis of Bloom Filters for Privacy-Preserving Record Linkage. In Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, 628--640.
[6]
Ana Costache and Nigel P Smart. 2016. Which Ring Based Somewhat Homomorphic Encryption Scheme is Best?. In Cryptographers' Track at the RSA Conference. Springer, 325--340.
[7]
Taher ElGamal. 1985. A public key cryptosystem and a signature scheme based on discrete logarithms. IEEE transactions on information theory 31, 4 (1985), 469--472.
[8]
Dennis Fetterly, Mark Manasse, Marc Najork, and Janet L Wiener. 2004. A large-scale study of the evolution of Web pages. Software: Practice and Experience 34, 2 (2004), 213--237.
[9]
Craig Gentry et al. 2009. Fully homomorphic encryption using ideal lattices. In STOC, Vol. 9. 169--178.
[10]
Craig Gentry, Amit Sahai, and Brent Waters. 2013. Homomorphic Encryption from Learning with Errors: Conceptually-Simpler, Asymptotically-Faster, Attribute-Based. Cryptology ePrint Archive, Report 2013/340. (2013). http://eprint.iacr.org/2013/340.
[11]
Luis Gravano, Panagiotis G Ipeirotis, Hosagrahar Visvesvaraya Jagadish, Nick Koudas, Shanmugauelayut Muthukrishnan, Divesh Srivastava, et al. 2001. Approximate string joins in a database (almost) for free. In VLDB, Vol. 1. 491--500.
[12]
Yu Jiang, Guoliang Li, Jianhua Feng, and Wen-Syan Li. 2014. String similarity joins: An experimental evaluation. Proceedings of the VLDB Endowment 7, 8 (2014), 625--636.
[13]
Mehmet Kuzu, Mohammad Saiful Islam, and Murat Kantarcioglu. 2012. Efficient similarity search over encrypted data. In Data Engineering (ICDE), 2012 IEEE 28th International Conference on. IEEE, 1156--1167.
[14]
Mehmet Kuzu, Murat Kantarcioglu, Elizabeth Ashley Durham, Csaba Toth, and Bradley Malin. 2013. A practical approach to achieve private medical record linkage in light of public resources. Journal of the American Medical Informatics Association 20, 2 (2013), 285--292.
[15]
Fengjun Li, Yuxin Chen, Bo Luo, Dongwon Lee, and Peng Liu. 2011. Privacy preserving group linkage. In International Conference on Scientific and Statistical Database Management. Springer, 432--450.
[16]
Jin Li, Qian Wang, Cong Wang, Ning Cao, KuiRen, and Wenjing Lou. 2010. Fuzzy keyword search over encrypted data in cloud computing. In INFOCOM, 2010 Proceedings IEEE. IEEE, 1--5.
[17]
Adriana Lopez-Alt, Eran Tromer, and Vinod Vaikuntanathan. 2013. On-the-Fly Multiparty Computation on the Cloud via Multikey Fully Homomorphic Encryption. Cryptology ePrint Archive, Report 2013/094. (2013). http://eprint.iacr.org/2013/094.
[18]
Vadim Lyubashevsky, Chris Peikert, and Oded Regev. 2010. On ideal lattices and learning with errors over rings. In Annual International Conference on the Theory and Applications of Cryptographic Techniques. Springer, 1--23.
[19]
Payman Mohassel and Yupeng Zhang. 2017. SecureML: A System for Scalable Privacy-Preserving Machine Learning. IACR Cryptology ePrint Archive 2017 (2017), 396.
[20]
Pascal Paillier. 1999. Public-Key Cryptosystems Based on Composite Degree Residuosity Classes. Springer Berlin Heidelberg, Berlin, Heidelberg, 223--238.
[21]
Raluca Ada Popa, Catherine M. S. Redfield, Nickolai Zeldovich, and Hari Balakrishnan. 2011. CryptDB: Protecting Confidentiality with Encrypted Query Processing. In Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles (SOSP '11). ACM, New York, NY, USA, 85--100.
[22]
Rainer Schnell, Tobias Bachteler, and Jörg Reiher. 2009. Privacy-preserving record linkage using Bloom filters. BMC medical informatics and decision making 9, 1 (2009), 41.
[23]
Ziad Sehili, Lars Kolb, Christian Borgs, Rainer Schnell, and Erhard Rahm. 2015. Privacy Preserving Record Linkage with PPJoin. In BTW. 85--104.
[24]
Kana Shimizu, Koji Nuida, Hiromi Arai, Shigeo Mitsunari, Nuttapong Attra-padung, Michiaki Hamada, Koji Tsuda, Takatsugu Hirokawa, Jun Sakuma, Goichiro Hanaoka, et al. 2015. Privacy-preserving search for chemical compound databases. BMC bioinformatics 16, Suppl 18 (2015), S6.
[25]
Nigel P Smart and Frederik Vercauteren. 2014. Fully homomorphic SIMD operations. Designs, codes and cryptography (2014), 1--25.
[26]
Stephen Tu, M. Frans Kaashoek, Samuel Madden, and Nickolai Zeldovich. 2013. Processing Analytical Queries over Encrypted Data. Proc. VLDB Endow. 6, 5 (March 2013), 289--300.
[27]
Dinusha Vatsalan, Peter Christen, and Erhard Rahm. 2017. Scalable Multi-Database Privacy-Preserving Record Linkage using Counting Bloom Filters. arXiv preprint arXiv: 1701.01232 (2017).
[28]
Bing Wang, Shucheng Yu, Wenjing Lou, and Y Thomas Hou. 2014. Privacy-preserving multi-keyword fuzzy search over encrypted data in the cloud. In INFOCOM, 2014 Proceedings IEEE. IEEE, 2112--2120.
[29]
Jiannan Wang, Guoliang Li, and Jianhua Feng. 2012. Can We Beat the Prefix Filtering?: An Adaptive Framework for Similarity Join and Search. In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data (SIGMOD '12). ACM, New York, NY, USA, 85--96.
[30]
Wai Kit Wong, David Wai-lok Cheung, Ben Kao, and Nikos Mamoulis. 2009. Secure kNN Computation on Encrypted Databases. In Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data (SIGMOD '09). ACM, New York, NY, USA, 139--152.
[31]
Chuan Xiao, Wei Wang, Xuemin Lin, Jeffrey Xu Yu, and Guoren Wang. 2011. Efficient similarity joins for near-duplicate detection. ACM Transactions on Database Systems (TODS) 36, 3 (2011), 15.

Cited By

View all
  • (2022)A Privacy-Preserving Blockchain Platform for a Data MarketplaceDistributed Ledger Technologies: Research and Practice10.1145/35738942:1(1-16)Online publication date: 6-Dec-2022
  • (2022)Privacy-Preserving Top-k Spatio-Textual Similarity Join2022 IEEE International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom)10.1109/TrustCom56396.2022.00102(718-726)Online publication date: Dec-2022

Index Terms

  1. Secure similarity joins using fully homomorphic encryption

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    iiWAS '17: Proceedings of the 19th International Conference on Information Integration and Web-based Applications & Services
    December 2017
    609 pages
    ISBN:9781450352994
    DOI:10.1145/3151759
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 04 December 2017

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. fully homomorphic encryption
    2. privacy
    3. security
    4. similarity joins

    Qualifiers

    • Research-article

    Conference

    iiWAS2017

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)8
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 06 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)A Privacy-Preserving Blockchain Platform for a Data MarketplaceDistributed Ledger Technologies: Research and Practice10.1145/35738942:1(1-16)Online publication date: 6-Dec-2022
    • (2022)Privacy-Preserving Top-k Spatio-Textual Similarity Join2022 IEEE International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom)10.1109/TrustCom56396.2022.00102(718-726)Online publication date: Dec-2022

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media