Abstract
Private record linkage is an active field of research targeted towards linking data sets from two or more sources, while preserving the privacy of contained sensitive content. With computation and communication efficiency as other two important requirements in such a process, much attention has been given to use Bloom filters for fast encoding of data records, while maintaining privacy of the records at the same time. A number of techniques to modify a typical Bloom filter have also appeared and addresses the need to harden them against known attacks. However, the field significantly lacks quantitative measures of the privacy level introduced by such techniques. In this work, we motivate and propose the generating-set amplification factor measure to bridge some of this gap. This privacy measure aims to capture the level of uncertainty that a hardening technique introduces between its output and the input used to create a Bloom filter. We provide algorithms to compute the measure and provide an empirical assessment of the state-of-the-art Bloom filter hardening techniques with respect to the measure. Our assessment shows that current techniques may still be retaining much of the characteristics of the input, although attacks to exploit them are yet to appear.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Baker, D.B., Knoppers, B.M., Phillips, M., van Enckevort, D., Kaufmann, P., Lochmuller, H., Taruscio, D.: Privacy-preserving linkage of genomic and clinical data sets. IEEE/ACM Trans. Comput. Biol. Bioinform. 16(4), 1342–1348 (2019)
Brown, A.P., Randall, S.M., Boyd, J.H., Ferrante, A.M.: Evaluation of approximate comparison methods on Bloom filters for probabilistic linkage. Int. J. Popul. Data Sci. 4(1), 1095 (2019)
Christen, P., Ranbaduge, T., Vatsalan, D., Schnell, R.: Precise and fast cryptanalysis for Bloom filter based privacy-preserving record linkage. IEEE Trans. Knowl. Data Eng. 31(11), 2164–2177 (2019)
Christen, P., Schnell, R., Vatsalan, D., Ranbaduge, T.: Efficient cryptanalysis of Bloom filters for privacy-preserving record linkage. In: 2017 Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 628–640 (2017)
Churches, T., Christen, P.: Blind data linkage using n-gram similarity comparisons. In: 2004 Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 121–126 (2004)
Dewri, R., Ong, T., Thurimella, R.: Linking health records for federated query processing. Proc. Priv. Enhanc. Technol. 2016(3), 4–23 (2016)
Durham, E.A., Kantarcioglu, M., Xue, Y., Toth, C., Kuzu, M., Malin, B.: Composite bloom filters for secure record linkage. IEEE Trans. Knowl. Data Eng. 26(12), 2956–2968 (2014)
Erlingsson, U., Pihur, V., Korolova, A.: RAPPOR: randomized aggregatable privacy-preserving ordinal response. In: 2014 21st ACM Conference on Computer and Communications Security, pp. 1054–1067 (2014)
Franke, M., Sehili, Z., Rahm, E.: Parallel privacy-preserving record linkage using LSH-based blocking. In: 2018 3rd International Conference on Internet of Things, Big Data and Security, pp. 195–203 (2018)
Franke, M., Sehili, Z., Rohde, F., Rahm, E.: Evaluation of hardening techniques for privacy-preserving record linkage. In: 2021 24th International Conference on Extending Database Technology, pp. 289–300 (2021)
Guesdon, M., Benzenine, E., Gadouche, K., Quantin, C.: Securizing data linkage in French public statistics. BMC Med. Inform. Decis. Mak. 16(1), 129 (2016)
Hall, R., Fienberg, S.E.: Privacy-preserving record linkage. In: 2010 International Conference on Privacy in Statistical Databases, pp. 269–283 (2010)
Karakasidis, A., Verykios, V.: Privacy preserving record linkage using phonetic codes. In: 2009 4th Balkan Conference in Informatics, pp. 101–106 (2009)
Kirsch, A., Mitzenmacher, M.: Less hashing, same performance: building a better bloom filter. In: Azar, Y., Erlebach, T. (eds.) ESA 2006. LNCS, vol. 4168, pp. 456–467. Springer, Heidelberg (2006). https://doi.org/10.1007/11841036_42
Kroll, M., Steinmetzer, S.: Automated cryptanalysis of bloom filter encryptions of health records. In: German Record Linkage Center, Working Paper Series. No. WP-GRLC-2014-05 (2014)
Kuzu, M., Kantarcioglu, M., Durham, E., Malin, B.: A constraint satisfaction cryptanalysis of Bloom filters in private record linkage. In: Privacy Enhancing Technologies, pp. 226–245 (2011)
Lazrig, I., Ong, T.C., Ray, I., Ray, I., Jiang, X., Vaidya, J.: Privacy preserving probabilistic record linkage without trusted third party. In: 2018 16th Annual Conference on Privacy, Security and Trust, pp. 1–10 (2018)
Mitchell, W., Dewri, R., Thurimella, R., Roschke, M.: A graph traversal attack on Bloom filter-based medical data aggregation. Int. J. Big Data Intell. 4(4), 217–226 (2017)
Niedermeyer, F., Steinmetzer, S., Kroll, M., Schnell, R.: Cryptanalysis of basic Bloom filters used for privacy preserving record linkage. J. Priv. Confid. 6(2), 59–79 (2014)
Schnell, R.: Privacy-preserving record linkage. In: Methodological Developments in Data Linkage, pp. 201–225 (2015)
Schnell, R., Bachteler, T.: Privacy-preserving record linkage using Bloom filters. BMC Med. Inform. Decis. Mak. 9(1), 41 (2009)
Schnell, R., Bachteler, T., Reiher, J.: A novel error-tolerant anonymous linking code. SSRN Electron. J. (2011). https://doi.org/10.2139/ssrn.3549247
Schnell, R., Borgs, C.: Randomized response and balanced Bloom filters for privacy preserving record linkage. In: 2016 16th International Conference on Data Mining Workshops, pp. 218–224 (2016)
Schnell, R., Borgs, C.: XOR-folding for Bloom filter-based encryptions for privacy-preserving record linkage. SSRN Electron. J. (2016). https://doi.org/10.2139/ssrn.3527984
Schnell, R., Borgs, C.: Hardening encrypted patient names against cryptographic attacks using cellular automata. In: 2018 International Conference on Data Mining Workshops, pp. 518–522 (2018)
Smith, D.: Secure pseudonymisation for privacy-preserving probabilistic record linkage. J. Inf. Secur. Appl. 34, 271–279 (2017)
Stammler, S., et al.: Mainzelliste SecureEpilinker (MainSEL): privacy-preserving record linkage using secure multi-party computation. Bioinformatics 38(6), 1657–1668 (2022)
Vatsalan, D., Christen, P., Rahm, E.: Scalable privacy-preserving linking of multiple databases using counting Bloom filters. In: 2016 16th International Conference on Data Mining Workshops, pp. 882–889 (2016)
Vatsalan, D., Christen, P., Verykios, V.: A taxonomy of privacy-preserving record linkage techniques. Inf. Sys. 38(6), 946–969 (2013)
Vatsalan, D., Sehili, Z., Christen, P., Rahm, E.: Privacy-preserving record linkage for big data: current approaches and research challenges. In: Zomaya, A.Y., Sakr, S. (eds.) Handbook of Big Data Technologies, pp. 851–895. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-49340-4_25
Vidanage, A., Christen, P., Ranbaduge, T., Schnell, R.: A graph matching attack on privacy-preserving record linkage. In: 2020 29th ACM International Conference on Information & Knowledge Management, pp. 1485–1494 (2020)
Vidanage, A., Ranbaduge, T., Christen, P., Randall, S.: A privacy attack on multiple dynamic match-key based privacy-preserving record linkage. Int. J. Popul. Data Sci. 5(1) (2020)
Vidanage, A., Ranbaduge, T., Christen, P., Schnell, R.: Efficient pattern mining based cryptanalysis for privacy-preserving record linkage. In: 2019 35th International Conference on Data Engineering, pp. 1698–1701 (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Mortl, K., Dewri, R. (2022). Generating-Set Evaluation of Bloom Filter Hardening Techniques in Private Record Linkage. In: Badarla, V.R., Nepal, S., Shyamasundar, R.K. (eds) Information Systems Security. ICISS 2022. Lecture Notes in Computer Science, vol 13784. Springer, Cham. https://doi.org/10.1007/978-3-031-23690-7_3
Download citation
DOI: https://doi.org/10.1007/978-3-031-23690-7_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-23689-1
Online ISBN: 978-3-031-23690-7
eBook Packages: Computer ScienceComputer Science (R0)