Skip to main content

Generating-Set Evaluation of Bloom Filter Hardening Techniques in Private Record Linkage

  • Conference paper
  • First Online:
Information Systems Security (ICISS 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13784))

Included in the following conference series:

  • 340 Accesses

Abstract

Private record linkage is an active field of research targeted towards linking data sets from two or more sources, while preserving the privacy of contained sensitive content. With computation and communication efficiency as other two important requirements in such a process, much attention has been given to use Bloom filters for fast encoding of data records, while maintaining privacy of the records at the same time. A number of techniques to modify a typical Bloom filter have also appeared and addresses the need to harden them against known attacks. However, the field significantly lacks quantitative measures of the privacy level introduced by such techniques. In this work, we motivate and propose the generating-set amplification factor measure to bridge some of this gap. This privacy measure aims to capture the level of uncertainty that a hardening technique introduces between its output and the input used to create a Bloom filter. We provide algorithms to compute the measure and provide an empirical assessment of the state-of-the-art Bloom filter hardening techniques with respect to the measure. Our assessment shows that current techniques may still be retaining much of the characteristics of the input, although attacks to exploit them are yet to appear.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 54.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Baker, D.B., Knoppers, B.M., Phillips, M., van Enckevort, D., Kaufmann, P., Lochmuller, H., Taruscio, D.: Privacy-preserving linkage of genomic and clinical data sets. IEEE/ACM Trans. Comput. Biol. Bioinform. 16(4), 1342–1348 (2019)

    Article  Google Scholar 

  2. Brown, A.P., Randall, S.M., Boyd, J.H., Ferrante, A.M.: Evaluation of approximate comparison methods on Bloom filters for probabilistic linkage. Int. J. Popul. Data Sci. 4(1), 1095 (2019)

    Google Scholar 

  3. Christen, P., Ranbaduge, T., Vatsalan, D., Schnell, R.: Precise and fast cryptanalysis for Bloom filter based privacy-preserving record linkage. IEEE Trans. Knowl. Data Eng. 31(11), 2164–2177 (2019)

    Article  Google Scholar 

  4. Christen, P., Schnell, R., Vatsalan, D., Ranbaduge, T.: Efficient cryptanalysis of Bloom filters for privacy-preserving record linkage. In: 2017 Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 628–640 (2017)

    Google Scholar 

  5. Churches, T., Christen, P.: Blind data linkage using n-gram similarity comparisons. In: 2004 Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 121–126 (2004)

    Google Scholar 

  6. Dewri, R., Ong, T., Thurimella, R.: Linking health records for federated query processing. Proc. Priv. Enhanc. Technol. 2016(3), 4–23 (2016)

    Google Scholar 

  7. Durham, E.A., Kantarcioglu, M., Xue, Y., Toth, C., Kuzu, M., Malin, B.: Composite bloom filters for secure record linkage. IEEE Trans. Knowl. Data Eng. 26(12), 2956–2968 (2014)

    Article  Google Scholar 

  8. Erlingsson, U., Pihur, V., Korolova, A.: RAPPOR: randomized aggregatable privacy-preserving ordinal response. In: 2014 21st ACM Conference on Computer and Communications Security, pp. 1054–1067 (2014)

    Google Scholar 

  9. Franke, M., Sehili, Z., Rahm, E.: Parallel privacy-preserving record linkage using LSH-based blocking. In: 2018 3rd International Conference on Internet of Things, Big Data and Security, pp. 195–203 (2018)

    Google Scholar 

  10. Franke, M., Sehili, Z., Rohde, F., Rahm, E.: Evaluation of hardening techniques for privacy-preserving record linkage. In: 2021 24th International Conference on Extending Database Technology, pp. 289–300 (2021)

    Google Scholar 

  11. Guesdon, M., Benzenine, E., Gadouche, K., Quantin, C.: Securizing data linkage in French public statistics. BMC Med. Inform. Decis. Mak. 16(1), 129 (2016)

    Article  Google Scholar 

  12. Hall, R., Fienberg, S.E.: Privacy-preserving record linkage. In: 2010 International Conference on Privacy in Statistical Databases, pp. 269–283 (2010)

    Google Scholar 

  13. Karakasidis, A., Verykios, V.: Privacy preserving record linkage using phonetic codes. In: 2009 4th Balkan Conference in Informatics, pp. 101–106 (2009)

    Google Scholar 

  14. Kirsch, A., Mitzenmacher, M.: Less hashing, same performance: building a better bloom filter. In: Azar, Y., Erlebach, T. (eds.) ESA 2006. LNCS, vol. 4168, pp. 456–467. Springer, Heidelberg (2006). https://doi.org/10.1007/11841036_42

    Chapter  Google Scholar 

  15. Kroll, M., Steinmetzer, S.: Automated cryptanalysis of bloom filter encryptions of health records. In: German Record Linkage Center, Working Paper Series. No. WP-GRLC-2014-05 (2014)

    Google Scholar 

  16. Kuzu, M., Kantarcioglu, M., Durham, E., Malin, B.: A constraint satisfaction cryptanalysis of Bloom filters in private record linkage. In: Privacy Enhancing Technologies, pp. 226–245 (2011)

    Google Scholar 

  17. Lazrig, I., Ong, T.C., Ray, I., Ray, I., Jiang, X., Vaidya, J.: Privacy preserving probabilistic record linkage without trusted third party. In: 2018 16th Annual Conference on Privacy, Security and Trust, pp. 1–10 (2018)

    Google Scholar 

  18. Mitchell, W., Dewri, R., Thurimella, R., Roschke, M.: A graph traversal attack on Bloom filter-based medical data aggregation. Int. J. Big Data Intell. 4(4), 217–226 (2017)

    Article  Google Scholar 

  19. Niedermeyer, F., Steinmetzer, S., Kroll, M., Schnell, R.: Cryptanalysis of basic Bloom filters used for privacy preserving record linkage. J. Priv. Confid. 6(2), 59–79 (2014)

    Google Scholar 

  20. Schnell, R.: Privacy-preserving record linkage. In: Methodological Developments in Data Linkage, pp. 201–225 (2015)

    Google Scholar 

  21. Schnell, R., Bachteler, T.: Privacy-preserving record linkage using Bloom filters. BMC Med. Inform. Decis. Mak. 9(1), 41 (2009)

    Article  Google Scholar 

  22. Schnell, R., Bachteler, T., Reiher, J.: A novel error-tolerant anonymous linking code. SSRN Electron. J. (2011). https://doi.org/10.2139/ssrn.3549247

    Article  Google Scholar 

  23. Schnell, R., Borgs, C.: Randomized response and balanced Bloom filters for privacy preserving record linkage. In: 2016 16th International Conference on Data Mining Workshops, pp. 218–224 (2016)

    Google Scholar 

  24. Schnell, R., Borgs, C.: XOR-folding for Bloom filter-based encryptions for privacy-preserving record linkage. SSRN Electron. J. (2016). https://doi.org/10.2139/ssrn.3527984

    Article  Google Scholar 

  25. Schnell, R., Borgs, C.: Hardening encrypted patient names against cryptographic attacks using cellular automata. In: 2018 International Conference on Data Mining Workshops, pp. 518–522 (2018)

    Google Scholar 

  26. Smith, D.: Secure pseudonymisation for privacy-preserving probabilistic record linkage. J. Inf. Secur. Appl. 34, 271–279 (2017)

    Google Scholar 

  27. Stammler, S., et al.: Mainzelliste SecureEpilinker (MainSEL): privacy-preserving record linkage using secure multi-party computation. Bioinformatics 38(6), 1657–1668 (2022)

    Google Scholar 

  28. Vatsalan, D., Christen, P., Rahm, E.: Scalable privacy-preserving linking of multiple databases using counting Bloom filters. In: 2016 16th International Conference on Data Mining Workshops, pp. 882–889 (2016)

    Google Scholar 

  29. Vatsalan, D., Christen, P., Verykios, V.: A taxonomy of privacy-preserving record linkage techniques. Inf. Sys. 38(6), 946–969 (2013)

    Article  Google Scholar 

  30. Vatsalan, D., Sehili, Z., Christen, P., Rahm, E.: Privacy-preserving record linkage for big data: current approaches and research challenges. In: Zomaya, A.Y., Sakr, S. (eds.) Handbook of Big Data Technologies, pp. 851–895. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-49340-4_25

    Chapter  Google Scholar 

  31. Vidanage, A., Christen, P., Ranbaduge, T., Schnell, R.: A graph matching attack on privacy-preserving record linkage. In: 2020 29th ACM International Conference on Information & Knowledge Management, pp. 1485–1494 (2020)

    Google Scholar 

  32. Vidanage, A., Ranbaduge, T., Christen, P., Randall, S.: A privacy attack on multiple dynamic match-key based privacy-preserving record linkage. Int. J. Popul. Data Sci. 5(1) (2020)

    Google Scholar 

  33. Vidanage, A., Ranbaduge, T., Christen, P., Schnell, R.: Efficient pattern mining based cryptanalysis for privacy-preserving record linkage. In: 2019 35th International Conference on Data Engineering, pp. 1698–1701 (2019)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rinku Dewri .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Mortl, K., Dewri, R. (2022). Generating-Set Evaluation of Bloom Filter Hardening Techniques in Private Record Linkage. In: Badarla, V.R., Nepal, S., Shyamasundar, R.K. (eds) Information Systems Security. ICISS 2022. Lecture Notes in Computer Science, vol 13784. Springer, Cham. https://doi.org/10.1007/978-3-031-23690-7_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-23690-7_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-23689-1

  • Online ISBN: 978-3-031-23690-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics