Skip to main content

Scalable Private Blocking Technique for Privacy-Preserving Record Linkage

  • Conference paper
  • First Online:
Web Technologies and Applications (APWeb 2016)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9932))

Included in the following conference series:

Abstract

Record linkage is the process of matching records from multiple databases that refer to the same entities and it has become an increasingly important subject in many application areas, including business, government, and health. When we collect data which is about people from these areas, integrating such data across organizations can raise privacy concerns. To prevent privacy breaches, ideally records should be linked in a private way such that no information other than the matching result is leaked in the process, this technique is called privacy-preserving record linkage (PPRL). Scalability is one of the main challenges in PPRL, therefore, many private blocking techniques have been developed for PPRL. They are aimed at reducing the number of record pairs to be compared in the matching process by removing obvious non-matching pairs without compromising privacy. However, they vary widely in their ability to balance competing goals of accuracy, efficiency and security. In this paper, we propose a novel private blocking approach for PPRL based on dynamic k-anonymous blocking and Paillier cryptosystem. In dynamic k-anonymous blocking, our approach dynamically generates blocks satisfying k-anonymity and more accurate values to represent the blocks with varying k. We also propose a novel similarity measure method which performs on the numerical attributes and combines with Paillier cryptosystem to measure the similarity of two blocks in security, which provides strong privacy guarantees that none information reveals. Experiments conducted on a public dataset of voter registration records validate that our approach is scalable to large databases and keeps a high quality of blocking. We compare our method with other techniques and demonstrate the increases in security and accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Vatsalan, D., Christen, P., Verykios, V.S.: A taxonomy of privacy-preserving record linkage techniques. Inf. Syst. 38(6), 946–969 (2013)

    Article  Google Scholar 

  2. Christen, P.: A survey of indexing techniques for scalable record linkage and deduplication. IEEE Trans. Knowl. Data Eng. 24, 1537–1555 (2011)

    Article  Google Scholar 

  3. Vatsalan, D., Christen, P., Verykios, V.S.: Efficient two-party private blocking based on sorted nearest neighborhood clustering. In: ACM CIKM (2013)

    Google Scholar 

  4. Inan, A., Kantarcioglu, M., Bertino, E., Scannapieco, M.: A hybrid approach to private record linkage. In: ICDE, pp. 496–505 (2008)

    Google Scholar 

  5. Al-Lawati, A., Lee, D., McDaniel, P.: Blocking-aware private record linkage. In: IQIS, pp. 59–68 (2005)

    Google Scholar 

  6. Karakasidis, A., Verykios, V.S.: Secure blocking + secure matching = secure record linkage. J. Comput. Sci. Eng. 5, 223–235 (2011)

    Article  Google Scholar 

  7. Karakasidis, A., Verykios, V.S.: Reference table based k-anonymous private blocking. In: 27th Annual ACM Symposium on Applied Computing, Trento (2012)

    Google Scholar 

  8. Durham, E.: A framework for accurate, efficient private record linkage. Ph.D. Thesis, Vanderbilt University (2012)

    Google Scholar 

  9. Karakasidis, A., Verykios, V.S.: Scalable blocking for privacy preserving record linkage. In: ACM KDD, Sydney (2015)

    Google Scholar 

  10. Inan, A., Kantarcioglu, M., Ghinita, G., Bertino, E.: Private record matching using differential privacy. In: EDBT, Lausanne, Switzerland, pp. 123–134 (2010)

    Google Scholar 

  11. Vatsalan, D., Christen, P.: An iterative two-party protocol for scalable privacy-preserving record linkage. In: Aus DM, CRPIT, Sydney, Australia, vol. 134 (2012)

    Google Scholar 

  12. Durham, E.A.: A framework for accurate, efficient private record linkage. Ph.D. thesis, Graduate School of Vanderbilt University, Nashville (2012)

    Google Scholar 

  13. Sweeney, L.: k-anonymity: a model for protecting privacy. Int. J. Uncertainty Fuzziness Knowl. Based Syst 10, 557–570 (2002)

    Google Scholar 

  14. Paillier, P.: Public-key cryptosystems based on composite degree residuosity classes. In: Stern, J. (ed.) EUROCRYPT 1999. LNCS, vol. 1592, pp. 223–238. Springer, Heidelberg (1999)

    Chapter  Google Scholar 

  15. Kuzu, M., Inan, A.: Efficient privacy-aware record integration. In: ACM EDBT (2013)

    Google Scholar 

Download references

Acknowledgment

This work is supported by the National Basic Research 973 Program of China under Grant No. 2012CB316201, the National Natural Science Foundation of China under Grant No. 61472070.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shumin Han .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Han, S., Shen, D., Nie, T., Kou, Y., Yu, G. (2016). Scalable Private Blocking Technique for Privacy-Preserving Record Linkage. In: Li, F., Shim, K., Zheng, K., Liu, G. (eds) Web Technologies and Applications. APWeb 2016. Lecture Notes in Computer Science(), vol 9932. Springer, Cham. https://doi.org/10.1007/978-3-319-45817-5_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-45817-5_16

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-45816-8

  • Online ISBN: 978-3-319-45817-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics