Loading [a11y]/accessibility-menu.js
Efficient Privacy Preserving Record Linkage at Scale using Apache Spark | IEEE Conference Publication | IEEE Xplore

Efficient Privacy Preserving Record Linkage at Scale using Apache Spark


Abstract:

Soundex has been used for over a century for approximately matching records based on their phonetic footprint. In this paper, we examine a series of techniques a practiti...Show More

Abstract:

Soundex has been used for over a century for approximately matching records based on their phonetic footprint. In this paper, we examine a series of techniques a practitioner might employ in order to increase the algorithm’s matching capabilities, when utilizing Soundex for privacy preserving record linkage and a protocol based on Apache Spark, suitable for big data processing. We provide a detailed empirical assessment measuring matching quality and time performance of the proposed alternatives, showing that we achieve both precision and recall over 95% for large datasets in a few seconds and without utilizing any privacy-preserving blocking technique.
Date of Conference: 17-20 December 2022
Date Added to IEEE Xplore: 26 January 2023
ISBN Information:
Conference Location: Osaka, Japan

Contact IEEE to Subscribe

References

References is not available for this document.