Synonyms
Related Concepts
Definition
Data linkage consists of methods for matching duplicates within or across files using nonunique identifiers such as first name, last name, date-of-birth, address, and other characteristics such as income and sex.
Background
Many analysts prefer microdata instead of tables of aggregates in publications. To meet analytic needs, organizations create public-use microdata that may allow the approximate reproduction of a few analyses from the original, confidential microdata. To maintain privacy, organizations must assure that the released microdata do not allow the reidentification of individuals with records in the microdata. To reduce reidentification risk, organizations may mask the data with methods such as additive noise or create models from which it is possible to draw synthetic data that meet various analytic restraints. Although rigorous...
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Recommended Reading
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc B 39: 1–38
Dwork C, McSherry F, Talwar K (2007) Differentially private marginals release with mutual consistency and error independent sample size. UNECE Worksession on Statistical Data Confidentiality, Manchester, UK at http://www.unece.org/stats/documents/2007/12/confidentiality/wp.19.e.pdf
Fellegi IP, Sunter AB (1969) A theory for record linkage. J Am Stat Assoc 64:1183–1210
Kim JJ, Winkler WE (1995) Masking microdata files. American Statistical Association, Proceedings of the section on survey research methods. pp 114–119 (http://www.amstat.org/sections/SRMS/Proceedings/papers/1995_017.pdf, longer report http://www.census.gov/srd/papers/pdf/rr97-3.pdf)
Lambert D (1993) Measures of disclosure risk and harm. J Off Stat 9:313–331 (http://www.jos.nu/Articles/abstract.asp?article=92313)
Winkler WE (1994) Advanced methods for record linkage. American Statistical Association, Proceedings of the section on survey research methods. pp 467–472 (longer version http://www.census.gov/srd/papers/pdf/rr94-5.pdf)
Winkler WE (2008) General discrete-data modeling methods for producing synthetic data with reduced re-identification risk that preserve analytic properties, IAB workshop on confidentiality and disclosure. http://fdz.iab.de/en/FDZ_Events/SDC-Workshop.aspx, Nuremberg, Germany, November 20–21, 2008, downloadable from workshop site
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer Science+Business Media, LLC
About this entry
Cite this entry
Winkler, W.E. (2011). Data Linkage. In: van Tilborg, H.C.A., Jajodia, S. (eds) Encyclopedia of Cryptography and Security. Springer, Boston, MA. https://doi.org/10.1007/978-1-4419-5906-5_750
Download citation
DOI: https://doi.org/10.1007/978-1-4419-5906-5_750
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4419-5905-8
Online ISBN: 978-1-4419-5906-5
eBook Packages: Computer ScienceReference Module Computer Science and Engineering