Conferences >2013 Eleventh Annual Conferen...

Identifying individual vulnerability based on public data

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Companies and government agencies frequently own data sets containing personal information about clients, survey responders, or users of a product. Sometimes these organi...Show More

Metadata

Abstract:

Companies and government agencies frequently own data sets containing personal information about clients, survey responders, or users of a product. Sometimes these organizations are required or wish to release anonymized versions of this information to the public. Prior to releasing these data, they use established privacy preservation methods such as binning, data perturbation, and data suppression to maintain the anonymity of clients, customers, or survey participants. However, existing work has shown that common privacy preserving measures fail when anonymized data are combined with data from online social networks, social media sites, and data aggregation sites. This paper introduces a methodology for determining the vulnerability of individuals in a pre-released data set to reidentification using public data. As part of this methodology, we propose novel metrics to quantify the amount of information that can be gained from combining pre-released data with publicly available online data. We then investigate how to utilize our metrics to identify individuals in the data set who may be particularly vulnerable to this form of data combination. We demonstrate the effectiveness of our methodology on a real world data set using public data from both social networking and data aggregation sites.

Published in: 2013 Eleventh Annual Conference on Privacy, Security and Trust

Date of Conference: 10-12 July 2013

Date Added to IEEE Xplore: 12 September 2013

Electronic ISBN:978-1-4673-5839-2

DOI: 10.1109/PST.2013.6596045

Conference Location: Tarragona, Spain