Identifying individual vulnerability based on public data | IEEE Conference Publication | IEEE Xplore

Identifying individual vulnerability based on public data


Abstract:

Companies and government agencies frequently own data sets containing personal information about clients, survey responders, or users of a product. Sometimes these organi...Show More

Abstract:

Companies and government agencies frequently own data sets containing personal information about clients, survey responders, or users of a product. Sometimes these organizations are required or wish to release anonymized versions of this information to the public. Prior to releasing these data, they use established privacy preservation methods such as binning, data perturbation, and data suppression to maintain the anonymity of clients, customers, or survey participants. However, existing work has shown that common privacy preserving measures fail when anonymized data are combined with data from online social networks, social media sites, and data aggregation sites. This paper introduces a methodology for determining the vulnerability of individuals in a pre-released data set to reidentification using public data. As part of this methodology, we propose novel metrics to quantify the amount of information that can be gained from combining pre-released data with publicly available online data. We then investigate how to utilize our metrics to identify individuals in the data set who may be particularly vulnerable to this form of data combination. We demonstrate the effectiveness of our methodology on a real world data set using public data from both social networking and data aggregation sites.
Date of Conference: 10-12 July 2013
Date Added to IEEE Xplore: 12 September 2013
Electronic ISBN:978-1-4673-5839-2
Conference Location: Tarragona, Spain

Contact IEEE to Subscribe

References

References is not available for this document.