Abstract
We consider the problem of developing a user-centric toolkit for anonymizing medical data that uses ε-differential privacy to measure disclosure risk. Our work will use a randomized algorithm, in particular, the application of sketches to achieve differential privacy. Sketch based randomization is a form of multiplicative perturbation that has been proven to work effectively on sparse, high dimensional data. However, a differential privacy model has yet to be defined in order to work with sketches. The goal is to study whether this approach will yield any improvement over previous results in preserving the privacy of data. How much the anonymized data utility is retained will subsequently be evaluated by the usefulness of the published synthetic data for a number of common statistical learning algorithms.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Sweeney, L.: k-Anonymity: A Model for Protecting Privacy. International Journal on Uncertainty, Fuzziness an Knowledge-based System 10(5), 557–570 (2002)
A face is exposed for AOL searcher no. 4417749, http://www.nytimes.com/2006/08/09/technology/09aol.html
Narayanan, A., Shmatikov, V.: Robust de-anonymization of large sparse datasets. In: Proceedings of the IEEE Symposium on Security and Privacy, Oakland, California, pp. 111–125 (2008)
Machanavajjhala, A., Kiefer, D., Gehrke, J., Venkitasubramanian, M.: l-Diversity: Privacy beyond k-anonymity. In: IEEE International Conference on Data Engineering (2006)
Domingo, J.F., Torra, V.: A Critique of k-Anonimity and Some of Its Enhancements. In: Proceedings of the 3rd International Conference on Availability, Reliability and Security, Barcelona, Spain, pp. 990–993 (2008)
Aggarwal, C.C., Yu, P.S.: On Privacy-Preservation of Text and Sparse Binary Data with Sketches. In: SIAM International Conference on Data Mining (2007)
Dwork, C., Smith, A.: Differential Privacy for Statistics: What we Know and What we Want to Learn. In: CDC Data Confidentiality Workshop (2008)
Ganta, S.R., Kasiviswanathan, S.P., Smith, A.: Composition Attacks and Auxiliary Information in Data Privacy. In: Proceeding of the 14th ACM SIGKDD International Conference, Las Vegas, Nevada, pp. 265–273 (2008)
Rusu, F., Dobra, A.: Pseudo-Random Number Generation for Sketch-Based Estimations. ACM Transactions on Database Systems 32(2) (2007)
UCI Machine Learning Repository: Adult Data Set, http://archive.ics.uci.edu/ml/datasets/Adult
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lee, J. (2010). On Sketch Based Anonymization That Satisfies Differential Privacy Model. In: Farzindar, A., Kešelj, V. (eds) Advances in Artificial Intelligence. Canadian AI 2010. Lecture Notes in Computer Science(), vol 6085. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13059-5_55
Download citation
DOI: https://doi.org/10.1007/978-3-642-13059-5_55
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-13058-8
Online ISBN: 978-3-642-13059-5
eBook Packages: Computer ScienceComputer Science (R0)