Maximizing Privacy under Data Distortion Constraints in Noise Perturbation Methods

Rachlin, Yaron; Probst, Katharina; Ghani, Rayid

doi:10.1007/978-3-642-01718-6_7

Yaron Rachlin²⁰,
Katharina Probst²¹ &
Rayid Ghani²⁰

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 5456))

Included in the following conference series:

International Workshop on Privacy, Security, and Trust in KDD

590 Accesses
4 Citations

Abstract

This paper introduces the ‘guessing anonymity,’ a definition of privacy for noise perturbation methods. This definition captures the difficulty of linking identity to a sanitized record using publicly available information. Importantly, this definition leads to analytical expressions that bound data privacy as a function of the noise perturbation parameters. Using these bounds, we can formulate optimization problems to describe the feasible tradeoffs between data distortion and privacy, without exhaustively searching the noise parameter space. This work addresses an important shortcoming of noise perturbation methods, by providing them with an intuitive definition of privacy analogous to the definition used in k-anonymity, and an analytical means for selecting parameters to achieve a desired level of privacy. At the same time, our work maintains the appealing aspects of noise perturbation methods, which have made them popular both in practice and as a subject of academic research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Barbaro, M., Zeller, T.: A face is exposed for aol searcher no. 4417749, New York Times (August 9, 2006)
Google Scholar
Domingo-Ferrer, J.: A survey of inference control methods for privacy-preserving data mining. In: Aggarwal, C.C., Yu, P.S. (eds.) Privacy-Preserving Data Mining: Models and Algorithms. Springer, Heidelberg (2008)
Google Scholar
Aggarwal, C.C., Yu, P.S.: A general survey of privacy-preserving data mining models and algorithms. In: Aggarwal, C.C., Yu, P.S. (eds.) Privacy-Preserving Data Mining: Models and Algorithms. Springer, Heidelberg (2008)
Chapter Google Scholar
Agrawal, R., Srikant, R.: Privacy-preserving data mining. In: ACM SIGMOD Conference (2000)
Google Scholar
Agrawal, D., Aggarwal, C.C.: On the design and quantification of privacy preserving data mining. In: ACM PODS Conference (2002)
Google Scholar
Muralidhar, K., Sarathy, R.: Security of random data perturbation methods. ACM Trans. Database Syst. 24(4) (1999)
Google Scholar
Samarati, P., Sweeney, L.: Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression. In: Proceedings of the IEEE Symposium on Research in Security and Privacy (1998)
Google Scholar
Bayardo, R.J., Agrawal, R.: Data privacy through optimal k-anonymization. In: IEEE International Conference on Data Engineering, pp. 217–228 (2005)
Google Scholar
LeFevre, K., DeWitt, D.J., Ramakrishnan, R.: Incognito: efficient full-domain k-anonymity. In: ACM SIGMOD (2005)
Google Scholar
Aggarwal, C.C.: On randomization, public information, and the curse of dimensionality. In: IEEE International Conference on Data Engineering (2007)
Google Scholar
Torra, V., Abowd, J., Domingo-Ferrer, J.: Using mahalanobis distance-based record linkage for disclosure risk assessment. In: Domingo-Ferrer, J., Franconi, L. (eds.) PSD 2006. LNCS, vol. 4302, pp. 233–242. Springer, Heidelberg (2006)
Chapter Google Scholar
Domingo-Ferrer, J., Torra, V.: A quantitative comparison of disclosure control methods for microdata. In: Doyle, P., Lane, J., Theeuwes, J., Zayatz, L. (eds.) Confidentiality, disclosure, and data access: Theory and practical applications for statistical agencies, pp. 111–133. Elsevier, Amsterdam (2001)
Google Scholar
Arikan, E.: An inequality on guessing and its application to sequential decoding. IEEE Transactions on Information Theory 42(1), 99–105 (1996)
Article MathSciNet MATH Google Scholar
Aggarwal, C.C.: On unifying privacy and uncertain data models. In: IEEE International Conference on Data Engineering (2008)
Google Scholar
Renyi, A.: On measures of entropy and information. In: 4th Berkeley Symposium on Mathematical Statistics and Probability (1961)
Google Scholar
Dalenius, T.: Finding a needle in a haystack - or identifying anonymous census record. J. Official Statistics 2(3), 329–336 (1986)
Google Scholar
Massey, J.L.: Guessing and entropy. In: IEEE Symposium on Information Theory (1994)
Google Scholar
Sweeney, L.: Achieving k-anonymity privacy protection using generalization and suppression. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems 10(5), 571–588 (2002)
Article MathSciNet MATH Google Scholar
Asuncion, A., Newman, D.: UCI machine learning repository adult dataset (2007)
Google Scholar
Witten, I.H., Frank, E.: Data mining: Practical machine learning tools and techniques
Google Scholar

Download references

Author information

Authors and Affiliations

Accenture Technology Labs, Chicago, IL, USA
Yaron Rachlin & Rayid Ghani
Google Inc., Atlanta, GA, USA
Katharina Probst

Authors

Yaron Rachlin
View author publications
You can also search for this author in PubMed Google Scholar
Katharina Probst
View author publications
You can also search for this author in PubMed Google Scholar
Rayid Ghani
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Yahoo! Research Barcelona, Avinguda Drive 177, 08018, Barcelona, Spain
Francesco Bonchi
Department of Computer Science and Communication, University of Insubria, 22100, Varese, Italy
Elena Ferrari
311 Computer Science Building, 500W. 15th St., MO 65409, Rolla, USA
Wei Jiang
Department of Biomedical Informatics, Vanderbilt University, TN 37203, nashville, USA
Bradley Malin

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rachlin, Y., Probst, K., Ghani, R. (2009). Maximizing Privacy under Data Distortion Constraints in Noise Perturbation Methods. In: Bonchi, F., Ferrari, E., Jiang, W., Malin, B. (eds) Privacy, Security, and Trust in KDD. PInKDD 2008. Lecture Notes in Computer Science, vol 5456. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01718-6_7

Download citation

DOI: https://doi.org/10.1007/978-3-642-01718-6_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-01717-9
Online ISBN: 978-3-642-01718-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics