skip to main content
10.1145/2213836.2213910acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
tutorial

Differential privacy in data publication and analysis

Published: 20 May 2012 Publication History

Abstract

Data privacy has been an important research topic in the security, theory and database communities in the last few decades. However, many existing studies have restrictive assumptions regarding the adversary's prior knowledge, meaning that they preserve individuals' privacy only when the adversary has rather limited background information about the sensitive data, or only uses certain kinds of attacks. Recently, differential privacy has emerged as a new paradigm for privacy protection with very conservative assumptions about the adversary's prior knowledge. Since its proposal, differential privacy had been gaining attention in many fields of computer science, and is considered among the most promising paradigms for privacy-preserving data publication and analysis. In this tutorial, we will motivate its introduction as a replacement for other paradigms, present the basics of the differential privacy model from a database perspective, describe the state of the art in differential privacy research, explain the limitations and shortcomings of differential privacy, and discuss open problems for future research.

References

[1]
http://search-logs.com/aol/about.
[2]
L. Backstrom, C. Dwork, and J. M. Kleinberg. Wherefore art thou r3579x?: anonymized social networks, hidden patterns, and structural steganography. In WWW, pages 181--190, 2007.
[3]
B. Barak, K. Chaudhuri, C. Dwork, S. Kale, F. McSherry, and K. Talwar. Privacy, accuracy, and consistency too: a holistic solution to contingency table release. In PODS, pages 273--282, 2007.
[4]
R. Bhaskar, S. Laxman, A. Smith, and A. Thakurta. Discovering frequent patterns in sensitive data. In KDD, pages 503--512, 2010.
[5]
K. Chaudhuri and C. Monteleoni. Privacy-preserving logistic regression. In NIPS, pages 289--296, 2008.
[6]
K. Chaudhuri, C. Monteleoni, and A. D. Sarwate. Differentially private empirical risk minimization. Journal of Machine Learning Research, 12:1069--1109, 2011.
[7]
G. Cormode, C. M. Procopiuc, D. Srivastava, and T. T. L. Tran. Differentially private publication of sparse data. In ICDT, 2012.
[8]
G. Cormode, M. Procopiuca, E. Shen, D. Srivastava, and T. Yu. Differentially private spatial decompositions. In ICDE, 2012.
[9]
B. Ding, M. Winslett, J. Han, and Z. Li. Differentially private data cubes: optimizing noise sources and consistency. In SIGMOD Conference, pages 217--228, 2011.
[10]
C. Dwork. Differential privacy. In ICALP (2), pages 1--12, 2006.
[11]
C. Dwork, F. McSherry, K. Nissim, and A. Smith. Calibrating noise to sensitivity in private data analysis. In TCC, pages 265--284, 2006.
[12]
C. Dwork, M. Naor, T. Pitassi, and G. N. Rothblum. Differential privacy under continual observation. In STOC, pages 715--724, 2010.
[13]
C. Dwork, G. N. Rothblum, and S. P. Vadhan. Boosting and differential privacy. In FOCS, pages 51--60, 2010.
[14]
S. E. Fienberg, A. B. Slavkovic, and C. Uhler. Privacy preserving gwas data sharing. In ICDM Workshops, pages 628--635, 2011.
[15]
A. Friedman and A. Schuster. Data mining with differential privacy. In KDD, pages 493--502, 2010.
[16]
A. Ghosh, T. Roughgarden, and M. Sundararajan. Universally utility-maximizing privacy mechanisms. In STOC, pages 351--360, 2009.
[17]
Hardt and Rothblum. A multiplicative weights mechanism for privacy preserving data analysis. In FOCS, pages 61--70, 2010.
[18]
M. Hay, V. Rastogi, G. Miklau, and D. Suciu. Boosting the accuracy of differentially private histograms through consistency. PVLDB, 3(1):1021--1032, 2010.
[19]
N. Homer, S. Szelinger, M. Redman, D. Duggan, W. Tembe, J. Muehling, J. V. Pearson, D. A. Stephan, S. F. Nelson, and D. W. Craig. Resolving individuals contributing trace amounts of dna to highly complex mixtures using high-density snp genotyping microarrays. PLoS Genetics, 4(8), 2008.
[20]
D. Kifer and A. Machanavajjhala. No free lunch in data privacy. In SIGMOD Conference, pages 193--204, 2011.
[21]
A. Korolova, K. Kenthapadi, N. Mishra, and A. Ntoulas. Releasing search queries and clicks privately. In WWW, pages 171--180, 2009.
[22]
J. Lei. Differentially private m-estimators. In NIPS, 2011.
[23]
C. Li, M. Hay, V. Rastogi, G. Miklau, and A. McGregor. Optimizing linear counting queries under differential privacy. In PODS, pages 123--134, 2010.
[24]
C. Li and G. Miklau. Efficient batch query answering under differential privacy. CoRR, abs/1103.1367, 2011.
[25]
C. Li and G. Miklau. An adaptive mechanism for accurate query answering under differential privacy. PVLDB, 2012.
[26]
N. Li, T. Li, and S. Venkatasubramanian. t-closeness: Privacy beyond k-anonymity and l-diversity. In ICDE, pages 106--115, 2007.
[27]
Y. D. Li, Z. Zhang, M. Winslett, and Y. Yang. Compressive mechanism: Utilizing sparse representation in differential privacy. In WPES, pages 177--182, 2011.
[28]
A. Machanavajjhala, J. Gehrke, D. Kifer, and M. Venkitasubramaniam. l-diversity: Privacy beyond k-anonymity. In ICDE, page 24, 2006.
[29]
A. Machanavajjhala, A. Korolova, and A. D. Sarma. Personalized social recommendations -- accurate or private? PVLDB, 4(7):440--450, 2011.
[30]
F. McSherry and I. Mironov. Differentially private recommender systems: Building privacy into the netflix prize contenders. In KDD, pages 627--636, 2009.
[31]
F. McSherry and K. Talwar. Mechanism design via differential privacy. In FOCS, pages 94--103, 2007.
[32]
N. Mohammed, R. Chen, B. C. M. Fung, and P. S. Yu. Differentially private data release for data mining. In KDD, pages 493--501, 2011.
[33]
A. Narayanan and V. Shmatikov. Robust de-anonymization of large sparse datasets. In IEEE Symposium on Security and Privacy, pages 111--125, 2008.
[34]
K. Nissim, S. Raskhodnikova, and A. Smith. Smooth sensitivity and sampling in private data analysis. In STOC, pages 75--84, 2007.
[35]
S. Peng, Y. Yang, Z. Zhang, M. Winslett, and Y. Yu. DP-Tree: Indexing multi-dimensional data under differential privacy. In SIGMOD Conference, 2012(poster).
[36]
V. Rastogi and S. Nath. Differentially private aggregation of distributed time-series with transformation and encryption. In SIGMOD Conference, pages 735--746, 2010.
[37]
B. I. P. Rubinstein, P. L. Bartlett, L. Huang, and N. Taft. Learning in a large function space: Privacy-preserving mechanisms for svm learning. Journal of Privacy and Confidentiality, 2011.
[38]
P. Samarati and L. Sweeney. Generalizing data to provide anonymity when disclosing information (abstract). In PODS, page 188, 1998.
[39]
R. Wang, Y. Li, X. Wang, H. Tang, and X. Zhou. Learning your identity and disease from research papers: Information leaks in genome wide association study. In ACM CCS, 2009.
[40]
X. Xiao, G. Bender, M. Hay, and J. Gehrke. iReduct: Differential privacy with reduced relative errors. In SIGMOD, pages 229--240, 2011.
[41]
X. Xiao, G. Wang, and J. Gehrke. Differential privacy via wavelet transforms. In ICDE, pages 225--236, 2010.
[42]
X. Xiao, G. Wang, and J. Gehrke. Differential privacy via wavelet transforms. TKDE, 23(8):1200--1214, 2011.
[43]
J. Xu, Z. Zhang, X. Xiao, Y. Yang, and G. Yu. Differentially private histogram publication. In ICDE, 2012.

Cited By

View all
  • (2024)Pervasive User Data Collection from Cyberspace: Privacy Concerns and CountermeasuresCryptography10.3390/cryptography80100058:1(5)Online publication date: 31-Jan-2024
  • (2024)PP-CSA: Practical Privacy-Preserving Software Call Stack AnalysisProceedings of the ACM on Programming Languages10.1145/36498568:OOPSLA1(1264-1293)Online publication date: 29-Apr-2024
  • (2024)The Economics of Privacy and Utility: Investment StrategiesIEEE Transactions on Information Forensics and Security10.1109/TIFS.2023.334100819(1744-1755)Online publication date: 2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGMOD '12: Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
May 2012
886 pages
ISBN:9781450312479
DOI:10.1145/2213836
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 May 2012

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. data analysis
  2. differential privacy
  3. privacy-preserving data publication
  4. query processing

Qualifiers

  • Tutorial

Conference

SIGMOD/PODS '12
Sponsor:

Acceptance Rates

SIGMOD '12 Paper Acceptance Rate 48 of 289 submissions, 17%;
Overall Acceptance Rate 785 of 4,003 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)72
  • Downloads (Last 6 weeks)4
Reflects downloads up to 07 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Pervasive User Data Collection from Cyberspace: Privacy Concerns and CountermeasuresCryptography10.3390/cryptography80100058:1(5)Online publication date: 31-Jan-2024
  • (2024)PP-CSA: Practical Privacy-Preserving Software Call Stack AnalysisProceedings of the ACM on Programming Languages10.1145/36498568:OOPSLA1(1264-1293)Online publication date: 29-Apr-2024
  • (2024)The Economics of Privacy and Utility: Investment StrategiesIEEE Transactions on Information Forensics and Security10.1109/TIFS.2023.334100819(1744-1755)Online publication date: 2024
  • (2024)A Method for Quantifying Data Utility in Privacy-Preserving Scenarios2024 IEEE 2nd International Conference on Electrical, Automation and Computer Engineering (ICEACE)10.1109/ICEACE63551.2024.10898248(20-24)Online publication date: 29-Dec-2024
  • (2024)The privacy preserving auction mechanisms in IoT-based trading market: A surveyInternet of Things10.1016/j.iot.2024.10117826(101178)Online publication date: Jul-2024
  • (2024)Meta-DPSTL: meta learning-based differentially private self-taught learningInternational Journal of Machine Learning and Cybernetics10.1007/s13042-024-02134-215:9(4021-4053)Online publication date: 9-May-2024
  • (2023)$kt$-Safety: Graph Release via $k$-Anonymity and $t$-ClosenessIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2022.322133335:9(9102-9113)Online publication date: 1-Sep-2023
  • (2023)KVSAgg: Secure Aggregation of Distributed Key-Value Sets2023 IEEE 39th International Conference on Data Engineering (ICDE)10.1109/ICDE55515.2023.00139(1775-1789)Online publication date: Apr-2023
  • (2023)Enabling anonymized open-data linkage by authorized partiesJournal of Information Security and Applications10.1016/j.jisa.2023.10347874(103478)Online publication date: May-2023
  • (2023)Privacy Preservation in Big Data AnalyticsGranular, Fuzzy, and Soft Computing10.1007/978-1-0716-2628-3_755(649-669)Online publication date: 30-Mar-2023
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media