Article

The price of privacy and the limits of LP decoding

Authors:

Frank McSherry,

Kunal TalwarAuthors Info & Claims

STOC '07: Proceedings of the thirty-ninth annual ACM symposium on Theory of computing

Pages 85 - 94

https://doi.org/10.1145/1250790.1250804

Published: 11 June 2007 Publication History

Abstract

This work is at theintersection of two lines of research. One line, initiated by Dinurand Nissim, investigates the price, in accuracy, of protecting privacy in a statistical database. The second, growing from an extensive literature on compressed sensing (see in particular the work of Donoho and collaborators [4,7,13,11])and explicitly connected to error-correcting codes by Candès and Tao ([4]; see also [5,3]), is in the use of linearprogramming for error correction.

Our principal result is the discovery of a sharp threshhold ρ*∠ 0.239, so that if ρ < ρ* and A is a random m x n encoding matrix of independently chosen standardGaussians, where m = O(n), then with overwhelming probability overchoice of A, for all x ∈ Rⁿ, LP decoding corrects ⌊ ρ m⌋ arbitrary errors in the encoding Ax, while decoding can be made to fail if the error rate exceeds ρ*. Our boundresolves an open question of Candès, Rudelson, Tao, and Vershyin [3] and (oddly, but explicably) refutesempirical conclusions of Donoho [11] and Candès et al [3]. By scaling and rounding we can easilytransform these results to obtain polynomial-time decodable random linear codes with polynomial-sized alphabets tolerating any ρ < ρ* ∠ 0.239 fraction of arbitrary errors.

In the context of privacy-preserving datamining our results say thatany privacy mechanism, interactive or non-interactive, providingreasonably accurate answers to a 0.761 fraction of randomly generated weighted subset sum queries, and arbitrary answers on the remaining 0.239 fraction, is blatantly non-private.

References

[1]

D. Achlioptas. Database-friendly random projections: Johnson-Lindenstrauss with binary coins. J. Comput. Syst. Sci., 66(4):671--687, 2003.

Digital Library

[2]

A. Blum, C. Dwork, F. McSherry, and K. Nissim. Practical privacy: the sulq framework. In C. Li, editor, PODS, pages 128--138. ACM, 2005.

Digital Library

[3]

E.J. Candès, M. Rudelson, T. Tao, and R. Vershynin. Error correction via linear programming. In FOCS, pages 295--308. IEEE Computer Society, 2005.

[4]

E.J. Candès and T. Tao. Decoding by linear programming. IEEE Transactions on Information Theory, 51(12):4203--4215, 2005.

Digital Library

[5]

E.J. Candès and T. Tao. Error correction via linear programming, 2005.

[6]

E.J. Candès and PRandall. Highly robust error correction by convex programming. Submitted, 2006.

[7]

S. Chen, D. Donoho, and M. Saunders. Atomic decomposition by basis pursuit. SIAM J. Sci Comp, 48(1):33--61, 1999.

Digital Library

[8]

I. Dinur, C. Dwork, and K. Nissim. Privacy in public databases: A foundational approach, 2005. Manuscript.

[9]

I. Dinur and K. Nissim. Revealing information while preserving privacy. In PODS, pages 202--210. ACM, 2003.

Digital Library

[10]

D. Donoho. For most large underdetermined systems of linear equations, the minimal 11-norm near-solution approximates the sparsest near-solution. Communications on Pure and Applied Mathematics, 59(7):907--934, 2006.

[11]

D. Donoho. For most large underdetermined systems of linear equations, the minimal l1-norm solution is also the sparsest solution. Communications on Pure and Applied Mathematics, 59(6):797--829, 2006.

[12]

D. Donoho. High-dimensional centrally symmetric polytopes with neighborliness proportional to dimension. Discrete and Computational Geometry, 35(4):617--652, 2006.

Digital Library

[13]

D. Donoho and X. Huo. Uncertainty principles and ideal atomic decomposition. IEEE Transactions on Information Theory, 48:2845--2862, 2001.

Digital Library

[14]

D. Donoho and I.M. Johnstone. Minimax estimation via wavelet shrinkage. Annals of Statistics, 26(3):879--921, 1998.

[15]

D. Donoho and J. Tanner. Thresholds for the recovery of sparse solutions via l1 minimization. In Proceedings of the Conference on Information Sciences and Systems, 2006.

[16]

D.L. Donoho. Compressed sensing. IEEE Transactions on Information Theory, 52(4):1289--1306, 2006.

Digital Library

[17]

C. Dwork, F. McSherry, K. Nissim, and A. Smith. Calibrating noise to sensitivity in private data analysis. In S. Halevi and T. Rabin, editors, TCC, volume 3876 of Lecture Notes in Computer Science, pages 265--284. Springer, 2006.

Digital Library

[18]

C. Dwork and K. Nissim. Privacy-preserving datamining on vertically partitioned databases. In M.K. Franklin, editor, CRYPTO, volume 3152 of Lecture Notes in Computer Science, pages 528--544. Springer, 2004.

[19]

J. Feldman. Decoding error-correcting codes via linear programming. Master's thesis, Massachusetts Institute of Technology, 2003.

Digital Library

[20]

J. Feldman and D. Karger. Decoding turbo-like codes via linear programming. Journal of Computer and System Sciences, 68(4):733--752, 2004.

Digital Library

[21]

J. Feldman, T. Malkin, C. Stein, R. Servedio, and M. Wainwright. LP decoding corrects a constant fraction of errors (an expanded version). In ISIT, pages 68--. IEEE, 2004.

Digital Library

[22]

J. Feldman and C. Stein. LP decoding achieves capacity. In SODA, pages 460--469. SIAM, 2005.

Digital Library

[23]

L. Goldstein. l¹ bounds in normal approximation. Annals of Probability, 2006. To Appear.

[24]

M. Ledoux. The Concentration of Measure Phenomenon, volume 89 of Mathematical Surveys and Monographs. American Mathematical Society, 2001.

[25]

M. Talagrand. Concentration of measure and isoperimetric inequalities in product space. Publ. Math. I.H.E.S., 81:73--205, 1995.

Cited By

Ferry JFukasawa RPascal TVidal TSalakhutdinov RKolter ZHeller KWeller AOliver NScarlett JBerkenkamp F(2024)Trained random forests completely reveal your datasetProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3692612(13545-13569)Online publication date: 21-Jul-2024
https://dl.acm.org/doi/10.5555/3692070.3692612
Biswas ACormode GKanza YSrivastava DZhou Z(2024)Differentially Private Hierarchical Heavy HittersProceedings of the ACM on Management of Data10.1145/36958262:5(1-25)Online publication date: 7-Nov-2024
https://dl.acm.org/doi/10.1145/3695826
Stevanoski BCretu Ade Montjoye YLuo BLiao XXu JKirda ELie D(2024)QueryCheetah: Fast Automated Discovery of Attribute Inference Attacks Against Query-Based SystemsProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security10.1145/3658644.3690272(3451-3465)Online publication date: 2-Dec-2024
https://dl.acm.org/doi/10.1145/3658644.3690272
Show More Cited By

Index Terms

The price of privacy and the limits of LP decoding

Recommendations

Dequantizing compressed sensing with non-Gaussian constraints
ICIP'09: Proceedings of the 16th IEEE international conference on Image processing

In this paper, following the Compressed Sensing (CS) paradigm, we study the problem of recovering sparse or compressible signals from uniformly quantized measurements. We present a new class of convex optimization programs, or decoders, coined Basis ...
Block-sparse signals: uncertainty relations and efficient recovery

We consider efficient methods for the recovery of block-sparse signals--i.e., sparse signals that have nonzero entries occurring in clusters--from an underdetermined system of linear equations. An uncertainty relation for block-sparse signals is derived,...
Message passing algorithms and improved LP decoding
STOC '09: Proceedings of the forty-first annual ACM symposium on Theory of computing

Linear programming decoding for low-density parity check codes (and related domains such as compressed sensing) has received increased attention over recent years because of its practical performance --coming close to that of iterative decoding ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

STOC '07: Proceedings of the thirty-ninth annual ACM symposium on Theory of computing

June 2007

734 pages

ISBN:9781595936318

DOI:10.1145/1250790

General Chair:
David Johnson
AT&T Labs - Research
,
Program Chair:
Uriel Feige
Microsoft Research and Weizmann Institute

Copyright © 2007 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 June 2007

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Article

Conference

STOC07

Sponsor:

STOC07: Symposium on Theory of Computing

June 11 - 13, 2007

California, San Diego, USA

Acceptance Rates

Overall Acceptance Rate 1,469 of 4,586 submissions, 32%

Upcoming Conference

STOC '25

Sponsor:
sigact

57th Annual ACM Symposium on Theory of Computing (STOC 2025)

June 23 - 27, 2025

Prague , Czech Republic

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

113
Total Citations
View Citations
1,441
Total Downloads

Downloads (Last 12 months)53
Downloads (Last 6 weeks)4

Reflects downloads up to 13 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Ferry JFukasawa RPascal TVidal TSalakhutdinov RKolter ZHeller KWeller AOliver NScarlett JBerkenkamp F(2024)Trained random forests completely reveal your datasetProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3692612(13545-13569)Online publication date: 21-Jul-2024
https://dl.acm.org/doi/10.5555/3692070.3692612
Biswas ACormode GKanza YSrivastava DZhou Z(2024)Differentially Private Hierarchical Heavy HittersProceedings of the ACM on Management of Data10.1145/36958262:5(1-25)Online publication date: 7-Nov-2024
https://dl.acm.org/doi/10.1145/3695826
Stevanoski BCretu Ade Montjoye YLuo BLiao XXu JKirda ELie D(2024)QueryCheetah: Fast Automated Discovery of Attribute Inference Attacks Against Query-Based SystemsProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security10.1145/3658644.3690272(3451-3465)Online publication date: 2-Dec-2024
https://dl.acm.org/doi/10.1145/3658644.3690272
Gadotti ARocher LHoussiau FCreţu Ade Montjoye Y(2024)Anonymization: The imperfect science of using data while preserving privacyScience Advances10.1126/sciadv.adn705310:29Online publication date: 19-Jul-2024
https://doi.org/10.1126/sciadv.adn7053
Ferry JAïvodji UGambs SHuguet MSiala M(2024)Probabilistic Dataset Reconstruction from Interpretable Models2024 IEEE Conference on Secure and Trustworthy Machine Learning (SaTML)10.1109/SaTML59370.2024.00009(1-17)Online publication date: 9-Apr-2024
https://doi.org/10.1109/SaTML59370.2024.00009
Krichene WJain PSong SSundararajan MThakurta AZhang LKrause ABrunskill ECho KEngelhardt BSabato SScarlett J(2023)Multi-task differential privacy under distribution skewProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3619139(17784-17807)Online publication date: 23-Jul-2023
https://dl.acm.org/doi/10.5555/3618408.3619139
Ferry JAïvodji UGambs SHuguet MSiala M(2023)Exploiting Fairness to Enhance Sensitive Attributes Reconstruction2023 IEEE Conference on Secure and Trustworthy Machine Learning (SaTML)10.1109/SaTML54575.2023.00012(18-41)Online publication date: Feb-2023
https://doi.org/10.1109/SaTML54575.2023.00012
Ghazi BIlango RKamath PKumar RManurangsi P(2023)Towards Separating Computational and Statistical Differential Privacy2023 IEEE 64th Annual Symposium on Foundations of Computer Science (FOCS)10.1109/FOCS57990.2023.00042(580-599)Online publication date: 6-Nov-2023
https://doi.org/10.1109/FOCS57990.2023.00042
Ghayyur SGhosh DHe XMehrotra S(2022)MIDEProceedings of the VLDB Endowment10.14778/3551793.355182115:11(2653-2665)Online publication date: 1-Jul-2022
https://dl.acm.org/doi/10.14778/3551793.3551821
Balle BCherubin GHayes J(2022)Reconstructing Training Data with Informed Adversaries2022 IEEE Symposium on Security and Privacy (SP)10.1109/SP46214.2022.9833677(1138-1156)Online publication date: May-2022
https://doi.org/10.1109/SP46214.2022.9833677
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten