Skip to main content
Log in

Lower Bounds on the Error of Query Sets Under the Differentially-Private Matrix Mechanism

  • Published:
Theory of Computing Systems Aims and scope Submit manuscript

Abstract

A common goal of privacy research is to release synthetic data that satisfies a formal privacy guarantee and can be used by an analyst in place of the original data. To achieve reasonable accuracy, a synthetic data set must be tuned to support a specified set of queries accurately, sacrificing fidelity for other queries. This work considers methods for producing synthetic data under differential privacy and investigates what makes a set of queries β€œeasy” or β€œhard” to answer. We consider this issue in the particular case of answering sets of linear counting queries using the matrix mechanism (Li et al. 2010), a recent differentially-private mechanism that can reduce error by adding complex correlated noise adapted to a specified workload. Our main result is a novel lower bound on the minimum total error required to simultaneously release answers to a set of workload queries when using the matrix mechanism. The bound reveals that the hardness of a query workload is related to the spectral properties of the workload when it is represented in matrix form. Under (πœ–, Ξ΄)-differential privacy, we prove that this bound is tight for many common workloads such as the set of all predicate queries and the set of all k-way marginals. Our empirical study also indicates this bound is close-to-tight on workloads consisting of random interval queries or random marginals.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig.Β 4

Similar content being viewed by others

Notes

  1. This can also be done in an interactive (or online) setting, in which the workload queries are not known in advance, but our focus is the non-interactive (or batch) setting.

  2. Although there is no benefit in sensitivity when the workload W is used as the strategy, there are in fact some workloads for which this special instance of the matrix mechanism has lower error than the Gaussian mechanism. This happens because the matrix mechanism is able to combine related query answers from the strategy to form more accurate answers to the workload queries.

  3. The approaches in [20,30] were originally proposed in the context of πœ–-differential privacy, but their behavior is similar under (πœ–,Ξ΄)-differential privacy.

References

  1. Ács, G., Castelluccia, C., Chen, R.: Differentially private histogram publishing through lossy compression. In: ICDM, pp. 1–10 (2012)

  2. Barak, B., Chaudhuri, K., Dwork, C., Kale, S., McSherry, F., Talwar, K.: Privacy, accuracy, and consistency too: A holistic solution to contingency table release. In: PODS (2007)

  3. Ben-Israel, A., Greville, T.: Generalized inverses: Theory and applications, vol. 15. Springer (2003)

  4. Bhaskara, A., Dadush, D., Krishnaswamy, R., Talwar, K.: Unconditional differentially private mechanisms for linear queries. In: STOC, pp. 1269–1284, New York, NY, USA (2012)

  5. Blum, A., Ligett, K., Roth, A.: A learning theory approach to non-interactive database privacy. In: STOC, pp. 609–618 (2008)

  6. Cormode, G., Procopiuc, M., Shen, E., Srivastava, D., Yu, T.: Differentially private spatial decompositions. ICDE, 20–31 (2012)

  7. Ding, B., Winslett, M., Han, J., Li, Z.: Differentially private data cubes: optimizing noise sources and consistency. In: SIGMOD, pp. 217–228 (2011)

  8. Dinur, I., Nissim, K.: Revealing information while preserving privacy. In: PODS, pp. 202–210 (2003)

  9. Dwork, C., Kenthapadi, K., McSherry, F., Mironov, I., Naor, M.: Our data, ourselves: Privacy via distributed noise generation. In: EUROCRYPT, pp. 486–503 (2006)

  10. Dwork, C., McSherry, F., Nissim, K., Smith, A.: Calibrating noise to sensitivity in private data analysis. In: TCC, pp. 265–284 (2006)

  11. Dwork, C., Naor, M., Reingold, O., Rothblum, G., Vadhan, S.: On the complexity of differentially private data release: efficient algorithms and hardness results. In: STOC, pp. 381–390 (2009)

  12. Dwork, C., Rothblum, G.N., Vadhan, S.P.: Boosting and differential privacy. In: FOCS, pp. 51–60 (2010)

  13. Fawaz, N., Muthukrishnan, S., Nikolov, A.: Nearly optimal private convolution. In: ESA, pp. 445–456 (2013)

  14. Fulton, W.: Eigenvalues, invariant factors, highest weights, and schubert calculus. Bulletin of the AMS 37(3), 209–250 (2000)

    ArticleΒ  MATHΒ  MathSciNetΒ  Google ScholarΒ 

  15. Gray, R.M.: Toeplitz and circulant matrices: A review. Now Pub (2006)

  16. Gupta, A., Roth, A., Ullman, J.: Iterative constructions and private data release. In: TCC, pp. 339–356 (2012)

  17. Hardt, M., Ligett, K., McSherry, F.: A simple and practical algorithm for differentially private data release. In: NIPS, pp. 2348–2356 (2012)

  18. Hardt, M., Rothblum, G.: A multiplicative weights mechanism for privacy-preserving data analysis. In: FOCS, pp. 61–70 (2010)

  19. Hardt, M., Talwar, K.: On the geometry of differential privacy. In: STOC, pp. 705–714 (2010)

  20. Hay, M., Rastogi, V., Miklau, G., Suciu, D.: Boosting the accuracy of differentially-private histograms through consistency. PVLDB 3(1-2), 1021–1032 (2010)

    Google ScholarΒ 

  21. Kasiviswanathan, S., Rudelson, M., Smith, A., Ullman, J.: The price of privately releasing contingency tables and the spectra of random matrices with correlated rows. In: STOC, pp. 775–784 (2010)

  22. Li, C., Hay, M., Rastogi, V., Miklau, G., McGregor, A.: Optimizing linear counting queries under differential privacy. In: PODS, pp. 123–134 (2010)

  23. Li, C., Miklau, G.: An adaptive mechanism for accurate query answering under differential privacy. PVLDB 5(6), 514–525 (2012)

    Google ScholarΒ 

  24. Li, Y.D., Zhang, Z., Winslett, M., Yang, Y.: Compressive mechanism: utilizing sparse representation in differential privacy. In: WPES, pp. 177–182 (2011)

  25. McSherry, F., Mironov, I.: Differentially Private Recommender Systems: Building Privacy into the Netflix Prize Contenders. In: SIGKDD, pp. 627–636 (2009)

  26. Nikolov, A., Talwar, K., Zhang, L.: The geometry of differential privacy: the sparse and approximate cases. In: Proceedings of the 45th annual ACM symposium on Symposium on theory of computing, STOC ’13, pp. 351–360 (2013)

  27. Nissim, K., Raskhodnikova, S., Smith, A.: Smooth sensitivity and sampling in private data analysis. In: STOC, pp. 75–84 (2007)

  28. Rastogi, V., Suciu, D., Hong, S.: The boundary between privacy and utility in data publishing. In: VLDB, pp. 531–542 (2007)

  29. Roth, A., Roughgarden, T.: Interactive privacy via the median mechanism. In: STOC, pp. 765–774 (2010)

  30. Xiao, X., Wang, G., Gehrke, J.: Differential privacy via wavelet transforms. In: ICDE, pp. 225–236 (2010)

  31. Xiao, Y., Xiong, L., Yuan, C.: Differentially private data release through multidimensional partitioning. In: SDM, pp. 150–168 (2010)

  32. Xu, J., Zhang, Z., Xiao, X., Yang, Y., Yu, G., Winslett, M.: Differentially private histogram publication. The VLDB Journal, 1–26 (2013)

  33. Yaroslavtsev, G., Cormode, G., Procopiuc, C. M., Srivastava, D.: Accurate and efficient private release of datacubes and contingency tables. In: ICDE (2013)

  34. Yuan, G., Zhang, Z., Winslett, M., Xiao, X., Yang, Y., Hao, Z.: Low-rank mechanism: Optimizing batch queries under differential privacy. PVLDB 5(11), 1136–1147 (2012)

    Google ScholarΒ 

Download references

Acknowledgments

We appreciate the thoughtful comments of the anonymous reviewers. Li was supported by NSF CNS-1012748. Miklau was partially supported by NSF CNS-1012748, NSF CNS-0964094, NSF CNS-1409143, and the European Research Council under the Webdam grant, No. 226513.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chao Li.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, C., Miklau, G. Lower Bounds on the Error of Query Sets Under the Differentially-Private Matrix Mechanism. Theory Comput Syst 57, 1159–1201 (2015). https://doi.org/10.1007/s00224-015-9610-z

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00224-015-9610-z

Keywords

Navigation