Abstract
We define On-Average KL-Privacy and present its properties and connections to differential privacy, generalization and information-theoretic quantities including max-information and mutual information. The new definition significantly weakens differential privacy, while preserving its minimal design features such as composition over small group and multiple queries as well as closeness to post-processing. Moreover, we show that On-Average KL-Privacy is equivalent to generalization for a large class of commonly-used tools in statistics and machine learning that samples from Gibbs distributions—a class of distributions that arises naturally from the maximum entropy principle. In addition, a byproduct of our analysis yields a lower bound for generalization error in terms of mutual information which reveals an interesting interplay with known upper bounds that use the same quantity.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
We will formally define these quantities.
- 2.
These assumptions are only for presentation simplicity. The notion of On-Average KL-privacy can naturally handle mixture of densities and point masses.
References
Akaike, H.: Likelihood of a model and information criteria. J. Econometrics 16(1), 3–14 (1981)
Altun, Y., Smola, A.J.: Unifying divergence minimization and statistical inference via convex duality. In: Lugosi, G., Simon, H.U. (eds.) COLT 2006. LNCS (LNAI), vol. 4005, pp. 139–153. Springer, Heidelberg (2006)
Anderson, N.: “anonymized” data really isn’t and here’s why not (2009). http://arstechnica.com/tech-policy/2009/09/your-secrets-live-online-in-databases-of-ruin/
Barber, R.F., Duchi, J.C.: Privacy and statistical risk: Formalisms and minimax bounds. arXiv preprint arXiv:1412.4451 (2014)
Bassily, R., Nissim, K., Smith, A., Steinke, T., Stemmer, U., Ullman, J.: Algorithmic stability for adaptive data analysis. arXiv preprint arXiv:1511.02513 (2015)
Berger, A.L., Pietra, V.J.D., Pietra, S.A.D.: A maximum entropy approach to natural language processing. Comput. Linguist. 22(1), 39–71 (1996)
Bousquet, O., Elisseeff, A.: Stability and generalization. J. Mach. Learn. Res. 2, 499–526 (2002)
Duncan, G.T., Elliot, M., Salazar-González, J.J.: Statistical Confidentiality: Principle and Practice. Springer, New York (2011)
Duncan, G.T., Fienberg, S.E., Krishnan, R., Padman, R., Roehrig, S.F.: Disclosure limitation methods and information loss for tabular data. In: Confidentiality, Disclosure and Data Access: Theory and Practical Applications for Statistical Agencies, pp. 135–166 (2001)
Dwork, C.: Differential privacy. In: Bugliesi, M., Preneel, B., Sassone, V., Wegener, I. (eds.) ICALP 2006. LNCS, vol. 4052, pp. 1–12. Springer, Heidelberg (2006)
Dwork, C., Feldman, V., Hardt, M., Pitassi, T., Reingold, O., Roth, A.: Generalization in adaptive data analysis and holdout reuse. In: Advances in Neural Information Processing Systems (NIPS 2015), pp. 2341–2349 (2015)
Dwork, C., Feldman, V., Hardt, M., Pitassi, T., Reingold, O., Roth, A.L.: Preserving statistical validity in adaptive data analysis. In: ACM on Symposium on Theory of Computing (STOC 2015), pp. 117–126. ACM (2015)
Dwork, C., Kenthapadi, K., McSherry, F., Mironov, I., Naor, M.: Our data, ourselves: privacy via distributed noise generation. In: Vaudenay, S. (ed.) EUROCRYPT 2006. LNCS, vol. 4004, pp. 486–503. Springer, Heidelberg (2006)
Dwork, C., McSherry, F., Nissim, K., Smith, A.: Calibrating noise to sensitivity in private data analysis. In: Halevi, S., Rabin, T. (eds.) TCC 2006. LNCS, vol. 3876, pp. 265–284. Springer, Heidelberg (2006)
Ebadi, H., Sands, D., Schneider, G.: Differential privacy: now it’s getting personal. In: ACM Symposium on Principles of Programming Languages, pp. 69–81. ACM (2015)
Fienberg, S.E., Rinaldo, A., Yang, X.: Differential privacy and the risk-utility tradeoff for multi-dimensional contingency tables. In: Domingo-Ferrer, J., Magkos, E. (eds.) PSD 2010. LNCS, vol. 6344, pp. 187–199. Springer, Heidelberg (2010)
Hall, R., Rinaldo, A., Wasserman, L.: Random differential privacy. arXiv preprint arXiv:1112.2680 (2011)
Hardt, M., Ullman, J.: Preventing false discovery in interactive data analysis is hard. In: IEEE Symposium on Foundations of Computer Science (FOCS 2014), pp. 454–463. IEEE (2014)
Hundepool, A., Domingo-Ferrer, J., Franconi, L., Giessing, S., Nordholt, E.S., Spicer, K., De Wolf, P.P.: Statistical Disclosure Control. Wiley (2012)
Jaynes, E.T.: Information theory and statistical mechanics. Phys. Rev. 106(4), 620 (1957)
Kearns, M., Ron, D.: Algorithmic stability and sanity-check bounds for leave-one-out cross-validation. Neural Comput. 11(6), 1427–1453 (1999)
Liu, Z., Wang, Y.X., Smola, A.: Fast differentially private matrix factorization. In: ACM Conference on Recommender Systems (RecSys 2015), pp. 171–178. ACM (2015)
McSherry, F., Talwar, K.: Mechanism design via differential privacy. In: IEEE Symposium on Foundations of Computer Science (FOCS 2007), pp. 94–103 (2007)
Mir, D.J.: Information-theoretic foundations of differential privacy. In: Garcia-Alfaro, J., Cuppens, F., Cuppens-Boulahia, N., Miri, A., Tawbi, N. (eds.) FPS 2012. LNCS, vol. 7743, pp. 374–381. Springer, Heidelberg (2013)
Mosteller, F., Tukey, J.W.: Data analysis, including statistics (1968)
Mukherjee, S., Niyogi, P., Poggio, T., Rifkin, R.: Learning theory: stability is sufficient for generalization and necessary and sufficient for consistency of empirical risk minimization. Adv. Comput. Math. 25(1–3), 161–193 (2006)
Narayanan, A., Shmatikov, V.: Robust de-anonymization of large sparse datasets. In: IEEE Symposium on Security and Privacy, pp. 111–125. IEEE, September 2008
Russo, D., Zou, J.: Controlling bias in adaptive data analysis using information theory. In: International Conference on Artificial Intelligence and Statistics (AISTATS 2016) (2016)
Shalev-Shwartz, S., Shamir, O., Srebro, N., Sridharan, K.: Learnability, stability and uniform convergence. J. Mach. Learn. Res. 11, 2635–2670 (2010)
Steinke, T., Ullman, J.: Interactive fingerprinting codes and the hardness of preventing false discovery. arXiv preprint arXiv:1410.1228 (2014)
Tishby, N., Pereira, F.C., Bialek, W.: The information bottleneck method. arXiv preprint arXiv:physics/0004057 (2000)
Uhlerop, C., Slavković, A., Fienberg, S.E.: Privacy-preserving data sharing for genome-wide association studies. J. Priv. Confidentiality 5(1), 137 (2013)
Van Erven, T., Harremoës, P.: Rényi divergence and kullback-leibler divergence. IEEE Trans. Inf. Theor. 60(7), 3797–3820 (2014)
Wang, Y.X., Fienberg, S.E., Smola, A.: Privacy for free: posterior sampling and stochastic gradient monte carlo. In: International Conference on Machine Learning (ICML 2015) (2015)
Wang, Y.X., Lei, J., Fienberg, S.E.: Learning with differential privacy: stability, learnability and the sufficiency and necessity of erm principle. J. Mach. Learn. Res. (to appear, 2016)
Wang, Y.X., Lei, J., Fienberg, S.E.: On-average kl-privacy and its equivalence to generalization for max-entropy mechanisms (2016). preprint http://www.cs.cmu.edu/~yuxiangw/publications.html
Yau, N.: Lessons from improperly anonymized taxi logs (2014). http://flowingdata.com/2014/06/23/lessons-from-improperly-anonymized-taxi-logs/
Yu, F., Fienberg, S.E., Slavković, A.B., Uhler, C.: Scalable privacy-preserving data sharing methodology for genome-wide association studies. J. Biomed. Inform. 50, 133–141 (2014)
Zhou, S., Lafferty, J., Wasserman, L.: Compressed and privacy-sensitive sparse regression. IEEE Trans. Inf. Theor. 55(2), 846–866 (2009)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
A Proofs
A Proofs
We now prove Theorem 4 and Lemma 14. Due to space limit, proof of Lemmas 6, 7 and 11 are given in the technical report [36].
Proof of Theorem 4. We start by a ghost sample argument.
The \(K_i\) and \(K_i'\) are partition functions of \(p_{\mathcal {A}(Z)}(h)\) and \(p_{\mathcal {A}([Z_{-i},z_i'])}(h)\) respectively. Since \(z_i\sim z_i'\), we know \( \mathbb {E}K_i - \mathbb {E}K_i' = 0.\) The proof is complete by noting the non-negativity of On-Average KL-Privacy, which allows us to take absolute value without changing the equivalence. \(\square \)
Proof of Lemma 14. Denote \(p(\mathcal {A}(Z)) = p(h|Z)\). \(p(h,Z) = p(h|Z) p(Z)\). The marginal distribution of h is therefore \(p(h)=\int _Z p(h,Z) dZ = \mathbb {E}_{Z} p(\mathcal {A}(Z)) \). By definition,
The inequality remark in the last line follows from Jensen’s inequality. \(\square \)
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Wang, YX., Lei, J., Fienberg, S.E. (2016). On-Average KL-Privacy and Its Equivalence to Generalization for Max-Entropy Mechanisms. In: Domingo-Ferrer, J., Pejić-Bach, M. (eds) Privacy in Statistical Databases. PSD 2016. Lecture Notes in Computer Science(), vol 9867. Springer, Cham. https://doi.org/10.1007/978-3-319-45381-1_10
Download citation
DOI: https://doi.org/10.1007/978-3-319-45381-1_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-45380-4
Online ISBN: 978-3-319-45381-1
eBook Packages: Computer ScienceComputer Science (R0)