Differentially Private Non-parametric Machine Learning as a Service

Dandekar, Ashish; Basu, Debabrota; Bressan, Stéphane

doi:10.1007/978-3-030-27615-7_14

Ashish Dandekar¹⁴,
Debabrota Basu¹⁴ &
Stéphane Bressan¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11706))

Included in the following conference series:

International Conference on Database and Expert Systems Applications

1511 Accesses
2 Citations

Abstract

Machine learning algorithms create models from training data for the purpose of estimation, prediction and classification. While releasing parametric machine learning models requires the release of the parameters of the model, releasing non-parametric machine learning models requires the release of the training dataset along with the parameters. The release of the training dataset creates a risk of breach of privacy. An alternative to the release of the training dataset is the presentation of the non-parametric model as a service. Still, the non-parametric model as a service may leak information about the training dataset.

We study how to provide differential privacy guarantees for non-parametric models as a service. We show how to apply the perturbation to the model functions of histogram, kernel density estimator, kernel SVM and Gaussian process regression in order to provide \((\epsilon , \delta )\)-differential privacy. We empirically evaluate the trade-off between the privacy guarantee and the error incurred for each of these non-parametric machine learning algorithms on benchmarks and real-world datasets.

Our contribution is twofold. We show that functional perturbation is not only pragmatic for releasing machine learning models as a service but also yields higher effectiveness than output perturbation mechanisms for specified privacy parameters. We show a practical step to perturbate the model functions of histogram, kernel SVM, Gaussian process regression along with kernel density estimator and perform evaluation on a real-world dataset as well as a selection of benchmarks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Model function refers to the mapping from input to output that is learned by the corresponding machine learning algorithm.
2.
https://cloud.google.com/ml-engine/.
3.
https://azure.microsoft.com/en-us/services/machine-learning-studio/.
4.
https://aws.amazon.com/machine-learning/.
5.
https://www.ibm.com/cloud/.

References

Minnesota population center. Integrated public use microdata series – international: Version 5.0 (2009). https://international.ipums.org
Abadi, M., et al.: Deep learning with differential privacy. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pp. 308–318. ACM (2016)
Google Scholar
Aldà, F., Rubinstein, B.: The Bernstein mechanism: function release under differential privacy (2017)
Google Scholar
Ateniese, G., Mancini, L.V., Spognardi, A., Villani, A., Vitali, D., Felici, G.: Hacking smart machines with smarter ones: how to extract meaningful data from machine learning classifiers. Int. J. Secure. Network. 10(3), 137–150 (2015)
Article Google Scholar
Balog, M., Tolstikhin, I., Schölkopf, B.: Differentially private database release via kernel mean embeddings. arXiv preprint arXiv:1710.01641 (2017)
Boser, B.E., Guyon, I.M., Vapnik, V.N.: A training algorithm for optimal margin classifiers. In: Proceedings of the Fifth Annual Workshop on Computational Learning Theory, pp. 144–152. ACM (1992)
Google Scholar
Chaudhuri, K., Monteleoni, C.: Privacy-preserving logistic regression. In: Advances in Neural Information Processing Systems, pp. 289–296 (2009)
Google Scholar
Chaudhuri, K., Monteleoni, C., Sarwate, A.D.: Differentially private empirical risk minimization. J. Mach. Learn. Res. 12, 1069–1109 (2011)
MathSciNet MATH Google Scholar
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)
MATH Google Scholar
Dandekar, A., Basu, D., Bressan, S.: Evaluation of differentially private non-parametric machine learning as a service. Technical report TRA3/19, National University of Singapore, March 2019
Google Scholar
Dandekar, A., Basu, D., Kister, T., Poh, G.S., Xu, J., Bressan, S.: Privacy as a service: publishing data and models. In: Li, G., Yang, J., Gama, J., Natwichai, J., Tong, Y. (eds.) DASFAA 2019. LNCS, vol. 11448, pp. 557–561. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-18590-9_86
Chapter Google Scholar
Dheeru, D., Karra Taniskidou, E.: UCI machine learning repository (2017). http://archive.ics.uci.edu/ml
Dwork, C.: Differential privacy. In: Bugliesi, M., Preneel, B., Sassone, V., Wegener, I. (eds.) ICALP 2006. LNCS, vol. 4052, pp. 1–12. Springer, Heidelberg (2006). https://doi.org/10.1007/11787006_1
Chapter Google Scholar
Dwork, C., Roth, A., et al.: The algorithmic foundations of differential privacy. Found. Trends® Theor. Comput. Sci. 9(3–4), 211–407 (2014)
MathSciNet MATH Google Scholar
Dwork, C., Smith, A., Steinke, T., Ullman, J.: Exposed! a survey of attacks on private data. Ann. Rev. Stat. Appl. 4, 61–84 (2017)
Article Google Scholar
Fredrikson, M., Lantz, E., Jha, S., Lin, S., Page, D., Ristenpart, T.: Privacy in pharmacogenetics: an end-to-end case study of personalized warfarin dosing. In: Proceedings of the USENIX Security Symposium. UNIX Security Symposium, vol. 2014, pp. 17–32. NIH Public Access (2014)
Google Scholar
Fredrikson, M., Jha, S., Ristenpart, T.: Model inversion attacks that exploit confidence information and basic countermeasures. In: Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, pp. 1322–1333. ACM (2015)
Google Scholar
Hall, R., Rinaldo, A., Wasserman, L.: Differential privacy for functions and functional data. J. Mach. Learn. Res. 14, 703–727 (2013)
MathSciNet MATH Google Scholar
Homer, N., et al.: Resolving individuals contributing trace amounts of DNA to highly complex mixtures using high-density SNP genotyping microarrays. PLoS Genet. 4(8), e1000167 (2008)
Article Google Scholar
Howell, N.: Data from a partial census of the !kung san, dobe. (1967). https://public.tableau.com/profile/john.marriott#!/vizhome/kung-san/Attributes
Jain, P., Thakurta, A.: Differentially private learning with kernels. In: Dasgupta, S., McAllester, D. (eds.) Proceedings of the 30th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 28, pp. 118–126. PMLR (2013)
Google Scholar
Murphy, K.P.: Machine Learning: A Probabilistic Perspective. The MIT Press, Cambridge (2012)
MATH Google Scholar
Nozari, E., Tallapragada, P., Cortés, J.: Differentially private distributed convex optimization via functional perturbation. IEEE Trans. Control Netw. Syst. 5(1), 395–408 (2018)
Article MathSciNet Google Scholar
Parzen, E.: On estimation of a probability density function and mode. Ann. Math. Stat. 33(3), 1065–1076 (1962)
Article MathSciNet Google Scholar
Pop, D.: Machine learning and cloud computing: survey of distributed and SaaS solutions. arXiv preprint arXiv:1603.08767 (2016)
Rasmussen, C.E.: Gaussian processes in machine learning. In: Bousquet, O., von Luxburg, U., Rätsch, G. (eds.) ML 2003. LNCS (LNAI), vol. 3176, pp. 63–71. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-28650-9_4
Chapter Google Scholar
Rubinstein, B.I., Bartlett, P.L., Huang, L., Taft, N.: Learning in a large function space: privacy-preserving mechanisms for SVM learning. J. Priv. Confid. 4(1) (2012)
Google Scholar
Shokri, R., Stronati, M., Song, C., Shmatikov, V.: Membership inference attacks against machine learning models. In: 2017 IEEE Symposium on Security and Privacy (SP), pp. 3–18. IEEE (2017)
Google Scholar
Smith, M.T., Zwiessele, M., Lawrence, N.D.: Differentially private Gaussian processes. arXiv preprint arXiv:1606.00720 (2016)
Smola, A.J., Schölkopf, B.: Learning with kernels, vol. 4. Citeseer (1998)
Google Scholar
Yu, F., Rybar, M., Uhler, C., Fienberg, S.E.: Differentially-private logistic regression for detecting multiple-SNP association in GWAS databases. In: Domingo-Ferrer, J. (ed.) PSD 2014. LNCS, vol. 8744, pp. 170–184. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11257-2_14
Chapter Google Scholar
Zhang, J., Zhang, Z., Xiao, X., Yang, Y., Winslett, M.: Functional mechanism: regression analysis under differential privacy. Proc. VLDB Endow. 5(11), 1364–1375 (2012)
Article Google Scholar

Download references

Acknowledgement

This project is supported by the National Research Foundation, Singapore Prime Minister’s Office under its Corporate Laboratory@University Scheme between National University of Singapore and Singapore Telecommunications Ltd.

Author information

Authors and Affiliations

School of Computing, National University of Singapore, 13, Computing Drive, Singapore, 117417, Singapore
Ashish Dandekar, Debabrota Basu & Stéphane Bressan

Authors

Ashish Dandekar
View author publications
You can also search for this author in PubMed Google Scholar
Debabrota Basu
View author publications
You can also search for this author in PubMed Google Scholar
Stéphane Bressan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ashish Dandekar .

Editor information

Editors and Affiliations

Clausthal University of Technology, Clausthal-Zellerfeld, Germany
Sven Hartmann
Johannes Kepler University of Linz, Linz, Austria
Josef Küng
The University of Texas at Arlington, Arlington, TX, USA
Sharma Chakravarthy
Johannes Kepler University of Linz, Linz, Austria
Gabriele Anderst-Kotsis
Software Competence Center Hagenberg, Hagenberg im Mühlkreis, Austria
A Min Tjoa
Johannes Kepler University of Linz, Linz, Austria
Ismail Khalil

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dandekar, A., Basu, D., Bressan, S. (2019). Differentially Private Non-parametric Machine Learning as a Service. In: Hartmann, S., Küng, J., Chakravarthy, S., Anderst-Kotsis, G., Tjoa, A., Khalil, I. (eds) Database and Expert Systems Applications. DEXA 2019. Lecture Notes in Computer Science(), vol 11706. Springer, Cham. https://doi.org/10.1007/978-3-030-27615-7_14

Download citation

DOI: https://doi.org/10.1007/978-3-030-27615-7_14
Published: 03 August 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-27614-0
Online ISBN: 978-3-030-27615-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics