Abstract
Machine learning algorithms create models from training data for the purpose of estimation, prediction and classification. While releasing parametric machine learning models requires the release of the parameters of the model, releasing non-parametric machine learning models requires the release of the training dataset along with the parameters. The release of the training dataset creates a risk of breach of privacy. An alternative to the release of the training dataset is the presentation of the non-parametric model as a service. Still, the non-parametric model as a service may leak information about the training dataset.
We study how to provide differential privacy guarantees for non-parametric models as a service. We show how to apply the perturbation to the model functions of histogram, kernel density estimator, kernel SVM and Gaussian process regression in order to provide \((\epsilon , \delta )\)-differential privacy. We empirically evaluate the trade-off between the privacy guarantee and the error incurred for each of these non-parametric machine learning algorithms on benchmarks and real-world datasets.
Our contribution is twofold. We show that functional perturbation is not only pragmatic for releasing machine learning models as a service but also yields higher effectiveness than output perturbation mechanisms for specified privacy parameters. We show a practical step to perturbate the model functions of histogram, kernel SVM, Gaussian process regression along with kernel density estimator and perform evaluation on a real-world dataset as well as a selection of benchmarks.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Model function refers to the mapping from input to output that is learned by the corresponding machine learning algorithm.
- 2.
- 3.
- 4.
- 5.
References
Minnesota population center. Integrated public use microdata series – international: Version 5.0 (2009). https://international.ipums.org
Abadi, M., et al.: Deep learning with differential privacy. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pp. 308–318. ACM (2016)
Aldà, F., Rubinstein, B.: The Bernstein mechanism: function release under differential privacy (2017)
Ateniese, G., Mancini, L.V., Spognardi, A., Villani, A., Vitali, D., Felici, G.: Hacking smart machines with smarter ones: how to extract meaningful data from machine learning classifiers. Int. J. Secure. Network. 10(3), 137–150 (2015)
Balog, M., Tolstikhin, I., Schölkopf, B.: Differentially private database release via kernel mean embeddings. arXiv preprint arXiv:1710.01641 (2017)
Boser, B.E., Guyon, I.M., Vapnik, V.N.: A training algorithm for optimal margin classifiers. In: Proceedings of the Fifth Annual Workshop on Computational Learning Theory, pp. 144–152. ACM (1992)
Chaudhuri, K., Monteleoni, C.: Privacy-preserving logistic regression. In: Advances in Neural Information Processing Systems, pp. 289–296 (2009)
Chaudhuri, K., Monteleoni, C., Sarwate, A.D.: Differentially private empirical risk minimization. J. Mach. Learn. Res. 12, 1069–1109 (2011)
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)
Dandekar, A., Basu, D., Bressan, S.: Evaluation of differentially private non-parametric machine learning as a service. Technical report TRA3/19, National University of Singapore, March 2019
Dandekar, A., Basu, D., Kister, T., Poh, G.S., Xu, J., Bressan, S.: Privacy as a service: publishing data and models. In: Li, G., Yang, J., Gama, J., Natwichai, J., Tong, Y. (eds.) DASFAA 2019. LNCS, vol. 11448, pp. 557–561. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-18590-9_86
Dheeru, D., Karra Taniskidou, E.: UCI machine learning repository (2017). http://archive.ics.uci.edu/ml
Dwork, C.: Differential privacy. In: Bugliesi, M., Preneel, B., Sassone, V., Wegener, I. (eds.) ICALP 2006. LNCS, vol. 4052, pp. 1–12. Springer, Heidelberg (2006). https://doi.org/10.1007/11787006_1
Dwork, C., Roth, A., et al.: The algorithmic foundations of differential privacy. Found. Trends® Theor. Comput. Sci. 9(3–4), 211–407 (2014)
Dwork, C., Smith, A., Steinke, T., Ullman, J.: Exposed! a survey of attacks on private data. Ann. Rev. Stat. Appl. 4, 61–84 (2017)
Fredrikson, M., Lantz, E., Jha, S., Lin, S., Page, D., Ristenpart, T.: Privacy in pharmacogenetics: an end-to-end case study of personalized warfarin dosing. In: Proceedings of the USENIX Security Symposium. UNIX Security Symposium, vol. 2014, pp. 17–32. NIH Public Access (2014)
Fredrikson, M., Jha, S., Ristenpart, T.: Model inversion attacks that exploit confidence information and basic countermeasures. In: Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, pp. 1322–1333. ACM (2015)
Hall, R., Rinaldo, A., Wasserman, L.: Differential privacy for functions and functional data. J. Mach. Learn. Res. 14, 703–727 (2013)
Homer, N., et al.: Resolving individuals contributing trace amounts of DNA to highly complex mixtures using high-density SNP genotyping microarrays. PLoS Genet. 4(8), e1000167 (2008)
Howell, N.: Data from a partial census of the !kung san, dobe. (1967). https://public.tableau.com/profile/john.marriott#!/vizhome/kung-san/Attributes
Jain, P., Thakurta, A.: Differentially private learning with kernels. In: Dasgupta, S., McAllester, D. (eds.) Proceedings of the 30th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 28, pp. 118–126. PMLR (2013)
Murphy, K.P.: Machine Learning: A Probabilistic Perspective. The MIT Press, Cambridge (2012)
Nozari, E., Tallapragada, P., Cortés, J.: Differentially private distributed convex optimization via functional perturbation. IEEE Trans. Control Netw. Syst. 5(1), 395–408 (2018)
Parzen, E.: On estimation of a probability density function and mode. Ann. Math. Stat. 33(3), 1065–1076 (1962)
Pop, D.: Machine learning and cloud computing: survey of distributed and SaaS solutions. arXiv preprint arXiv:1603.08767 (2016)
Rasmussen, C.E.: Gaussian processes in machine learning. In: Bousquet, O., von Luxburg, U., Rätsch, G. (eds.) ML 2003. LNCS (LNAI), vol. 3176, pp. 63–71. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-28650-9_4
Rubinstein, B.I., Bartlett, P.L., Huang, L., Taft, N.: Learning in a large function space: privacy-preserving mechanisms for SVM learning. J. Priv. Confid. 4(1) (2012)
Shokri, R., Stronati, M., Song, C., Shmatikov, V.: Membership inference attacks against machine learning models. In: 2017 IEEE Symposium on Security and Privacy (SP), pp. 3–18. IEEE (2017)
Smith, M.T., Zwiessele, M., Lawrence, N.D.: Differentially private Gaussian processes. arXiv preprint arXiv:1606.00720 (2016)
Smola, A.J., Schölkopf, B.: Learning with kernels, vol. 4. Citeseer (1998)
Yu, F., Rybar, M., Uhler, C., Fienberg, S.E.: Differentially-private logistic regression for detecting multiple-SNP association in GWAS databases. In: Domingo-Ferrer, J. (ed.) PSD 2014. LNCS, vol. 8744, pp. 170–184. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11257-2_14
Zhang, J., Zhang, Z., Xiao, X., Yang, Y., Winslett, M.: Functional mechanism: regression analysis under differential privacy. Proc. VLDB Endow. 5(11), 1364–1375 (2012)
Acknowledgement
This project is supported by the National Research Foundation, Singapore Prime Minister’s Office under its Corporate Laboratory@University Scheme between National University of Singapore and Singapore Telecommunications Ltd.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Dandekar, A., Basu, D., Bressan, S. (2019). Differentially Private Non-parametric Machine Learning as a Service. In: Hartmann, S., Küng, J., Chakravarthy, S., Anderst-Kotsis, G., Tjoa, A., Khalil, I. (eds) Database and Expert Systems Applications. DEXA 2019. Lecture Notes in Computer Science(), vol 11706. Springer, Cham. https://doi.org/10.1007/978-3-030-27615-7_14
Download citation
DOI: https://doi.org/10.1007/978-3-030-27615-7_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-27614-0
Online ISBN: 978-3-030-27615-7
eBook Packages: Computer ScienceComputer Science (R0)