Skip to main content

Differentially Private Non-parametric Machine Learning as a Service

  • Conference paper
  • First Online:
Database and Expert Systems Applications (DEXA 2019)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11706))

Included in the following conference series:

Abstract

Machine learning algorithms create models from training data for the purpose of estimation, prediction and classification. While releasing parametric machine learning models requires the release of the parameters of the model, releasing non-parametric machine learning models requires the release of the training dataset along with the parameters. The release of the training dataset creates a risk of breach of privacy. An alternative to the release of the training dataset is the presentation of the non-parametric model as a service. Still, the non-parametric model as a service may leak information about the training dataset.

We study how to provide differential privacy guarantees for non-parametric models as a service. We show how to apply the perturbation to the model functions of histogram, kernel density estimator, kernel SVM and Gaussian process regression in order to provide \((\epsilon , \delta )\)-differential privacy. We empirically evaluate the trade-off between the privacy guarantee and the error incurred for each of these non-parametric machine learning algorithms on benchmarks and real-world datasets.

Our contribution is twofold. We show that functional perturbation is not only pragmatic for releasing machine learning models as a service but also yields higher effectiveness than output perturbation mechanisms for specified privacy parameters. We show a practical step to perturbate the model functions of histogram, kernel SVM, Gaussian process regression along with kernel density estimator and perform evaluation on a real-world dataset as well as a selection of benchmarks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Model function refers to the mapping from input to output that is learned by the corresponding machine learning algorithm.

  2. 2.

    https://cloud.google.com/ml-engine/.

  3. 3.

    https://azure.microsoft.com/en-us/services/machine-learning-studio/.

  4. 4.

    https://aws.amazon.com/machine-learning/.

  5. 5.

    https://www.ibm.com/cloud/.

References

  1. Minnesota population center. Integrated public use microdata series – international: Version 5.0 (2009). https://international.ipums.org

  2. Abadi, M., et al.: Deep learning with differential privacy. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pp. 308–318. ACM (2016)

    Google Scholar 

  3. Aldà, F., Rubinstein, B.: The Bernstein mechanism: function release under differential privacy (2017)

    Google Scholar 

  4. Ateniese, G., Mancini, L.V., Spognardi, A., Villani, A., Vitali, D., Felici, G.: Hacking smart machines with smarter ones: how to extract meaningful data from machine learning classifiers. Int. J. Secure. Network. 10(3), 137–150 (2015)

    Article  Google Scholar 

  5. Balog, M., Tolstikhin, I., Schölkopf, B.: Differentially private database release via kernel mean embeddings. arXiv preprint arXiv:1710.01641 (2017)

  6. Boser, B.E., Guyon, I.M., Vapnik, V.N.: A training algorithm for optimal margin classifiers. In: Proceedings of the Fifth Annual Workshop on Computational Learning Theory, pp. 144–152. ACM (1992)

    Google Scholar 

  7. Chaudhuri, K., Monteleoni, C.: Privacy-preserving logistic regression. In: Advances in Neural Information Processing Systems, pp. 289–296 (2009)

    Google Scholar 

  8. Chaudhuri, K., Monteleoni, C., Sarwate, A.D.: Differentially private empirical risk minimization. J. Mach. Learn. Res. 12, 1069–1109 (2011)

    MathSciNet  MATH  Google Scholar 

  9. Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)

    MATH  Google Scholar 

  10. Dandekar, A., Basu, D., Bressan, S.: Evaluation of differentially private non-parametric machine learning as a service. Technical report TRA3/19, National University of Singapore, March 2019

    Google Scholar 

  11. Dandekar, A., Basu, D., Kister, T., Poh, G.S., Xu, J., Bressan, S.: Privacy as a service: publishing data and models. In: Li, G., Yang, J., Gama, J., Natwichai, J., Tong, Y. (eds.) DASFAA 2019. LNCS, vol. 11448, pp. 557–561. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-18590-9_86

    Chapter  Google Scholar 

  12. Dheeru, D., Karra Taniskidou, E.: UCI machine learning repository (2017). http://archive.ics.uci.edu/ml

  13. Dwork, C.: Differential privacy. In: Bugliesi, M., Preneel, B., Sassone, V., Wegener, I. (eds.) ICALP 2006. LNCS, vol. 4052, pp. 1–12. Springer, Heidelberg (2006). https://doi.org/10.1007/11787006_1

    Chapter  Google Scholar 

  14. Dwork, C., Roth, A., et al.: The algorithmic foundations of differential privacy. Found. Trends® Theor. Comput. Sci. 9(3–4), 211–407 (2014)

    MathSciNet  MATH  Google Scholar 

  15. Dwork, C., Smith, A., Steinke, T., Ullman, J.: Exposed! a survey of attacks on private data. Ann. Rev. Stat. Appl. 4, 61–84 (2017)

    Article  Google Scholar 

  16. Fredrikson, M., Lantz, E., Jha, S., Lin, S., Page, D., Ristenpart, T.: Privacy in pharmacogenetics: an end-to-end case study of personalized warfarin dosing. In: Proceedings of the USENIX Security Symposium. UNIX Security Symposium, vol. 2014, pp. 17–32. NIH Public Access (2014)

    Google Scholar 

  17. Fredrikson, M., Jha, S., Ristenpart, T.: Model inversion attacks that exploit confidence information and basic countermeasures. In: Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, pp. 1322–1333. ACM (2015)

    Google Scholar 

  18. Hall, R., Rinaldo, A., Wasserman, L.: Differential privacy for functions and functional data. J. Mach. Learn. Res. 14, 703–727 (2013)

    MathSciNet  MATH  Google Scholar 

  19. Homer, N., et al.: Resolving individuals contributing trace amounts of DNA to highly complex mixtures using high-density SNP genotyping microarrays. PLoS Genet. 4(8), e1000167 (2008)

    Article  Google Scholar 

  20. Howell, N.: Data from a partial census of the !kung san, dobe. (1967). https://public.tableau.com/profile/john.marriott#!/vizhome/kung-san/Attributes

  21. Jain, P., Thakurta, A.: Differentially private learning with kernels. In: Dasgupta, S., McAllester, D. (eds.) Proceedings of the 30th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 28, pp. 118–126. PMLR (2013)

    Google Scholar 

  22. Murphy, K.P.: Machine Learning: A Probabilistic Perspective. The MIT Press, Cambridge (2012)

    MATH  Google Scholar 

  23. Nozari, E., Tallapragada, P., Cortés, J.: Differentially private distributed convex optimization via functional perturbation. IEEE Trans. Control Netw. Syst. 5(1), 395–408 (2018)

    Article  MathSciNet  Google Scholar 

  24. Parzen, E.: On estimation of a probability density function and mode. Ann. Math. Stat. 33(3), 1065–1076 (1962)

    Article  MathSciNet  Google Scholar 

  25. Pop, D.: Machine learning and cloud computing: survey of distributed and SaaS solutions. arXiv preprint arXiv:1603.08767 (2016)

  26. Rasmussen, C.E.: Gaussian processes in machine learning. In: Bousquet, O., von Luxburg, U., Rätsch, G. (eds.) ML 2003. LNCS (LNAI), vol. 3176, pp. 63–71. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-28650-9_4

    Chapter  Google Scholar 

  27. Rubinstein, B.I., Bartlett, P.L., Huang, L., Taft, N.: Learning in a large function space: privacy-preserving mechanisms for SVM learning. J. Priv. Confid. 4(1) (2012)

    Google Scholar 

  28. Shokri, R., Stronati, M., Song, C., Shmatikov, V.: Membership inference attacks against machine learning models. In: 2017 IEEE Symposium on Security and Privacy (SP), pp. 3–18. IEEE (2017)

    Google Scholar 

  29. Smith, M.T., Zwiessele, M., Lawrence, N.D.: Differentially private Gaussian processes. arXiv preprint arXiv:1606.00720 (2016)

  30. Smola, A.J., Schölkopf, B.: Learning with kernels, vol. 4. Citeseer (1998)

    Google Scholar 

  31. Yu, F., Rybar, M., Uhler, C., Fienberg, S.E.: Differentially-private logistic regression for detecting multiple-SNP association in GWAS databases. In: Domingo-Ferrer, J. (ed.) PSD 2014. LNCS, vol. 8744, pp. 170–184. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11257-2_14

    Chapter  Google Scholar 

  32. Zhang, J., Zhang, Z., Xiao, X., Yang, Y., Winslett, M.: Functional mechanism: regression analysis under differential privacy. Proc. VLDB Endow. 5(11), 1364–1375 (2012)

    Article  Google Scholar 

Download references

Acknowledgement

This project is supported by the National Research Foundation, Singapore Prime Minister’s Office under its Corporate Laboratory@University Scheme between National University of Singapore and Singapore Telecommunications Ltd.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ashish Dandekar .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Dandekar, A., Basu, D., Bressan, S. (2019). Differentially Private Non-parametric Machine Learning as a Service. In: Hartmann, S., Küng, J., Chakravarthy, S., Anderst-Kotsis, G., Tjoa, A., Khalil, I. (eds) Database and Expert Systems Applications. DEXA 2019. Lecture Notes in Computer Science(), vol 11706. Springer, Cham. https://doi.org/10.1007/978-3-030-27615-7_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-27615-7_14

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-27614-0

  • Online ISBN: 978-3-030-27615-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics