Skip to main content

Privacy-Preserving Distributed Machine Learning Based on Secret Sharing

  • Conference paper
  • First Online:
Information and Communications Security (ICICS 2019)

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 11999))

Included in the following conference series:

Abstract

Machine Learning has been widely applied in practice, such as disease diagnosis, target detection. Commonly, a good model relies on massive training data collected from different sources. However, the collected data might expose sensitive information. To solve the problem, researchers have proposed many excellent methods that combine machine learning with privacy protection technologies, such as secure multiparty computation (MPC), homomorphic encryption (HE), and differential privacy. In the meanwhile, some other researchers proposed distributed machine learning which allows the clients to store their data locally but train a model collaboratively. The first kind of methods focuses on security, but the performance and accuracy remain to be improved, while the second provides higher accuracy and better performance but weaker security, for instance, the adversary can launch membership attacks from the gradients’ updates in plaintext.

In this paper, we join secret sharing to distributed machine learning to achieve reliable performance, accuracy, and high-level security. Next, we design, implement, and evaluate a practical system to jointly learn an accurate model under semi-honest and servers-only malicious adversary security, respectively. And the experiments show our protocols achieve the best overall performance as well.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    MNIST database, http://yann.lecun.com/exdb/mnist/. Accessed: 2017-09-24.

  2. 2.

    https://mortendahl.github.io/2017/04/17/private-deep-learning-with-mpc/.

  3. 3.

    https://github.com/tensorflow/tensorflow/releases/tag/v1.13.1.

  4. 4.

    We only implement the basic secure aggregation with no dropouts.

References

  1. Shamir, A.: How to share a secret. Commun. ACM 22(11), 612–613 (1979). https://doi.org/10.1145/359168.359176

    Article  MathSciNet  MATH  Google Scholar 

  2. Lindell, Y., Pinkas, B.: Privacy preserving data mining. In: Bellare, M. (ed.) CRYPTO 2000. LNCS, vol. 1880, pp. 36–54. Springer, Heidelberg (2000). https://doi.org/10.1007/3-540-44598-6_3

    Chapter  Google Scholar 

  3. Du, W., Atallah, M.J.: Privacy-preserving cooperative scientific computations. In: CSFW. IEEE (2001). 0273. https://doi.org/10.1109/CSFW.2001.930152

  4. Du, W., Han, Y.S., Chen, S.: Privacy-preserving multivariate statistical analysis: linear regression and classification. In: Proceedings of the 2004 SIAM International Conference on Data Mining. Society for Industrial and Applied Mathematics, pp. 222–233 (2004). https://doi.org/10.1137/1.9781611972740.21

  5. Sanil, A.P., Karr, A.F., Lin, X., et al.: Privacy-preserving regression modelling via distributed computation. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 677–682. ACM (2004). https://doi.org/10.1145/1014052.1014139

  6. Jagannathan, G., Wright, R.N.: Privacy-preserving distributed k-means clustering over arbitrarily partitioned data. In: Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, pp. 593–599. ACM (2005). https://doi.org/10.1145/1081870.1081942

  7. Yu, H., Vaidya, J., Jiang, X.: Privacy-preserving SVM classification on vertically partitioned data. In: Ng, W.-K., Kitsuregawa, M., Li, J., Chang, K. (eds.) PAKDD 2006. LNCS (LNAI), vol. 3918, pp. 647–656. Springer, Heidelberg (2006). https://doi.org/10.1007/11731139_74

    Chapter  Google Scholar 

  8. Slavkovic, A.B., Nardi, Y., Tibbits, M.M.: “Secure” logistic regression of horizontally and vertically partitioned distributed databases. In: Seventh IEEE International Conference on Data Mining Workshops (ICDMW 2007), pp. 723–728. IEEE (2007). https://doi.org/10.1109/ICDMW.2007.114

  9. Bunn, P., Ostrovsky, R.: Secure two-party k-means clustering. In: Proceedings of the 14th ACM Conference on Computer and Communications Security, pp. 486–497. ACM (2007). https://doi.org/10.1145/1315245.1315306

  10. Vaidya, J., Yu, H., Jiang, X.: Privacy-preserving SVM classification. Knowl. Inf. Syst. 14(2), 161–178 (2008). https://doi.org/10.1007/s10115-007-0073-7

    Article  Google Scholar 

  11. Bottou, L.: Large-scale machine learning with stochastic gradient descent. In: Lechevallier, Y., Saporta, G. (eds.) Proceedings of COMPSTAT 2010, pp. 177–186. Physica-Verlag HD, Heidelberg (2010). https://doi.org/10.1007/978-3-7908-2604-3_16

    Chapter  Google Scholar 

  12. Damgård, I., Pastro, V., Smart, N., Zakarias, S.: Multiparty computation from somewhat homomorphic encryption. In: Safavi-Naini, R., Canetti, R. (eds.) CRYPTO 2012. LNCS, vol. 7417, pp. 643–662. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-32009-5_38

    Chapter  Google Scholar 

  13. Nikolaenko, V., Ioannidis, S., Weinsberg, U., et al.: Privacy-preserving matrix factorization. In: Proceedings of the 2013 ACM SIGSAC Conference on Computer & Communications Security, pp. 801–812. ACM (2013). https://doi.org/10.1145/2508859.2516751

  14. Wu, S., Teruya, T., Kawamoto, J.: Privacy-preservation for stochastic gradient descent application to secure logistic regression. In: The 27th Annual Conference of the Japanese Society for Artificial Intelligence, vol. 27, pp. 1–4 (2013)

    Google Scholar 

  15. Song, S., Chaudhuri, K., Sarwate, A.D.: Stochastic gradient descent with differentially private updates. In: 2013 IEEE Global Conference on Signal and Information Processing, pp. 245–248. IEEE (2013). https://doi.org/10.1109/GlobalSIP.2013.6736861

  16. Li, M., Andersen, D.G., Park, J.W., et al.: Scaling distributed machine learning with the parameter server. In: 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI 14), pp. 583–598 (2014). https://doi.org/10.1145/2640087.2644155

  17. Shokri, R., Shmatikov, V.: Privacy-preserving deep learning. In: Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, pp. 1310–1321. ACM (2015). https://doi.org/10.1145/2810103.2813687

  18. Abadi, M., Chu, A., Goodfellow, I., et al.: Deep learning with differential privacy. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pp. 308–318. ACM (2016). https://doi.org/10.1145/2976749.2978318

  19. GascĂł, A., Schoppmann, P., Balle, B., et al.: Secure linear regression on vertically partitioned datasets. IACR Cryptology ePrint Archive 2016, 892 (2016)

    Google Scholar 

  20. Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (GDPR). Official J. Eur. Union, L119 (2016)

    Google Scholar 

  21. Gilad-Bachrach, R., Dowlin, N., Laine, K., et al.: CryptoNets: applying neural networks to encrypted data with high throughput and accuracy. In: International Conference on Machine Learning, pp. 201–210 (2016)

    Google Scholar 

  22. Mohassel, P., Secureml, Z.Y.: A system for scalable privacy-preserving machine learning. In: 2017 IEEE Symposium on Security and Privacy (SP), pp. 19–38. IEEE (2017). https://doi.org/10.1109/SP.2017.12

  23. Liu J, Juuti, M., Lu, Y., et al.: Oblivious neural network predictions via miniONN transformations. In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, pp. 619–631. ACM (20170. https://doi.org/10.1145/3133956.3134056

  24. Bonawitz, K., Ivanov, V., Kreuter, B., et al.: Practical secure aggregation for privacy-preserving machine learning. In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, pp. 1175–1191. ACM (2017). https://doi.org/10.1145/3133956.3133982

  25. Lin, Y., Han, S., Mao, H., et al.: Deep gradient compression: reducing the communication bandwidth for distributed training. arXiv preprint arXiv:1712.01887 (2017)

  26. Riazi, M.S., Weinert, C., Tkachenko, O., et al.: Chameleon: a hybrid secure computation framework for machine learning applications. In: Proceedings of the 2018 on Asia Conference on Computer and Communications Security, pp. 707–721. ACM (2018). https://doi.org/10.1145/3196494.3196522

  27. Phong, L.T., Aono, Y., Hayashi, T., et al.: Privacy-preserving deep learning via additively homomorphic encryption. IEEE Trans. Inf. Forensics Secur. 13(5), 1333–1345 (2018). https://doi.org/10.1109/TIFS.2017.2787987

    Article  Google Scholar 

  28. Wagh, S., Gupta, D., Chandran, N.: SecureNN: 3-party secure computation for neural network training. Proc. Priv. Enhancing Technol. 1, 24 (2019). https://doi.org/10.2478/popets-2019-0035

    Article  Google Scholar 

  29. Nasr, M., Shokri, R., Houmansadr, A.: Comprehensive privacy analysis of deep learning: stand-alone and federated learning under passive and active white-box inference attacks. arXiv preprint arXiv:1812.00910 (2018)

  30. Yang, Q., Liu, Y., Chen, T., et al.: Federated machine learning: concept and applications. ACM Trans. Intell. Syst. Technol. (TIST) 10(2), 12 (2019). https://doi.org/10.1145/3298981

    Article  Google Scholar 

  31. Juvekar, C., Vaikuntanathan, V., Chandrakasan, A.: GAZELLE: a low latency framework for secure neural network inference. In: 27th USENIX Security Symposium (USENIX Security 18), pp. 1651–1669 (2018)

    Google Scholar 

  32. Centers for Medicare & Medicaid Services. The Health Insurance Portability and Accountability Act of 1996 (HIPAA) (1996). http://www.cms.hhs.gov/hipaa/

Download references

Acknowledgements

We are grateful to the anonymous reviewers for their comprehensive comments. And we thank Xiangfu Song, Yiran Liu from Shandong University for helpful discussions on MPC, and Junming Ke from Singapore University of Technology and Design for his help. This work was supported by the Strategic Priority Research Program of Chinese Academy of Sciences, Grant No. XDC02040400.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaojun Chen .

Editor information

Editors and Affiliations

Appendices

A Proof of Correctness

1.1 A.1 Lemma 1

Proof

Suppose we have two secrets, \(s_0\) and \(s_1\), and we share both in Shamir’s Secret Sharing scheme with two polynomial-functions

$$\begin{aligned} \begin{aligned} f(x)=a_0 + a_1\cdot x +...+a_{t-1}\cdot x^{t-1}\mod p\\ g(x)=b_0 + b_1\cdot x +...+b_{t-1}\cdot x^{t-1}\mod p \end{aligned} \end{aligned}$$
(12)

where \(f(0)=a_0 = s_0\), \(g(0)=b_0=s_1\) and p is a large prime.

In order to compute the shares, we can evaluate f(x) and g(x) at n different points \(f(x_0),f(x_1),...,f(x_{n-1})\) and \(g(x_0),g(x_1),...,g(x_{n-1})\) respectively.

Then we will turn to getting the shares of \(s_0 + s_1\). We define a new polynomial-function

$$\begin{aligned} \begin{aligned} h(x) = (a_0+b_0) + (a_1+b_1)\cdot x +... +(a_{t-1}+b_{t-1})\cdot x^{t-1}\mod p \end{aligned} \end{aligned}$$
(13)

Obviously, h(x) is a polynomial-function of degree \(t-1\) with t coefficients and \(h(0) = s_0+ s_1\). On the one hand, \(h(x_i)\) is the shares for \(s_0+s_1\), and on the other hand, we can confirm

$$\begin{aligned} \begin{aligned} h(x_i)&= (a_0+b_0) + (a_1+b_1)\cdot x_i +...+ (a_{t-1}+b_{t-1})\cdot x_i^{t-1} \\ {}&=(a_0 + a_1\cdot x_i\,+...+\,a_{t-1}\cdot x_i^{t-1}) + (b_0 + b_1\cdot x_i\,+... +\,b_{t-1}\cdot x_i^{t-1})\\ {}&= f(x_i) + g(x_i) \mod p,\ 0\le i\le n-1 \end{aligned} \end{aligned}$$
(14)

So that the shares of \(s_0+s_1\) can be computed by adding the corresponding shares of \(s_0\) and \(s_1\).

1.2 A.2 Lemma2

Proof

Suppose we have \(x_0,x_1,...,x_{n-1}\) and a secret key \(\alpha \). We could the compute the MAC of \(x_i\)

$$\begin{aligned} \begin{aligned} \varvec{\delta }(x_i) = \alpha \cdot x_i\mod p,\ 0\le i\le n-1 \end{aligned} \end{aligned}$$
(15)

Then we can also compute the MAC of \(x_i\)’s sum:

$$\begin{aligned} \begin{aligned} \varvec{\delta }(\sum \limits _{i=0}^{n-1} x_i) = \alpha \cdot (\sum \limits _{i=0}^{n-1} x_i)\mod p \end{aligned} \end{aligned}$$
(16)

Then it is easy to confirm

$$\begin{aligned} \begin{aligned} \varvec{\delta }(\sum \limits _{i=0}^{n-1} x_i)&= (\alpha \cdot x_0)+(\alpha \cdot x_1)+...+(\alpha \cdot x_{n-1}) \mod p\\&= \varvec{\delta }(x_0) +\varvec{\delta }(x_1) + ... + \varvec{\delta }(x_{n-1}) \mod p\\&= \sum \limits _{i=0}^{n-1} \varvec{\delta }(x_i) \mod p \end{aligned} \end{aligned}$$
(17)

For a more concrete proof, please refer to [12].

B Accuracy and Performance for Linear Regression and MLP

1.1 B.1 Linear Regression

See Fig. 6.

Fig. 6.
figure 6

Experimental results of linear regression. (a) is for accuracy, (b) is for performance.

1.2 B.2 MLP

See Fig. 7.

Fig. 7.
figure 7

Experimental results of MLP. (a) is for accuracy, (b) is for performance.

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Dong, Y., Chen, X., Shen, L., Wang, D. (2020). Privacy-Preserving Distributed Machine Learning Based on Secret Sharing. In: Zhou, J., Luo, X., Shen, Q., Xu, Z. (eds) Information and Communications Security. ICICS 2019. Lecture Notes in Computer Science(), vol 11999. Springer, Cham. https://doi.org/10.1007/978-3-030-41579-2_40

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-41579-2_40

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-41578-5

  • Online ISBN: 978-3-030-41579-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics