ABSTRACT
Most of the secure multi-party computation (MPC) machine learning methods can only afford simple gradient descent (sGD 1) optimizers, and are unable to benefit from the recent progress of adaptive GD optimizers (e.g., Adagrad, Adam and their variants), which include square-root and reciprocal operations that are hard to compute in MPC. To mitigate this issue, we introduce InvertSqrt, an efficient MPC protocol for computing 1/√x. Then we implement the Adam adaptive GD optimizer based on InvertSqrt and use it for training on different datasets. The training costs compare favorably to the sGD ones, indicating that adaptive GD optimizers in MPC have become practical.
Supplemental Material
- Nitin Agrawal, Ali Shahin Shamsabadi, Matt J. Kusner, and Adrià Gascón. 2019. QUOTIENT: Two-Party Secure Neural Network Training and Prediction. In Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security, CCS 2019, London, UK, November 11-15, 2019. 1231--1247.Google ScholarDigital Library
- CrypTen. 2020. CrypTen. Privacy Preserving Machine Learning. https://github. com/facebookresearch/CrypTen. Accessed: 2020-01-07.Google Scholar
- John Duchi, Elad Hazan, and Yoram Singer. 2011. Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research 12, Jul (2011), 2121--2159.Google ScholarDigital Library
- Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).Google Scholar
- Manuel Liedel. 2012. Secure distributed computation of the square root and applications. In International Conference on Information Security Practice and Experience. Springer, 277--288.Google ScholarDigital Library
- Payman Mohassel and Peter Rindal. 2018. ABY3: A Mixed Protocol Framework for Machine Learning. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, CCS 2018, Toronto, ON, Canada, October 15-19, 2018. 35--52.Google ScholarDigital Library
- Payman Mohassel and Yupeng Zhang. 2017. SecureML: A system for scalable privacy-preserving machine learning. In 2017 IEEE Symposium on Security and Privacy (SP). IEEE, 19--38.Google ScholarCross Ref
- Sashank J. Reddi, Satyen Kale, and Sanjiv Kumar. 2018. On the Convergence of Adam and Beyond. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings.Google Scholar
- Sebastian Ruder. 2016. An overview of gradient descent optimization algorithms. arXiv preprint arXiv:1609.04747 (2016).Google Scholar
- Samuel L Smith, Pieter-Jan Kindermans, Chris Ying, and Quoc V Le. 2017. Don't decay the learning rate, increase the batch size. arXiv preprint arXiv:1711.00489 (2017).Google Scholar
Index Terms
- Faster Secure Multiparty Computation of Adaptive Gradient Descent
Recommendations
The Chaotic Nature of Faster Gradient Descent Methods
The steepest descent method for large linear systems is well-known to often converge very slowly, with the number of iterations required being about the same as that obtained by utilizing a gradient descent method with the best constant step size and ...
Privacy-preservation for gradient descent methods
KDD '07: Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data miningGradient descent is a widely used paradigm for solving many optimization problems. Stochastic gradient descent performs a series of iterations to minimize a target function in order to reach a local minimum. In machine learning or data mining, this ...
Image Sharpening via Sobolev Gradient Flows
Motivated by some recent work in active contour applications, we study the use of Sobolev gradients for PDE-based image diffusion and sharpening. We begin by studying, for the case of isotropic diffusion, the gradient descent/ascent equation obtained by ...
Comments