ABSTRACT
Most current distributed machine learning systems try to scale up model training by using a data-parallel architecture that divides the computation for different samples among workers. We study distributed machine learning from a different motivation, where the information about the same samples, e.g., users and objects, are owned by several parities that wish to collaborate but do not want to share raw data with each other.
We propose an asynchronous stochastic gradient descent (SGD) algorithm for such a feature distributed machine learning (FDML) problem, to jointly learn from distributed features, with theoretical convergence guarantees under bounded asynchrony. Our algorithm does not require sharing the original features or even local model parameters between parties, thus preserving the data locality. The system can also easily incorporate differential privacy mechanisms to preserve a higher level of privacy. We implement the FDML system in a parameter server architecture and compare our system with fully centralized learning (which violates data locality) and learning based on only local features, through extensive experiments performed on both a public data set a9a, and a large dataset of 5,000,000 records and 8700 decentralized features from three collaborating apps at Tencent including Tencent MyApp, Tecent QQ Browser and Tencent Mobile Safeguard. Experimental results have demonstrated that the proposed FDML system can be used to significantly enhance app recommendation in Tencent MyApp by leveraging user and item features from other apps, while preserving the locality and privacy of features in each individual app to a high degree.
Supplemental Material
- Mart'in Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, et almbox. 2016. Tensorflow: A System for Large-Scale Machine Learning. In Proc. USENIX Symposium on Operating System Design and Implementation (OSDI) . Google ScholarDigital Library
- Tal Ben-Nun and Torsten Hoefler. 2018. Demystifying Parallel and Distributed Deep Learning: An In-Depth Concurrency Analysis. arXiv preprint arXiv:1802.09941 (2018).Google Scholar
- Charlotte Bonte and Frederik Vercauteren. 2018. Privacy-Preserving Logistic Regression Training . Technical Report. IACR Cryptology ePrint Archive 233.Google Scholar
- Joseph K Bradley, Aapo Kyrola, Danny Bickson, and Carlos Guestrin. 2011. Parallel coordinate descent for l1-regularized loss minimization. arXiv preprint arXiv:1105.5379 (2011).Google Scholar
- Trishul M Chilimbi, Yutaka Suzue, Johnson Apacible, and Karthik Kalyanaraman. 2014. Project Adam: Building an Efficient and Scalable Deep Learning Training System.. In OSDI , Vol. 14. 571--582. Google ScholarDigital Library
- Jeffrey Dean, Greg Corrado, Rajat Monga, Kai Chen, Matthieu Devin, Mark Mao, Andrew Senior, Paul Tucker, Ke Yang, Quoc V Le, et almbox. 2012. Large scale distributed deep networks. In Advances in neural information processing systems. 1223--1231. Google ScholarDigital Library
- Ofer Dekel, Ran Gilad-Bachrach, Ohad Shamir, and Lin Xiao. 2012. Optimal distributed online prediction using mini-batches. Journal of Machine Learning Research , Vol. 13, Jan (2012), 165--202. Google ScholarDigital Library
- Dua Dheeru and Efi Karra Taniskidou. 2017. UCI Machine Learning Repository. http://archive.ics.uci.edu/mlGoogle Scholar
- Cynthia Dwork. 2008. Differential privacy: A survey of results. In International Conference on Theory and Applications of Models of Computation. Springer, 1--19. Google ScholarDigital Library
- Cynthia Dwork, Aaron Roth, et almbox. 2014. The algorithmic foundations of differential privacy. Foundations and Trends® in Theoretical Computer Science , Vol. 9, 3--4 (2014), 211--407. Google ScholarDigital Library
- Ran Gilad-Bachrach, Nathan Dowlin, Kim Laine, Kristin Lauter, Michael Naehrig, and John Wernsing. 2016. Cryptonets: Applying neural networks to encrypted data with high throughput and accuracy. In International Conference on Machine Learning . 201--210. Google ScholarDigital Library
- Briland Hitaj, Giuseppe Ateniese, and Fernando Perez-Cruz. 2017. Deep models under the GAN: information leakage from collaborative deep learning. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. ACM, 603--618. Google ScholarDigital Library
- Qirong Ho, James Cipar, Henggang Cui, Seunghak Lee, Jin Kyu Kim, Phillip B Gibbons, Garth A Gibson, Greg Ganger, and Eric P Xing. 2013. More effective distributed ml via a stale synchronous parallel parameter server. In Advances in neural information processing systems. 1223--1231. Google ScholarDigital Library
- Kevin Hsieh, Aaron Harlap, Nandita Vijaykumar, Dimitris Konomis, Gregory R Ganger, Phillip B Gibbons, and Onur Mutlu. 2017. Gaia: Geo-Distributed Machine Learning Approaching LAN Speeds.. In NSDI. 629--647. Google ScholarDigital Library
- John Langford, Alexander J Smola, and Martin Zinkevich. 2009. Slow learners are fast. Advances in Neural Information Processing Systems , Vol. 22 (2009), 2331--2339. Google ScholarDigital Library
- Seunghak Lee, Jin Kyu Kim, Xun Zheng, Qirong Ho, Garth A Gibson, and Eric P Xing. 2014. On model parallelization and scheduling strategies for distributed machine learning. In Advances in neural information processing systems. 2834--2842. Google ScholarDigital Library
- Mu Li, David G Andersen, Jun Woo Park, Alexander J Smola, Amr Ahmed, Vanja Josifovski, James Long, Eugene J Shekita, and Bor-Yiing Su. 2014a. Scaling Distributed Machine Learning with the Parameter Server.. In OSDI , Vol. 14. 583--598. Google ScholarDigital Library
- Mu Li, David G Andersen, Alexander J Smola, and Kai Yu. 2014b. Communication efficient distributed machine learning with the parameter server. In Advances in Neural Information Processing Systems. 19--27. Google ScholarDigital Library
- Mu Li, Ziqi Liu, Alexander J Smola, and Yu-Xiang Wang. 2016. Difacto: Distributed factorization machines. In Proceedings of the Ninth ACM International Conference on Web Search and Data Mining. ACM, 377--386. Google ScholarDigital Library
- Xiangru Lian, Yijun Huang, Yuncheng Li, and Ji Liu. 2015. Asynchronous parallel stochastic gradient for nonconvex optimization. In Advances in Neural Information Processing Systems. 2737--2745. Google ScholarDigital Library
- H Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, et almbox. 2016. Communication-efficient learning of deep networks from decentralized data. arXiv preprint arXiv:1602.05629 (2016).Google Scholar
- Payman Mohassel and Yupeng Zhang. 2017. SecureML: A system for scalable privacy-preserving machine learning. In 2017 38th IEEE Symposium on Security and Privacy (SP). IEEE, 19--38.Google ScholarCross Ref
- Manas Pathak, Shantanu Rane, and Bhiksha Raj. 2010. Multiparty differential privacy via aggregation of locally trained classifiers. In Advances in Neural Information Processing Systems. 1876--1884. Google ScholarDigital Library
- Arun Rajkumar and Shivani Agarwal. 2012. A differentially private stochastic gradient descent algorithm for multiparty classification. In Artificial Intelligence and Statistics . 933--941.Google Scholar
- Benjamin Recht, Christopher Re, Stephen Wright, and Feng Niu. 2011. Hogwild: A lock-free approach to parallelizing stochastic gradient descent. In Advances in neural information processing systems. 693--701. Google ScholarDigital Library
- Chad Scherrer, Ambuj Tewari, Mahantesh Halappanavar, and David Haglin. 2012. Feature clustering for accelerating parallel coordinate descent. In Advances in Neural Information Processing Systems. 28--36. Google ScholarDigital Library
- Reza Shokri and Vitaly Shmatikov. 2015. Privacy-preserving deep learning. In Proceedings of the 22nd ACM SIGSAC conference on computer and communications security. ACM, 1310--1321. Google ScholarDigital Library
- Hassan Takabi, Ehsan Hesamifard, and Mehdi Ghasemi. 2016. Privacy preserving multi-party machine learning with homomorphic encryption. In 29th Annual Conference on Neural Information Processing Systems (NIPS) .Google Scholar
- Li Wan, Wee Keong Ng, Shuguo Han, and Vincent Lee. 2007. Privacy-preservation for gradient descent methods. In Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 775--783. Google ScholarDigital Library
- Eric P Xing, Qirong Ho, Pengtao Xie, and Dai Wei. 2016. Strategies and principles of distributed machine learning on big data. Engineering , Vol. 2, 2 (2016), 179--195.Google ScholarCross Ref
- Yi Zhou, Yaoliang Yu, Wei Dai, Yingbin Liang, and Eric Xing. 2016. On convergence of model parallel proximal gradient algorithm for stale synchronous parallel system. In Artificial Intelligence and Statistics . 713--722.Google Scholar
- Martin Zinkevich, Markus Weimer, Lihong Li, and Alex J Smola. 2010. Parallelized stochastic gradient descent. In Advances in neural information processing systems. 2595--2603. Google ScholarDigital Library
Index Terms
- FDML: A Collaborative Machine Learning Framework for Distributed Features
Recommendations
Performance Analysis of Distributed and Federated Learning Models on Private Data
AbstractThere has been significant research in privacy-related aspects of machine learning and large scale data processing. In traditional methods of training a model, data is gathered at a centralized machine where training on the entire data takes ...
Personalised anonymity for microdata release
Individual privacy protection in the released data sets has become an important issue in recent years. The release of microdata provides a significant information resource for researchers, whereas the release of person‐specific data poses a threat to ...
On the Evaluation of the Privacy Breach in Disassociated Set-valued Datasets
ICETE 2016: Proceedings of the 13th International Joint Conference on e-Business and TelecommunicationsData anonymization is gaining much attention these days as it provides the fundamental requirements to safely
outsource datasets containing identifying information. While some techniques add noise to protect privacy
others use generalization to hide the ...
Comments