Abstract
The full application of machine learning has caused plenty of problems with privacy-preserving. Especially in multi-party machine learning, private data is often exposed in the aggregation,transmission, and communication phase, which leads to the problem of private data leakage. Existing works use secure multi-party computing (SMPC) or secret-sharing technology to ensure the privacy-preserving of multi-party machine learning. Nevertheless, it brings enormous cost and feasibility drawbacks. The partition method of datasets is one of the most critical factors affecting the performance of machine learning. Vertically partitioned data has the problems of incomplete feature information held by a single participant and complicated training process. Therefore, it has to be tackled urgently that how to efficiently and safely complete the multi-party training using vertically partitioned datasets. Moreover, training logistic regression models efficiently is one of the directions worth working on. In this paper, we propose a protocol using that can complete the logistic regression modeling of vertically partitioned data by asynchronous gradient sharing. At the same time, we use an efficient homomorphic encryption method to protect private data. The experiments show that our protocol can reduce the training time in the case of a small impact on the output results, and speedup can be over 10x. Meanwhile, it will ensure the security of the vertically partitioned dataset.
Similar content being viewed by others
References
Brakerski Z, Gentry C, Vaikuntanathan V (2014) (leveled) fully homomorphic encryption without bootstrapping. ACM Trans Comput Theory 6(3). https://doi.org/10.1145/2633600
Cheng K, Fan T, Jin Y, Liu Y, Chen T, Yang Q (2019) Secureboost: A lossless federated learning framework
Cheon JH, Kim A, Kim M, Song Y (2017) Homomorphic encryption for arithmetic of approximate numbers. In: International conference on the theory and application of cryptology and information security. Springer, pp 409–437
Duverle DA, Kawasaki S, Yamada Y, Sakuma J, Tsuda K (2015) Privacy-preserving statistical analysis by exact logistic regression. In: 2015 IEEE Security and privacy workshops. IEEE, pp 7–16
Feng S, Yu H (2020) Multi-participant multi-class vertical federated learning. arXiv:2001.11154
Gascón A., Schoppmann P, Balle B, Raykova M, Doerner J, Zahur S, Evans D (2017) Privacy-preserving distributed linear regression on high-dimensional data. Proc Privacy Enhanc Technol 2017(4):345–364
Hardy S, Henecka W, Ivey-Law H, Nock R, Patrini G, Smith G, Thorne B (2017) Private federated learning on vertically partitioned data via entity resolution and additively homomorphic encryption. arXiv:1711.10677
Liu Y, Chen C, Zheng L, Wang L, Zhou J, Liu G (2020) Privacy preserving pca for multiparty modeling. arXiv:2002.02091
Liu Y, Kang Y, Zhang X, Li L, Cheng Y, Chen T, Hong M, Yang Q (2019) A communication efficient collaborative learning framework for distributed features
Mohassel P, Zhang Y (2017) Secureml: a system for scalable privacy-preserving machine learning. In: 2017 IEEE Symposium on security and privacy (SP). IEEE, pp 19–38
Paillier P (1999) Public-key cryptosystems based on composite degree residuosity classes, pp 223–238
Song Lei MC Privacy-preserving logistic regressionon vertically partitioned data. J Comput Res Dev 56(10), 2243–2249. https://doi.org/10.7544/issn1000-1239.2019.20190414. http://crad.ict.ac.cn/CN/abstract/abstract4032.shtml
Wu S, Sakuma J (2013) Privacy-preservation for Stochastic Gradient Descent Method, pp 3l1OS06a3–3l1OS06a3
Yang K, Fan T, Chen T, Shi Y, Yang Q (2019) A quasi-newton method based vertical federated learning framework for logistic regression
Yang Q, Liu Y, Chen T, Tong Y (2019) Federated machine learning: Concept and applications. ACM Trans Intell Syst Technol (TIST) 10(2):1–19
Yang S, Ren B, Zhou X, Liu L (2019) Parallel distributed logistic regression for vertical federated learning without third-party coordinator. arXiv:1911.09824
Acknowledgements
This work is supported by the National Natural Science Foundation of China under Grant No.61772229 and No.62072208.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This article belongs to the Topical Collection: Special Issue on Privacy-Preserving Computing
Guest Editors: Kaiping Xue, Zhe Liu, Haojin Zhu, Miao Pan and David S.L. Wei
Rights and permissions
About this article
Cite this article
Wei, Q., Li, Q., Zhou, Z. et al. Privacy-preserving two-parties logistic regression on vertically partitioned data using asynchronous gradient sharing. Peer-to-Peer Netw. Appl. 14, 1379–1387 (2021). https://doi.org/10.1007/s12083-020-01017-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12083-020-01017-x