research-article

Improving Availability of Vertical Federated Learning: Relaxing Inference on Non-overlapping Data

Authors:
Zhenghang Ren

Hong Kong University of Science and Technology, China

Hong Kong University of Science and Technology, China

0000-0002-8779-4768
View Profile

,
Liu Yang

Hong Kong University of Science and Technology, China

Hong Kong University of Science and Technology, China

0000-0002-4393-1791
View Profile

,
Kai Chen

Hong Kong University of Science and Technology, China

Hong Kong University of Science and Technology, China

0000-0003-2587-6028
View Profile

ACM Transactions on Intelligent Systems and Technology Volume 13 Issue 4Article No.: 58pp 1–20https://doi.org/10.1145/3501817

Published:28 June 2022Publication History

ACM Transactions on Intelligent Systems and Technology

Abstract

Vertical Federated Learning (VFL) enables multiple parties to collaboratively train a machine learning model over vertically distributed datasets without data privacy leakage. However, there is a limitation of the current VFL solutions: current VFL models fail to conduct inference on non-overlapping samples during inference. This limitation seriously damages the VFL model’s availability because, in practice, overlapping samples may only take up a small portion of the whole data at each party which means a large part of inference tasks will fail. In this article, we propose a novel VFL framework which enables federated inference on non-overlapping data. Our framework regards the distributed features as privileged information which is available in the training period but disappears during inference. We distill the knowledge of such privileged features and transfer them to the parties’ local model which only processes local features. Furthermore, we adopt Oblivious Transfer (OT) to preserve data ID privacy during training and inference. Empirically, we evaluate the model on the real-world dataset collected from Criteo and Taobao. Besides, we also provide a security analysis of the proposed framework.

REFERENCES

[1] Armknecht Frederik, Boyd Colin, Carr Christopher, Gjøsteen Kristian, Jäschke Angela, Reuter Christian A., and Strand Martin. 2015. A Guide to Fully Homomorphic Encryption. Cryptology ePrint Archive Report 2015/1192. https://eprint.iacr.org/2015/1192.Google Scholar
[2] Asharov Gilad, Lindell Yehuda, Schneider Thomas, and Zohner Michael. 2013. More efficient oblivious transfer and extensions for faster secure computation. In Proceedings of the 2013 ACM SIGSAC Conference on Computer and Communications Security. Association for Computing Machinery, New York, NY, 535–548. DOI:Google ScholarDigital Library
[3] Beaver Donald. 1995. Precomputing oblivious transfer. In Proceedings of the Advances in Cryptology, Coppersmith Don (Ed.). Springer Berlin, Berlin, 97–109.Google ScholarCross Ref
[4] Beaver Donald. 1996. Correlated pseudorandomness and the complexity of private computations. In Proceedings of the 28th Annual ACM Symposium on Theory of Computing. Association for Computing Machinery, New York, NY, 479–488. DOI:Google ScholarDigital Library
[5] Bellare Mihir and Micali Silvio. 1990. Non-interactive oblivious transfer and applications. In Proceedings of the Advances in Cryptology, Brassard Gilles (Ed.). Springer New York, New York, 547–557.Google ScholarCross Ref
[6] Chai Di, Wang Leye, Chen Kai, and Yang Qiang. 2021. Secure federated matrix factorization. IEEE Intelligent Systems 36, 5 (2021), 11–20. DOI:Google ScholarCross Ref
[7] Chen Tianyi, Jin Xiao, Sun Yuejiao, and Yin Wotao. 2020. VAFL: a Method of Vertical Asynchronous Federated Learning. arxiv:2007.06081. Retrieved from https://arxiv.org/abs/2007.06081.Google Scholar
[8] Criteo Challenge. 2014. Criteo Display Advertising Challenge. https://www.kaggle.com/c/criteo-display-ad-challenge/data. Access on 20 Feb. 2021.Google Scholar
[9] Damgård Ivan, Keller Marcel, Larraia Enrique, Pastro Valerio, Scholl Peter, and Smart Nigel P.. 2013. Practical covertly secure MPC for dishonest majority – or: Breaking the SPDZ limits. In Proceedings of the Computer Security, Crampton Jason, Jajodia Sushil, and Mayes Keith (Eds.). Springer Berlin, Berlin, 1–18.Google ScholarCross Ref
[10] Demmler Daniel, Schneider Thomas, and Zohner Michael. 2015. ABY-A framework for efficient mixed-protocol secure two-party computation. In Proceedings of the NDSS.Google ScholarCross Ref
[11] Gao Dashan, Liu Yang, Huang Anbu, Ju Ce, Yu Han, and Yang Qiang. 2019. Privacy-preserving heterogeneous federated transfer learning. In Proceedings of the 2019 IEEE International Conference on Big Data. 2552–2559. DOI:Google ScholarCross Ref
[12] Gentry Craig. 2009. Fully homomorphic encryption using ideal lattices. In Proceedings of the 41st Annual ACM Symposium on Theory of Computing. Association for Computing Machinery, New York, NY, 169–178. DOI:Google ScholarDigital Library
[13] Gu Chun Sheng. 2015. Fully homomorphic encryption from approximate ideal lattices. Ruan Jian Xue Bao/Journal of Software 26, 10 (2015), 2696–2719. DOI:Google ScholarCross Ref
[14] Hard Andrew, Kiddon Chloé M., Ramage Daniel, Beaufays Francoise, Eichner Hubert, Rao Kanishka, Mathews Rajiv, and Augenstein Sean. 2018. Federated Learning for Mobile Keyboard Prediction. Google ScholarCross Ref
[15] Hu Yaochen, Niu Di, Yang Jianming, and Zhou Shengping. 2018. FDML: A collaborative machine learning framework for distributed features. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (2018), 2232–2240.Google Scholar
[16] Ishai Yuval, Kilian Joe, Nissim Kobbi, and Petrank Erez. 2003. Extending oblivious transfers efficiently. In Proceedings of the Advances in Cryptology, Boneh Dan (Ed.). Springer Berlin, Berlin, 145–161.Google ScholarCross Ref
[17] Kairouz Peter, McMahan H. Brendan, Avent Brendan, Bellet Aurélien, Bennis Mehdi, Bhagoji Arjun Nitin, Bonawitz Kallista, Charles Zachary, Cormode Graham, Cummings Rachel, and D’Oliveira RG. 2019. Advances and open problems in federated learning. arXiv:1912.04977. Retrieved from https://arxiv.org/abs/1912.04977.Google Scholar
[18] Kolesnikov Vladimir and Kumaresan Ranjit. 2013. Improved ot extension for transferring short secrets. In Proceedings of the Advances in Cryptology, Canetti Ran and Garay Juan A. (Eds.). Springer Berlin, Berlin, 54–70.Google ScholarCross Ref
[19] Liang Gang and Chawathe Sudarshan S.. 2004. Privacy-preserving inter-database operations. In Proceedings of the Intelligence and Security Informatics, Chen Hsinchun, Moore Reagan, Zeng Daniel D., and Leavitt John (Eds.). Springer Berlin, Berlin, 66–82.Google ScholarCross Ref
[20] Liu Yang, Kang Yan, Zhang Xin wei, Li Liping, Cheng Yong, Chen Tianjian, Hong M., and Yang Q.. 2019. A communication efficient collaborative learning framework for distributed features. Google ScholarCross Ref
[21] Sudipan Saha and Tahir Ahmad. 2020. Federated Transfer Learning: concept and applications. Google ScholarCross Ref
[22] Liu Yang, Kang Yan, Xing Chaoping, Chen Tianjian, and Yang Qiang. 2020. A secure federated transfer learning framework. IEEE Intelligent Systems 35, 4 (2020), 70–82. DOI:Google ScholarCross Ref
[23] McMahan Brendan, Moore Eider, Ramage Daniel, Hampson Seth, and Arcas Blaise Aguera y. 2017. Communication-efficient learning of deep networks from decentralized data. In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, Singh Aarti and Zhu Jerry (Eds.). PMLR, Fort Lauderdale, FL, 1273–1282. Retrieved from http://proceedings.mlr.press/v54/mcmahan17a.html.Google Scholar
[24] Mohassel Payman and Rindal Peter. 2018. ABY\( ^{3} \): A mixed protocol framework for machine learning. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security. Association for Computing Machinery, New York, NY, 35–52. DOI:Google ScholarDigital Library
[25] Mohassel Payman and Zhang Yupeng. 2017. SecureML: A system for scalable privacy-preserving machine learning. In Proceedings of the 2017 IEEE Symposium on Security and Privacy, 19–38. DOI:Google ScholarCross Ref
[26] Nielsen Jesper Buus, Nordholt Peter Sebastian, Orlandi Claudio, and Burra Sai Sheshank. 2012. A new approach to practical active-secure two-party computation. In Proceedings of the Advances in Cryptology, Safavi-Naini Reihaneh and Canetti Ran (Eds.). Springer Berlin, Berlin, 681–700.Google ScholarDigital Library
[27] Paillier Pascal. 1999. Public-key cryptosystems based on composite degree residuosity classes. In Proceedings of the Advances in Cryptology, Stern Jacques (Ed.). Springer Berlin, Berlin, 223–238.Google ScholarCross Ref
[28] Paillier Pascal and Pointcheval David. 1999. Efficient public-key cryptosystems provably secure against active adversaries. In Proceedings of the Advances in Cryptology, Lam Kwok-Yan, Okamoto Eiji, and Xing Chaoping (Eds.). Springer Berlin, Berlin, 165–179.Google ScholarCross Ref
[29] Pan Sinno Jialin and Yang Qiang. 2009. A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering 22, 10 (2009), 1345–1359.Google ScholarDigital Library
[30] Sharma Shreya, Xing Chaoping, Liu Yang, and Kang Yan. 2019. Secure and efficient federated transfer learning. In Proceedings of the 2019 IEEE International Conference on Big Data. IEEE, 2569–2576.Google ScholarCross Ref
[31] Taobao. 2018. Taobao Display/Click Dataset. https://tianchi.aliyun.com/dataset/dataDetail?dataId=56, Access on 23 Feb. 2021.Google Scholar
[32] Vapnik Vladimir and Izmailov Rauf. 2015. Learning using privileged information: Similarity control and knowledge transfer. Journal of Machine Learning Research 16, 1 (2015), 2023–2049.Google ScholarDigital Library
[33] Vapnik Vladimir and Vashist Akshay. 2009. A new learning paradigm: Learning using privileged information. Neural Networks 22, 5 (2009), 544–557. DOI:Google ScholarDigital Library
[34] Voigt Paul and Bussche Axel Von dem. 2017. The eu general data protection regulation (gdpr). A Practical Guide, 1st Ed., Cham: Springer International Publishing 10, 3152676 (2017), 10–55.Google Scholar
[35] Webank. 2019. FATE: An Industrial Grade Federated Learning Framework. Retrieved from https://github.com/FederatedAI/FATE, Access on 20 Feb. 2021.Google Scholar
[36] Xu Chen, Li Quan, Ge Junfeng, Gao Jinyang, Yang Xiaoyong, Pei Changhua, Sun Fei, Wu Jian, Sun Hanxiao, and Ou Wenwu. 2020. Privileged features distillation at taobao recommendations. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Association for Computing Machinery, New York, NY, 2590–2598. DOI:Google ScholarDigital Library
[37] Yang Qiang, Liu Yang, Chen Tianjian, and Tong Yongxin. 2019. Federated machine learning: Concept and applications. arXiv 10, 2 (2019), 1–19.Google Scholar
[38] Kaiqiang Xu, Xinchen Wan, Hao Wang, Zhenghang Ren, Xudong Liao, Decang Sun, Chaoliang Zeng, and Kai Chen. 2021. TACC: A Full-stack Cloud Computing Infrastructure for Machine Learning Tasks. Google ScholarCross Ref

Index Terms

Improving Availability of Vertical Federated Learning: Relaxing Inference on Non-overlapping Data
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
2. Security and privacy
  1. Security services
    1. Privacy-preserving protocols

Recommendations

A Comprehensive Survey of Privacy-preserving Federated Learning: A Taxonomy, Review, and Future Directions
Invited Tutorial

The past four years have witnessed the rapid development of federated learning (FL). However, new privacy concerns have also emerged during the aggregation of the distributed intermediate results. The emerging privacy-preserving FL (PPFL) has been ...
Read More
Vertical federated learning-based feature selection with non-overlapping sample utilization
Abstract
Vertical federated learning (VFL) is a privacy preserving collaborative machine learning technique designed for distributed learning scenarios in which data from different parties have overlap in the sample space. In this paper, a VFL ...
Highlights
- In this paper, we bridge this gap by proposing a novel VFL-based feature selection method—Vertical Federated Learning-based Feature Selection (VFLFS). To the ...
Read More
SVFL: Secure Vertical Federated Learning on Linear Models
Science of Cyber Security
Abstract
Federated learning (FL) is a popular technique that enables multiple parties to train a machine learning model collaboratively without disclosing the raw data to each other. A vertically partitioned federated learning configuration is applicable ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Intelligent Systems and Technology Volume 13, Issue 4
August 2022
364 pages
ISSN:2157-6904
EISSN:2157-6912
DOI:10.1145/3522732
Editor:
Huan Liu
Arizona State University, USA
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 28 June 2022
- Online AM: 12 May 2022
- Accepted: 1 November 2021
- Revised: 1 August 2021
- Received: 1 April 2021
Published in tist Volume 13, Issue 4

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Availability
privacy
vertical federated learning
Qualifiers
- research-article
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 2
  Total Citations
  View Citations
- 1,138
  Total Downloads
- Downloads (Last 12 months)393
- Downloads (Last 6 weeks)41
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

View Full Text

HTML Format

View this article in HTML Format .

View HTML Format

Improving Availability of Vertical Federated Learning: Relaxing Inference on Non-overlapping Data

ACM Transactions on Intelligent Systems and Technology

Abstract

REFERENCES

Cited By

Index Terms

Recommendations

A Comprehensive Survey of Privacy-preserving Federated Learning: A Taxonomy, Review, and Future Directions

Vertical federated learning-based feature selection with non-overlapping sample utilization

SVFL: Secure Vertical Federated Learning on Linear Models

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Full Text

HTML Format

Caption

Improving Availability of Vertical Federated Learning: Relaxing Inference on Non-overlapping Data

ACM Transactions on Intelligent Systems and Technology

Abstract

REFERENCES

Cited By

Index Terms

Recommendations

A Comprehensive Survey of Privacy-preserving Federated Learning: A Taxonomy, Review, and Future Directions

Vertical federated learning-based feature selection with non-overlapping sample utilization

SVFL: Secure Vertical Federated Learning on Linear Models

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Full Text

HTML Format

Share this Publication link

Share on Social Media