skip to main content
research-article

Improving Availability of Vertical Federated Learning: Relaxing Inference on Non-overlapping Data

Published:28 June 2022Publication History
Skip Abstract Section

Abstract

Vertical Federated Learning (VFL) enables multiple parties to collaboratively train a machine learning model over vertically distributed datasets without data privacy leakage. However, there is a limitation of the current VFL solutions: current VFL models fail to conduct inference on non-overlapping samples during inference. This limitation seriously damages the VFL model’s availability because, in practice, overlapping samples may only take up a small portion of the whole data at each party which means a large part of inference tasks will fail. In this article, we propose a novel VFL framework which enables federated inference on non-overlapping data. Our framework regards the distributed features as privileged information which is available in the training period but disappears during inference. We distill the knowledge of such privileged features and transfer them to the parties’ local model which only processes local features. Furthermore, we adopt Oblivious Transfer (OT) to preserve data ID privacy during training and inference. Empirically, we evaluate the model on the real-world dataset collected from Criteo and Taobao. Besides, we also provide a security analysis of the proposed framework.

REFERENCES

  1. [1] Armknecht Frederik, Boyd Colin, Carr Christopher, Gjøsteen Kristian, Jäschke Angela, Reuter Christian A., and Strand Martin. 2015. A Guide to Fully Homomorphic Encryption. Cryptology ePrint Archive Report 2015/1192. https://eprint.iacr.org/2015/1192.Google ScholarGoogle Scholar
  2. [2] Asharov Gilad, Lindell Yehuda, Schneider Thomas, and Zohner Michael. 2013. More efficient oblivious transfer and extensions for faster secure computation. In Proceedings of the 2013 ACM SIGSAC Conference on Computer and Communications Security. Association for Computing Machinery, New York, NY, 535548. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. [3] Beaver Donald. 1995. Precomputing oblivious transfer. In Proceedings of the Advances in Cryptology, Coppersmith Don (Ed.). Springer Berlin, Berlin, 97109.Google ScholarGoogle ScholarCross RefCross Ref
  4. [4] Beaver Donald. 1996. Correlated pseudorandomness and the complexity of private computations. In Proceedings of the 28th Annual ACM Symposium on Theory of Computing. Association for Computing Machinery, New York, NY, 479488. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. [5] Bellare Mihir and Micali Silvio. 1990. Non-interactive oblivious transfer and applications. In Proceedings of the Advances in Cryptology, Brassard Gilles (Ed.). Springer New York, New York, 547557.Google ScholarGoogle ScholarCross RefCross Ref
  6. [6] Chai Di, Wang Leye, Chen Kai, and Yang Qiang. 2021. Secure federated matrix factorization. IEEE Intelligent Systems 36, 5 (2021), 1120. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  7. [7] Chen Tianyi, Jin Xiao, Sun Yuejiao, and Yin Wotao. 2020. VAFL: a Method of Vertical Asynchronous Federated Learning. arxiv:2007.06081. Retrieved from https://arxiv.org/abs/2007.06081.Google ScholarGoogle Scholar
  8. [8] Criteo Challenge. 2014. Criteo Display Advertising Challenge. https://www.kaggle.com/c/criteo-display-ad-challenge/data. Access on 20 Feb. 2021.Google ScholarGoogle Scholar
  9. [9] Damgård Ivan, Keller Marcel, Larraia Enrique, Pastro Valerio, Scholl Peter, and Smart Nigel P.. 2013. Practical covertly secure MPC for dishonest majority – or: Breaking the SPDZ limits. In Proceedings of the Computer Security, Crampton Jason, Jajodia Sushil, and Mayes Keith (Eds.). Springer Berlin, Berlin, 118.Google ScholarGoogle ScholarCross RefCross Ref
  10. [10] Demmler Daniel, Schneider Thomas, and Zohner Michael. 2015. ABY-A framework for efficient mixed-protocol secure two-party computation. In Proceedings of the NDSS.Google ScholarGoogle ScholarCross RefCross Ref
  11. [11] Gao Dashan, Liu Yang, Huang Anbu, Ju Ce, Yu Han, and Yang Qiang. 2019. Privacy-preserving heterogeneous federated transfer learning. In Proceedings of the 2019 IEEE International Conference on Big Data. 25522559. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  12. [12] Gentry Craig. 2009. Fully homomorphic encryption using ideal lattices. In Proceedings of the 41st Annual ACM Symposium on Theory of Computing. Association for Computing Machinery, New York, NY, 169178. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. [13] Gu Chun Sheng. 2015. Fully homomorphic encryption from approximate ideal lattices. Ruan Jian Xue Bao/Journal of Software 26, 10 (2015), 26962719. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  14. [14] Hard Andrew, Kiddon Chloé M., Ramage Daniel, Beaufays Francoise, Eichner Hubert, Rao Kanishka, Mathews Rajiv, and Augenstein Sean. 2018. Federated Learning for Mobile Keyboard Prediction. Google ScholarGoogle ScholarCross RefCross Ref
  15. [15] Hu Yaochen, Niu Di, Yang Jianming, and Zhou Shengping. 2018. FDML: A collaborative machine learning framework for distributed features. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (2018), 22322240.Google ScholarGoogle Scholar
  16. [16] Ishai Yuval, Kilian Joe, Nissim Kobbi, and Petrank Erez. 2003. Extending oblivious transfers efficiently. In Proceedings of the Advances in Cryptology, Boneh Dan (Ed.). Springer Berlin, Berlin, 145161.Google ScholarGoogle ScholarCross RefCross Ref
  17. [17] Kairouz Peter, McMahan H. Brendan, Avent Brendan, Bellet Aurélien, Bennis Mehdi, Bhagoji Arjun Nitin, Bonawitz Kallista, Charles Zachary, Cormode Graham, Cummings Rachel, and D’Oliveira RG. 2019. Advances and open problems in federated learning. arXiv:1912.04977. Retrieved from https://arxiv.org/abs/1912.04977.Google ScholarGoogle Scholar
  18. [18] Kolesnikov Vladimir and Kumaresan Ranjit. 2013. Improved ot extension for transferring short secrets. In Proceedings of the Advances in Cryptology, Canetti Ran and Garay Juan A. (Eds.). Springer Berlin, Berlin, 5470.Google ScholarGoogle ScholarCross RefCross Ref
  19. [19] Liang Gang and Chawathe Sudarshan S.. 2004. Privacy-preserving inter-database operations. In Proceedings of the Intelligence and Security Informatics, Chen Hsinchun, Moore Reagan, Zeng Daniel D., and Leavitt John (Eds.). Springer Berlin, Berlin, 6682.Google ScholarGoogle ScholarCross RefCross Ref
  20. [20] Liu Yang, Kang Yan, Zhang Xin wei, Li Liping, Cheng Yong, Chen Tianjian, Hong M., and Yang Q.. 2019. A communication efficient collaborative learning framework for distributed features. Google ScholarGoogle ScholarCross RefCross Ref
  21. [21] Sudipan Saha and Tahir Ahmad. 2020. Federated Transfer Learning: concept and applications. Google ScholarGoogle ScholarCross RefCross Ref
  22. [22] Liu Yang, Kang Yan, Xing Chaoping, Chen Tianjian, and Yang Qiang. 2020. A secure federated transfer learning framework. IEEE Intelligent Systems 35, 4 (2020), 7082. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  23. [23] McMahan Brendan, Moore Eider, Ramage Daniel, Hampson Seth, and Arcas Blaise Aguera y. 2017. Communication-efficient learning of deep networks from decentralized data. In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, Singh Aarti and Zhu Jerry (Eds.). PMLR, Fort Lauderdale, FL, 12731282. Retrieved from http://proceedings.mlr.press/v54/mcmahan17a.html.Google ScholarGoogle Scholar
  24. [24] Mohassel Payman and Rindal Peter. 2018. ABY\( ^{3} \): A mixed protocol framework for machine learning. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security. Association for Computing Machinery, New York, NY, 3552. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. [25] Mohassel Payman and Zhang Yupeng. 2017. SecureML: A system for scalable privacy-preserving machine learning. In Proceedings of the 2017 IEEE Symposium on Security and Privacy, 1938. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  26. [26] Nielsen Jesper Buus, Nordholt Peter Sebastian, Orlandi Claudio, and Burra Sai Sheshank. 2012. A new approach to practical active-secure two-party computation. In Proceedings of the Advances in Cryptology, Safavi-Naini Reihaneh and Canetti Ran (Eds.). Springer Berlin, Berlin, 681700.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. [27] Paillier Pascal. 1999. Public-key cryptosystems based on composite degree residuosity classes. In Proceedings of the Advances in Cryptology, Stern Jacques (Ed.). Springer Berlin, Berlin, 223238.Google ScholarGoogle ScholarCross RefCross Ref
  28. [28] Paillier Pascal and Pointcheval David. 1999. Efficient public-key cryptosystems provably secure against active adversaries. In Proceedings of the Advances in Cryptology, Lam Kwok-Yan, Okamoto Eiji, and Xing Chaoping (Eds.). Springer Berlin, Berlin, 165179.Google ScholarGoogle ScholarCross RefCross Ref
  29. [29] Pan Sinno Jialin and Yang Qiang. 2009. A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering 22, 10 (2009), 13451359.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. [30] Sharma Shreya, Xing Chaoping, Liu Yang, and Kang Yan. 2019. Secure and efficient federated transfer learning. In Proceedings of the 2019 IEEE International Conference on Big Data. IEEE, 25692576.Google ScholarGoogle ScholarCross RefCross Ref
  31. [31] Taobao. 2018. Taobao Display/Click Dataset. https://tianchi.aliyun.com/dataset/dataDetail?dataId=56, Access on 23 Feb. 2021.Google ScholarGoogle Scholar
  32. [32] Vapnik Vladimir and Izmailov Rauf. 2015. Learning using privileged information: Similarity control and knowledge transfer. Journal of Machine Learning Research 16, 1 (2015), 20232049.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. [33] Vapnik Vladimir and Vashist Akshay. 2009. A new learning paradigm: Learning using privileged information. Neural Networks 22, 5 (2009), 544557. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. [34] Voigt Paul and Bussche Axel Von dem. 2017. The eu general data protection regulation (gdpr). A Practical Guide, 1st Ed., Cham: Springer International Publishing 10, 3152676 (2017), 1055.Google ScholarGoogle Scholar
  35. [35] Webank. 2019. FATE: An Industrial Grade Federated Learning Framework. Retrieved from https://github.com/FederatedAI/FATE, Access on 20 Feb. 2021.Google ScholarGoogle Scholar
  36. [36] Xu Chen, Li Quan, Ge Junfeng, Gao Jinyang, Yang Xiaoyong, Pei Changhua, Sun Fei, Wu Jian, Sun Hanxiao, and Ou Wenwu. 2020. Privileged features distillation at taobao recommendations. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Association for Computing Machinery, New York, NY, 25902598. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. [37] Yang Qiang, Liu Yang, Chen Tianjian, and Tong Yongxin. 2019. Federated machine learning: Concept and applications. arXiv 10, 2 (2019), 119.Google ScholarGoogle Scholar
  38. [38] Kaiqiang Xu, Xinchen Wan, Hao Wang, Zhenghang Ren, Xudong Liao, Decang Sun, Chaoliang Zeng, and Kai Chen. 2021. TACC: A Full-stack Cloud Computing Infrastructure for Machine Learning Tasks. Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Improving Availability of Vertical Federated Learning: Relaxing Inference on Non-overlapping Data

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Intelligent Systems and Technology
        ACM Transactions on Intelligent Systems and Technology  Volume 13, Issue 4
        August 2022
        364 pages
        ISSN:2157-6904
        EISSN:2157-6912
        DOI:10.1145/3522732
        • Editor:
        • Huan Liu
        Issue’s Table of Contents

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 28 June 2022
        • Online AM: 12 May 2022
        • Accepted: 1 November 2021
        • Revised: 1 August 2021
        • Received: 1 April 2021
        Published in tist Volume 13, Issue 4

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Full Text

      View this article in Full Text.

      View Full Text

      HTML Format

      View this article in HTML Format .

      View HTML Format