Skip to main content

New Approaches to Federated XGBoost Learning for Privacy-Preserving Data Analysis

  • Conference paper
  • First Online:
Neural Information Processing (ICONIP 2020)

Abstract

In this paper, we propose a new privacy-preserving machine learning algorithm called Federated-Learning XGBoost (FL-XGBoost), in which a federated learning scheme is introduced into XGBoost, a state-of-the-art gradient boosting decision tree model. The proposed FL-XGBoost can train a sensitive task to be solved among different entities without revealing their own data. The proposed FL-XGBoost can achieve significant reduction in the number of communications between entities by exchanging decision tree models. In our experiments, we carry out the performance comparison between FL-XGBoost and a different federated learning approach to XGBoost called FATE. The experimental results show that the proposed method can achieve high prediction accuracy with less communication even if the number of entities is increase.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Chen, T., Guestrin, C.: Xgboost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794 (2016)

    Google Scholar 

  2. Dwork, C., McSherry, F., Nissim, K., Smith, A.: Calibrating noise to sensitivity in private data analysis. In: Halevi, S., Rabin, T. (eds.) TCC 2006. LNCS, vol. 3876, pp. 265–284. Springer, Heidelberg (2006). https://doi.org/10.1007/11681878_14

    Chapter  Google Scholar 

  3. Friedman, J.: Greedy function approximation: a gradient boosting machine. Ann. Stat. pp. 1189–1232 (2001)

    Google Scholar 

  4. Kaggle: Credit Card Fraud Detection. https://www.kaggle.com/mlg-ulb/creditcardfraud. Accessed 14 Sep 2020

  5. Paillier, P.: Public-key cryptosystems based on composite degree residuosity classes. In: Stern, J. (ed.) EUROCRYPT 1999. LNCS, vol. 1592, pp. 223–238. Springer, Heidelberg (1999). https://doi.org/10.1007/3-540-48910-X_16

    Chapter  Google Scholar 

  6. UCI: Arcene Data Set. https://archive.ics.uci.edu/ml/datasets/Arcene. Accessed 14 Sep 2020

  7. UCI: German Credit Data. https://archive.ics.uci.edu/ml/datasets/statlog+(german+credit+data. Accessed 14 Sep 2020

  8. UCI: QSAR biodegradation Data Set (UCI). https://archive.ics.uci.edu/ml/datasets/QSAR+biodegradation. Accessed 14 Sep 2020

  9. Webank: FATE (Federated AI Technology Enabler). https://fate.readthedocs.io/en/latest/index.html. Accessed 14 Sep 2020

  10. Yang, M., Song, L., Xu, J., Li, C., Tan, G.: The tradeoff between privacy and accuracy in anomaly detection using federated XGBoost. arXiv preprint arXiv:1907.07157 (2019)

  11. Yang, Q., Liu, Y., Chen, T., Tong, Y.: Federated machine learning: Concept and applications. In: ACM Transactions on Intelligent Systems and Technology (TIST), New York, NY, USA, pp. 1–19. ACM (2019)

    Google Scholar 

  12. Zhao, L., et al.: Inprivate digging: enabling tree-based distributed data mining with differential privacy. In: IEEE INFOCOM 2018-IEEE Conference on Computer Communications, pp. 2087–2095. IEEE (2018)

    Google Scholar 

Download references

Acknowledgement

We would like to thank Associate Professor Toshiaki Omori and the members of the National Institute of Information and Communications Technology (NICT) for their helpful advice and support in writing this paper. This research has been accomplished through the project “Social Implementation of Privacy-Preserving Data Analytics” (JPMJCR19F6) in the JST CREST research area “Development and Integration of Artificial Intelligence Technologies for Innovation Acceleration”.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Seiichi Ozawa .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Yamamoto, F., Wang, L., Ozawa, S. (2020). New Approaches to Federated XGBoost Learning for Privacy-Preserving Data Analysis. In: Yang, H., Pasupa, K., Leung, A.CS., Kwok, J.T., Chan, J.H., King, I. (eds) Neural Information Processing. ICONIP 2020. Lecture Notes in Computer Science(), vol 12533. Springer, Cham. https://doi.org/10.1007/978-3-030-63833-7_47

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-63833-7_47

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-63832-0

  • Online ISBN: 978-3-030-63833-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics