skip to main content
10.1145/3543507.3583542acmconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
research-article

AgrEvader: Poisoning Membership Inference against Byzantine-robust Federated Learning

Published:30 April 2023Publication History

ABSTRACT

The Poisoning Membership Inference Attack (PMIA) is a newly emerging privacy attack that poses a significant threat to federated learning (FL). An adversary conducts data poisoning (i.e., performing adversarial manipulations on training examples) to extract membership information by exploiting the changes in loss resulting from data poisoning. The PMIA significantly exacerbates the traditional poisoning attack that is primarily focused on model corruption. However, there has been a lack of a comprehensive systematic study that thoroughly investigates this topic. In this work, we conduct a benchmark evaluation to assess the performance of PMIA against the Byzantine-robust FL setting that is specifically designed to mitigate poisoning attacks. We find that all existing coordinate-wise averaging mechanisms fail to defend against the PMIA, while the detect-then-drop strategy was proven to be effective in most cases, implying that the poison injection is memorized and the poisonous effect rarely dissipates. Inspired by this observation, we propose AgrEvader, a PMIA that maximizes the adversarial impact on the victim samples while circumventing the detection by Byzantine-robust mechanisms. AgrEvader significantly outperforms existing PMIAs. For instance, AgrEvader achieved a high attack accuracy of between 72.78% (on CIFAR-10) to 97.80% (on Texas100), which is an average accuracy increase of 13.89% compared to the strongest PMIA reported in the literature. We evaluated AgrEvader on five datasets across different domains, against a comprehensive list of threat models, which included black-box, gray-box and white-box models for targeted and non-targeted scenarios. AgrEvader demonstrated consistent high accuracy across all settings tested. The code is available at: https://github.com/PrivSecML/AgrEvader.

References

  1. [1] 2021. https://www.dshs.texas.gov/THCIC/Hospitals/Download.shtmGoogle ScholarGoogle Scholar
  2. [2] 2021. https://kaggle.com/c/acquire-valued-shoppers-challenge/dataGoogle ScholarGoogle Scholar
  3. [3] 2021. https://sites.google.com/site/yangdingqi/home/foursquare-datasetGoogle ScholarGoogle Scholar
  4. Martin Abadi, Andy Chu, Ian Goodfellow, H Brendan McMahan, Ilya Mironov, Kunal Talwar, and Li Zhang. 2016. Deep learning with differential privacy. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. ACM, 308–318.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Dan Alistarh, Zeyuan Allen-Zhu, and Jerry Li. 2018. Byzantine stochastic gradient descent. In Proceedings of the 32nd International Conference on Neural Information Processing Systems. 4618–4628.Google ScholarGoogle Scholar
  6. Eugene Bagdasaryan, Andreas Veit, Yiqing Hua, Deborah Estrin, and Vitaly Shmatikov. 2020. How to backdoor federated learning. In International Conference on Artificial Intelligence and Statistics. PMLR, 2938–2948.Google ScholarGoogle Scholar
  7. Arjun Nitin Bhagoji, Supriyo Chakraborty, Prateek Mittal, and Seraphin Calo. 2019. Analyzing federated learning through an adversarial lens. In International Conference on Machine Learning. PMLR, 634–643.Google ScholarGoogle Scholar
  8. Battista Biggio, Blaine Nelson, and Pavel Laskov. 2012. Poisoning attacks against support vector machines. arXiv preprint arXiv:1206.6389 (2012).Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Peva Blanchard, El Mahdi El Mhamdi, Rachid Guerraoui, and Julien Stainer. 2017. Machine learning with adversaries: Byzantine tolerant gradient descent. In Proceedings of the 31st International Conference on Neural Information Processing Systems. 118–128.Google ScholarGoogle Scholar
  10. Keith Bonawitz, Vladimir Ivanov, Ben Kreuter, Antonio Marcedone, H Brendan McMahan, Sarvar Patel, Daniel Ramage, Aaron Segal, and Karn Seth. 2017. Practical secure aggregation for privacy-preserving machine learning. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. ACM, 1175–1191.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Xiaoyu Cao, Minghong Fang, Jia Liu, and Neil Zhenqiang Gong. 2020. FLTrust: Byzantine-robust Federated Learning via Trust Bootstrapping. arXiv preprint arXiv:2012.13995 (2020).Google ScholarGoogle Scholar
  12. Yufei Chen, Chao Shen, Yun Shen, Cong Wang, and Yang Zhang. 2022. Amplifying Membership Exposure via Data Poisoning. NeurIPS 2022 (2022).Google ScholarGoogle Scholar
  13. Yi-Ruei Chen, Amir Rezapour, and Wen-Guey Tzeng. 2018. Privacy-preserving ridge regression on distributed data. Information Sciences 451 (2018), 34–49.Google ScholarGoogle ScholarCross RefCross Ref
  14. Christopher A Choquette-Choo, Florian Tramer, Nicholas Carlini, and Nicolas Papernot. 2021. Label-only membership inference attacks. In International conference on machine learning. PMLR, 1964–1974.Google ScholarGoogle Scholar
  15. 1000 Genomes Project Consortium 2015. A global reference for human genetic variation. Nature 526, 7571 (2015), 68.Google ScholarGoogle Scholar
  16. Cynthia Dwork, Aaron Roth, 2014. The algorithmic foundations of differential privacy. Foundations and Trends® in Theoretical Computer Science 9, 3–4 (2014), 211–407.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Minghong Fang, Xiaoyu Cao, Jinyuan Jia, and Neil Gong. 2020. Local model poisoning attacks to byzantine-robust federated learning. In 29th { USENIX} Security Symposium ({ USENIX} Security 20). 1605–1622.Google ScholarGoogle Scholar
  18. Adrià Gascón, Phillipp Schoppmann, Borja Balle, Mariana Raykova, Jack Doerner, Samee Zahur, and David Evans. 2017. Privacy-preserving distributed linear regression on high-dimensional data. Proceedings on Privacy Enhancing Technologies 2017, 4 (2017), 345–364.Google ScholarGoogle ScholarCross RefCross Ref
  19. Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. Advances in neural information processing systems 27 (2014).Google ScholarGoogle Scholar
  20. Rachid Guerraoui, Sébastien Rouault, 2018. The hidden vulnerability of distributed learning in byzantium. In International Conference on Machine Learning. PMLR, 3521–3530.Google ScholarGoogle Scholar
  21. Yaochen Hu, Di Niu, Jianming Yang, and Shengping Zhou. 2019. FDML: A collaborative machine learning framework for distributed features. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2232–2240.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Matthew Jagielski, Alina Oprea, Battista Biggio, Chang Liu, Cristina Nita-Rotaru, and Bo Li. 2018. Manipulating machine learning: Poisoning attacks and countermeasures for regression learning. In 2018 IEEE Symposium on Security and Privacy (SP). IEEE, 19–35.Google ScholarGoogle ScholarCross RefCross Ref
  23. Bargav Jayaraman and David Evans. 2019. Evaluating differentially private machine learning in practice. In 28th { USENIX} Security Symposium ({ USENIX} Security 19). 1895–1912.Google ScholarGoogle Scholar
  24. Jinyuan Jia, Ahmed Salem, Michael Backes, Yang Zhang, and Neil Zhenqiang Gong. 2019. Memguard: Defending against black-box membership inference attacks via adversarial examples. In Proceedings of the 2019 ACM SIGSAC conference on computer and communications security. 259–274.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Peter Kairouz, H Brendan McMahan, Brendan Avent, Aurélien Bellet, Mehdi Bennis, Arjun Nitin Bhagoji, Kallista Bonawitz, Zachary Charles, Graham Cormode, Rachel Cummings, 2019. Advances and open problems in federated learning. arXiv preprint arXiv:1912.04977 (2019).Google ScholarGoogle Scholar
  26. Georgios A Kaissis, Marcus R Makowski, Daniel Rückert, and Rickmer F Braren. 2020. Secure, privacy-preserving and federated machine learning in medical imaging. Nature Machine Intelligence 2, 6 (2020), 305–311.Google ScholarGoogle ScholarCross RefCross Ref
  27. Jakub Konečnỳ, H Brendan McMahan, Daniel Ramage, and Peter Richtárik. 2016. Federated optimization: Distributed machine learning for on-device intelligence. arXiv preprint arXiv:1610.02527 (2016).Google ScholarGoogle Scholar
  28. Jakub Konečnỳ, H Brendan McMahan, Felix X Yu, Peter Richtárik, Ananda Theertha Suresh, and Dave Bacon. 2016. Federated learning: Strategies for improving communication efficiency. arXiv preprint arXiv:1610.05492 (2016).Google ScholarGoogle Scholar
  29. Alex Krizhevsky, Geoffrey Hinton, 2009. Learning multiple layers of features from tiny images. (2009).Google ScholarGoogle Scholar
  30. Andrew Law, Chester Leung, Rishabh Poddar, Raluca Ada Popa, Chenyu Shi, Octavian Sima, Chaofan Yu, Xingmeng Zhang, and Wenting Zheng. 2020. Secure collaborative training and inference for xgboost. In Proceedings of the 2020 Workshop on Privacy-Preserving Machine Learning in Practice. 21–26.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Zheng Li and Yang Zhang. 2021. Membership leakage in label-only exposures. In Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security. 880–895.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Hongbin Liu, Jinyuan Jia, Wenjie Qu, and Neil Zhenqiang Gong. 2021. EncoderMI: Membership inference against pre-trained encoders in contrastive learning. In Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security. 2081–2095.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Jian Liu, Mika Juuti, Yao Lu, and Nadarajah Asokan. 2017. Oblivious neural network predictions via minionn transformations. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. ACM, 619–631.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Saeed Mahloujifar, Esha Ghosh, and Melissa Chase. 2022. Property Inference from Poisoning. In 2022 IEEE Symposium on Security and Privacy (SP). IEEE Computer Society, 1569–1569.Google ScholarGoogle ScholarCross RefCross Ref
  35. Tilen Marc, Miha Stopar, Jan Hartman, Manca Bizjak, and Jolanda Modic. 2019. Privacy-Enhanced Machine Learning with Functional Encryption. In European Symposium on Research in Computer Security. Springer, 3–21.Google ScholarGoogle Scholar
  36. Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Aguera y Arcas. 2017. Communication-efficient learning of deep networks from decentralized data. In Artificial intelligence and statistics. PMLR, 1273–1282.Google ScholarGoogle Scholar
  37. Luca Melis, Congzheng Song, Emiliano De Cristofaro, and Vitaly Shmatikov. 2019. Exploiting unintended feature leakage in collaborative learning. In 2019 IEEE Symposium on Security and Privacy (SP). IEEE, 691–706.Google ScholarGoogle ScholarCross RefCross Ref
  38. Fatemehsadat Mireshghallah, Kartik Goyal, Archit Uniyal, Taylor Berg-Kirkpatrick, and Reza Shokri. 2022. Quantifying privacy risks of masked language models using membership inference attacks. arXiv preprint arXiv:2203.03929 (2022).Google ScholarGoogle Scholar
  39. Payman Mohassel and Yupeng Zhang. 2017. Secureml: A system for scalable privacy-preserving machine learning. In 2017 IEEE Symposium on Security and Privacy (SP). IEEE, 19–38.Google ScholarGoogle ScholarCross RefCross Ref
  40. Luis Muñoz-González, Kenneth T Co, and Emil C Lupu. 2019. Byzantine-robust federated machine learning through adaptive model averaging. arXiv preprint arXiv:1909.05125 (2019).Google ScholarGoogle Scholar
  41. Milad Nasr, Reza Shokri, and Amir Houmansadr. 2018. Machine learning with membership privacy using adversarial regularization. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security. 634–646.Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Milad Nasr, Reza Shokri, and Amir Houmansadr. 2019. Comprehensive privacy analysis of deep learning: Passive and active white-box inference attacks against centralized and federated learning. In 2019 IEEE symposium on security and privacy (SP). IEEE, 739–753.Google ScholarGoogle ScholarCross RefCross Ref
  43. German I Parisi, Ronald Kemker, Jose L Part, Christopher Kanan, and Stefan Wermter. 2019. Continual lifelong learning with neural networks: A review. Neural Networks 113 (2019), 54–71.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Do Le Quoc and Christof Fetzer. 2021. SecFL: Confidential Federated Learning using TEEs. arXiv preprint arXiv:2110.00981 (2021).Google ScholarGoogle Scholar
  45. Md Atiqur Rahman, Tanzila Rahman, Robert Laganière, Noman Mohammed, and Yang Wang. 2018. Membership Inference Attack against Differentially Private Deep Learning Model.Trans. Data Priv. 11, 1 (2018), 61–79.Google ScholarGoogle Scholar
  46. Doyen Sahoo, Quang Pham, Jing Lu, and Steven CH Hoi. 2017. Online deep learning: Learning deep neural networks on the fly. arXiv preprint arXiv:1711.03705 (2017).Google ScholarGoogle Scholar
  47. Ahmed Salem, Yang Zhang, Mathias Humbert, Pascal Berrang, Mario Fritz, and Michael Backes. 2018. Ml-leaks: Model and data independent membership inference attacks and defenses on machine learning models. arXiv preprint arXiv:1806.01246 (2018).Google ScholarGoogle Scholar
  48. Sagar Sharma and Keke Chen. 2019. Confidential boosting with random linear classifiers for outsourced user-generated data. In European Symposium on Research in Computer Security. Springer, 41–65.Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Virat Shejwalkar and Amir Houmansadr. 2021. Manipulating the Byzantine: Optimizing Model Poisoning Attacks and Defenses for Federated Learning. Internet Society (2021), 18.Google ScholarGoogle Scholar
  50. Liyue Shen, Yanjun Zhang, Jingwei Wang, and Guangdong Bai. 2022. Better Together: Attaining the Triad of Byzantine-robust Federated Learning via Local Update Amplification. In Proceedings of the 38th Annual Computer Security Applications Conference. 201–213.Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Reza Shokri and Vitaly Shmatikov. 2015. Privacy-preserving deep learning. In Proceedings of the 22nd ACM SIGSAC conference on computer and communications security. ACM, 1310–1321.Google ScholarGoogle Scholar
  52. Reza Shokri, Marco Stronati, Congzheng Song, and Vitaly Shmatikov. 2017. Membership inference attacks against machine learning models. In 2017 IEEE Symposium on Security and Privacy (SP). IEEE, 3–18.Google ScholarGoogle ScholarCross RefCross Ref
  53. Liwei Song and Prateek Mittal. 2021. Systematic evaluation of privacy risks of machine learning models. In 30th { USENIX} Security Symposium ({ USENIX} Security 21).Google ScholarGoogle Scholar
  54. Yu Sun, Shuohuan Wang, Yukun Li, Shikun Feng, Hao Tian, Hua Wu, and Haifeng Wang. 2020. Ernie 2.0: A continual pre-training framework for language understanding. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 8968–8975.Google ScholarGoogle ScholarCross RefCross Ref
  55. Xinyu Tang, Saeed Mahloujifar, Liwei Song, Virat Shejwalkar, Milad Nasr, Amir Houmansadr, and Prateek Mittal. 2022. Mitigating Membership Inference Attacks by { Self-Distillation} Through a Novel Ensemble Architecture. In 31st USENIX Security Symposium (USENIX Security 22). 1433–1450.Google ScholarGoogle Scholar
  56. Florian Tramèr, Reza Shokri, Ayrton San Joaquin, Hoang Le, Matthew Jagielski, Sanghyun Hong, and Nicholas Carlini. 2022. Truth Serum: Poisoning Machine Learning Models to Reveal Their Secrets. In Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security. 2779–2792.Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Cong Xie, Oluwasanmi Koyejo, and Indranil Gupta. 2018. Generalized byzantine-tolerant sgd. arXiv preprint arXiv:1802.10116 (2018).Google ScholarGoogle Scholar
  58. Qiang Yang, Yang Liu, Tianjian Chen, and Yongxin Tong. 2019. Federated machine learning: Concept and applications. ACM Transactions on Intelligent Systems and Technology (TIST) 10, 2 (2019), 1–19.Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Samuel Yeom, Irene Giacomelli, Matt Fredrikson, and Somesh Jha. 2018. Privacy risk in machine learning: Analyzing the connection to overfitting. In 2018 IEEE 31st Computer Security Foundations Symposium (CSF). IEEE, 268–282.Google ScholarGoogle ScholarCross RefCross Ref
  60. Dong Yin, Yudong Chen, Ramchandran Kannan, and Peter Bartlett. 2018. Byzantine-robust distributed learning: Towards optimal statistical rates. In International Conference on Machine Learning. PMLR, 5650–5659.Google ScholarGoogle Scholar
  61. Guangsheng Zhang, Bo Liu, Tianqing Zhu, Ming Ding, and Wanlei Zhou. 2022. Label-Only Membership Inference Attacks and Defenses In Semantic Segmentation Models. IEEE Transactions on Dependable and Secure Computing (2022).Google ScholarGoogle Scholar
  62. Yanjun Zhang, Guangdong Bai, Xue Li, Caitlin Curtis, Chen Chen, and Ryan KL Ko. 2020. PrivColl: Practical Privacy-Preserving Collaborative Machine Learning. In European Symposium on Research in Computer Security. Springer, 399–418.Google ScholarGoogle Scholar
  63. Yanjun Zhang, Guangdong Bai, Xue Li, Surya Nepal, Marthie Grobler, Chen Chen, and Ryan KL Ko. 2022. Preserving Privacy for Distributed Genome-Wide Analysis Against Identity Tracing Attacks. IEEE Transactions on Dependable and Secure Computing (2022).Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. Yanjun Zhang, Guangdong Bai, Mingyang Zhong, Xue Li, and Ryan Ko. 2020. Differentially private collaborative coupling learning for recommender systems. IEEE Intelligent Systems (2020).Google ScholarGoogle Scholar
  65. Yanjun Zhang, Xin Zhao, Xue Li, Mingyang Zhong, Caitlin Curtis, and Chen Chen. 2019. Enabling privacy-preserving sharing of genomic data for GWASs in decentralized networks. In Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining. 204–212.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. AgrEvader: Poisoning Membership Inference against Byzantine-robust Federated Learning

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        WWW '23: Proceedings of the ACM Web Conference 2023
        April 2023
        4293 pages
        ISBN:9781450394161
        DOI:10.1145/3543507

        Copyright © 2023 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 30 April 2023

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed limited

        Acceptance Rates

        Overall Acceptance Rate1,899of8,196submissions,23%

        Upcoming Conference

        WWW '24
        The ACM Web Conference 2024
        May 13 - 17, 2024
        Singapore , Singapore

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format .

      View HTML Format