ABSTRACT
The Poisoning Membership Inference Attack (PMIA) is a newly emerging privacy attack that poses a significant threat to federated learning (FL). An adversary conducts data poisoning (i.e., performing adversarial manipulations on training examples) to extract membership information by exploiting the changes in loss resulting from data poisoning. The PMIA significantly exacerbates the traditional poisoning attack that is primarily focused on model corruption. However, there has been a lack of a comprehensive systematic study that thoroughly investigates this topic. In this work, we conduct a benchmark evaluation to assess the performance of PMIA against the Byzantine-robust FL setting that is specifically designed to mitigate poisoning attacks. We find that all existing coordinate-wise averaging mechanisms fail to defend against the PMIA, while the detect-then-drop strategy was proven to be effective in most cases, implying that the poison injection is memorized and the poisonous effect rarely dissipates. Inspired by this observation, we propose AgrEvader, a PMIA that maximizes the adversarial impact on the victim samples while circumventing the detection by Byzantine-robust mechanisms. AgrEvader significantly outperforms existing PMIAs. For instance, AgrEvader achieved a high attack accuracy of between 72.78% (on CIFAR-10) to 97.80% (on Texas100), which is an average accuracy increase of 13.89% compared to the strongest PMIA reported in the literature. We evaluated AgrEvader on five datasets across different domains, against a comprehensive list of threat models, which included black-box, gray-box and white-box models for targeted and non-targeted scenarios. AgrEvader demonstrated consistent high accuracy across all settings tested. The code is available at: https://github.com/PrivSecML/AgrEvader.
- [1] 2021. https://www.dshs.texas.gov/THCIC/Hospitals/Download.shtmGoogle Scholar
- [2] 2021. https://kaggle.com/c/acquire-valued-shoppers-challenge/dataGoogle Scholar
- [3] 2021. https://sites.google.com/site/yangdingqi/home/foursquare-datasetGoogle Scholar
- Martin Abadi, Andy Chu, Ian Goodfellow, H Brendan McMahan, Ilya Mironov, Kunal Talwar, and Li Zhang. 2016. Deep learning with differential privacy. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. ACM, 308–318.Google ScholarDigital Library
- Dan Alistarh, Zeyuan Allen-Zhu, and Jerry Li. 2018. Byzantine stochastic gradient descent. In Proceedings of the 32nd International Conference on Neural Information Processing Systems. 4618–4628.Google Scholar
- Eugene Bagdasaryan, Andreas Veit, Yiqing Hua, Deborah Estrin, and Vitaly Shmatikov. 2020. How to backdoor federated learning. In International Conference on Artificial Intelligence and Statistics. PMLR, 2938–2948.Google Scholar
- Arjun Nitin Bhagoji, Supriyo Chakraborty, Prateek Mittal, and Seraphin Calo. 2019. Analyzing federated learning through an adversarial lens. In International Conference on Machine Learning. PMLR, 634–643.Google Scholar
- Battista Biggio, Blaine Nelson, and Pavel Laskov. 2012. Poisoning attacks against support vector machines. arXiv preprint arXiv:1206.6389 (2012).Google ScholarDigital Library
- Peva Blanchard, El Mahdi El Mhamdi, Rachid Guerraoui, and Julien Stainer. 2017. Machine learning with adversaries: Byzantine tolerant gradient descent. In Proceedings of the 31st International Conference on Neural Information Processing Systems. 118–128.Google Scholar
- Keith Bonawitz, Vladimir Ivanov, Ben Kreuter, Antonio Marcedone, H Brendan McMahan, Sarvar Patel, Daniel Ramage, Aaron Segal, and Karn Seth. 2017. Practical secure aggregation for privacy-preserving machine learning. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. ACM, 1175–1191.Google ScholarDigital Library
- Xiaoyu Cao, Minghong Fang, Jia Liu, and Neil Zhenqiang Gong. 2020. FLTrust: Byzantine-robust Federated Learning via Trust Bootstrapping. arXiv preprint arXiv:2012.13995 (2020).Google Scholar
- Yufei Chen, Chao Shen, Yun Shen, Cong Wang, and Yang Zhang. 2022. Amplifying Membership Exposure via Data Poisoning. NeurIPS 2022 (2022).Google Scholar
- Yi-Ruei Chen, Amir Rezapour, and Wen-Guey Tzeng. 2018. Privacy-preserving ridge regression on distributed data. Information Sciences 451 (2018), 34–49.Google ScholarCross Ref
- Christopher A Choquette-Choo, Florian Tramer, Nicholas Carlini, and Nicolas Papernot. 2021. Label-only membership inference attacks. In International conference on machine learning. PMLR, 1964–1974.Google Scholar
- 1000 Genomes Project Consortium 2015. A global reference for human genetic variation. Nature 526, 7571 (2015), 68.Google Scholar
- Cynthia Dwork, Aaron Roth, 2014. The algorithmic foundations of differential privacy. Foundations and Trends® in Theoretical Computer Science 9, 3–4 (2014), 211–407.Google ScholarDigital Library
- Minghong Fang, Xiaoyu Cao, Jinyuan Jia, and Neil Gong. 2020. Local model poisoning attacks to byzantine-robust federated learning. In 29th { USENIX} Security Symposium ({ USENIX} Security 20). 1605–1622.Google Scholar
- Adrià Gascón, Phillipp Schoppmann, Borja Balle, Mariana Raykova, Jack Doerner, Samee Zahur, and David Evans. 2017. Privacy-preserving distributed linear regression on high-dimensional data. Proceedings on Privacy Enhancing Technologies 2017, 4 (2017), 345–364.Google ScholarCross Ref
- Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. Advances in neural information processing systems 27 (2014).Google Scholar
- Rachid Guerraoui, Sébastien Rouault, 2018. The hidden vulnerability of distributed learning in byzantium. In International Conference on Machine Learning. PMLR, 3521–3530.Google Scholar
- Yaochen Hu, Di Niu, Jianming Yang, and Shengping Zhou. 2019. FDML: A collaborative machine learning framework for distributed features. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2232–2240.Google ScholarDigital Library
- Matthew Jagielski, Alina Oprea, Battista Biggio, Chang Liu, Cristina Nita-Rotaru, and Bo Li. 2018. Manipulating machine learning: Poisoning attacks and countermeasures for regression learning. In 2018 IEEE Symposium on Security and Privacy (SP). IEEE, 19–35.Google ScholarCross Ref
- Bargav Jayaraman and David Evans. 2019. Evaluating differentially private machine learning in practice. In 28th { USENIX} Security Symposium ({ USENIX} Security 19). 1895–1912.Google Scholar
- Jinyuan Jia, Ahmed Salem, Michael Backes, Yang Zhang, and Neil Zhenqiang Gong. 2019. Memguard: Defending against black-box membership inference attacks via adversarial examples. In Proceedings of the 2019 ACM SIGSAC conference on computer and communications security. 259–274.Google ScholarDigital Library
- Peter Kairouz, H Brendan McMahan, Brendan Avent, Aurélien Bellet, Mehdi Bennis, Arjun Nitin Bhagoji, Kallista Bonawitz, Zachary Charles, Graham Cormode, Rachel Cummings, 2019. Advances and open problems in federated learning. arXiv preprint arXiv:1912.04977 (2019).Google Scholar
- Georgios A Kaissis, Marcus R Makowski, Daniel Rückert, and Rickmer F Braren. 2020. Secure, privacy-preserving and federated machine learning in medical imaging. Nature Machine Intelligence 2, 6 (2020), 305–311.Google ScholarCross Ref
- Jakub Konečnỳ, H Brendan McMahan, Daniel Ramage, and Peter Richtárik. 2016. Federated optimization: Distributed machine learning for on-device intelligence. arXiv preprint arXiv:1610.02527 (2016).Google Scholar
- Jakub Konečnỳ, H Brendan McMahan, Felix X Yu, Peter Richtárik, Ananda Theertha Suresh, and Dave Bacon. 2016. Federated learning: Strategies for improving communication efficiency. arXiv preprint arXiv:1610.05492 (2016).Google Scholar
- Alex Krizhevsky, Geoffrey Hinton, 2009. Learning multiple layers of features from tiny images. (2009).Google Scholar
- Andrew Law, Chester Leung, Rishabh Poddar, Raluca Ada Popa, Chenyu Shi, Octavian Sima, Chaofan Yu, Xingmeng Zhang, and Wenting Zheng. 2020. Secure collaborative training and inference for xgboost. In Proceedings of the 2020 Workshop on Privacy-Preserving Machine Learning in Practice. 21–26.Google ScholarDigital Library
- Zheng Li and Yang Zhang. 2021. Membership leakage in label-only exposures. In Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security. 880–895.Google ScholarDigital Library
- Hongbin Liu, Jinyuan Jia, Wenjie Qu, and Neil Zhenqiang Gong. 2021. EncoderMI: Membership inference against pre-trained encoders in contrastive learning. In Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security. 2081–2095.Google ScholarDigital Library
- Jian Liu, Mika Juuti, Yao Lu, and Nadarajah Asokan. 2017. Oblivious neural network predictions via minionn transformations. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. ACM, 619–631.Google ScholarDigital Library
- Saeed Mahloujifar, Esha Ghosh, and Melissa Chase. 2022. Property Inference from Poisoning. In 2022 IEEE Symposium on Security and Privacy (SP). IEEE Computer Society, 1569–1569.Google ScholarCross Ref
- Tilen Marc, Miha Stopar, Jan Hartman, Manca Bizjak, and Jolanda Modic. 2019. Privacy-Enhanced Machine Learning with Functional Encryption. In European Symposium on Research in Computer Security. Springer, 3–21.Google Scholar
- Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Aguera y Arcas. 2017. Communication-efficient learning of deep networks from decentralized data. In Artificial intelligence and statistics. PMLR, 1273–1282.Google Scholar
- Luca Melis, Congzheng Song, Emiliano De Cristofaro, and Vitaly Shmatikov. 2019. Exploiting unintended feature leakage in collaborative learning. In 2019 IEEE Symposium on Security and Privacy (SP). IEEE, 691–706.Google ScholarCross Ref
- Fatemehsadat Mireshghallah, Kartik Goyal, Archit Uniyal, Taylor Berg-Kirkpatrick, and Reza Shokri. 2022. Quantifying privacy risks of masked language models using membership inference attacks. arXiv preprint arXiv:2203.03929 (2022).Google Scholar
- Payman Mohassel and Yupeng Zhang. 2017. Secureml: A system for scalable privacy-preserving machine learning. In 2017 IEEE Symposium on Security and Privacy (SP). IEEE, 19–38.Google ScholarCross Ref
- Luis Muñoz-González, Kenneth T Co, and Emil C Lupu. 2019. Byzantine-robust federated machine learning through adaptive model averaging. arXiv preprint arXiv:1909.05125 (2019).Google Scholar
- Milad Nasr, Reza Shokri, and Amir Houmansadr. 2018. Machine learning with membership privacy using adversarial regularization. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security. 634–646.Google ScholarDigital Library
- Milad Nasr, Reza Shokri, and Amir Houmansadr. 2019. Comprehensive privacy analysis of deep learning: Passive and active white-box inference attacks against centralized and federated learning. In 2019 IEEE symposium on security and privacy (SP). IEEE, 739–753.Google ScholarCross Ref
- German I Parisi, Ronald Kemker, Jose L Part, Christopher Kanan, and Stefan Wermter. 2019. Continual lifelong learning with neural networks: A review. Neural Networks 113 (2019), 54–71.Google ScholarDigital Library
- Do Le Quoc and Christof Fetzer. 2021. SecFL: Confidential Federated Learning using TEEs. arXiv preprint arXiv:2110.00981 (2021).Google Scholar
- Md Atiqur Rahman, Tanzila Rahman, Robert Laganière, Noman Mohammed, and Yang Wang. 2018. Membership Inference Attack against Differentially Private Deep Learning Model.Trans. Data Priv. 11, 1 (2018), 61–79.Google Scholar
- Doyen Sahoo, Quang Pham, Jing Lu, and Steven CH Hoi. 2017. Online deep learning: Learning deep neural networks on the fly. arXiv preprint arXiv:1711.03705 (2017).Google Scholar
- Ahmed Salem, Yang Zhang, Mathias Humbert, Pascal Berrang, Mario Fritz, and Michael Backes. 2018. Ml-leaks: Model and data independent membership inference attacks and defenses on machine learning models. arXiv preprint arXiv:1806.01246 (2018).Google Scholar
- Sagar Sharma and Keke Chen. 2019. Confidential boosting with random linear classifiers for outsourced user-generated data. In European Symposium on Research in Computer Security. Springer, 41–65.Google ScholarDigital Library
- Virat Shejwalkar and Amir Houmansadr. 2021. Manipulating the Byzantine: Optimizing Model Poisoning Attacks and Defenses for Federated Learning. Internet Society (2021), 18.Google Scholar
- Liyue Shen, Yanjun Zhang, Jingwei Wang, and Guangdong Bai. 2022. Better Together: Attaining the Triad of Byzantine-robust Federated Learning via Local Update Amplification. In Proceedings of the 38th Annual Computer Security Applications Conference. 201–213.Google ScholarDigital Library
- Reza Shokri and Vitaly Shmatikov. 2015. Privacy-preserving deep learning. In Proceedings of the 22nd ACM SIGSAC conference on computer and communications security. ACM, 1310–1321.Google Scholar
- Reza Shokri, Marco Stronati, Congzheng Song, and Vitaly Shmatikov. 2017. Membership inference attacks against machine learning models. In 2017 IEEE Symposium on Security and Privacy (SP). IEEE, 3–18.Google ScholarCross Ref
- Liwei Song and Prateek Mittal. 2021. Systematic evaluation of privacy risks of machine learning models. In 30th { USENIX} Security Symposium ({ USENIX} Security 21).Google Scholar
- Yu Sun, Shuohuan Wang, Yukun Li, Shikun Feng, Hao Tian, Hua Wu, and Haifeng Wang. 2020. Ernie 2.0: A continual pre-training framework for language understanding. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 8968–8975.Google ScholarCross Ref
- Xinyu Tang, Saeed Mahloujifar, Liwei Song, Virat Shejwalkar, Milad Nasr, Amir Houmansadr, and Prateek Mittal. 2022. Mitigating Membership Inference Attacks by { Self-Distillation} Through a Novel Ensemble Architecture. In 31st USENIX Security Symposium (USENIX Security 22). 1433–1450.Google Scholar
- Florian Tramèr, Reza Shokri, Ayrton San Joaquin, Hoang Le, Matthew Jagielski, Sanghyun Hong, and Nicholas Carlini. 2022. Truth Serum: Poisoning Machine Learning Models to Reveal Their Secrets. In Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security. 2779–2792.Google ScholarDigital Library
- Cong Xie, Oluwasanmi Koyejo, and Indranil Gupta. 2018. Generalized byzantine-tolerant sgd. arXiv preprint arXiv:1802.10116 (2018).Google Scholar
- Qiang Yang, Yang Liu, Tianjian Chen, and Yongxin Tong. 2019. Federated machine learning: Concept and applications. ACM Transactions on Intelligent Systems and Technology (TIST) 10, 2 (2019), 1–19.Google ScholarDigital Library
- Samuel Yeom, Irene Giacomelli, Matt Fredrikson, and Somesh Jha. 2018. Privacy risk in machine learning: Analyzing the connection to overfitting. In 2018 IEEE 31st Computer Security Foundations Symposium (CSF). IEEE, 268–282.Google ScholarCross Ref
- Dong Yin, Yudong Chen, Ramchandran Kannan, and Peter Bartlett. 2018. Byzantine-robust distributed learning: Towards optimal statistical rates. In International Conference on Machine Learning. PMLR, 5650–5659.Google Scholar
- Guangsheng Zhang, Bo Liu, Tianqing Zhu, Ming Ding, and Wanlei Zhou. 2022. Label-Only Membership Inference Attacks and Defenses In Semantic Segmentation Models. IEEE Transactions on Dependable and Secure Computing (2022).Google Scholar
- Yanjun Zhang, Guangdong Bai, Xue Li, Caitlin Curtis, Chen Chen, and Ryan KL Ko. 2020. PrivColl: Practical Privacy-Preserving Collaborative Machine Learning. In European Symposium on Research in Computer Security. Springer, 399–418.Google Scholar
- Yanjun Zhang, Guangdong Bai, Xue Li, Surya Nepal, Marthie Grobler, Chen Chen, and Ryan KL Ko. 2022. Preserving Privacy for Distributed Genome-Wide Analysis Against Identity Tracing Attacks. IEEE Transactions on Dependable and Secure Computing (2022).Google ScholarDigital Library
- Yanjun Zhang, Guangdong Bai, Mingyang Zhong, Xue Li, and Ryan Ko. 2020. Differentially private collaborative coupling learning for recommender systems. IEEE Intelligent Systems (2020).Google Scholar
- Yanjun Zhang, Xin Zhao, Xue Li, Mingyang Zhong, Caitlin Curtis, and Chen Chen. 2019. Enabling privacy-preserving sharing of genomic data for GWASs in decentralized networks. In Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining. 204–212.Google ScholarDigital Library
Index Terms
- AgrEvader: Poisoning Membership Inference against Byzantine-robust Federated Learning
Recommendations
LoDen: Making Every Client in Federated Learning a Defender Against the Poisoning Membership Inference Attacks
ASIA CCS '23: Proceedings of the 2023 ACM Asia Conference on Computer and Communications SecurityFederated learning (FL) is a widely used distributed machine learning framework. However, recent studies have shown its susceptibility to poisoning membership inference attacks (MIA). In MIA, adversaries maliciously manipulate the local updates on ...
Local model poisoning attacks to byzantine-robust federated learning
SEC'20: Proceedings of the 29th USENIX Conference on Security SymposiumIn federated learning, multiple client devices jointly learn a machine learning model: each client device maintains a local model for its local training dataset, while a master device maintains a global model via aggregating the local models from the ...
Dynamic defense against byzantine poisoning attacks in federated learning
AbstractFederated learning, as a distributed learning that conducts the training on the local devices without accessing to the training data, is vulnerable to Byzantine poisoning adversarial attacks. We argue that the federated learning model ...
Highlights- We identify Byzantine attacks as a real problem of Federated Learning.
- We ...
Comments