ABSTRACT
The popularity of machine learning models has dramatically increased in a large variety of applications that affect people's daily lifes, including product recommendations, healthcare predictions and critical applications. This wide availability has at the same time raised questions about the trustworthiness, security, and privacy implications of using these systems. While novel technologies and methodologies have been emerging to protect the privacy and security of AI Systems, there are still open challenges that need to be addressed by the research community. Over the past years, my research has focused on the creation of defenses to protect the machine learning pipeline and the design of privacy-aware methodologies to enable the training of accurate machine learning models without transmitting the data to a central place. In this talk, I will focus on data privacy covering a game-changing paradigm known as federated learning [4], which to some extend addresses privacy concerns and regulations that prevent the free transmission and sharing of information. Federated learning is a technology that enables multiple participants owning private data to collaboratively train a single machine learning model while maintaining their training data locally. This is in sharp contrast to traditional machine learning where all data needs to be in a central place. Some argue that federated learning is a privacy-by-design technology given that it does not require data to be transmitted to a central place. However, there are still privacy risks that are relevant in some scenarios. Novel inference attacks that take advantage of the federated learning process have been demonstrated in the literature, resulting in a variety of defenses that aim to reduce these risks. I will present some of these attacks and several cryptographical and differential privacy techniques to deter them, including [5,7,8]. The plethora of defenses is particularly interesting given their diverse threat models and the divergent set of privacy requirements they address. In this talk I will demystify them. I will also explain some challenges related to manipulation attacks [6,9] and machine learning fairness [1] in the context of federated learning. Finally, I will touch upon transparency issues and how to enable accountability for regulated industries [2,3] and vertical federated learning [7]. This talk will go through the security and privacy challenges and solutions in federated learning systems.
- Abay, Annie, Yi Zhou, Nathalie Baracaldo, Shashank Rajamoni, Ebube Chuba, and Heiko Ludwig. "Mitigating bias in federated learning." arXiv preprint arXiv:2012.02447 (2020)Google Scholar
- Baracaldo, Nathalie, Ali Anwar, Mark Purcell, Ambrish Rawat, Mathieu Sinn, Bashar Altakrouri, Dian Balta et al. "Towards an Accountable and Reproducible Federated Learning: A FactSheets Approach." arXiv preprint arXiv:2202.12443(2022).Google Scholar
- Balta, Dian, Mahdi Sellami, Peter Kuhn, Ulrich Schöpp, Matthias Buchinger, Nathalie Baracaldo, Ali Anwar et al. "Accountable Federated Machine Learning in Government: Engineering and Management Insights." In International Conference on Electronic Participation, pp. 125--138. Springer, Cham, 2021.Google Scholar
- McMahan, Brendan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Aguera y Arcas. "Communication-efficient learning of deep networks from decentralized data." In Artificial intelligence and statistics, pp. 1273--1282. PMLR, 2017.Google Scholar
- Truex, Stacey, Nathalie Baracaldo, Ali Anwar, Thomas Steinke, Heiko Ludwig, Rui Zhang, and Yi Zhou. "A hybrid approach to privacy-preserving federated learning." In Proceedings of the 12th ACM workshop on artificial intelligence and security, pp. 1--11. 2019.Google Scholar
- Varma, Kamala, Yi Zhou, Nathalie Baracaldo, and Ali Anwar. "LEGATO: A LayerwisE Gradient AggregaTiOn Algorithm for Mitigating Byzantine Attacks in Federated Learning." In 2021 IEEE 14th International Conference on Cloud Computing (CLOUD), pp. 272--277. IEEE, 2021.Google Scholar
- Xu, Runhua, Nathalie Baracaldo, Yi Zhou, Ali Anwar, James Joshi, and Heiko Ludwig. "FedV: Privacy-Preserving Federated Learning over Vertically Partitioned Data." In Proceedings of the 14th ACM Workshop on Artificial Intelligence and Security, pp. 181--192. 2021.Google Scholar
- Xu, Runhua, Nathalie Baracaldo, Yi Zhou, Ali Anwar, and Heiko Ludwig. "Hybridalpha: An efficient approach for privacy-preserving federated learning." In Proceedings of the 12th ACM Workshop on Artificial Intelligence and Security, pp. 13--23. 2019.Google Scholar
- Zawad, Syed, Ahsan Ali, Pin-Yu Chen, Ali Anwar, Yi Zhou, Nathalie Baracaldo, Yuan Tian, and Feng Yan. "Curse or redemption? How data heterogeneity affects the robustness of federated learning." arXiv preprint arXiv:2102.00655 (2021).Google Scholar
Index Terms
- Keynote Talk - Federated Learning: The Hype, State-of-the-Art and Open Challenges
Recommendations
Federated Machine Learning: Concept and Applications
Survey Papers and Regular PapersToday’s artificial intelligence still faces two major challenges. One is that, in most industries, data exists in the form of isolated islands. The other is the strengthening of data privacy and security. We propose a possible solution to these ...
Achieving security and privacy in federated learning systems: Survey, research challenges and future directions
AbstractFederated learning (FL) allows a server to learn a machine learning (ML) model across multiple decentralized clients that privately store their own training data. In contrast with centralized ML approaches, FL saves computation to the ...
Highlights- We survey privacy and security attacks to federated learning and mitigation measures.
Federated Learning for Mobility Applications
The increasing concern for privacy and the use of machine learning on personal data has led researchers to introduce new approaches to machine learning. Federated learning is one such a novel privacy-preserving machine learning approach that “brings code ...
Comments