ABSTRACT
Data privacy and security are currently an important societal topic that rightfully garners much attention. In an effort to make people the owners of their personal data, the Bubl platform will provide its users with a secure personal data vault. This platform will have a focus on medical and healthcare data, because of its sensitive nature. Gathering important insights from users’ data could be useful, but due to the high privacy and security standards required by Bubl, it becomes impossible to deploy standard Machine Learning (ML) techniques. These methods require centralization of all training data, which is not allowed. This problem can be solved using techniques from the research field of Privacy Preserving Machine Learning (PPML). Therefore, a system to facilitate PPML within the Bubl platform is developed. More specifically, we employ a technique called Federated Learning (FL). In our FL implementation in the Bubl platform, we focus on minimizing Random-Access Memory (RAM) usage to adhere to the constraints posed by the small computational budgets of the data vaults. Challenges that arise are non-Independent and Identically Distributed (IID) data and the fact that patient vaults contain very few data samples. The latter is the main focus of this research as it is underdeveloped in the FL literature. Currently, we are still working on acquiring the results which are expected in the coming months. At this moment, only preliminary results are discussed that reflect on the effect of the number of clients and the distribution of non-IID data on the ML performance.
- Robert Kocher and Nikhil R Sahni. “Rethinking health care labor”. In: N Engl J Med 365.15 (2011), pp. 1370–1372.Google ScholarCross Ref
- Trishan Panch, Peter Szolovits, and Rifat Atun. “Artificial intelligence, machine learning and health systems”. In: Journal of global health 8.2 (2018).Google ScholarCross Ref
- Angela Spatharou, Solveigh Hieronimus, and Jonathan Jenkins. Transforming Healthcare with AI: The impact on the workforce and organizations. McKinsey and Company, 2020.Google Scholar
- Martin Gellerstedt. “The digitalization of health care paves the way for improved quality of life?” In: (2016). URL: http://www.iiisci.org/journal/pdv/sci/pdfs/IP018LL16.pdf.Google Scholar
- Muhammad Aurangzeb Ahmad, Carly Eckert, and Ankur Teredesai. “Interpretable machine learning in healthcare”. In: Proceedings of the 2018 ACM international conference on bioinformatics, computational biology, and health informatics. 2018, pp. 559–560.Google ScholarDigital Library
- Orson Lucas, Martin Sokalski, and Rob Fisher. Corporate data responsibility: Bridging the consumer trust gap. KPMG, 2021.Google Scholar
- Mehran Mozaffari-Kermani and Anand Raghunathan. “Systematic poisoning attacks on and defenses for machine learning in Healthcare”. In: (2015). URL: https://cse.usf.edu/∼mehran2/ Papers/J18.pdf.Google Scholar
- Nicholas Confessore. Cambridge Analytica and Facebook: The Scandal and the Fallout So Far. 2018. URL: https://www.nytimes.com/2018/04/04/us/politics/cambridge-analytica-scandal-fallout.html (visited on 12/29/2021).Google Scholar
- Brendan McMahan “Communication-efficient learning of deep networks from decentralized data”. In: Artificial intelligence and statistics. PMLR. 2017, pp. 1273–1282.Google Scholar
- Lea Danschel, Michael Huth, and Leif-Nissen Lundbaek. XayNet: Masked Cross-Device federated learning framework. Tech. rep. 2020. URL: https://uploads-ssl.webflow.com/5f0c5c0bb18a279f0a62919e/ 5f157004da6585f299fa542b_XayNet%20Whitepaper%202.1.pdf.Google Scholar
- Morgan Ekmefjord “Scalable federated machine learning with FEDn”. In: arXiv preprint arXiv:2103.00148 (2021).Google Scholar
- Daniel J Beutel “Flower: A Friendly Federated Learning Research Framework”. In: arXiv preprint arXiv:2007.14390 (2020).Google Scholar
- Alex Ingerman and Krzys Ostrowski. “Introducing TensorFlow Federated”. In: (2019). URL: https: //blog.tensorflow.org/2019/03/introducing-tensorflow-federated.html.Google Scholar
- Alexander Ziller “Pysyft: A library for easy federated learning”. In: Federated Learning Systems. Springer, 2021, pp. 111–139.Google ScholarCross Ref
- Charlie Hou “Reducing the Communication Cost of Federated Learning through Multi- stage Optimization”. In: (2021).Google Scholar
- Yue Zhao “Federated learning with non-iid data”. In: arXiv preprint arXiv:1806.00582 (2018).Google Scholar
Index Terms
- Federated Learning in the Bubl Platform to Enhance the Privacy of Personal Patient Data
Recommendations
Performance Analysis of Distributed and Federated Learning Models on Private Data
AbstractThere has been significant research in privacy-related aspects of machine learning and large scale data processing. In traditional methods of training a model, data is gathered at a centralized machine where training on the entire data takes ...
Federated learning on non-IID data: A survey
AbstractFederated learning is an emerging distributed machine learning framework for privacy preservation. However, models trained in federated learning usually have worse performance than those trained in the standard centralized learning ...
A review of secure federated learning: Privacy leakage threats, protection technologies, challenges and future directions
Highlights- Provide a multi-perspective investigation of privacy-preserving federated learning.
- Deep analysis of advanced privacy-preserving federated learning mechanisms.
- Discussed the challenges of privacy-preserving federated learning.
- ...
AbstractAdvances in the new generation of Internet of Things (IoT) technology are propelling the growth of intelligent industrial applications worldwide. Simultaneously, widespread adoption of artificial intelligence (AI) technologies, such as machine ...
Comments