Abstract
Federated Learning (FL) is quickly becoming a goto distributed training paradigm for users to jointly train a global model without physically sharing their data. Users can indirectly contribute to, and directly benefit from a much larger aggregate data corpus used to train the global model. However, literature on successful application of FL in real-world problem settings is somewhat sparse. In this paper, we describe our experience applying a FL based solution to the Named Entity Recognition (NER) task for an adverse event detection application in the context of mass scale vaccination programs. We present a comprehensive empirical analysis of various dimensions of benefits gained with FL based training. Furthermore, we investigate effects of tighter Differential Privacy (DP) constraints in highly sensitive settings where federation users must enforce DP to ensure strict privacy guarantees. We show that DP can severely cripple the global model’s prediction accuracy, thus disincentivizing users from participating in the federation. In response, we demonstrate how recent innovation in personalization methods can help significantly recover the lost accuracy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Abadi, M., et al.: Deep learning with differential privacy. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pp. 308–318 (2016)
Arivazhagan, M.G., Aggarwal, V., Singh, A.K., Choudhary, S.: Federated learning with personalization layers. CoRR, abs/1912.00818 (2019)
Bagdasaryan, E., Veit, A., Hua, Y., Estrin, D., Shmatikov, V.: How to backdoor federated learning. In: The 23rd International Conference on Artificial Intelligence and Statistics, AISTATS 2020, Palermo, Sicily, Italy, 26–28 August 2020, volume 108 of Proceedings of Machine Learning Research, pp. 2938–2948. PMLR (2020)
Bonawitz, K., et al.: Towards federated learning at scale: system design. CoRR (2019)
Carlini, N., Liu, C., Erlingsson, Ú., Kos, J., Song, D.: The secret sharer: evaluating and testing unintended memorization in neural networks. In: 28th USENIX Security Symposium, pp. 267–284 (2019)
California consumer privacy act (CCPA). https://oag.ca.gov/privacy/ccpa
Chaudhuri, K., Monteleoni, C., Sarwate, A.D.: Differentially private empirical risk minimization. J. Mach. Learn. Res. 12, 1069–1109 (2011)
Deng, Y., Kamani, M.M., Mahdavi, M.: Adaptive personalized federated learning. CoRR, abs/2003.13461 (2020)
Differential Privacy Team. Learning with Privacy at Scale (2017). https://machinelearning.apple.com/2017/12/06/learning-with-privacy-at-scale.html
Dimitrakakis, C., Nelson, B., Zhang, Z., Mitrokotsa, A., Rubinstein, B.I.P.: Differential privacy for Bayesian inference through posterior sampling. J. Mach. Learn. Res. 18(1), 343–381 (2017)
Dinh, C.T., Tran, N.H., Nguyen, T.D.: Personalized federated learning with Moreau envelopes. In: Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, Virtual (2020)
Dwork, C.: Differential privacy. In: 33rd International Colloquium Automata, Languages and Programming, ICALP, pp. 1–12 (2006)
Dwork, C., McSherry, F., Nissim, K., Smith, A.: Calibrating noise to sensitivity in private data analysis. In: Halevi, S., Rabin, T. (eds.) TCC 2006. LNCS, vol. 3876, pp. 265–284. Springer, Heidelberg (2006). https://doi.org/10.1007/11681878_14
Dwork, C., Roth, A.: The algorithmic foundations of differential privacy. Found. Trends Theor. Comput. Sci. 9(3–4), 211–407 (2014)
Fallah, A., Mokhtari, A., Ozdaglar, A.: Personalized federated learning: a meta-learning approach (2020)
Fredrikson, M., Jha, S., Ristenpart, T.: Model inversion attacks that exploit confidence information and basic countermeasures. In: Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, pp. 1322–1333 (2015)
General data protection regulation (GDPR). https://gdpr-info.eu/
Gentry, C.: Fully homomorphic encryption using ideal lattices. In: Proceedings of the Forty-first Annual ACM Symposium on Theory of Computing, pp. 169–178 (2009)
Geyer, R.C., Klein, T., Nabi, M.: Differentially private federated learning: a client level perspective. CoRR, abs/1712.07557 (2017)
Giorgi, J.M., Bader, G.D.: Transfer learning for biomedical named entity recognition with neural networks. Bioinformatics 34(23), 4087–4094 (2018)
Gurulingappa, H., Rajput, A.M., Roberts, A., Fluck, J., Hofmann-Apitius, M., Toldo, L.: Development of a benchmark corpus to support the automatic extraction of drug-related adverse effects from medical case reports. J. Biomed. Inform. 45(5), 885–892 (2012)
Haerian, K., Varn, D., Vaidya, S., Ena, L., Chase, H., Friedman, C.: Detection of pharmacovigilance-related adverse events using electronic health records and automated methods. Clin. Pharmacol. Ther. 92(2), 228–234 (2012)
Harpaz, R., et al.: Text mining for adverse drug events: the promise, challenges, and state of the art. Drug Saf. Int. J. Med. Toxicol. Drug Exp. 37, 777–790 (2014)
Hitaj, B., Ateniese, G., Perez-Cruz, F.: Deep models under the GAN: information leakage from collaborative deep learning. In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, pp. 603–618 (2017)
Hsieh, K., Phanishayee, A., Mutlu, O., Gibbons, P.B.: The non-IID data quagmire of decentralized machine learning. CoRR, abs/1910.00189 (2019)
Innovatice medices initiative: Europe’s partnership for health. https://www.imi.europa.eu
Jiang, Y., Konecný, J., Rush, K., Kannan, S.: Improving federated learning personalization via model agnostic meta learning. CoRR, abs/1909.12488 (2019)
Kairouz, P., et al.: Advances and open problems in federated learning. CoRR, abs/1912.04977 (2019)
Kasiviswanathan, S.P., Lee, H.K., Nissim, K., Raskhodnikova, S., Smith, A.D.: What can we learn privately? CoRR, abs/0803.0924 (2008)
Konecný, J., McMahan, B., Ramage, D.: Federated optimization: distributed optimization beyond the datacenter. CoRR, abs/1511.03575 (2015)
Konecný, J., McMahan, H.B., Ramage, D., Richtárik, P.: Federated optimization: distributed machine learning for on-device intelligence. CoRR, abs/1610.02527 (2016)
Korkontzelos, I., Nikfarjam, A., Shardlow, M., Sarker, A., Ananiadou, S., Gonzalez, G.H.: Analysis of the effect of sentiment analysis on extracting adverse drug reactions from tweets and forum posts. J. Biomed. Inform. 62, 148–158 (2016)
Korolova, A.: Privacy violations using microtargeted ads: a case study. In: 2010 IEEE International Conference on Data Mining Workshops, pp. 474–482 (2010)
Leaman, R., Wojtulewicz, L., Sullivan, R., Skariah, A., Yang, J., Gonzalez, G.: Towards internet-age pharmacovigilance: extracting adverse drug reactions from user posts in health-related social networks. In: Proceedings of the 2010 Workshop on Biomedical Natural Language Processing, BioNLP@ACL 2010, Uppsala, Sweden, 15 July 2010, pp. 117–125. Association for Computational Linguistics (2010)
LePendu, P., et al.: Pharmacovigilance using clinical notes. Clin. Pharmacol. Ther. 93, 547–555 (2013)
Li, X., Gu, Y., Dvornek, N., Staib, L.H., Ventola, P., Duncan, J.S.: Multi-site fMRI analysis using privacy-preserving federated learning and domain adaptation: abide results. Med. Image Anal. 65, 101765 (2020)
Li, X., Huang, K., Yang, W., Wang, S., Zhang, Z.: On the convergence of FedAvg on non-IID data. In: 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, 26–30 April 2020. OpenReview.net (2020)
Liang, P.P., Liu, T., Liu, Z., Salakhutdinov, R., Morency, L.: Think locally, act globally: federated learning with local and global representations. CoRR, abs/2001.01523 (2020)
Mansour, Y., Mohri, M., Ro, J., Suresh, A.T.: Three approaches for personalization with applications to federated learning. CoRR, abs/2002.10619 (2020)
McMahan, H.B., Moore, E., Ramage, D., Arcas, B.A.y.: Federated learning of deep networks using model averaging. CoRR, abs/1602.05629 (2016)
McMahan, H.B., Ramage, D., Talwar, K., Zhang, L.: Learning differentially private language models without losing accuracy. CoRR, abs/1710.06963 (2017)
Melis, L., Song, C., Cristofaro, E.D., Shmatikov, V.: Inference attacks against collaborative learning. CoRR, abs/1805.04049 (2018)
New research consortium seeks to accelerate drug discovery using machine learning to unlock maximum potential of pharma industry data. https://www.janssen.com/emea/new-research-consortium-seeks-accelerate-drug-discovery-using-machine-learning-unlock-maximum
Nasr, M., Shokri, R., Houmansadr, A.: Comprehensive privacy analysis of deep learning: passive and active white-box inference attacks against centralized and federated learning. In: 2019 IEEE Symposium on Security and Privacy, SP 2019, San Francisco, CA, USA, 19–23 May 2019, pp. 739–753. IEEE (2019)
Peterson, D.W., Kanani, P., Marathe, V.J.: Private federated learning with domain adaptation. CoRR, abs/1912.06733 (2019)
Roberts, K., Demner-Fushman, D., Tonning, J.M.: Overview of the TAC 2017 adverse reaction extraction from drug labels track. In: Proceedings of the 2017 Text Analysis Conference, TAC 2017, Gaithersburg, Maryland, USA, 13–14 November 2017. NIST (2017)
Shokri, R., Stronati, M., Song, C., Shmatikov, V.: Membership inference attacks against machine learning models. In: 2017 IEEE Symposium on Security and Privacy (SP), pp. 3–18 (2017)
Smith, V., Chiang, C.-K., Sanjabi, M., Talwalkar, A.: Federated multi-task learning (2017)
Tramèr, F., Zhang, F., Juels, A., Reiter, M.K., Ristenpart, T.: Stealing machine learning models via prediction APIs. In: Proceedings of the 25th USENIX Conference on Security Symposium, pp. 601–618 (2016)
Winnenburg, R., et al.: Leveraging medline indexing for pharmacovigilance - inherent limitations and mitigation strategies. J. Biomed. Inform. (2015)
Yao, A.C.: How to generate and exchange secrets. In: 27th Annual Symposium on Foundations of Computer Science, pp. 162–167 (1986)
Yu, T., Bagdasaryan, E., Shmatikov, V.: Salvaging federated learning by local adaptation. CoRR, abs/2002.04758 (2020)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Kanani, P., Marathe, V.J., Peterson, D., Harpaz, R., Bright, S. (2021). Private Cross-Silo Federated Learning for Extracting Vaccine Adverse Event Mentions. In: Kamp, M., et al. Machine Learning and Principles and Practice of Knowledge Discovery in Databases. ECML PKDD 2021. Communications in Computer and Information Science, vol 1525. Springer, Cham. https://doi.org/10.1007/978-3-030-93733-1_37
Download citation
DOI: https://doi.org/10.1007/978-3-030-93733-1_37
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-93732-4
Online ISBN: 978-3-030-93733-1
eBook Packages: Computer ScienceComputer Science (R0)