Skip to main content

Private Cross-Silo Federated Learning for Extracting Vaccine Adverse Event Mentions

  • Conference paper
  • First Online:
Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD 2021)

Abstract

Federated Learning (FL) is quickly becoming a goto distributed training paradigm for users to jointly train a global model without physically sharing their data. Users can indirectly contribute to, and directly benefit from a much larger aggregate data corpus used to train the global model. However, literature on successful application of FL in real-world problem settings is somewhat sparse. In this paper, we describe our experience applying a FL based solution to the Named Entity Recognition (NER) task for an adverse event detection application in the context of mass scale vaccination programs. We present a comprehensive empirical analysis of various dimensions of benefits gained with FL based training. Furthermore, we investigate effects of tighter Differential Privacy (DP) constraints in highly sensitive settings where federation users must enforce DP to ensure strict privacy guarantees. We show that DP can severely cripple the global model’s prediction accuracy, thus disincentivizing users from participating in the federation. In response, we demonstrate how recent innovation in personalization methods can help significantly recover the lost accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Abadi, M., et al.: Deep learning with differential privacy. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pp. 308–318 (2016)

    Google Scholar 

  2. Arivazhagan, M.G., Aggarwal, V., Singh, A.K., Choudhary, S.: Federated learning with personalization layers. CoRR, abs/1912.00818 (2019)

    Google Scholar 

  3. Bagdasaryan, E., Veit, A., Hua, Y., Estrin, D., Shmatikov, V.: How to backdoor federated learning. In: The 23rd International Conference on Artificial Intelligence and Statistics, AISTATS 2020, Palermo, Sicily, Italy, 26–28 August 2020, volume 108 of Proceedings of Machine Learning Research, pp. 2938–2948. PMLR (2020)

    Google Scholar 

  4. Bonawitz, K., et al.: Towards federated learning at scale: system design. CoRR (2019)

    Google Scholar 

  5. Carlini, N., Liu, C., Erlingsson, Ú., Kos, J., Song, D.: The secret sharer: evaluating and testing unintended memorization in neural networks. In: 28th USENIX Security Symposium, pp. 267–284 (2019)

    Google Scholar 

  6. California consumer privacy act (CCPA). https://oag.ca.gov/privacy/ccpa

  7. Chaudhuri, K., Monteleoni, C., Sarwate, A.D.: Differentially private empirical risk minimization. J. Mach. Learn. Res. 12, 1069–1109 (2011)

    MathSciNet  MATH  Google Scholar 

  8. Deng, Y., Kamani, M.M., Mahdavi, M.: Adaptive personalized federated learning. CoRR, abs/2003.13461 (2020)

    Google Scholar 

  9. Differential Privacy Team. Learning with Privacy at Scale (2017). https://machinelearning.apple.com/2017/12/06/learning-with-privacy-at-scale.html

  10. Dimitrakakis, C., Nelson, B., Zhang, Z., Mitrokotsa, A., Rubinstein, B.I.P.: Differential privacy for Bayesian inference through posterior sampling. J. Mach. Learn. Res. 18(1), 343–381 (2017)

    MathSciNet  MATH  Google Scholar 

  11. Dinh, C.T., Tran, N.H., Nguyen, T.D.: Personalized federated learning with Moreau envelopes. In: Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, Virtual (2020)

    Google Scholar 

  12. Dwork, C.: Differential privacy. In: 33rd International Colloquium Automata, Languages and Programming, ICALP, pp. 1–12 (2006)

    Google Scholar 

  13. Dwork, C., McSherry, F., Nissim, K., Smith, A.: Calibrating noise to sensitivity in private data analysis. In: Halevi, S., Rabin, T. (eds.) TCC 2006. LNCS, vol. 3876, pp. 265–284. Springer, Heidelberg (2006). https://doi.org/10.1007/11681878_14

    Chapter  Google Scholar 

  14. Dwork, C., Roth, A.: The algorithmic foundations of differential privacy. Found. Trends Theor. Comput. Sci. 9(3–4), 211–407 (2014)

    Google Scholar 

  15. Fallah, A., Mokhtari, A., Ozdaglar, A.: Personalized federated learning: a meta-learning approach (2020)

    Google Scholar 

  16. Fredrikson, M., Jha, S., Ristenpart, T.: Model inversion attacks that exploit confidence information and basic countermeasures. In: Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, pp. 1322–1333 (2015)

    Google Scholar 

  17. General data protection regulation (GDPR). https://gdpr-info.eu/

  18. Gentry, C.: Fully homomorphic encryption using ideal lattices. In: Proceedings of the Forty-first Annual ACM Symposium on Theory of Computing, pp. 169–178 (2009)

    Google Scholar 

  19. Geyer, R.C., Klein, T., Nabi, M.: Differentially private federated learning: a client level perspective. CoRR, abs/1712.07557 (2017)

    Google Scholar 

  20. Giorgi, J.M., Bader, G.D.: Transfer learning for biomedical named entity recognition with neural networks. Bioinformatics 34(23), 4087–4094 (2018)

    Article  Google Scholar 

  21. Gurulingappa, H., Rajput, A.M., Roberts, A., Fluck, J., Hofmann-Apitius, M., Toldo, L.: Development of a benchmark corpus to support the automatic extraction of drug-related adverse effects from medical case reports. J. Biomed. Inform. 45(5), 885–892 (2012)

    Article  Google Scholar 

  22. Haerian, K., Varn, D., Vaidya, S., Ena, L., Chase, H., Friedman, C.: Detection of pharmacovigilance-related adverse events using electronic health records and automated methods. Clin. Pharmacol. Ther. 92(2), 228–234 (2012)

    Article  Google Scholar 

  23. Harpaz, R., et al.: Text mining for adverse drug events: the promise, challenges, and state of the art. Drug Saf. Int. J. Med. Toxicol. Drug Exp. 37, 777–790 (2014)

    Google Scholar 

  24. Hitaj, B., Ateniese, G., Perez-Cruz, F.: Deep models under the GAN: information leakage from collaborative deep learning. In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, pp. 603–618 (2017)

    Google Scholar 

  25. Hsieh, K., Phanishayee, A., Mutlu, O., Gibbons, P.B.: The non-IID data quagmire of decentralized machine learning. CoRR, abs/1910.00189 (2019)

    Google Scholar 

  26. Innovatice medices initiative: Europe’s partnership for health. https://www.imi.europa.eu

  27. Jiang, Y., Konecný, J., Rush, K., Kannan, S.: Improving federated learning personalization via model agnostic meta learning. CoRR, abs/1909.12488 (2019)

    Google Scholar 

  28. Kairouz, P., et al.: Advances and open problems in federated learning. CoRR, abs/1912.04977 (2019)

    Google Scholar 

  29. Kasiviswanathan, S.P., Lee, H.K., Nissim, K., Raskhodnikova, S., Smith, A.D.: What can we learn privately? CoRR, abs/0803.0924 (2008)

    Google Scholar 

  30. Konecný, J., McMahan, B., Ramage, D.: Federated optimization: distributed optimization beyond the datacenter. CoRR, abs/1511.03575 (2015)

    Google Scholar 

  31. Konecný, J., McMahan, H.B., Ramage, D., Richtárik, P.: Federated optimization: distributed machine learning for on-device intelligence. CoRR, abs/1610.02527 (2016)

    Google Scholar 

  32. Korkontzelos, I., Nikfarjam, A., Shardlow, M., Sarker, A., Ananiadou, S., Gonzalez, G.H.: Analysis of the effect of sentiment analysis on extracting adverse drug reactions from tweets and forum posts. J. Biomed. Inform. 62, 148–158 (2016)

    Article  Google Scholar 

  33. Korolova, A.: Privacy violations using microtargeted ads: a case study. In: 2010 IEEE International Conference on Data Mining Workshops, pp. 474–482 (2010)

    Google Scholar 

  34. Leaman, R., Wojtulewicz, L., Sullivan, R., Skariah, A., Yang, J., Gonzalez, G.: Towards internet-age pharmacovigilance: extracting adverse drug reactions from user posts in health-related social networks. In: Proceedings of the 2010 Workshop on Biomedical Natural Language Processing, BioNLP@ACL 2010, Uppsala, Sweden, 15 July 2010, pp. 117–125. Association for Computational Linguistics (2010)

    Google Scholar 

  35. LePendu, P., et al.: Pharmacovigilance using clinical notes. Clin. Pharmacol. Ther. 93, 547–555 (2013)

    Article  Google Scholar 

  36. Li, X., Gu, Y., Dvornek, N., Staib, L.H., Ventola, P., Duncan, J.S.: Multi-site fMRI analysis using privacy-preserving federated learning and domain adaptation: abide results. Med. Image Anal. 65, 101765 (2020)

    Google Scholar 

  37. Li, X., Huang, K., Yang, W., Wang, S., Zhang, Z.: On the convergence of FedAvg on non-IID data. In: 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, 26–30 April 2020. OpenReview.net (2020)

    Google Scholar 

  38. Liang, P.P., Liu, T., Liu, Z., Salakhutdinov, R., Morency, L.: Think locally, act globally: federated learning with local and global representations. CoRR, abs/2001.01523 (2020)

    Google Scholar 

  39. Mansour, Y., Mohri, M., Ro, J., Suresh, A.T.: Three approaches for personalization with applications to federated learning. CoRR, abs/2002.10619 (2020)

    Google Scholar 

  40. McMahan, H.B., Moore, E., Ramage, D., Arcas, B.A.y.: Federated learning of deep networks using model averaging. CoRR, abs/1602.05629 (2016)

    Google Scholar 

  41. McMahan, H.B., Ramage, D., Talwar, K., Zhang, L.: Learning differentially private language models without losing accuracy. CoRR, abs/1710.06963 (2017)

    Google Scholar 

  42. Melis, L., Song, C., Cristofaro, E.D., Shmatikov, V.: Inference attacks against collaborative learning. CoRR, abs/1805.04049 (2018)

    Google Scholar 

  43. New research consortium seeks to accelerate drug discovery using machine learning to unlock maximum potential of pharma industry data. https://www.janssen.com/emea/new-research-consortium-seeks-accelerate-drug-discovery-using-machine-learning-unlock-maximum

  44. Nasr, M., Shokri, R., Houmansadr, A.: Comprehensive privacy analysis of deep learning: passive and active white-box inference attacks against centralized and federated learning. In: 2019 IEEE Symposium on Security and Privacy, SP 2019, San Francisco, CA, USA, 19–23 May 2019, pp. 739–753. IEEE (2019)

    Google Scholar 

  45. Peterson, D.W., Kanani, P., Marathe, V.J.: Private federated learning with domain adaptation. CoRR, abs/1912.06733 (2019)

    Google Scholar 

  46. Roberts, K., Demner-Fushman, D., Tonning, J.M.: Overview of the TAC 2017 adverse reaction extraction from drug labels track. In: Proceedings of the 2017 Text Analysis Conference, TAC 2017, Gaithersburg, Maryland, USA, 13–14 November 2017. NIST (2017)

    Google Scholar 

  47. Shokri, R., Stronati, M., Song, C., Shmatikov, V.: Membership inference attacks against machine learning models. In: 2017 IEEE Symposium on Security and Privacy (SP), pp. 3–18 (2017)

    Google Scholar 

  48. Smith, V., Chiang, C.-K., Sanjabi, M., Talwalkar, A.: Federated multi-task learning (2017)

    Google Scholar 

  49. Tramèr, F., Zhang, F., Juels, A., Reiter, M.K., Ristenpart, T.: Stealing machine learning models via prediction APIs. In: Proceedings of the 25th USENIX Conference on Security Symposium, pp. 601–618 (2016)

    Google Scholar 

  50. Winnenburg, R., et al.: Leveraging medline indexing for pharmacovigilance - inherent limitations and mitigation strategies. J. Biomed. Inform. (2015)

    Google Scholar 

  51. Yao, A.C.: How to generate and exchange secrets. In: 27th Annual Symposium on Foundations of Computer Science, pp. 162–167 (1986)

    Google Scholar 

  52. Yu, T., Bagdasaryan, E., Shmatikov, V.: Salvaging federated learning by local adaptation. CoRR, abs/2002.04758 (2020)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pallika Kanani .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kanani, P., Marathe, V.J., Peterson, D., Harpaz, R., Bright, S. (2021). Private Cross-Silo Federated Learning for Extracting Vaccine Adverse Event Mentions. In: Kamp, M., et al. Machine Learning and Principles and Practice of Knowledge Discovery in Databases. ECML PKDD 2021. Communications in Computer and Information Science, vol 1525. Springer, Cham. https://doi.org/10.1007/978-3-030-93733-1_37

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-93733-1_37

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-93732-4

  • Online ISBN: 978-3-030-93733-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics