Skip to main content

Federated Learning of Oligonucleotide Drug Molecule Thermodynamics with Differentially Private ADMM-Based SVM

  • Conference paper
  • First Online:
  • 1352 Accesses

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1525))

Abstract

A crucial step to assure drug safety is predicting off-target binding. For oligonucleotide drugs this requires learning the relevant thermodynamics from often large-scale data distributed across different organisations. This process will respect data privacy if distributed and private learning under limited and private communication between local nodes is used. We propose an ADMM-based SVM with differential privacy for this purpose. We empirically show that this approach achieves accuracy comparable to the non-private one, i.e. \({\sim }86\%\), while yielding tight empirical privacy guarantees even after convergence.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    Here, M is the number of nodes across which the data is distributed and \(M \le N\).

  2. 2.

    E.g. the 3-gram “GCG” has larger weight due to higher binding affinity than “ATA”.

  3. 3.

    Increase in privacy level \(\epsilon \) indicates decrease in differential privacy.

References

  1. Abadi, M., et al.: Deep learning with differential privacy. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pp. 308–318 (2016)

    Google Scholar 

  2. Bennett, C.F.: Therapeutic antisense oligonucleotides are coming of age. Annu. Rev. Med. 70, 307–321 (2019)

    Article  Google Scholar 

  3. Blaschke, T., et al.: Reinvent 2.0: an AI tool for de novo drug design. J. Chem. Inf. Mod. 60(12), 5918–5922 (2020)

    Google Scholar 

  4. Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J.: Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends® Mach. Learn. 3(1), 1–122 (2011)

    Google Scholar 

  5. Canonne, C.: What is \(\delta \), and what \(\delta \) difference does it make? DifferentialPrivacy.org, March 2021. https://differentialprivacy.org/flavoursofdelta/

  6. Chen, H., Engkvist, O., Wang, Y., Olivecrona, M., Blaschke, T.: The rise of deep learning in drug discovery. Drug Discov. Today 23(6), 1241–1250 (2018)

    Article  Google Scholar 

  7. Collobert, R., Bengio, S.: Svmtorch: support vector machines for large-scale regression problems. J. Mach. Learn. Res. 1, 143–160 (2001)

    Google Scholar 

  8. Dandekar, A., Basu, D., Bressan, S.: Differential privacy at risk: bridging randomness and privacy budget. In: Proceedings on Privacy Enhancing Technologies, vol. 1, pp. 64–84 (2021)

    Google Scholar 

  9. Ding, J., Wang, J., Liang, G., Bi, J., Pan, M.: Towards plausible differentially private ADMM based distributed machine learning. In: Proceedings of the 29th ACM International Conference on Information & Knowledge Management, pp. 285–294 (2020)

    Google Scholar 

  10. Dwork, C., McSherry, F., Nissim, K., Smith, A.: Calibrating noise to sensitivity in private data analysis. In: Halevi, S., Rabin, T. (eds.) TCC 2006. LNCS, vol. 3876, pp. 265–284. Springer, Heidelberg (2006). https://doi.org/10.1007/11681878_14

    Chapter  Google Scholar 

  11. Flake, G.W., Lawrence, S.: Efficient SVM regression training with SMO. Mach. Learn. 46(1), 271–290 (2002)

    Article  Google Scholar 

  12. Forero, P.A., Cano, A., Giannakis, G.B.: Consensus-based distributed support vector machines. J. Mach. Learn. Res. 11, 1663–1707 (2010)

    MathSciNet  MATH  Google Scholar 

  13. França, G., Bento, J.: How is distributed ADMM affected by network topology? ArXiv e-prints, October 2017

    Google Scholar 

  14. Harvard: Differential privacy (2021). https://privacytools.seas.harvard.edu/differential-privacy

  15. Johansson, S., et al.: AI-assisted synthesis prediction. Drug Discov. Today Technol. 32–33, 65–72 (2020)

    Google Scholar 

  16. Johansson, S.V., et al.: Using active learning to develop machine learning models for reaction yield prediction. ChemRxiv (2021). https://doi.org/10.33774/chemrxiv-2021-bpv0c. Under review

  17. Kairouz, P., Oh, S., Viswanath, P.: The composition theorem for differential privacy. In: International Conference on Machine Learning, pp. 1376–1385. PMLR (2015)

    Google Scholar 

  18. Leslie, C., Eskin, E., Noble, W.S.: The spectrum kernel: a string kernel for SVM protein classification. In: Biocomputing 2002, pp. 564–575. World Scientific (2001)

    Google Scholar 

  19. Lorenz, R., et al.: ViennaRNA package 2.0. Algorithms Mol. Biol. 6(1), 1–14 (2011)

    Google Scholar 

  20. Martin, E.J., Zhu, X.W.: Collaborative profile-QSAR: a natural platform for building collaborative models among competing companies. J. Chem. Inf. Mod. 61(4), 1603–1616 (2021)

    Article  Google Scholar 

  21. NSC: Tetralith (2021). https://www.nsc.liu.se/systems/tetralith/, https://www.nsc.liu.se/systems/tetralith/

  22. Papargyri, N., Pontoppidan, M., Andersen, M.R., Koch, T., Hagedorn, P.H.: Chemical diversity of locked nucleic acid-modified antisense oligonucleotides allows optimization of pharmaceutical properties. Mol. Ther. Nucleic Acids 19, 706–717 (2020)

    Article  Google Scholar 

  23. Pinot, R., Yger, F., Gouy-Pailler, C., Atif, J.: A unified view on differential privacy and robustness to adversarial examples (2019)

    Google Scholar 

  24. Platt, J.: Sequential minimal optimization: A fast algorithm for training support vector machines. Technical Report MSR-TR-98-14, Microsoft Research, April 1998

    Google Scholar 

  25. Raisaro, J.L., et al.: Protecting privacy and security of genomic data in i2b2 with homomorphic encryption and differential privacy. IEEE/ACM Trans. Comput. Biol. Bioinform. 15(5), 1413–1426 (2018)

    Google Scholar 

  26. Shevade, S.K., Keerthi, S.S., Bhattacharyya, C., Murthy, K.R.K.: Improvements to the SMO algorithm for SVM regression. IEEE Trans. Neural Netw. 11(5), 1188–1193 (2000)

    Article  Google Scholar 

  27. Soman, K., Loganathan, R., Ajay, V.: Machine learning with SVM and other kernel methods. PHI Learning Pvt. Ltd. (2009)

    Google Scholar 

  28. Sun, Z., Wang, Y., Shu, M., Liu, R., Zhao, H.: Differential privacy for data and model publishing of medical data. IEEE Access 7, 152103–152114 (2019)

    Article  Google Scholar 

  29. Tavara, S.: Parallel computing of support vector machines: a survey. ACM Comput. Surv. (CSUR) 51(6), 1–38 (2019)

    Article  Google Scholar 

  30. Tavara, S., Schliep, A.: Effect of network topology on the performance of ADMM-based SVMs. In: 2018 30th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD), pp. 388–393. IEEE (2018)

    Google Scholar 

  31. Tavara, S., Schliep, A.: Effects of network topology on the performance of consensus and distributed learning of SVMs using ADMM. PeerJ Comput. Sci. 7, e397 (2021)

    Google Scholar 

  32. Tavara, S., Sundell, H., Dahlbom, A.: Empirical study of time efficiency and accuracy of support vector machines using an improved version of PSVM. In: Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA), p. 177. The Steering Committee of The World Congress in Computer Science, Computer (2015)

    Google Scholar 

  33. Watt, A.T., Swayze, G., Swayze, E.E., Freier, S.M.: Likelihood of nonspecific activity of gapmer antisense oligonucleotides is associated with relative hybridization free energy. Nucleic Acid Ther. 30(4), 215–228 (2020)

    Google Scholar 

  34. Wei, J., Lin, Y., Yao, X., Zhang, J., Liu, X.: Differential privacy-based genetic matching in personalized medicine. IEEE Trans. Emerg. Top. Comput. (2020)

    Google Scholar 

  35. Yu, D., Zhang, H., Chen, W., Liu, T.Y., Yin, J.: Gradient perturbation is underrated for differentially private convex optimization. arXiv preprint arXiv:1911.11363 (2019)

  36. Zhang, R., Ma, J.: An improved SVM method P-SVM for classification of remotely sensed data. Int. J. Remote Sens. 29(20), 6029–6036 (2008)

    Article  Google Scholar 

  37. Zhang, X., Khalili, M.M., Liu, M.: Improving the privacy and accuracy of ADMM-based distributed algorithms. In: International Conference on Machine Learning, pp. 5796–5805. PMLR (2018)

    Google Scholar 

  38. Zuker, M., Mathews, D.H., Turner, D.H.: Algorithms and thermodynamics for RNA secondary structure prediction: a practical guide. In: Barciszewski, J., Clark, B.F.C. (eds.) RNA Biochemistry and Biotechnology. NATO Science Series (Series 3: High Technology), vol. 70, pp. 11–43. Springer, Dordrecht (1999). https://doi.org/10.1007/978-94-011-4485-8_2

Download references

Acknowledgments

SSF Strategic Mobility Grant “Drug Discovery for Antisense Oligos” (A.S.), Swedish National Supercomputer Centre (A.S. & S.T.).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shirin Tavara .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Tavara, S., Schliep, A., Basu, D. (2021). Federated Learning of Oligonucleotide Drug Molecule Thermodynamics with Differentially Private ADMM-Based SVM. In: Kamp, M., et al. Machine Learning and Principles and Practice of Knowledge Discovery in Databases. ECML PKDD 2021. Communications in Computer and Information Science, vol 1525. Springer, Cham. https://doi.org/10.1007/978-3-030-93733-1_34

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-93733-1_34

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-93732-4

  • Online ISBN: 978-3-030-93733-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics