Skip to main content

Scalable and Fault Tolerant Platform for Distributed Learning on Private Medical Data

  • Conference paper
  • First Online:
Machine Learning in Medical Imaging (MLMI 2017)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 10541))

Included in the following conference series:

Abstract

Medical image data is naturally distributed among clinical institutions. This partitioning, combined with security and privacy restrictions on medical data, imposes limitations on machine learning algorithms in clinical applications, especially for small and newly established institutions. We present InsuLearn: an intuitive and robust open-source (open-source code available at: https://github.com/DistributedML/InsuLearn) platform designed to facilitate distributed learning (classification and regression) on medical image data, while preserving data security and privacy. InsuLearn is built on ensemble learning, in which statistical models are developed at each institution independently and combined at secure coordinator nodes. InsuLearn protocols are designed such that the liveness of the system is guaranteed as institutions join and leave the network. Coordination is implemented as a cluster of replicated state machines, making it tolerant to individual node failures. We demonstrate that InsuLearn successfully integrates accurate models for horizontally partitioned data while preserving privacy.

This work is supported in part by the Institute for Computing, Information and Cognitive Systems (ICICS) at UBC.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Open-source code available at: https://github.com/DistributedML/InsuLearn.

  2. 2.

    In fact \(h_i\) does not know the size of H nor the nodes in H.

References

  1. Li, Y., Bai, C., Reddy, C.K.: A distributed ensemble approach for mining healthcare data under privacy constraints. Inf. Sci. 330, 245–259 (2016)

    Article  Google Scholar 

  2. Ohno-Machado, L.: To share or not to share: that is not the question. Sci. Trans. Med. 4(165), 165cm15 (2012)

    Article  Google Scholar 

  3. Fabian, B., Göthling, T.: Privacy-preserving data warehousing. Int. J. Bus. Intell. Data Min. 10(4), 297–336 (2015)

    Article  Google Scholar 

  4. McMahan, H.B., Moore, E., Ramage, D., Hampson, S., Arcas, B.A.: Communication-efficient learning of deep networks from decentralized data. In: Artificial Intelligence and Statistics (2016)

    Google Scholar 

  5. Hamm, J., Cao, P., Belkin, M.: Learning privately from multiparty data. In: International Conference on Machine Learning, pp. 555–563 (2016)

    Google Scholar 

  6. Xie, L., Plis, S., Sarwate, A.D.: Data-weighted ensemble learning for privacy-preserving distributed learning. In: ICASSP, pp. 2309–2313. IEEE (2016)

    Google Scholar 

  7. Wu, Y., Jiang, X., Kim, J., Ohno-Machado, L.: Grid Binary LOgistic REgression (GLORE): building shared models without sharing data. J. Am. Med. Inform. Assoc. 19(5), 758–764 (2012)

    Article  Google Scholar 

  8. Shokri, R., Shmatikov, V.: Privacy-preserving deep learning. In: Computer and Communications Security, pp. 1310–1321. ACM (2015)

    Google Scholar 

  9. Schneider, F.B.: Implementing fault-tolerant services using the state machine approach: a tutorial. ACM Comput. Surv. 22(4), 299–319 (1990)

    Article  Google Scholar 

  10. Ongaro, D., Ousterhout, J.K.: In search of an understandable consensus algorithm. In: USENIX Annual Technical Conference, pp. 305–319 (2014)

    Google Scholar 

  11. Lichman, M.: UCI machine learning repository (2013)

    Google Scholar 

  12. Castro, M., Liskov, B.: Practical byzantine fault tolerance and proactive recovery. ACM Trans. Comput. Syst. 20(4), 398–461 (2002)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alborz Amir-Khalili .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 137 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Amir-Khalili, A., Kianzad, S., Abugharbieh, R., Beschastnikh, I. (2017). Scalable and Fault Tolerant Platform for Distributed Learning on Private Medical Data. In: Wang, Q., Shi, Y., Suk, HI., Suzuki, K. (eds) Machine Learning in Medical Imaging. MLMI 2017. Lecture Notes in Computer Science(), vol 10541. Springer, Cham. https://doi.org/10.1007/978-3-319-67389-9_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-67389-9_21

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-67388-2

  • Online ISBN: 978-3-319-67389-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics