Skip to main content
Log in

Secure analysis of distributed chemical databases without data integration

  • Published:
Journal of Computer-Aided Molecular Design Aims and scope Submit manuscript

Summary

We present a method for performing statistically valid linear regressions on the union of distributed chemical databases that preserves confidentiality of those databases. The method employs secure multi-party computation to share local sufficient statistics necessary to compute least squares estimators of regression coefficients, error variances and other quantities of interest. We illustrate our method with an example containing four companies’ rather different databases.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Goldwasser, S., Multi-Party Computations: Past and Present. In Proceedings of the 6th Annual ACM Symposium on Principles of Distributed Computing, ACM Press, New York, 1997, pp. 1–6

  2. Yao, A.C., Protocols for secure computations. In Proceedings of the 23rd Annual IEEE Symposium on Foundations of Computer Science, ACM Press, New York, 1982, pp. 160–164

  3. Karr, A.F., Lin, X., Reiter, J.P. and Sanil, A.P., J. Comput. Graph. Stat., (2004b). To appear. Available on-line at www.niss.org/dgii/technicalreports.html

  4. Karr, A.F., Lin, X., Reiter, J.P. and Sanil, A.P., Secure analysis of distributed databases. ASA/SIAM Series on Statistics and Applied Probability. SIAM, Philadelphia, 2005a. To appear. Available on-line at www.niss.org/dgii/technicalreports.html

  5. Reiter J.P., (2003). Stat. Comput. 13:371

    Article  Google Scholar 

  6. Huuskonen J., (2000). J. Chem. Inf. Comput. Sci. 40:773

    Article  CAS  Google Scholar 

  7. Wang R., Gao Y., Lai L., (2000). Perspect Drug Discov Design 19:47

    Article  CAS  Google Scholar 

  8. Liu K., Feng J., Young S.S., (2005). J. Chem. Inf. Model. 45(2):515

    Article  CAS  Google Scholar 

  9. SAS Institute, Inc. JMP, the Statistical Discovery Software, 2005. Information available on-line at www.jmp.com

  10. Willenborg L.C.R.J., de Waal T., (2001), Elements of Statistical Disclosure Control. Springer-Verlag, New York

    Google Scholar 

  11. Powell M.J.D., (1964). Comput. J. 7:152

    Article  Google Scholar 

Download references

Acknowledgements

This research was supported by NSF Grant EIA-0131884 to the National Institute of Statistical Sciences (NISS) and by the HighQ Foundation. Any opinions, findings, and conclusions or recommendations expressed in this publication are those of the authors and do not necessarily reflect the views of the National Science Foundation. The data and structures used in this paper are available at www.niss.org/PowerMV.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alan F. Karr.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Karr, A.F., Feng, J., Lin, X. et al. Secure analysis of distributed chemical databases without data integration. J Comput Aided Mol Des 19, 739–747 (2005). https://doi.org/10.1007/s10822-005-9011-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10822-005-9011-5

Keywords

Navigation