Skip to main content
Log in

Privacy preserving linear regression modeling of distributed databases

  • Original Paper
  • Published:
Optimization Letters Aims and scope Submit manuscript

Abstract

Statistical analysis is one of the important tools in data mining field. Little work has been conducted to investigate how statistical analysis could be performed when dataset are distributed among a number of data owners. Due to confidentiality or other proprietary reasons, data owners are reluctant to share data with others, while they wish to perform statistical analysis cooperatively. We address the important tradeoff between privacy and global statistical analysis such as linear regression, and present a privacy preserving linear regression model based on fully homomorphic encryption scheme.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Agrawal, R., Srikant, R.: Privacy-preserving data mining. Proceedings of the ACM SIGMOD Conference (2000)

  2. Du, W., Han, Y.: Privacy preserving multivariate statistical analysis: linear regression and classification. In: Proceedings of the 4th SIAM International Conference on Data Mining, pp. 222–233 (2004)

  3. Sanil, A., Karr, A.: Privacy preserving regression modeling via distributed computation. In: Proceedings of 10 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 667–682 (2004)

  4. Amirbekyan, A., Estivill-Castro, V.: Privacy-preserving regression algorithms. In: Proceedings of the 7th WSEAS International Conference on simulation, modelling and optimization, pp. 37–45. Stevens Point, Wisconsin (2007)

  5. Aggarwal, C. C., Yu, P. S.: Privacy preserving data mining: models and algorithms. Advances in database systems, 34. Springer, New York (2008)

  6. Pinkas, B.: Cryptographic techniques for privacy-preserving Data mining. SIGKDD Explor. 4(2), 12 (2002)

    Article  Google Scholar 

  7. Yang, Z., Zhong, S., Wright, R.N.: Privacy-preserving classiffication of customer data without loss of accuracy. In: SDM ’2005 SIAM International Conference on Data Mining (2005)

  8. Magkos, E., Maragoudakis, M., Chrissikopoulos, V., Gritzalis, S.: Accurate and large-scale privacy-preserving data mining using the election paradigm. In: Data & Knowledge Engineering vol. 68, pp. 1124–1236. Elsevier, Amsterdam (2009)

  9. Gentry, C.: A fully homomorphic encryption scheme, PhD thesis, Stanford University (2009)

  10. Rizvi, S., Haritsa, J.: Maintaining data privacy in association rule mining. VLDB Conference (2002)

  11. Evfimievski, A., Srikant, R., Agrawal, R., Gehrke, J.: Privacy-preserving mining of association rules. ACM KDD Conference (2002)

  12. Kim, J.J., Winkler, W.E.: Multiplicative noise for masking continuous data. Technical Report Statistics # 2003-01, Statistical Research Division, US Bureau of the Census, Washington D. C. (2003)

  13. Chen, K., Liu, L.: A random rotation perturbation approach to privacy preserving data classification. In: Proceedings of International Conference on Data Mining, ICDM (2005)

  14. Guo, S., Wu, X.: Deriving private information from arbitrarily projected data. In: Proceedings of the 11th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD07)

  15. Chen, K., Liu, L.: A random geometric perturbation approach to privacy preserving data classification. Technical Report, College of Computing, Georgia Tech (2005)

  16. Huang, Z., Du, W.: Driving private information from randomized data. ACM SIGMOD Conference, pp. 37–48 (2005)

  17. Kargupta, H.: On the privacy preserving properties of random data perturbation techniques. ICDM Conference, pp. 99–106 (2003)

  18. Hyvarinen A.: Independent Component analysis. Wiley science, New York (2001)

    Book  Google Scholar 

  19. Pinkas B.: Cryptographic techniques for privacy-preserving data mining [J]. ACM SIGKDD Explor Newslett 4(2), 12–19 (2002)

    Article  Google Scholar 

  20. Lindell, Y., Pinkas, B.: Privacy preserving data mining. In: Advances in Cryptology—CRYPTO 2000, pp. 36–54. Springer, New York (2000)

  21. Kantarcıoglu M., Clifton C.: Privacy-preserving distributed mining of association rules on horizontally partitioned data. IEEE TKDE 16(9), 1026–1037 (2004)

    Google Scholar 

  22. Kantarcioglu, M., Vaidya, J.: Privacy preserving naive bayes classifier for horizontally partitioned data. In: Workshop on privacy preserving data mining held in association with The Third IEEE International Conference on Data Mining, Melbourne (2003)

  23. Yu, H., Jiang, X., Vaidya, J.: Privacy preserving SVM using nonlinear kernels on horizontally partitioned data. In: SAC ’06: Proceedings of the 2006 ACM symposium on Applied computing, pp. 603–610. ACM Press, New York (2006)

  24. Goldreich O.: The Foundations of Cryptography, vol. 2. Cambridge University Press, Cambridge (2004)

    Book  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Weiwei Fang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fang, W., Zhou, C. & Yang, B. Privacy preserving linear regression modeling of distributed databases. Optim Lett 7, 807–818 (2013). https://doi.org/10.1007/s11590-012-0482-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11590-012-0482-8

Keywords

Navigation