Skip to main content

Secure Computation of Pearson Correlation Coefficients for High-Quality Data Analytics

  • Conference paper
  • First Online:
Database Systems for Advanced Applications (DASFAA 2018)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10829))

Included in the following conference series:

Abstract

In this paper, we present a secure method of computing Pearson correction coefficients while preserving data privacy as well as data quality in the distributed computing environment. In general data analytical/mining processes, individual data owners need to provide their original data to the third parties. In many cases, however, the original data contain sensitive information, and the data owners do not want to disclose their data in the original form for the purpose of privacy preservation. In this paper, we address a problem of secure multiparty computation of Pearson correlation coefficients. For the secure Pearson correlation computation, we first propose an advanced solution by exploiting the secure scalar product. We then present an approximate solution by adopting the lower-dimensional transformation. We finally empirically show that the proposed solutions are practical methods in terms of execution time and data quality.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Aggarwal, C.C., Yu, P.S.: Privacy-preserving data mining: a survey. In: Gertz, M., Jajodia, S. (eds.) Handbook of Database Security, pp. 431–460. Springer, Boston (2008). https://doi.org/10.1007/978-0-387-48533-1_18

  2. Agrawal, R., Srikant, R.: Privacy-preserving data mining. In: Proceedings of International Conference on Management of Data, ACM SIGMOD, Dallas, TX, pp. 439–450, June 2000

    Google Scholar 

  3. Blaikie, N.: Analyzing Quantitative Data. Sage Publications, London (2003)

    Book  Google Scholar 

  4. Du, W., Atallah, M.J.: Secure multi-party computation problems and their applications - a review and open problems. In: Proceedings of the 2001 Workshop on New Security Paradigms, New York, NY, pp. 13–22, September 2001

    Google Scholar 

  5. Goethals, B., Laur, S., Lipmaa, H., Mielikäinen, T.: On private scalar product computation for privacy-preserving data mining. In: Proceedings of the 7th International Conference on Information Security and Cryptology, Seoul, Korea, pp. 104–120, December 2004

    Google Scholar 

  6. Jiang, W., Murugesan, M., Clifton, C., Si, L.: Similar document detection with limited information disclosure. In: Proceedings of the 24th International Conference on Data Engineering, Cancun, pp. 735–743, April 2008

    Google Scholar 

  7. Kaosar, M.G., Paulet, R., Yi, X.: Fully homomorphic encryption based two-party association rule mining. Data Knowl. Eng. 76–78, 1–15 (2012)

    Article  Google Scholar 

  8. Kim, S.-P., Gil, M.-S., Kim, H., Choi, M.-J., Moon, Y.-S., Won, H.-S.: Efficient two-step protocol and its discriminative feature selections in secure similar document detection. Secur. Commun. Netw. 2017, Article ID 6841216, 1–12 (2017)

    Google Scholar 

  9. Lee, M., Lee, S., Choi, M.-J., Moon, Y.-S., Lim, H.-S.: HybridFTW: hybrid computation of dynamic time warping distances. IEEE Access 6, 2085–2096 (2018)

    Article  Google Scholar 

  10. Lee, S., Kim, B.-S., Choi, M.-J., Moon, Y.-S.: Coefficient control multi-step \(k\)-NN search in time-series databases. Int. J. Innov. Comput. Inf. Control 12(2), 419–431 (2016)

    Google Scholar 

  11. Moon, Y.-S., Kim, H.-S., Kim, S.-P., Bertino, E.: Publishing time-series data under preservation of privacy and distance orders. In: Proceedings of the 21st International Conference on Database and Expert Systems Application, Bilbao, Spain, pp. 17–31, August 2010

    Google Scholar 

  12. National Climate Data Center. http://www.ncdc.noaa.gov

  13. Sayal, M., Singh, L.: Privately detecting pairwise correlations in distributed time series. In: Proceedings of IEEE International Conference on Privacy, Security, Risk, and Trust and IEEE International Conference on Social Computing, Boston, MA, pp. 981–987, October 2011

    Google Scholar 

  14. Won, H.-S., Kim, S.-P., Lee, S., Choi, M.-J., Moon, Y.-S.: Secure principal component analysis in multiple distributed nodes. Secur. Commun. Netw. 9(14), 2348–2358 (2016)

    Article  Google Scholar 

  15. Yao, A.C.: Protocols for secure computations. In: Proceedings of the 23th IEEE Symposium on Foundations of Computer Science, Chicago, IL, pp. 160–164, November 1982

    Google Scholar 

  16. Yi, X., Kaosar, M.G., Paulet, R., Bertino, E.: Single-database private information retrieval from fully homomorphic encryption. IEEE Trans. Knowl. Data Eng. 25(5), 1125–1134 (2013)

    Article  Google Scholar 

Download references

Acknowledgment

This work was supported by Institute for Information & communications Technology Promotion (IITP) grant funded by the Korea government (MSIT) (No. 2016-0-00179, Development of an Intelligent Sampling and Filtering Techniques for Purifying Data Streams).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yang-Sae Moon .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Hong, SK., Gil, MS., Moon, YS. (2018). Secure Computation of Pearson Correlation Coefficients for High-Quality Data Analytics. In: Liu, C., Zou, L., Li, J. (eds) Database Systems for Advanced Applications. DASFAA 2018. Lecture Notes in Computer Science(), vol 10829. Springer, Cham. https://doi.org/10.1007/978-3-319-91455-8_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-91455-8_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-91454-1

  • Online ISBN: 978-3-319-91455-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics