Skip to main content
Log in

Homomorphically encrypted k-means on cloud-hosted servers with low client-side load

  • Published:
Computing Aims and scope Submit manuscript

Abstract

The significance of data analytics has been acknowledged in many scientific and business domains. However, the required processing power and memory capacity is a prohibiting factor for performing data analytics on proprietary platforms. An obvious solution is the outsourcing of data analytics to cloud storage and cloud computing providers but this entails that privacy and security issues are raised, given the fact that data can be valuable and/or personal. The aim of this paper is the development of a server-side k-means algorithm over encrypted data using homomorphic encryption in order to overcome both the lack of resources of the data owner and the security concerns. Current solutions that deal with homomorphic encryption impose a heavy load on the side of the data owner; this limitation is now addressed in this work. More specifically, in this paper, we present a framework for the implementation of an homomorphic version of k-means, we discuss the capabilities of the current state-of-the-art homomorphic encryption schemes, and we propose a novel approach to server-side computation of k-means assuming a new adversary model tailored to modern settings. We instantiate our framework in two different versions in terms of operation assignment each coming in three flavors of operation implementation. All alternatives are evaluated thoroughly using both real experiments and analytic cost models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. Someone could argue that it would be the same as providing the trusted server with Alice’s private key. We prefer to provide the trusted server with an equivalent key in order to emphasize the fact the Alice and the trusted server are different entities.

  2. The term i is an index and not a power. In this paper, we always use parenthesis, e.g. \((t)^2\), to show a power of a number.

  3. It is straightforward to consider flavors where k-means terminates when the centroids have converged as well.

  4. As we have already discussed, there are two kinds of SM transformations. The first one uses the \(SM_N\), executed by Bob for noise reduction purposes after a homomorphic multiplication, while the second one executed by the trusted server using the SM for switch key purposes. However, there is no difference between them from the perspective of theoretical performance analysis, so in summarization tables we refer to them together.

References

  1. Almutairi N, Coenen F, Dures K (2017) K-means clustering using homomorphic encryption and an updatable distance matrix: secure third party data clustering with limited data owner interaction. Springer, Cham, pp 274–285

    Google Scholar 

  2. Aumann Y, Lindell Y (2010) Security against covert adversaries: efficient protocols for realistic adversaries. J Cryptol 23(2):281–343

    Article  MathSciNet  Google Scholar 

  3. Barhamgi M, Bandara AK, Yijun Y, Belhajjame K, Nuseibeh B (2016) Protecting privacy in the cloud: current practices, future directions. IEEE Comput 49(2):68–72

    Article  Google Scholar 

  4. Brakerski Z (2012) Fully homomorphic encryption without modulus switching from classical GapSVP. In: Lecture notes in computer science (including subseries Lecture notes in artificial intelligence and lecture notes in bioinformatics)

  5. Brakerski Z, Gentry C, Vaikuntanathan V (2012) Fully homomorphic encryption without bootstrapping. In: Innovations in theoretical computer science

  6. Brakerski Z, Vaikuntanathan V (2011) Efficient fully homomorphic encryption from (standard) LWE. In: 2011 IEEE 52nd annual symposium on foundations of computer science. IEEE, pp 97–106

  7. Bunn P, Ostrovsky R (2007) Secure two-party k-means clustering. In: Proceedings of the 14th ACM conference on Computer and communications security—CCS ’07, New York, New York, USA. ACM Press, p 486

  8. David W (2017) FHE-SI: Implementation of Brakerski’s leveled homomorphic encryption system. https://github.com/dwu4/fhe-si. Accessed 11 Aug 2018

  9. Elgamal T (1985) A public key cryptosystem and a signature scheme based on discrete logarithms. IEEE Trans Inf Theory 31(4):469–472

    Article  MathSciNet  Google Scholar 

  10. Gentry C (2009) Fully homomorphic encryption using ideal lattices. In: Proceedings of the 41st annual ACM symposium on symposium on theory of computing—STOC ’09

  11. Gentry C, Halevi S (2011) Implementing gentry’s fully-homomorphic encryption scheme. In: Lecture notes in computer science (including subseries Lecture notes in artificial intelligence and lecture notes in bioinformatics)

  12. Gentry C, Halevi S, Smart NP (2012) Homomorphic evaluation of the AES circuit. In: Lecture notes in computer science (including subseries Lecture notes in artificial intelligence and lecture notes in bioinformatics)

  13. Lindell Y, Pinkas B (2009) Secure multiparty computation for privacy-preserving data mining. J Priv Confid 1(1):59–98

    Google Scholar 

  14. Liu D (2013) International Patent Application No.: PCT/AU2013/000674. Accessible via http://patentscope.wipo.int/search/en/WO2013188929. Accessed 11 Aug 2018

  15. Liu D, Bertino E, Yi X (2014) Privacy of outsourced k-means clustering. In: 9th ACM symposium on information, computer and communications security, ASIA CCS ’14, Kyoto, Japan, 03–06 June 2014, pp 123–134

  16. Nakamoto S (2008) Bitcoin: a peer-to-peer electronic cash system. www.Bitcoin.org. Accessed 11 Aug 2018

  17. Ordonez C (2006) Integrating K-means clustering with a relational DBMS using SQL. IEEE Trans Knowl Data Eng 18(2):188–201

    Article  Google Scholar 

  18. Paillier P (1999) Public-key cryptosystems based on composite degree residuosity classes. In: Lecture notes in computer science (including subseries Lecture notes in artificial intelligence and lecture notes in bioinformatics)

  19. Popa RA, Redfield CMS, Zeldovich N, Balakrishnan H (2011) Cryptdb: protecting confidentiality with encrypted query processing. In: Proceedings of the 23rd ACM symposium on operating systems principles 2011, SOSP 2011, Cascais, Portugal, 23–26 October 2011, pp 85–100

  20. Rao F-Y, Samanthula BK, Bertino E, Yi X, Liu D (2015) Privacy-preserving and outsourced multi-user k-means clustering. In: IEEE conference on collaboration and internet computing, CIC 2015, Hangzhou, China, 27–30 October 2015, pp 80–89

  21. Samet S, Miri A (2008) Privacy preserving ID3 using Gini index over horizontally partitioned data. In: 2008 IEEE/ACS international conference on computer systems and applications. IEEE, pp 645–651

  22. Samet S, Miri A, Orozco-Barbosa L (2007) Privacy preserving k-means clustering in multi-party environment. In: SECRYPT, pp 381–385

  23. Shai H, Victor S (2018) HElib: An implementation of homomorphic encryption. https://github.com/shaih/HElib. Accessed 11 Aug 2018

  24. Theodouli A, Draziotis KA, Gounaris A (2017) Implementing private k-means clustering using a LWE-based cryptosystem. In: 2017 IEEE symposium on computers and communications, ISCC 2017, Heraklion, Greece, 3–6 July 2017, pp 88–93

  25. Tu S, Frans KM, Madden S, Zeldovich N (2013) Processing analytical queries over encrypted data. In: Proceedings of the VLDB Endowment

  26. Vaidya J, Clifton C (2003) Privacy-preserving k-means clustering over vertically partitioned data. In: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 206–215

  27. Wu D, Haven J (2012) Using homomorphic encryption for large scale statistical analysis. Technical Report from https://crypto.stanford.edu/~dwu4/papers/FHE-SI_Report.pdf. Accessed 11 Aug 2018

  28. Zheng Z, Xie S, Dai H-N, Wang H (2018) Blockchain challenges and opportunities: a survey. Int J Web Grid Serv 14(4):352–375

    Article  Google Scholar 

  29. Zyskind G, Nathan O, Pentland AS (2015) Decentralizing privacy: using blockchain to protect personal data. In: Proceedings, IEEE security and privacy workshops. SPW 2015:2015

Download references

Acknowledgements

We would like to thank Dr. K. Draziotis for all his help and advice on cryptographic issues.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anastasios Gounaris.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sakellariou, G., Gounaris, A. Homomorphically encrypted k-means on cloud-hosted servers with low client-side load. Computing 101, 1813–1836 (2019). https://doi.org/10.1007/s00607-019-00711-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00607-019-00711-w

Keywords

Navigation