Abstract
The significance of data analytics has been acknowledged in many scientific and business domains. However, the required processing power and memory capacity is a prohibiting factor for performing data analytics on proprietary platforms. An obvious solution is the outsourcing of data analytics to cloud storage and cloud computing providers but this entails that privacy and security issues are raised, given the fact that data can be valuable and/or personal. The aim of this paper is the development of a server-side k-means algorithm over encrypted data using homomorphic encryption in order to overcome both the lack of resources of the data owner and the security concerns. Current solutions that deal with homomorphic encryption impose a heavy load on the side of the data owner; this limitation is now addressed in this work. More specifically, in this paper, we present a framework for the implementation of an homomorphic version of k-means, we discuss the capabilities of the current state-of-the-art homomorphic encryption schemes, and we propose a novel approach to server-side computation of k-means assuming a new adversary model tailored to modern settings. We instantiate our framework in two different versions in terms of operation assignment each coming in three flavors of operation implementation. All alternatives are evaluated thoroughly using both real experiments and analytic cost models.
Similar content being viewed by others
Notes
Someone could argue that it would be the same as providing the trusted server with Alice’s private key. We prefer to provide the trusted server with an equivalent key in order to emphasize the fact the Alice and the trusted server are different entities.
The term i is an index and not a power. In this paper, we always use parenthesis, e.g. \((t)^2\), to show a power of a number.
It is straightforward to consider flavors where k-means terminates when the centroids have converged as well.
As we have already discussed, there are two kinds of SM transformations. The first one uses the \(SM_N\), executed by Bob for noise reduction purposes after a homomorphic multiplication, while the second one executed by the trusted server using the SM for switch key purposes. However, there is no difference between them from the perspective of theoretical performance analysis, so in summarization tables we refer to them together.
References
Almutairi N, Coenen F, Dures K (2017) K-means clustering using homomorphic encryption and an updatable distance matrix: secure third party data clustering with limited data owner interaction. Springer, Cham, pp 274–285
Aumann Y, Lindell Y (2010) Security against covert adversaries: efficient protocols for realistic adversaries. J Cryptol 23(2):281–343
Barhamgi M, Bandara AK, Yijun Y, Belhajjame K, Nuseibeh B (2016) Protecting privacy in the cloud: current practices, future directions. IEEE Comput 49(2):68–72
Brakerski Z (2012) Fully homomorphic encryption without modulus switching from classical GapSVP. In: Lecture notes in computer science (including subseries Lecture notes in artificial intelligence and lecture notes in bioinformatics)
Brakerski Z, Gentry C, Vaikuntanathan V (2012) Fully homomorphic encryption without bootstrapping. In: Innovations in theoretical computer science
Brakerski Z, Vaikuntanathan V (2011) Efficient fully homomorphic encryption from (standard) LWE. In: 2011 IEEE 52nd annual symposium on foundations of computer science. IEEE, pp 97–106
Bunn P, Ostrovsky R (2007) Secure two-party k-means clustering. In: Proceedings of the 14th ACM conference on Computer and communications security—CCS ’07, New York, New York, USA. ACM Press, p 486
David W (2017) FHE-SI: Implementation of Brakerski’s leveled homomorphic encryption system. https://github.com/dwu4/fhe-si. Accessed 11 Aug 2018
Elgamal T (1985) A public key cryptosystem and a signature scheme based on discrete logarithms. IEEE Trans Inf Theory 31(4):469–472
Gentry C (2009) Fully homomorphic encryption using ideal lattices. In: Proceedings of the 41st annual ACM symposium on symposium on theory of computing—STOC ’09
Gentry C, Halevi S (2011) Implementing gentry’s fully-homomorphic encryption scheme. In: Lecture notes in computer science (including subseries Lecture notes in artificial intelligence and lecture notes in bioinformatics)
Gentry C, Halevi S, Smart NP (2012) Homomorphic evaluation of the AES circuit. In: Lecture notes in computer science (including subseries Lecture notes in artificial intelligence and lecture notes in bioinformatics)
Lindell Y, Pinkas B (2009) Secure multiparty computation for privacy-preserving data mining. J Priv Confid 1(1):59–98
Liu D (2013) International Patent Application No.: PCT/AU2013/000674. Accessible via http://patentscope.wipo.int/search/en/WO2013188929. Accessed 11 Aug 2018
Liu D, Bertino E, Yi X (2014) Privacy of outsourced k-means clustering. In: 9th ACM symposium on information, computer and communications security, ASIA CCS ’14, Kyoto, Japan, 03–06 June 2014, pp 123–134
Nakamoto S (2008) Bitcoin: a peer-to-peer electronic cash system. www.Bitcoin.org. Accessed 11 Aug 2018
Ordonez C (2006) Integrating K-means clustering with a relational DBMS using SQL. IEEE Trans Knowl Data Eng 18(2):188–201
Paillier P (1999) Public-key cryptosystems based on composite degree residuosity classes. In: Lecture notes in computer science (including subseries Lecture notes in artificial intelligence and lecture notes in bioinformatics)
Popa RA, Redfield CMS, Zeldovich N, Balakrishnan H (2011) Cryptdb: protecting confidentiality with encrypted query processing. In: Proceedings of the 23rd ACM symposium on operating systems principles 2011, SOSP 2011, Cascais, Portugal, 23–26 October 2011, pp 85–100
Rao F-Y, Samanthula BK, Bertino E, Yi X, Liu D (2015) Privacy-preserving and outsourced multi-user k-means clustering. In: IEEE conference on collaboration and internet computing, CIC 2015, Hangzhou, China, 27–30 October 2015, pp 80–89
Samet S, Miri A (2008) Privacy preserving ID3 using Gini index over horizontally partitioned data. In: 2008 IEEE/ACS international conference on computer systems and applications. IEEE, pp 645–651
Samet S, Miri A, Orozco-Barbosa L (2007) Privacy preserving k-means clustering in multi-party environment. In: SECRYPT, pp 381–385
Shai H, Victor S (2018) HElib: An implementation of homomorphic encryption. https://github.com/shaih/HElib. Accessed 11 Aug 2018
Theodouli A, Draziotis KA, Gounaris A (2017) Implementing private k-means clustering using a LWE-based cryptosystem. In: 2017 IEEE symposium on computers and communications, ISCC 2017, Heraklion, Greece, 3–6 July 2017, pp 88–93
Tu S, Frans KM, Madden S, Zeldovich N (2013) Processing analytical queries over encrypted data. In: Proceedings of the VLDB Endowment
Vaidya J, Clifton C (2003) Privacy-preserving k-means clustering over vertically partitioned data. In: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 206–215
Wu D, Haven J (2012) Using homomorphic encryption for large scale statistical analysis. Technical Report from https://crypto.stanford.edu/~dwu4/papers/FHE-SI_Report.pdf. Accessed 11 Aug 2018
Zheng Z, Xie S, Dai H-N, Wang H (2018) Blockchain challenges and opportunities: a survey. Int J Web Grid Serv 14(4):352–375
Zyskind G, Nathan O, Pentland AS (2015) Decentralizing privacy: using blockchain to protect personal data. In: Proceedings, IEEE security and privacy workshops. SPW 2015:2015
Acknowledgements
We would like to thank Dr. K. Draziotis for all his help and advice on cryptographic issues.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Sakellariou, G., Gounaris, A. Homomorphically encrypted k-means on cloud-hosted servers with low client-side load. Computing 101, 1813–1836 (2019). https://doi.org/10.1007/s00607-019-00711-w
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00607-019-00711-w