ABSTRACT
It is attractive for an organization to outsource its data analytics to a service provider who has powerful platforms and advanced analytics skills. However, the organization (data owner) may have concerns about the privacy of its data. In this paper, we present a method that allows the data owner to encrypt its data with a homomorphic encryption scheme and the service provider to perform k-means clustering directly over the encrypted data. However, since the ciphertexts resulting from homomorphic encryption do not preserve the order of distances between data objects and cluster centers, we propose an approach that enables the service provider to compare encrypted distances with the trapdoor information provided by the data owner. The efficiency of our method is validated by extensive experimental evaluation.
- D. Agrawal, A. E. Abbadi, F. Emekçi, and A. Metwally. Database management as a service: Challenges and opportunities. In Proceedings of the 25th International Conference on Data Engineering, pages 1709--1716, 2009. Google ScholarDigital Library
- R. Agrawal, J. Kiernan, R. Srikant, and Y. Xu. Order preserving encryption for numeric data. In Proceedings of the 2004 ACM SIGMOD international conference on Management of data, SIGMOD '04, pages 563--574, 2004. Google ScholarDigital Library
- R. Agrawal, J. Kiernan, R. Srikant, and Y. Xu. Order preserving encryption for numeric data. In Proceedings of the 2004 ACM SIGMOD international conference on Management of data, SIGMOD '04, pages 563--574, New York, NY, USA, 2004. ACM. Google ScholarDigital Library
- R. Agrawal and R. Srikant. Privacy-preserving data mining. In Proceedings of the 2000 ACM SIGMOD international conference on Management of data, pages 439--450, 2000. Google ScholarDigital Library
- M. Barni, P. Failla, V. Kolesnikov, R. Lazzeretti, A.-R. Sadeghi, and T. Schneider. Secure evaluation of private linear branching programs with medical applications. In M. Backes and P. Ning, editors, ESORICS, volume 5789 of Lecture Notes in Computer Science, pages 424--439. Springer, 2009. Google ScholarDigital Library
- A. Boldyreva, N. Chenette, Y. Lee, and A. O'Neill. Order-preserving symmetric encryption. In Proceedings of the 28th Annual International Conference on Advances in Cryptology, EUROCRYPT '09, pages 224--241, 2009. Google ScholarDigital Library
- D. Boneh, A. Sahai, and B. Waters. Functional encryption: Definitions and challenges. In Proceedings of the 8th Conference on Theory of Cryptography, TCC'11, pages 253--273, Berlin, Heidelberg, 2011. Springer-Verlag. Google ScholarDigital Library
- Z. Brakerski and V. Vaikuntanathan. Fully homomorphic encryption from ring-lwe and security for key dependent messages. In P. Rogaway, editor, CRYPTO, volume 6841 of Lecture Notes in Computer Science, pages 505--524. Springer, 2011. Google ScholarDigital Library
- J. Brickell, D. E. Porter, V. Shmatikov, and E. Witchel. Privacy-preserving remote diagnostics. In Proceedings of the 14th ACM Conference on Computer and Communications Security, CCS '07, pages 498--507, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
- P. Bunn and R. Ostrovsky. Secure two-party k-means clustering. In Proceedings of the 14th ACM conference on Computer and communications security, CCS '07, pages 486--497, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
- S. Goldwasser, Y. Kalai, R. A. Popa, V. Vaikuntanathan, and N. Zeldovich. Reusable garbled circuits and succinct functional encryption. In Proceedings of the 45th Annual ACM Symposium on Symposium on Theory of Computing, STOC '13, pages 555--564, New York, NY, USA, 2013. ACM. Google ScholarDigital Library
- S. Goldwasser, Y. T. Kalai, R. A. Popa, V. Vaikuntanathan, and N. Zeldovich. How to run turing machines on encrypted data. In R. Canetti and J. A. Garay, editors, CRYPTO (2), volume 8043 of Lecture Notes in Computer Science, pages 536--553. Springer, 2013.Google Scholar
- G. Jagannathan and R. N. Wright. Privacy-preserving distributed k-means clustering over arbitrarily partitioned data. In Proceedings of the 11th ACM SIGKDD international conference on Knowledge discovery in data mining, 2005. Google ScholarDigital Library
- M. K. Jiawei Han and J. Pei. Data Mining: Concepts and Techniques, 3rd ed. Morgan Kaufmann Publishers, 2011. Google ScholarDigital Library
- H. Kargupta, S. Datta, Q. Wang, and K. Sivakumar. On the privacy preserving properties of random data perturbation techniques. In Proceedings of the Third IEEE International Conference on Data Mining, ICDM '03, pages 99--, Washington, DC, USA, 2003. IEEE Computer Society. Google ScholarDigital Library
- J. Katz and Y. Lindell. Introduction to Modern Cryptography (Chapman & Hall/Crc Cryptography and Network Security Series). Chapman & Hall/CRC, 2007. Google ScholarDigital Library
- D. Liu. Homomorphic encryption for database querying. Inernational Patent Application No.: PCT/AU2013/000674 (Accessible via http://patentscope.wipo.int/search/en/WO2013188929), 2013.Google Scholar
- D. Liu and S. Wang. Programmable order preserving secure index for encrypted database query. In Proceedings of the 5th IEEE International Conference on Cloud Computing, 2012. Google ScholarDigital Library
- M. Naehrig, K. Lauter, and V. Vaikuntanathan. Can homomorphic encryption be practical? In Proceedings of the 3rd ACM workshop on Cloud computing security workshop, CCSW '11, pages 113--124, 2011. Google ScholarDigital Library
- R. Ostrovsky, Y. Rabani, L. J. Schulman, and C. Swamy. The effectiveness of lloyd-type methods for the k-means problem. In Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science, pages 165--176, 2006. Google ScholarDigital Library
- A. C.-C. Yao. How to generate and exchange secrets. In Proceedings of the 27th Annual Symposium on Foundations of Computer Science, SFCS '86, pages 162--167, Washington, DC, USA, 1986. IEEE Computer Society. Google ScholarDigital Library
- X. Yi and Y. Zhang. Equally contributory privacy-preserving k-means clustering over vertically partitioned data. Inf. Syst., 38(1):97--107, 2013. Google ScholarDigital Library
Index Terms
- Privacy of outsourced k-means clustering
Recommendations
Privacy Preserving Outsourced K-means Clustering Using Kd-tree
Provable and Practical SecurityAbstractNowadays, more and more resource-constrained individuals and corporations tend to outsource their data and machine learning tasks to cloud servers, enjoying high-quality data storage and computing services ubiquitously. However, outsourcing ...
Proofs of Encrypted Data Retrievability with Probabilistic and Homomorphic Message Authenticators
TRUSTCOM '15: Proceedings of the 2015 IEEE Trustcom/BigDataSE/ISPA - Volume 01When users store their data on a cloud, they may concern on whether their data is stored correctly and can be fully retrieved. Proofs of Retrivability (PoR) is a cryptographic concept that allows users to remotely check the integrity of their data ...
Privacy preserving weighted similarity search scheme for encrypted data
Cloud computing has become increasingly popular among individuals and enterprises because of the benefits it provides by outsourcing their data to cloud servers. However, the security of the outsourced data has become a major concern. For privacy ...
Comments