Abstract
A k-means clustering with new privacy-preserving concept, user-centric privacy preservation, is presented. In this framework, users can conduct data mining using their private information with storing them in their local storages. After the computation, they obtain only mining result without disclosing private information to others. The number of parties that join conventional privacy-preserving data mining has been assumed to be two. In our framework, we assume large numbers of parties join the protocol, therefore, not only scalability but also asynchronism and fault-tolerance is important. Considering this, we propose a k-mean algorithm combined with a decentralized cryptographic protocol and a gossip-based protocol. The computational complexity is O( logn) with respect to the number of parties n and experimental results show that our protocol is scalable even with one million parties.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Sweeney, L.: k-anonymity: a model for protecting privacy. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems 10(5), 557–570 (2002)
Lindell, Y., Pinkas, B.: Privacy Preserving Data Mining. In: Bellare, M. (ed.) CRYPTO 2000. LNCS, vol. 1880, pp. 20–24. Springer, Heidelberg (2000)
Evfimievski, A., Srikant, A., Agrawal, R., Gehrke, J.: Privacy Preserving Mining of Association Rules. In: ACM SIGKDD Int’l conf. on Knowledge discovery in data mining, pp. 217–228 (2002)
Vaidya, J., Clifton, C.: Privacy-preserving k-means clustering over vertically partitioned data. In: ACM SIGKDD Int’l conf. on Knowledge discovery in data mining, pp. 206–215 (2003)
Jha, S., Kruger, L., McDaniel, P.: Privacy Preserving Clustering. In: European Symposium on Research in Computer Security, pp. 397–417 (2005)
Jagannathan, G., Wright, R.: Privacy-preserving distributed k-means clustering over arbitrarily partitioned data. In: ACM SIGKDD Int’l conf. on Knowledge discovery in data mining, pp. 593–599 (2005)
Kowalczyk, W., Vlassis, N.: Newscast EM. In: NIPS 17, MIT Press, Cambridge (2005)
Yao, A.C.-C.: How to Generate and Exchange Secrets. In: IEEE Symposium on FOCS, pp. 162–167 (1986)
Kempe, D., Dobra, A., Gehrke, J.: Computing aggregate information using gossip. In: IEEE Symposium on FOCS, pp. 482–491 (2003)
Paillier, P.: Public-Key Cryptosystems Based on Composite Degree Residuosity Classes. In: Stern, J. (ed.) EUROCRYPT 1999. LNCS, vol. 1592, pp. 223–238. Springer, Heidelberg (1999)
Goldreich, O.: Foundations of Cryptography II: Basic Applications. Cambridge University Press, Cambridge (2004)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Sakuma, J., Kobayashi, S. (2008). Large-Scale k-Means Clustering with User-Centric Privacy Preservation. In: Washio, T., Suzuki, E., Ting, K.M., Inokuchi, A. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2008. Lecture Notes in Computer Science(), vol 5012. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68125-0_29
Download citation
DOI: https://doi.org/10.1007/978-3-540-68125-0_29
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68124-3
Online ISBN: 978-3-540-68125-0
eBook Packages: Computer ScienceComputer Science (R0)