Skip to main content

Large-Scale k-Means Clustering with User-Centric Privacy Preservation

  • Conference paper
Advances in Knowledge Discovery and Data Mining (PAKDD 2008)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5012))

Included in the following conference series:

Abstract

A k-means clustering with new privacy-preserving concept, user-centric privacy preservation, is presented. In this framework, users can conduct data mining using their private information with storing them in their local storages. After the computation, they obtain only mining result without disclosing private information to others. The number of parties that join conventional privacy-preserving data mining has been assumed to be two. In our framework, we assume large numbers of parties join the protocol, therefore, not only scalability but also asynchronism and fault-tolerance is important. Considering this, we propose a k-mean algorithm combined with a decentralized cryptographic protocol and a gossip-based protocol. The computational complexity is O( logn) with respect to the number of parties n and experimental results show that our protocol is scalable even with one million parties.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Sweeney, L.: k-anonymity: a model for protecting privacy. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems 10(5), 557–570 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  2. Lindell, Y., Pinkas, B.: Privacy Preserving Data Mining. In: Bellare, M. (ed.) CRYPTO 2000. LNCS, vol. 1880, pp. 20–24. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  3. Evfimievski, A., Srikant, A., Agrawal, R., Gehrke, J.: Privacy Preserving Mining of Association Rules. In: ACM SIGKDD Int’l conf. on Knowledge discovery in data mining, pp. 217–228 (2002)

    Google Scholar 

  4. Vaidya, J., Clifton, C.: Privacy-preserving k-means clustering over vertically partitioned data. In: ACM SIGKDD Int’l conf. on Knowledge discovery in data mining, pp. 206–215 (2003)

    Google Scholar 

  5. Jha, S., Kruger, L., McDaniel, P.: Privacy Preserving Clustering. In: European Symposium on Research in Computer Security, pp. 397–417 (2005)

    Google Scholar 

  6. Jagannathan, G., Wright, R.: Privacy-preserving distributed k-means clustering over arbitrarily partitioned data. In: ACM SIGKDD Int’l conf. on Knowledge discovery in data mining, pp. 593–599 (2005)

    Google Scholar 

  7. Kowalczyk, W., Vlassis, N.: Newscast EM. In: NIPS 17, MIT Press, Cambridge (2005)

    Google Scholar 

  8. Yao, A.C.-C.: How to Generate and Exchange Secrets. In: IEEE Symposium on FOCS, pp. 162–167 (1986)

    Google Scholar 

  9. Kempe, D., Dobra, A., Gehrke, J.: Computing aggregate information using gossip. In: IEEE Symposium on FOCS, pp. 482–491 (2003)

    Google Scholar 

  10. Paillier, P.: Public-Key Cryptosystems Based on Composite Degree Residuosity Classes. In: Stern, J. (ed.) EUROCRYPT 1999. LNCS, vol. 1592, pp. 223–238. Springer, Heidelberg (1999)

    Google Scholar 

  11. Goldreich, O.: Foundations of Cryptography II: Basic Applications. Cambridge University Press, Cambridge (2004)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Takashi Washio Einoshin Suzuki Kai Ming Ting Akihiro Inokuchi

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Sakuma, J., Kobayashi, S. (2008). Large-Scale k-Means Clustering with User-Centric Privacy Preservation. In: Washio, T., Suzuki, E., Ting, K.M., Inokuchi, A. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2008. Lecture Notes in Computer Science(), vol 5012. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68125-0_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-68125-0_29

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-68124-3

  • Online ISBN: 978-3-540-68125-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics