Abstract
Privacy-preserving k-means clustering assumes that there are at least two parties in the secure interactive computation. However, the existing schemes do not consider the data standardization which is an important task before executing the clustering among the different database. In this paper, we point out without data standardization, some problems will arise from many applications of data mining. Also, we provide a solution for the secure data standardization in the privacy-preserving k-means clustering.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Bunn, P., Ostrovsky, R.: Secure two-party k-means clustering. In: Proc. of the 14th ACM conference on Computer and communications security, pp. 486–497 (2007)
Chu, C.W., Holliday, J., Willett, P.: Effect of data standardization on chemical clustering and similarity searching. Journal of Chemical Information and Modeling (2008)
Feigenbaum, J., Ishai, Y., Malkin, T., Nissim, K., Strauss, M., Wright, R.: Secure multiparty computation of approximations. In: Proc. of 28th International Colloquium on Automata, Languages and Programming, pp. 927–938 (2001)
Goldreich, O., Micali, S., Wigderson, A.: How to play any mental game or a completeness theorem for protocols with honest majority. In: Proc. of the Nineteenth Annual ACM Symposium on Theory of Computing, pp. 218–229 (1987)
Jha, S., Kruger, L., McDaniel, P.: Privacy preserving clustering. In: de di Vimercati, S.C., Syverson, P.F., Gollmann, D. (eds.) ESORICS 2005. LNCS, vol. 3679, pp. 397–417. Springer, Heidelberg (2005)
Jagannathan, G., Pillaipakkamnatt, K., Wright, R.N.: A new privacy-preserving distributed k-clustering algorithm. In: Proc. of the 2006 SIAM International Conference on Data Mining, SDM (2006)
Jagannathan, G., Wright, R.: Privacy-preserving distributed k-means clustering over arbitrarily partitioned data. In: Proc. of the 11th International Conference on Knowledge Discovery and Data Mining, KDD (2005)
Kiltz, E., Leander, G., Malone-Lee, J.: Secure computation of the mean and related statistics. In: Kilian, J. (ed.) TCC 2005. LNCS, vol. 3378, pp. 283–302. Springer, Heidelberg (2005)
MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: Proc. of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297. University of California Press, Berkeley (1967)
Naor, M., Pinkas, B.: Oblivious transfer and polynomial evaluation. In: 31st ACM Symposium on Theory of Computing, pp. 245–254. ACM Press, New York (1999)
Paillier, P.: Public-key cryptosystems based on composite degree residuosity classes. In: Stern, J. (ed.) EUROCRYPT 1999. LNCS, vol. 1592, p. 223. Springer, Heidelberg (1999)
Peng, K., Boyd, C., Dawson, E., Lee, B.: An efficient and verifiable solution to the millionaire problem. In: Park, C.-s., Chee, S. (eds.) ICISC 2004. LNCS, vol. 3506, pp. 51–66. Springer, Heidelberg (2005)
Rakhlin, A., Caponnetto, A.: Stability of k-means clustering. In: Proc. of Neural Information Processing Systems Conference (2006)
Schaffer, C.M., Green, P.E.: An empirical comparison of variable standardization methods in cluster analysis. Multivariate Behavioral Research 31(2), 149–167 (1996)
Vaidya, J., Clifton, C.: Privacy-preserving k-means clustering over vertically partitioned data. In: Proc. of the 9th ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining, USA (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Su, C., Zhan, J., Sakurai, K. (2009). Importance of Data Standardization in Privacy-Preserving K-Means Clustering. In: Chen, L., Liu, C., Liu, Q., Deng, K. (eds) Database Systems for Advanced Applications. DASFAA 2009. Lecture Notes in Computer Science, vol 5667. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04205-8_23
Download citation
DOI: https://doi.org/10.1007/978-3-642-04205-8_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04204-1
Online ISBN: 978-3-642-04205-8
eBook Packages: Computer ScienceComputer Science (R0)