Two-Party Privacy-Preserving Agglomerative Document Clustering

Su, Chunhua; Zhou, Jianying; Bao, Feng; Takagi, Tsuyoshi; Sakurai, Kouichi

doi:10.1007/978-3-540-72163-5_16

Two-Party Privacy-Preserving Agglomerative Document Clustering

Chunhua Su¹,
Jianying Zhou²,
Feng Bao²,
Tsuyoshi Takagi³ &
…
Kouichi Sakurai¹

Conference paper

526 Accesses
5 Citations

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 4464))

Abstract

Document clustering is a powerful data mining technique to analyze the large amount of documents and structure large sets of text or hypertext documents. Many organizations or companies want to share their documents in a similar theme to get the joint benefits. However, it also brings the problem of sensitive information leakage without consideration of privacy. In this paper, we propose a cryptography-based framework to do the privacy-preserving document clustering among the users under the distributed environment: two parties, each having his private documents, want to collaboratively execute agglomerative document clustering without disclosing their private contents.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Boneh, D., et al.: Public Key Encryption with Keyword Search. In: Cachin, C., Camenisch, J.L. (eds.) EUROCRYPT 2004. LNCS, vol. 3027, pp. 506–522. Springer, Heidelberg (2004)
Google Scholar
Beil, F., Ester, M., Xu, X.: Frequent Term-Based Text Clustering. In: Proceedings of the 8th Int. Conf. on Knowledge Discovery and Data Mining (KDD) (2002)
Google Scholar
Cutting, D.R., et al.: Scatter/Gather: A Cluster-based Approach to Browsing Large Document Collections. Proc. ACM SIGIR 92, 318–329 (1992)
Google Scholar
Damgård, I., Jurik, M.: Client/server Tradeoffs for Online Elections. In: Naccache, D., Paillier, P. (eds.) PKC 2002. LNCS, vol. 2274, pp. 125–140. Springer, Heidelberg (2002)
Chapter Google Scholar
Feigenbaum, J., et al.: Secure multiparty computation of approximations. ACM Transactions on Algorithms 2, 435–472 (2006)
Article MathSciNet Google Scholar
Freedman, M.J., et al.: Keyword Search and Oblivious Pseudorandom Functions. In: Kilian, J. (ed.) TCC 2005. LNCS, vol. 3378, Springer, Heidelberg (2005)
Google Scholar
Goldreich, O.: Foundations of Cryptography, vol. 2. Cambridge University Press, Cambridge (2004)
Google Scholar
Goldreich, O., Micali, S., Wigderson, A.: How To Play Any Mental Game. In: Proceedings of the 19th annual ACM symposium on Theory of computing (1987)
Google Scholar
Jagannathan, G., Wright, R.: Privacy-Preserving Distributed k-Means Clustering over Arbitrarily Partitioned Data. In: Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD) (2005)
Google Scholar
Laur, S., Lipmaa, H., Mielikainen, T.: Private Itemset Support Counting. In: Qing, S., et al. (eds.) ICICS 2005. LNCS, vol. 3783, pp. 97–111. Springer, Heidelberg (2005)
Chapter Google Scholar
Lindell, Y., Pinkas, B.: Privacy preserving data mining. In: Bellare, M. (ed.) CRYPTO 2000. LNCS, vol. 1880, pp. 36–54. Springer, Heidelberg (2000)
Chapter Google Scholar
Ogata, W., Kurosawa, K.: Oblivious Keyword Search. Journal of Complexity 20(2-3), 356–371 (2004)
Article MATH MathSciNet Google Scholar
Paillier, P.: Public-Key Cryptosystems Based on Composite Degree Residuosity Classes. In: Stern, J. (ed.) EUROCRYPT 1999. LNCS, vol. 1592, Springer, Heidelberg (1999)
Google Scholar
Steinbach, M., Karypis, G., Kumar, V.: A Comparison of Document Clustering Techniques. In: KDD Workshop on Text Mining (2000)
Google Scholar
Song, D., Wagner, D., Perrig, A.: Practical Techniques for Searches on Encrypted Data. In: Proc. of the 2000 IEEE Security and Privacy Symposium (May 2000)
Google Scholar
Vaidya, J., Clifton, C.: Privacy-Preserving K-Means Clustering Over Vertically Partitioned Data. In: Proc. of the 9th ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining, Washington (2003)
Google Scholar
Yao, A.C.: Protocols for Secure Computation. In: 23rd FOCS (1982)
Google Scholar
Zamir, O., Etzioni, O.: Web Document Clustering: A Feasibility Demonstration. In: Proc. of 21st ACM SIGIR on Research and Development in Information Retrieval, Melbourne, Australia, 1998, pp. 46–54 (1998)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Communication Engineering, Kyushu University, Japan
Chunhua Su & Kouichi Sakurai
Systems and Security Department (SSD), Institute for Infocomm Research, Singapore
Jianying Zhou & Feng Bao
School of Systems Information Science, Future University-Hakodate, Japan
Tsuyoshi Takagi

Authors

Chunhua Su
View author publications
You can also search for this author in PubMed Google Scholar
Jianying Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Feng Bao
View author publications
You can also search for this author in PubMed Google Scholar
Tsuyoshi Takagi
View author publications
You can also search for this author in PubMed Google Scholar
Kouichi Sakurai
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Ed Dawson Duncan S. Wong

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Su, C., Zhou, J., Bao, F., Takagi, T., Sakurai, K. (2007). Two-Party Privacy-Preserving Agglomerative Document Clustering. In: Dawson, E., Wong, D.S. (eds) Information Security Practice and Experience. ISPEC 2007. Lecture Notes in Computer Science, vol 4464. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72163-5_16

Download citation

DOI: https://doi.org/10.1007/978-3-540-72163-5_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-72159-8
Online ISBN: 978-3-540-72163-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics