Abstract
Sharing information brought by governments, companies, and individuals, has created fabulous opportunities for knowledge-based decision making. However, the main challenge in collaborative data analysis returns back to the privacy of sensitive data. In current study, we propose a general framework which can be exploited as a secure tool for constructing any agglomerative hierarchical clustering algorithm over partitioned data. We assume that data is distributed between two (or more) parties either horizontally or vertically, such that for mutual benefits the participated parties are interested in obtaining the clusters’ structure on whole data, but for privacy concerns, they are not willing to share the original datasets. To this end, in this study, we propose general algorithms based on secure scalar product and secure hamming distance to securely compute the desired criteria for shaping the clusters’ scheme. Our proposed approach covers the private construction of all possible agglomerative hierarchical clustering algorithms on distributed datasets, including both numerical and categorical data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
References
Artoisenet, C., Roland, M., Closon, M.: Health networks: actors, professional relationships, and controversies. In: Collaborative Patient Centred eHealth, vol. 141. IOSPress (2013)
Berkhin, P.: A survey of clustering data mining techniques. In: Kogan, J., Nicholas, C., Teboulle, M. (eds.) Grouping Multidimensional Data, pp. 25–71. Springer, Berlin Heidelberg (2006). https://doi.org/10.1007/3-540-28349-8_2
Bogan, E., English, J.: Benchmarking for Best Practices: Winning Through Innovative Adaptation. McGraw-Hill, New York (1994)
Bringer, J., Chabanne, H., Favre, M., Patey, A., Schneider, T., Zohner, M.: GSHADE: faster privacy-preserving distance computation and biometric identification. In: Proceedings of the 2nd ACM Workshop on Information Hiding and Multimedia Security, New York, NY, USA, pp. 187–198 (2014)
Bringer, J., Chabanne, H., Patey, A.: SHADE: secure hamming distance computation from oblivious transfer. In: Adams, A.A., Brenner, M., Smith, M. (eds.) FC 2013. LNCS, vol. 7862, pp. 164–176. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-41320-9_11
Bunn, P., Ostrovsky, R.: Secure two-party k-means clustering. In: Proceedings of the 14th ACM Conference on Computer and Communications Security, CCS 2007, pp. 486–497. ACM, NY, USA (2007)
Day, W.H.E., Edelsbrunner, H.: Efficient algorithms for agglomerative hierarchical clustering methods. J. Classif. 1(1), 7–24 (1984)
De, I., Tripathy, A.: A secure two party hierarchical clustering approach for vertically partitioned data set with accuracy measure. In: Thampi, S., Abraham, A., Pal, S., Rodriguez, J. (eds.) Recent Advances in Intelligent Informatics. Advances in Intelligent Systems and Computing, vol. 235. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-01778-5_16
Erkin, Z., Franz, M., Guajardo, J., Katzenbeisser, S., Lagendijk, I., Toft, T.: Privacy-preserving face recognition. In: Goldberg, I., Atallah, M.J. (eds.) PETS 2009. LNCS, vol. 5672, pp. 235–253. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-03168-7_14
Gan, G., Ma, C., Wu, J.: Data Clustering: Theory, Algorithms, and Applications. ASA-SIAM Series on Statistics and Applied Probability. Society for Industrial and Applied Mathematics, Philadelphia (2007)
Hamidi, M., Sheikhalishahi, M., Martinelli, F.: Secure two-party agglomerative hierarchical clustering construction. In: Proceedings of the 4th International Conference on Information Systems Security and Privacy, ICISSP 2018, Funchal, Madeira, Portugal, 22–24 January 2018, pp. 432–437 (2018)
Hamidi, M., Sheikhalishahi, M., Martinelli, F.: Secure two-party agglomerative hierarchical clustering construction. In: the 4th International Conference on Information Systems Security and Privacy (ICISSP). SciTePress (2018)
Jagannathan, G., Pillaipakkamnatt, K., Wright, R.N.: A new privacy-preserving distributed k-clustering algorithm. In: SDM, pp. 494–498. SIAM (2006)
Jagannathan, G., Wright, R.N.: Privacy-preserving distributed k-means clustering over arbitrarily partitioned data. In: Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, KDD 2005, pp. 593–599. ACM, New York, NY, USA (2005)
Jha, S., Kruger, L., McDaniel, P.: Privacy preserving clustering. In: di Vimercati, S.C., Syverson, P., Gollmann, D. (eds.) ESORICS 2005. LNCS, vol. 3679, pp. 397–417. Springer, Heidelberg (2005). https://doi.org/10.1007/11555827_23
Martinelli, F., Saracino, A., Sheikhalishahi, M.: Modeling privacy aware information sharing systems: a formal and general approach. In: 15th IEEE International Conference on Trust, Security and Privacy in Computing and Communications (2016)
Sheikhalishahi, M., Martinelli, F.: Privacy preserving hierarchical clustering over multi-party data distribution. In: Wang, G., Atiquzzaman, M., Yan, Z., Choo, K.-K.R. (eds.) SpaCCS 2017. LNCS, vol. 10656, pp. 530–544. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-72389-1_42
Mohammed, N., Chen, R., Fung, B.C., Yu, P.S.: Differentially private data release for data mining. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2011, pp. 493–501, ACM, New York, NY, USA (2011)
Murtagh, F., Contreras, P.: Algorithms for hierarchical clustering: an overview. Wiley Interdisc. Rew. Data Min. Knowl. Discov. 2(1), 86–97 (2012)
Nateghizad, M., Erkin, Z., Lagendijk, R.L.: Efficient and secure equality tests. In: 2016 IEEE International Workshop on Information Forensics and Security (WIFS), pp. 1–6 (2016)
Nateghizad, M., Erkin, Z., Lagendijk, R.L.: An efficient privacy-preserving comparison protocol in smart metering systems. EURASIP J. Inf. Secur. 2016(1), 11 (2016)
Oliveira, S.R.M., Zaïane, O.R.: Privacy preserving frequent itemset mining. In: Proceedings of the IEEE International Conference on Privacy, Security and Data Mining, CRPIT 2014, vol. 14, pp. 43–54 (2002)
Oliveira, S.R.M., Zaiane, O.R.: A privacy-preserving clustering approach toward secure and effective data analysis for business collaboration. Comput. Secur. 26(1), 81–93 (2007)
Sheikhalishahi, M., Martinelli, F.: Privacy preserving clustering over horizontal and vertical partitioned data. In: 2017 IEEE Symposium on Computers and Communications, ISCC 2017, Heraklion, Greece, 3–6 July 2017, pp. 1237–1244 (2017)
Sheikhalishahi, M., Martinelli, F.: Privacy-utility feature selection as a privacy mechanism in collaborative data classification. In: The 26th IEEE International Conference on Enabling Technologies: Infrastructure for Collaborative Enterprises, Poznan, Poland (2017)
Sheikhalishahi, M., Mejri, M., Tawbi, N., Martinelli, F.: Privacy-aware data sharing in a tree-based categorical clustering algorithm. In: Cuppens, F., Wang, L., Cuppens-Boulahia, N., Tawbi, N., Garcia-Alfaro, J. (eds.) FPS 2016. LNCS, vol. 10128, pp. 161–178. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-51966-1_11
Su, C., Zhou, J., Bao, F., Takagi, T., Sakurai, K.: Two-party privacy-preserving agglomerative document clustering. In: Dawson, E., Wong, D.S. (eds.) ISPEC 2007. LNCS, vol. 4464, pp. 193–208. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-72163-5_16
Tan, S.C., San Lau, J.P.: Time series clustering: A superior alternative for market basket analysis. In: Herawan, T., Deris, M., Abawajy, J. (eds.) DaEng-2013. LNEE, vol. 285, pp. 241–248. Springer, Singapore (2014). https://doi.org/10.1007/978-981-4585-18-7_28
Vaidya, J., Clifton, C.: Privacy preserving association rule mining in vertically partitioned data. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2002, pp. 639–644. ACM, New York, NY, USA (2002)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Sheikhalishahi, M., Hamidi, M., Martinelli, F. (2019). Privacy Preserving Collaborative Agglomerative Hierarchical Clustering Construction. In: Mori, P., Furnell, S., Camp, O. (eds) Information Systems Security and Privacy. ICISSP 2018. Communications in Computer and Information Science, vol 977. Springer, Cham. https://doi.org/10.1007/978-3-030-25109-3_14
Download citation
DOI: https://doi.org/10.1007/978-3-030-25109-3_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-25108-6
Online ISBN: 978-3-030-25109-3
eBook Packages: Computer ScienceComputer Science (R0)