Abstract
Despite being one of the most common approaches in unsupervised data analysis, a very small literature exists in applying formal methods to address data mining problems. This paper applies an abstract representation of a hierarchical categorical clustering algorithm (CCTree) to solve the problem of privacy-aware data clustering in distributed agents. The proposed methodology is based on rewriting systems, and automatically generates a global structure of the clusters. We prove that the proposed approach improves the time complexity. Moreover a metric is provided to measure the privacy gain after revealing the CCTree result. Furthermore, we discuss under what condition the CCTree clustering in distributed framework produces the comparable result to the centralized one.
This research has been partially supported by the EU Funded Projects H2020 C3IISP, GA #700294, H2020 NeCS, GA #675320, EIT Digital MCloudDaaS and partially by the Natural Sciences and Engineering Research Council of Canada (NSERC).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Berkhin, P.: A survey of clustering data mining techniques. In: Kogan, J., Nicholas, C., Teboulle, M. (eds.) Grouping Multidimensional Data, pp. 25–71. Springer, Heidelberg (2006)
Clifton, C., Kantarcioglu, M., Vaidya, J., Lin, X., Zhu, M.Y.: Tools for privacy preserving distributed data mining. SIGKDD Explor. Newsl. 4(2), 28–34 (2002)
Dershowitz, N., Jouannaud, J.: Rewrite systems. In: van Leeuwen, J. (ed.) Handbook of Theoretical Computer Science, vol. b, pp. 243–320. MIT Press, Cambridge (1990)
Fung, B.C.M., Wang, K., Chen, R., Yu, P.S.: Privacy-preserving data publishing: a survey of recent developments. ACM Comput. Surv. 42(4), 14:1–14:53 (2010)
Kantarcioǧlu, M., Jin, J., Clifton, C.: When do data mining results violate privacy? In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2004, pp. 599–604. ACM, New York (2004)
Kriegel, H.P., Kroger, P., Pryakhin, A., Schubert, M.: Effective and efficient distributed model-based clustering. In: Fifth IEEE International Conference on Data Mining (2005)
Kullback, S., Leibler, R.A.: On information and sufficiency. Ann. Math. Statist. 22, 79–86 (1951)
Lindell, Y., Pinkas, B.: Secure multiparty computation for privacy-preserving data mining (2008)
Martinelli, F., Saracino, A., Sheikhalishahi, M.: Modeling privacy aware information sharing systems: a formal and general approach. In: 15th IEEE International Conference on Trust, Security and Privacy in Computing and Communications (2016)
Oliveira, S.R.M., Zaïane, O.R.: Achieving privacy preservation when sharing data for clustering. In: Jonker, W., Petković, M. (eds.) SDM 2004. LNCS, vol. 3178, pp. 67–82. Springer, Heidelberg (2004). doi:10.1007/978-3-540-30073-1_6
Sheikhalishahi, M., Mejri, M., Tawbi, N.: Clustering spam emails into campaigns. In: Library, S.D. (ed.) 1st Conference on Information Systems Security and Privacy (2015)
Sheikhalishahi, M., Saracino, A., Mejri, M., Tawbi, N., Martinelli, F.: Fast and effective clustering of spam emails based on structural similarity. In: Garcia-Alfaro, J., Kranakis, E., Bonfante, G. (eds.) FPS 2015. LNCS, vol. 9482, pp. 195–211. Springer, Heidelberg (2016). doi:10.1007/978-3-319-30303-1_12
Sheikhalishahi, M., Mejri, M., Tawbi, N.: On the abstraction of a categorical clustering algorithm. In: Perner, P. (ed.) MLDM 2016. LNCS (LNAI), pp. 659–675. Springer, Heidelberg (2016). doi:10.1007/978-3-319-41920-6_51
Zhan, Z.J.: Privacy-preserving collaborative data mining. Doctoral Dissertation (2006)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Sheikhalishahi, M., Mejri, M., Tawbi, N., Martinelli, F. (2017). Privacy-Aware Data Sharing in a Tree-Based Categorical Clustering Algorithm. In: Cuppens, F., Wang, L., Cuppens-Boulahia, N., Tawbi, N., Garcia-Alfaro, J. (eds) Foundations and Practice of Security. FPS 2016. Lecture Notes in Computer Science(), vol 10128. Springer, Cham. https://doi.org/10.1007/978-3-319-51966-1_11
Download citation
DOI: https://doi.org/10.1007/978-3-319-51966-1_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-51965-4
Online ISBN: 978-3-319-51966-1
eBook Packages: Computer ScienceComputer Science (R0)