Abstract
Privacy preservation is realized by transforming data into k-anonymous (k-anonymization) and l-diverse (l-diversification) versions while minimizing information loss. Frequency l-diversity is possibly the most practical instance of the generic l-diversity principle for privacy preservation. In this paper, we propose an algorithm for frequency l-diversification. Our primary objective is to minimize information loss. Most studies in privacy preservation have focused on k-anonymization. While simple principles of l-diversification algorithms can be obtained by adapting k-anonymization algorithms it is not straightforward for some other principles. Our algorithm, called Bucket Clustering, adapts k-member Clustering. However, in order to guarantee termination we use hashing and buckets as in the Anatomy algorithm. In order to minimize information loss we choose tuples that minimize information loss during the creation of clusters. We empirically show that our algorithm achieves low information loss with acceptable efficiency.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Byun, J.-W., Kamra, A., Bertino, E., Li, N.: Efficient k-Anonymity Using Clustering Technique. In: CERIAS Tech Report 2006-10, Center for Education and Research in Information Assurance and Security, Purdue University (2006)
Bayardo, R.J., Agrawal, R.: Data Privacy through Optimal k-Anonymization. In: 21st International Conference on Data Engineering (ICDE) (2005)
Xiao, X., Tao, Y.: Anatomy: Simple and Effective Privacy Preservation. In: Very Large Data Bases (VLDB) Conference, pp. 139–150 (2006)
Sweeney, L.: k-Anonymity: A Model for Protecting Privacy. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems 10, 557–570 (2002)
Samarati, P., Sweeney, L.: Protecting Privacy when Disclosing Information: k-Anonymity and its Enforcement through Generalization and Suppression. In: Technical Report SRI-CSL-98-04, SRI Computer Science Laboratory (1998)
Iyengar, V.: Transforming Data to Satisfy Privacy Constraints. In: SIGKDD, pp. 279–288 (2002)
LeFevre, K., DeWitt, D. J., Ramakrishnan, R.: Mondrian Multidimensional k-Anonymity. In: 22nd International Conference on Data Engineering (ICDE) (2006)
Machanavajjhala, A., Kifer, D., Gehrke, J., Venkitasubramaniam, M.: l-Diversity: Privacy beyond k-Anonymity. In: IEEE 22nd International Conference on Data Engineering (ICDE 2006) (2006)
Wong, R. C.-W., Li, J., Fu, A. W.-C., Wang, K.: (alpha, k)-Anonymity: An Enhanced k-Anonymity Model for Privacy Preserving Data Publishing. In: 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD) (2006)
Li, N., Li, T., Venkatasubramanian, S.: t-Closeness: Privacy Beyond k-Anonymity and l-Diversity. In: IEEE 23rd International Conference on Data Engineering (ICDE), 106–115 (2007)
Ghinita, G., Karras, P., Kalnis, P., Mamoulis, N.: Fast Data Anonymization with Low Information Loss. In: Very Large Data Bases (VLDB) Conference. ACM Press, New York (2007)
Blake, C., Merz, C.: UCI Repository of Machine Learning Databases (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zare-Mirakabad, MR., Jantan, A., Bressan, S. (2009). Clustering-Based Frequency l-Diversity Anonymization. In: Park, J.H., Chen, HH., Atiquzzaman, M., Lee, C., Kim, Th., Yeo, SS. (eds) Advances in Information Security and Assurance. ISA 2009. Lecture Notes in Computer Science, vol 5576. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02617-1_17
Download citation
DOI: https://doi.org/10.1007/978-3-642-02617-1_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-02616-4
Online ISBN: 978-3-642-02617-1
eBook Packages: Computer ScienceComputer Science (R0)