Efficient systematic clustering method for k-anonymization

Kabir, Md. Enamul; Wang, Hua; Bertino, Elisa

doi:10.1007/s00236-010-0131-6

Efficient systematic clustering method for k-anonymization

Original Article
Published: 12 January 2011

Volume 48, pages 51–66, (2011)
Cite this article

Acta Informatica Aims and scope Submit manuscript

Md. Enamul Kabir¹,
Hua Wang¹ &
Elisa Bertino²

393 Accesses
54 Citations
3 Altmetric
Explore all metrics

Abstract

This paper presents a clustering (Clustering partitions record into clusters such that records within a cluster are similar to each other, while records in different clusters are most distinct from one another.) based k-anonymization technique to minimize the information loss while at the same time assuring data quality. Privacy preservation of individuals has drawn considerable interests in data mining research. The k-anonymity model proposed by Samarati and Sweeney is a practical approach for data privacy preservation and has been studied extensively for the last few years. Anonymization methods via generalization or suppression are able to protect private information, but lose valued information. The challenge is how to minimize the information loss during the anonymization process. We refer to the challenge as a systematic clustering problem for k-anonymization which is analysed in this paper. The proposed technique adopts group-similar data together and then anonymizes each group individually. The structure of systematic clustering problem is defined and investigated through paradigm and properties. An algorithm of the proposed problem is developed and shown that the time complexity is in ${O(\frac{n^{2}}{k})}$, where n is the total number of records containing individuals concerning their privacy. Experimental results show that our method attains a reasonable dominance with respect to both information loss and execution time. Finally the algorithm illustrates the usability for incremental datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Bayardo, R.J., Agrawal, R.: Data privacy through optimal k-anonymization. In: International Conference on Data Engineering (2005)
Byun J.W., Bertino E.: Micro-views, or on how to protect privacy while enhancing data usability: concepts and challenges. SIGMOD 35(1), 9–13 (2006)
Article Google Scholar
Byun, J.W., Kamra, A., Bertino, E., Li, N.: Efficient k-anonymization using clustering techniques. In: International Conference on Database Systems for Advanced Applications (DASFAA) (2007)
Byun, J.W., Sohn, Y., Bertino, E., Li, N.: Secure anonymization for incremental datasets. In: 3rd VLDB Workshop on Secure Data Management (SDM) (2006)
Chiu, C.-C., Tsai, C.-Y.: A k-anonymity clustering method for effective data privacy preservation. In: Third International Conference on Advanced Data Mining and Applications (ADMA) (2007)
Ciriani V., di Vimercati S.D.C., Foresti S., Samarati P.: k-anonymous data mining: a aurvey. In: Aggarwal, C.C., Yu, P.S. (eds) Privacy-Preserving Data Mining: Models and Algorithms, pp. 103–134. Kluwer Academic Publishers, Boston (2008)
Google Scholar
Fung, B.C.M., Wang, K., Yu, P.S.: Top-down specialization for information and privacy preservation. In: International Conference on Data Engineering (2005)
Gonzalez T.Z.: Clustering to minimize the maximum intercluster distance. Theor Comput Sci 38, 293–306 (1985)
Article MATH Google Scholar
Hettich, C.B.S., Merz, C.: UCI repository of machine learning databases (1998)
Iyengar, V.S.: Transforming data to satisfy privacy constraints. In: SIGKDD (2002)
LeFevre, K., DeWitt, D., Ramakrishnan, R.: Incogniti: efficient full-domain k-anonymity. In: ACM International Conference on Management of Data (2005)
LeFevre, K., DeWitt, D., Ramakrishnan, R.: Mondrian multidimensional k-anonymity. In: International Conference on Data Engineering (2006)
Li, N., Li, T.: t-closeness: privacy beyond k-anonymity and l-diversity. In: ICDE (2007)
Lin, J.L., Wei, M.C.: An efficient clustering method for k-anonymization. In: Proceedings of the 2008 International Workshop on Privacy and Anonymity in Information Society (2008)
Loukides, G., Shao, J.: Capturing data usefulness and privacy protection in k-anonymisation. In: Proceedings of the 2007 ACM Symposium on Applied Computing (2007)
Machanavajjhala, A., Gehrke, J., Kifer, D., Venkitasubramanian, M.: l-diversity: privacy beyond k-anonymity. In: ICDE (2006)
Meyerson, A., Williams, R.: On the complexity of optimal k-anonymity. In: PODS, pp. 223–228 (2004)
Samarati, P.: Protecting respondent’s privacy in microdata release. TKDE, 13(6) (2001)
Solanas, A., Sebe, F., Domingo-Ferrer, J.: Micro-aggregation-based heuristics for p-sensitive k-anonymity: One step beyond. In: International Work-shop on Privacy and Anonymity in the Information Society (2008)
Sun, X., Li, M., Wang, H., Plank, A.: An efficient hash-based algorithm for minimal k-anonymity. In: ACSC, pp. 101–107, (2008)
Sun, X., Wang, H., Li, J.: Priority driven K-Anonymisation for privacy protection. In: AusDM, pp. 73–78 (2008)
Sweeney L.: Achieving k-anonymity privacy protection using generalization and supression. Int. J. Uncertainty Fuzziness Knowledge-based Syst. 10(5), 571–588 (2002)
Article MATH MathSciNet Google Scholar
Sweeney L.: K-anonymity: a model for protecting privacy. Int. J. Uncertainty Fuzziness Knowledge-based Syst. 10(5), 557–570 (2002)
Article MATH MathSciNet Google Scholar
Truta, T., Vinay, B.: Privacy protection: p-sensitive k-anonymity property. In: International Workshop on Privacy Data Management (PDM), p. 94 (2006)
Xu, J., Wang, W., Pei, J., Wang, X., Shi, B., Fu, A.W.C.: Utility-based anonymization using local recording. In: KDD 2006, pp. 785–790 (2006)
Wong, R.C.-W., Li, J., Fu, A.W.-C., Wang, K.: (α, k)-anonymity: an enhanced k-anonymity model for privacy preserving data publishing. In: Proceedings of the Twelfth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2006)

Download references

Author information

Authors and Affiliations

Department of Mathematics and Computing, University of Southern Queensland, Toowoomba, QLD, 4350, Australia
Md. Enamul Kabir & Hua Wang
Department of Computer Science and CERIAS, Purdue University, West Lafayette, IN, USA
Elisa Bertino

Authors

Md. Enamul Kabir
View author publications
You can also search for this author in PubMed Google Scholar
Hua Wang
View author publications
You can also search for this author in PubMed Google Scholar
Elisa Bertino
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Elisa Bertino.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kabir, M.E., Wang, H. & Bertino, E. Efficient systematic clustering method for k-anonymization. Acta Informatica 48, 51–66 (2011). https://doi.org/10.1007/s00236-010-0131-6

Download citation

Received: 27 July 2009
Accepted: 07 December 2010
Published: 12 January 2011
Issue Date: February 2011
DOI: https://doi.org/10.1007/s00236-010-0131-6

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Efficient systematic clustering method for k-anonymization

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

K-Anonymity Algorithm Based on Improved Clustering

Effective L-Diversity Anonymization Algorithm Based on Improved Clustering

A Topological k-Anonymity Model Based on Collaborative Multi-view Clustering

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Efficient systematic clustering method for k-anonymization

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

K-Anonymity Algorithm Based on Improved Clustering

Effective L-Diversity Anonymization Algorithm Based on Improved Clustering

A Topological k-Anonymity Model Based on Collaborative Multi-view Clustering

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation