An Efficient Clustering Algorithm for k-Anonymisation

Loukides, Grigorios; Shao, Jian-Hua

doi:10.1007/s11390-008-9121-3

An Efficient Clustering Algorithm for k-Anonymisation

Regular Paper
Published: 05 April 2008

Volume 23, pages 188–202, (2008)
Cite this article

Journal of Computer Science and Technology Aims and scope Submit manuscript

Grigorios Loukides¹ &
Jian-Hua Shao¹

104 Accesses
Explore all metrics

Abstract

K-anonymisation is an approach to protecting individuals from being identified from data. Good k-anonymisations should retain data utility and preserve privacy, but few methods have considered these two con°icting requirements together. In this paper, we extend our previous work on a clustering-based method for balancing data utility and privacy protection, and propose a set of heuristics to improve its effectiveness. We introduce new clustering criteria that treat utility and privacy on equal terms and propose sampling-based techniques to optimally set up its parameters. Extensive experiments show that the extended method achieves good accuracy in query answering and is able to prevent linking attacks effectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Artificial Intelligence

References

Li N, Li T, Venkatasubramanian S. t-closeness: Privacy beyond k-anonymity and l-diversity. In Proc. ICDE, Istanbul, Turkey, 2007, pp.106–115.
Loukides G, Shao J. Speeding up clustering-based k-anonymisation algorithms with pre-partitioning. In Proc. The 24th British National Conference on Databases, Glasgow, UK, 2007, pp.203–214.
Loukides G, Shao J. Capturing data usefulness and privacy protection in K-anonymisation. In Proc. The 22nd Annual ACM Symposium on Applied Computing, Seoul, Korea, 2007, pp.370–374.
Sweeney L. K-anonymity: A model for protecting privacy. International Journal on Uncertainty, Fuzziness and Knowledge-Based Systems, 2002, 10(5): 557–570.
Article MATH MathSciNet Google Scholar
Samarati P. Protecting respondents identities in microdata release. IEEE Transactions on Knowledge and Data Engineering, 2001, 13(9): 1010–1027.
Article Google Scholar
LeFevre K, DeWitt D J, Ramakrishnan R. Mondrian multi-dimensional K-anonymity. In Proc. ICDE, Atlanta, Georgia, USA, 2006, p.25.
Bayardo R J, Agrawal R. Data privacy through optimal k-anonymization. In Proc. ICDE, Tokyo, Japan, 2005, pp.217–228.
Iyengar V S. Transforming data to satisfy privacy constraints. In Proc. KDD, Edmonton, Alberta, Canada, 2002, pp.279–288.
LeFevre K, DeWitt D J, Ramakrishnan R. Workload-aware anonymization. In Proc. KDD, Philadelphia, PA, USA, 2006, pp.277–286.
Fung B C M, Wang K, Yu P S. Top-down specialization for information and privacy preservation. In Proc. ICDE, Tokyo, Japan, 2005, pp.205–216.
Teng Z, Du W. Comparisons of k-anonymization and randomization schemes under linking attacks. In Proc. ICDM, Hong Kong, China, 2006, pp.1091–1096.
Machanavajjhala A Gehrke J, D Kifer et al. l-diversity: Privacy beyond k-anonymity. In Proc. ICDE, Atlanta, Georgia, USA, 2006, p.24.
Koudas N, Zhang Q, Srivastava D et al. Aggregate query answering on anonymized tables. In Proc. ICDE, Istanbul, Turkey, 2007, pp.116–125.
Hettich S, Merz C J. UCI Repository of machine learning databases, 1999, http://kdd.ics.uci.edu.
LeFevre K, DeWitt D J, Ramakrishnan R. Incognito: Efficient full-domain K-anonymity. In Proc. SIGMOD, Baltimore, Maryland, USA, 2005, pp.49–60.
Xu J, Wang W, Pei J et al. Utility-based anonymization using local recoding. In Proc. KDD, Philadelphia, PA, USA, 2006, pp.785–790.
Aggarwal C C, Yu P S. A condensation approach to privacy preserving data mining. In Proc. The 9th International Conference on Extending Database Technology, Heraklion, Crete, Greece, 2004, pp.183–199.
Byun J, Kamra E, Bertino E et al. Efficient k-anonymization using clustering techniques. In Proc. The 12th International Conference on Database Systems for Advanced Applications, 2007, Bangkok, Thailand, pp.188–200.
Zhou J, Sander J. Data bubbles for non-vector data: Speeding-up hierarchical clustering in arbitrary metric spaces. In Proc. VLDB, Berlin, Germany, 2003, pp.452–463.
Narayan B L, Murthy C A, Pal S K. Maxdiff kd-trees for data condensation. Pattern Recogn. Lett., 27(3): 187–200.
Xiao X, Tao Y. Anatomy: Simple and effective privacy preservation. In Proc. VLDB, Seoul, Korea, 2006, pp.139–150.
Gehrke J, Ramakrishnan R, Ganti V. RainForest — A Framework for fast decision tree construction of large datasets. In Proc. VLDB, New York City, USA, 1998, pp.416–427.

Download references

Author information

Authors and Affiliations

School of Computer Science, Cardiff University, Cardiff, U.K.
Grigorios Loukides & Jian-Hua Shao

Authors

Grigorios Loukides
View author publications
You can also search for this author inPubMed Google Scholar
Jian-Hua Shao
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Grigorios Loukides.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Loukides, G., Shao, JH. An Efficient Clustering Algorithm for k-Anonymisation. J. Comput. Sci. Technol. 23, 188–202 (2008). https://doi.org/10.1007/s11390-008-9121-3

Download citation

Received: 06 December 2007
Published: 05 April 2008
Issue Date: March 2008
DOI: https://doi.org/10.1007/s11390-008-9121-3

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An Efficient Clustering Algorithm for k-Anonymisation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

K-Anonymity Algorithm Based on Improved Clustering

A weighted K-member clustering algorithm for K-anonymization

Data Anonymization Through Multi-modular Clustering

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

An Efficient Clustering Algorithm for k-Anonymisation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

K-Anonymity Algorithm Based on Improved Clustering

A weighted K-member clustering algorithm for K-anonymization

Data Anonymization Through Multi-modular Clustering

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now