skip to main content
10.1145/1247480.1247490acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
Article

Approximate algorithms for K-anonymity

Published: 11 June 2007 Publication History

Abstract

When a table containing individual data is published, disclosure of sensitive information should be prohibitive. A naive approach for the problem is to remove identifiers such as name and social security number. However, linking attacks which joins the published table with other tables on some attributes, called quasi-identifier, may reveal the sensitive information. To protect privacy against linking attack, the notion of k-anonymity which makes each record in the table be indistinguishable with k-1 other records has been proposed previously. It is shown to be NP-Hard to k-anonymize a table minimizing the number of suppressed cells. To alleviate this, O(k log k)-approximation and O(k)-approximation algorithms were proposed in previous works.
In this paper, we propose several approximation algorithms that guarantee O(log k)-approximation ratio and perform significantly better than the traditional algorithms. We also provide O(ß log k)-approximate algorithms which gracefully adjust their running time according to the tolerance é (≥ 1) of the approximation ratios. Experimental results confirm that our approximation algorithms perform significantly better than traditional approximation algorithms.

References

[1]
C. C. Aggarwal. On k-anonymity and the curse of dimensionality. In VLDB, pages 901--909, 2005.
[2]
G. Aggarwal, T. Feder, K. Kenthapadi, S. Khuller, R. Panigrahy, D. Thomas, and A. Zhu. Achieving anonymity via clustering. In VLDB, 2006.
[3]
G. Aggarwal, T. Feder, K. Kenthapadi, R. Motwani, R. Panigrahy, D. Thomas, and A. Zhu. Anonymizing tables. In ICDT, pages 246--258, 2005.
[4]
R. Agrawal and R. Srikant. Fast algorithms for mining association rules in large databases. In VLDB, pages 487--499, 1994.
[5]
R. Bayardo and R. Agrawal. Data privacy through optimal k-anonymization. In ICDE, pages 217--228, 2005.
[6]
T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to Algorithms (Second Edition). McGraw Hill and MIT Press, 2001.
[7]
D. S. Johnson. Approximation algorithms for combinatorial problems. Journal of Computer and System Sciences, 9:256--278, 1974.
[8]
J. Han, J. Pei, and Y. Yin. Mining frequent patterns without candidate generation. In SIGMOD Conference, pages 1--12, 2000.
[9]
K. LeFevre, D. J. DeWitt, and R. Ramakrishnan. Incognito: Efficient full-domain k-anonymity. In SIGMOD, pages 49--60, 2005.
[10]
K. LeFevre, D. J. DeWitt, and R. Ramakrishnan. Mondrian multidimensional k-anonymity. In ICDE, page 25, 2006.
[11]
A. Machanavajjhala, J. Gehrke, and D. Kifer. l-diversity: Privacy beyond k-anonymity. In ICDE, 2006.
[12]
A. Meyerson and R. Williams. On the complexity of optimal k-anonymity. In Proc. of PODS, 2004.
[13]
N. Pasquier, Y. Bastide, R. Taouil, and L. Lakhal. Discovering frequent closed itemsets for association rules. In ICDT, pages 398--416, 1999.
[14]
U.C. Irvine Machine Learning Repository. http://www.ics.uci.edu/~mlearn/mlsummary.html.
[15]
P. Samarati and L. Sweeny. Generalizing data to provide anonymity when disclosing information (abstract). In In Proc. of ACM Symposium on Principles of Database Systems, page 188, 1998.
[16]
L. Sweeny. k-anonymity: a model for protecting privacy. International Journal on Uncertainty, Fuzziness and Knowedge-based Systems, 10(5):557--570, 2002.
[17]
L. Willenborg and T. deWaal. Elements of statistical disclosure control. Springer Verlog Lecture Notes in Statistics, 2000.

Cited By

View all
  • (2024)Anonymization of Bigdata using ARX Tools2024 15th International Conference on Information and Communication Systems (ICICS)10.1109/ICICS63486.2024.10638298(1-6)Online publication date: 13-Aug-2024
  • (2023)Anonymous learning via look-alike clusteringProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3667682(35966-35987)Online publication date: 10-Dec-2023
  • (2023)An Improved Partitioning Method via Disassociation towards Environmental SustainabilitySustainability10.3390/su1509744715:9(7447)Online publication date: 30-Apr-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGMOD '07: Proceedings of the 2007 ACM SIGMOD international conference on Management of data
June 2007
1210 pages
ISBN:9781595936868
DOI:10.1145/1247480
  • General Chairs:
  • Lizhu Zhou,
  • Tok Wang Ling,
  • Program Chair:
  • Beng Chin Ooi
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 June 2007

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. anonymity
  2. data mining
  3. data publishing
  4. local recoding
  5. privacy preservation

Qualifiers

  • Article

Conference

SIGMOD/PODS07
Sponsor:

Acceptance Rates

Overall Acceptance Rate 785 of 4,003 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)60
  • Downloads (Last 6 weeks)13
Reflects downloads up to 03 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Anonymization of Bigdata using ARX Tools2024 15th International Conference on Information and Communication Systems (ICICS)10.1109/ICICS63486.2024.10638298(1-6)Online publication date: 13-Aug-2024
  • (2023)Anonymous learning via look-alike clusteringProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3667682(35966-35987)Online publication date: 10-Dec-2023
  • (2023)An Improved Partitioning Method via Disassociation towards Environmental SustainabilitySustainability10.3390/su1509744715:9(7447)Online publication date: 30-Apr-2023
  • (2023)Algorithms for Efficiently Computing Structural Anonymity in Complex NetworksACM Journal of Experimental Algorithmics10.1145/360490828(1-22)Online publication date: 11-Aug-2023
  • (2023)Privacy Preservation in Big Data AnalyticsGranular, Fuzzy, and Soft Computing10.1007/978-1-0716-2628-3_755(649-669)Online publication date: 30-Mar-2023
  • (2021)Efficiently Supporting Online Privacy-Preserving Data Publishing in a Distributed Computing EnvironmentApplied Sciences10.3390/app11221074011:22(10740)Online publication date: 14-Nov-2021
  • (2021)Privacy Preservation in Big Data AnalyticsEncyclopedia of Complexity and Systems Science10.1007/978-3-642-27737-5_755-1(1-22)Online publication date: 11-Nov-2021
  • (2020)Let’s Refresh! Efficient and Private OpenPGP Certificate Updates2020 International Conference on Software, Telecommunications and Computer Networks (SoftCOM)10.23919/SoftCOM50211.2020.9238161(1-6)Online publication date: 17-Sep-2020
  • (2020)A Global Optimal Model for Protecting PrivacyWireless Personal Communications10.1007/s11277-020-07110-xOnline publication date: 22-Jan-2020
  • (2019)Efficient Recommendation of De-Identification Policies Using MapReduceIEEE Transactions on Big Data10.1109/TBDATA.2017.26906605:3(343-354)Online publication date: 1-Sep-2019
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media