Improving Microaggregation for Complex Record Anonymization

Pont-Tuset, Jordi; Nin, Jordi; Medrano-Gracia, Pau; Larriba-Pey, Josep Ll.; Muntés-Mulero, Victor

doi:10.1007/978-3-540-88269-5_20

Jordi Pont-Tuset³,
Jordi Nin⁴,
Pau Medrano-Gracia³,
Josep Ll. Larriba-Pey³ &
…
Victor Muntés-Mulero³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5285))

Included in the following conference series:

International Conference on Modeling Decisions for Artificial Intelligence

816 Accesses

Abstract

Microaggregation is one of the most commonly employed microdata protection methods. This method builds clusters of at least k original records and replaces the records in each cluster with the centroid of the cluster. Usually, when records are complex, i.e., the number of attributes of the data set is large, this data set is split into smaller blocks of attributes and microaggregation is applied to each block, successively and independently. In this way, the information loss when collapsing several values to the centroid of their group is reduced, at the cost of losing the k-anonymity property when at least two attributes of different blocks are known by the intruder.

In this work, we present a new microaggregation method called One dimension microaggregation (Mic1D − κ). This method gathers all the values of the data set into a single sorted vector, independently of the attribute they belong to. Then, it microaggregates all the mixed values together. Our experiments show that, using real data, our proposal obtains lower disclosure risk than previous approaches whereas the information loss is preserved.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Adam, N.R., Wortmann, J.C.: Security-control for statistical databases: a comparative study. ACM Computing Surveys 21, 515–556 (1989)
Article Google Scholar
Domingo-Ferrer, J., Torra, V.: Disclosure control methods and information loss for microdata. In: [6], pp. 91–110 (2001)
Google Scholar
Domingo-Ferrer, J., Torra, V.: A quantitative comparison of disclosure control methods for microdata. In: [6], pp. 111–133 (2001)
Google Scholar
Domingo-Ferrer, J., Mateo-Sanz, J.M.: Practical data-oriented microaggregation for statistical disclosure control. IEEE Trans. on Knowledge and Data Engineering 14(1), 189–201 (2002)
Article Google Scholar
Domingo-Ferrer, J., Martínez-Ballesté, A., Mateo-Sanz, J.M., Sebé, F.: Efficient multivariate data-oriented microaggregation. The VLDB Journal 15, 355–369 (2006)
Article Google Scholar
Doyle, P., Lane, J., Theeuwes, J., Zayatz, L. (eds.): Confidentiality, disclosure, and data access: theory and practical applications for statistical agencies. Elsevier Science, Amsterdam (2001)
Google Scholar
Felsö, F., Theeuwes, J., Wagner, G.: Disclosure Limitation in Use: Results of a Survey. In: [6], pp. 17–42 (2001)
Google Scholar
Hansen, S., Mukherjee, S.: A Polynomial Algorithm for Optimal Univariate Microaggregation. Trans. on Knowledge and Data Engineering 15(4), 1043–1044 (2003)
Article Google Scholar
Medrano-Gracia, P., Pont-Tuset, J., Nin, J., Muntés-Mulero, V.: Ordered Data Set Vectorization for Linear Regression on Data Privacy. In: Torra, V., Narukawa, Y., Yoshida, Y. (eds.) MDAI 2007. LNCS (LNAI), vol. 4617, pp. 361–372. Springer, Heidelberg (2007)
Chapter Google Scholar
Murphy, P., Aha, D.W.: UCI Repository machine learning databases. University of California, Department of Information and Computer Science, Irvine (1994), http://www.ics.uci.edu/~mlearn/MLRepository.html
Google Scholar
Nin, J., Herranz, J., Torra, V.: Attribute Selection in Multivariate Microaggregation. In: Post-Proc. of 11th ACM International Conference on Extending Database Technology (2008)
Google Scholar
Nin, J., Herranz, J., Torra, V.: How to group attributes in multivariate microaggregation. Int. J. on Uncertainty, Fuzziness and Knowledge-Based Systems 16(1), 121–138 (2008)
Article Google Scholar
Nin, J., Torra, V.: Analysis of the Univariate Microaggregation Disclosure Risk (submitted, 2007)
Google Scholar
Oganian, A., Domingo-Ferrer, J.: On the Complexity of Optimal Microaggregation for Statistical Disclosure Control. Statistical J. United Nations Economic Commission for Europe 18(4), 345–354 (2000)
Google Scholar
Samarati, P., Sweeney, L.: Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression, SRI Intl. Tech. Rep. (1998)
Google Scholar
Sande, G.: Exact and approximate methods for data directed microaggregation in one or more dimensions. Int. J. of Unc., Fuzz. and Knowledge Based Systems 10(5), 459–476 (2002)
Article MathSciNet MATH Google Scholar
Sweeney, L.: Achieving k-anonymity privacy protection using generalization and suppression. Int. J. of Unc., Fuzz. and Knowledge Based Systems 10(5), 571–588 (2002)
Article MathSciNet MATH Google Scholar
Sweeney, L.: k-anonymity: a model for protecting privacy. Int. J. of Unc., Fuzz. and Knowledge Based Systems 10(5), 557–570 (2002)
Article MathSciNet MATH Google Scholar
U.S. Census Bureau, Data Extraction System (1990), http://www.census.gov/

Download references

Author information

Authors and Affiliations

DAMA-UPC, Computer Architecture Dept., Universitat Politècnica de Catalunya, Campus Nord UPC, C/Jordi Girona 1-3, 08034, Barcelona, Catalonia, Spain
Jordi Pont-Tuset, Pau Medrano-Gracia, Josep Ll. Larriba-Pey & Victor Muntés-Mulero
IIIA, Artificial Intelligence Research Institute CSIC, Spanish National Research Council, Campus UAB s/n, 08193, Bellaterra, Catalonia, Spain
Jordi Nin

Authors

Jordi Pont-Tuset
View author publications
You can also search for this author in PubMed Google Scholar
Jordi Nin
View author publications
You can also search for this author in PubMed Google Scholar
Pau Medrano-Gracia
View author publications
You can also search for this author in PubMed Google Scholar
Josep Ll. Larriba-Pey
View author publications
You can also search for this author in PubMed Google Scholar
Victor Muntés-Mulero
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

IIIA, Artificial Intelligence Research Institute CSIC, Spanish National Research Council,, Campus UAB s/n, 08193, Bellaterra, Catalonia, Spain
Vicenç Torra
Toho Gakuen,, 3-1-10 Naka, Kunitachi, 186-0004, Tokyo, Japan
Yasuo Narukawa

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pont-Tuset, J., Nin, J., Medrano-Gracia, P., Larriba-Pey, J.L., Muntés-Mulero, V. (2008). Improving Microaggregation for Complex Record Anonymization. In: Torra, V., Narukawa, Y. (eds) Modeling Decisions for Artificial Intelligence. MDAI 2008. Lecture Notes in Computer Science(), vol 5285. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88269-5_20

Download citation

DOI: https://doi.org/10.1007/978-3-540-88269-5_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-88268-8
Online ISBN: 978-3-540-88269-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics