Microaggregation for Categorical Variables: A Median Based Approach

Torra, Vicenç

doi:10.1007/978-3-540-25955-8_13

Microaggregation for Categorical Variables: A Median Based Approach

Vicenç Torra¹⁷

Conference paper

982 Accesses
61 Citations

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3050))

Abstract

Microaggregation is a masking procedure used for protecting confidential data prior to their public release. This technique, that relies on clustering and aggregation techniques, is solely used for numerical data. In this work we introduce a microaggregation procedure for categorical variables. We describe the new masking method and we analyse the results it obtains according to some indices found in the literature. The method is compared with Top and Bottom Coding, Global recoding, Rank Swapping and PRAM.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agrawal, R., Srikant, R.: Privacy preserving data mining. In: Proc. 2000 ACM SIGMOD Int’l Conf. Management of Data, pp. 439–450. ACM Press, New York (2000)
Chapter Google Scholar
Chiang, J.-H., Hao, P.-Y.: A new kernel-based fuzzy clustering approach: support vector clustering with cell growing. IEEE Trans. on Fuzzy Systems 11(4), 518–527 (2003)
Article Google Scholar
Data Extraction System (DES), U. S. Census Bureau, http://www.census.gov/DES/www/welcome.html
Domingo-Ferrer, J., Torra, V.: Disclosure Control Methods and Information Loss for Microdata. In: Doyle, P., Lane, J.I., Theeuwes, J.J.M., Zayatz, L.M. (eds.) Confidentiality, Disclosure, and Data Access: Theory and Practical Applications for Statistical Agencies, pp. 91–110. Elsevier, Amsterdam (2001)
Google Scholar
Domingo-Ferrer, J., Torra, V.: A Quantitative Comparison of Disclosure Control Methods for Microdata. In: Doyle, P., Lane, J.I., Theeuwes, J.J.M., Zayatz, L.M. (eds.) Confidentiality, Disclosure, and Data Access: Theory and Practical Applications for Statistical Agencies, pp. 111–133. Elsevier, Amsterdam (2001)
Google Scholar
Domingo-Ferrer, J., Torra, V.: Median based aggregation operators for prototype construction in ordinal scales. Intl. J. of Intel. Syst. 6, 633–655 (2003)
Article Google Scholar
Doyle, P., Lane, J.I., Theeuwes, J.J.M., Zayatz, L.M. (eds.): Confidentiality, Disclosure, and Data Access: Theory and Practical Applications for Statistical Agencies. Elsevier, Amsterdam (2001)
Google Scholar
Eschrich, S., Ke, J., Hall, L.O., Goldgof, D.B.: Fast accurate fuzzy clustering through data reduction. IEEE Trans. on Fuzzy Systems 11(2), 262–270 (2003)
Article Google Scholar
Felso, F., Theeuwes, J., Wagner, G.G.: Disclosure Limitation Methods in Use: Results of a Survey. In: Doyle, P., Lane, J.I., Theeuwes, J.J.M., Zayatz, L.M. (eds.) Confidentiality, Disclosure, and Data Access: Theory and Practical Applications for Statistical Agencies, pp. 17–42. Elsevier, Amsterdam (2001)
Google Scholar
Godo, L., Torra, V.: On aggregation operators for ordinal qualitative information. IEEE Trans. on Fuzzy Systems 8(2), 143–154 (2000)
Article Google Scholar
Herrera, F., Herrera-Viedma, E., Verdegay, J.L.: A Sequential Selection process in Group Decision Making with a Linguistic Assessment Approach. Information Science 85, 223–239 (1995)
Article MATH Google Scholar
Huang, Z., Ng, M.K.: A fuzzy k-modes algorithm for clustering categorical data. IEEE Trans. on Fuzzy Systems 7(4), 446–452 (1999)
Article Google Scholar
Kolen, J.F., Hutcheson, T.: Reducing the time complexity of the fuzzy c-means algorithm. IEEE Trans. on Fuzzy Systems 10(2), 263–267 (2002)
Article Google Scholar
Kooiman, P., Willenborg, L., Gouweleeuw, J.: PRAM: a method for disclosure limitation of microdata, Statistics Netherlands, Research Report (1998)
Google Scholar
Leski, J.M.: Generalized weighted conditional fuzzy clustering. IEEE Trans. on Fuzzy Systems 11(6), 709–715 (2003)
Article Google Scholar
Miyamoto, S.: Introduction to fuzzy clustering. Morikita, Japan (1999)
Google Scholar
Miyamoto, S., Umayahara, K.: Methods in Hard and Fuzzy Clustering. In: Liu, Z.-Q., Miyamoto, S. (eds.) Soft Computing and Human-Centered Machines, pp. 85–129. Springer, Tokyo (2000)
Google Scholar
Sande, G.: Exact and approximate methods for data directed microaggregation in one or more dimensions. Int. J. of Unc. Fuzziness and Knowledge Based Systems 10(5), 459–476 (2002)
Article MATH MathSciNet Google Scholar
Sugeno, M.: Theory of Fuzzy Integrals and its Applications (PhD Dissertation). Tokyo Institute of Technology, Tokyo, Japan (1974)
Google Scholar
Torra, V.: Negation functions based semantics for ordered linguistic labels. Intl. J. of Intel. Syst. 11, 975–988 (1996)
Article Google Scholar
Torra, V.: The Weighted OWA operator. Intl. J. of Intel. Syst. 12, 153–166 (1997)
Article MATH Google Scholar
Torra, V.: Aggregation of linguistic labels when semantics is based on antonyms. Intl. J. of Intel. Systems 16, 513–524 (2001)
Article MATH Google Scholar
Willenborg, L., De Waal, T.: Statistical Disclosure Control in Practice. LNS, vol. 111. Springer, Heidelberg (1996)
MATH Google Scholar
Willenborg, L., de Waal, T.: Elements of Statistical Disclosure Control. Lecture Notes in Statistics. Springer, Heidelberg (2001)
Book MATH Google Scholar
Winkler, W.E.: Matching and record linkage. In: Cox, B.G. (ed.) Business Survey Methods, pp. 355–384. Wiley, New York (1995)
Google Scholar
Xu, Z.S., Da, Q.L.: An overview of operators for aggregating information. Int. J. of Intel. Systems 18, 953–969 (2003)
Article MATH Google Scholar
Yancey, W.E., Winkler, W.E., Creecy, R.H.: Disclosure Risk Assessment in Perturbative Microdata Protection. In: Domingo-Ferrer, J. (ed.) Inference Control in Statistical Databases. LNCS, vol. 2316, pp. 135–152. Springer, Heidelberg (2002)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Institut d’Investigació en Intel·ligència Artificial, Campus de Bellaterra, 08193, Bellaterra, Catalonia, Spain
Vicenç Torra

Authors

Vicenç Torra
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Engineering and Mathematics, Universitat Rovira i Virgili, UNESCO Chair in Data Privacy, Av. Països Catalans 26, E-43007, Tarragona, Catalonia
Josep Domingo-Ferrer
IIIA, Artificial Intelligence Research Institute CSIC, Spanish National Research Council, Campus UAB s/n, 08193, Bellaterra, Catalonia, Spain
Vicenç Torra

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Torra, V. (2004). Microaggregation for Categorical Variables: A Median Based Approach. In: Domingo-Ferrer, J., Torra, V. (eds) Privacy in Statistical Databases. PSD 2004. Lecture Notes in Computer Science, vol 3050. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-25955-8_13

Download citation

DOI: https://doi.org/10.1007/978-3-540-25955-8_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22118-0
Online ISBN: 978-3-540-25955-8
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics