Abstract
Many organizations, enterprises or public services collect and manage personal data of individuals. These data contain knowledge that is of substantial value for scientists and market experts, but carelessly disseminating them can lead to significant privacy breaches, as they might reveal financial, medical or other personal information. Several anonymization methods have been proposed to allow the privacy preserving sharing of datasets with personal information. Anonymization techniques provide a trade-off between the strength of the privacy guarantee and the quality of the anonymized dataset. In this work we focus on the anonymization of sets of values from continuous domains, e.g., numerical data, and we provide a method for protecting the anonymized data from attacks against identity disclosure. The main novelty of our approach is that instead of using a fixed, given generalization hierarchy, we let the anonymization algorithm decide how different values will be generalized. The benefit of our approach is twofold: a) we are able to generalize datasets without requiring an expert to define the hierarchy and b) we limit the information loss, since the proposed algorithm is able to limit the scope of the generalization. We provide a series of experiments that demonstrate the gains in terms of information quality of our algorithm compared to the state-of-the-art.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Terrovitis, M., Mamoulis, N., Kalnis, P.: Privacy-preserving Anonymization of Set-valued Data. PVLDB 1(1) (2008)
Sweeney, L.: k-Anonymity: A Model for Protecting Privacy. IJUFKS 10(5) (2002)
Terrovitis, M., Mamoulis, N., Kalnis, P.: Local and global recoding methods for anonymizing set-valued data. The VLDB Journal 20(1), 83–106 (2011)
Terrovitis, M., Mamoulis, N., Liagouris, J., Skiadopoulos, S.: Privacy preservation by disassociation. Proceedings of the VLDB Endowment 5(10), 944–955 (2012)
Meyerson, A., Williams, R.: On the Complexity of Optimal K-anonymity. In: PODS, pp. 223–228 (2004)
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: SIGMOD, pp. 1–12 (2000)
Xu, J., Wang, W., Pei, J., Wang, X., Shi, B., Fu, A.: Utility-Based Anonymization Using Local Recoding. In: KDD, pp. 785–790 (2006)
Uci repository, http://archive.ics.uci.edu/ml/datasets.html
Uci repository us census data 1990 data set (1990), http://archive.ics.uci.edu/ml/datasets/US+Census+Data+%281990%29
Samarati, P., Sweeney, L.: Generalizing Data to Provide Anonymity when Disclosing Information (abstract). In: PODS (see also Technical Report SRI-CSL-98-04) (1998)
Samarati, P.: Protecting respondents identities in microdata release. TKDE 13(6), 1010–1027 (2001)
Sweeney, L.: Datafly: A system for providing anonymity in medical data. In: Proc. of the International Conference on Database Security, pp. 356–381 (1998)
Iyengar, V.S.: Transforming data to satisfy privacy constraints. In: SIGKDD, pp. 279–288. ACM (2002)
Sweeney, L.: Achieving k-anonymity privacy protection using generalization and suppression. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 10(05), 571–588 (2002)
Wang, K., Yu, P.S., Chakraborty, S.: Bottom-up generalization: A data mining solution to privacy protection. In: ICDM, pp. 249–256. IEEE (2004)
LeFevre, K., DeWitt, D.J., Ramakrishnan, R.: Incognito: Efficient full-domain k-anonymity. In: SIGMOD, pp. 49–60. ACM (2005)
LeFevre, K., DeWitt, D.-J., Ramakrishnan, R.: Mondrian multidimensional k-anonymity. In: ICDE, p. 25. IEEE (2006)
Fung, B.C., Wang, K., Yu, P.S.: Top-down specialization for information and privacy preservation. In: ICDE, pp. 205–216. IEEE (2005)
Bayardo, R.J., Agrawal, R.: Data privacy through optimal k-anonymization. In: ICDE, pp. 217–228. IEEE (2005)
El Emam, K., Dankar, F.K., Issa, R., Jonker, E., Amyot, D., Cogo, E., Corriveau, J.-P., Walker, M., Chowdhury, S., Vaillancourt, R., et al.: A globally optimal k-anonymity method for the de-identification of health data. Journal of the American Medical Informatics Association 16(5), 670–682 (2009)
Kohlmayer, F., Prasser, F., Eckert, C., Kemper, A., Kuhn, K.A.: Flash: efficient, stable and optimal k-anonymity. In: PASSAT, SocialCom, pp. 708–717. IEEE (2012)
Zhang, Q., Koudas, N., Srivastava, D., Yu, T.: Aggregate query answering on anonymized tables. In: ICDE, pp. 116–125. IEEE (2007)
Evfimievski, A., Gehrke, J., Srikant, R.: Limiting privacy breaches in privacy preserving data mining. In: ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, pp. 211–222. ACM (2003)
Verykios, V.S., Elmagarmid, A.K., Bertino, E., Saygin, Y., Dasseni, E.: Association rule hiding. TKDE 16(4), 434–447 (2004)
Domingo-Ferrer, J., Torra, V.: Ordinal, continuous and heterogeneous k-anonymity through microaggregation. Data Mining and Knowledge Discovery 11(2), 195–212 (2005)
Domingo-Ferrer, J., Solanas, A., Martinez-Balleste, A.: Privacy in statistical databases: k-anonymity through microaggregation. In: GrC, pp. 774–777 (2006)
Domingo-Ferrer, J.: Microaggregation: achieving k-anonymity with quasi-optimal data quality. In: European Conference on Quality in Survey Statistics (2006)
Xiao, X., Tao, Y.: Anatomy: Simple and effective privacy preservation. In: VLDB, pp. 139–150. VLDB Endowment (2006)
Li, T., Li, N., Zhang, J., Molloy, I.: Slicing: A new approach for privacy preserving data publishing. TKDE 24(3), 561–574 (2012)
Casino, F., Patsakis, C., Puig, D., Solanas, A.: On privacy preserving collaborative filtering: Current trends, open problems, and new issues. In: e-Business Engineering (ICEBE), pp. 244–249. IEEE (2013)
Casino, F., Domingo-Ferrer, J., Patsakis, C., Puig, D., Solanas, A.: Privacy preserving collaborative filtering with k-anonymity through microaggregation. In: e-Business Engineering (ICEBE), pp. 490–497. IEEE (2013)
Machanavajjhala, A., Kifer, D., Gehrke, J., Venkitasubramaniam, M.: l-diversity: Privacy beyond k-anonymity. TKDD 1(1), 3 (2007)
Liu, J., Wang, K.: On optimal anonymization for l + -diversity. In: ICDE, pp. 213–224. IEEE (2010)
Li, N., Li, T., Venkatasubramanian, S.: t-closeness: Privacy beyond k-anonymity and l-diversity. In: ICDE, pp. 106–115. IEEE (2007)
Cao, J., Karras, P.: Publishing microdata with a robust privacy guarantee. Proceedings of the VLDB Endowment 5(11), 1388–1399 (2012)
Wang, K., Fung, B.: Anonymizing sequential releases. In: SIGKDD, pp. 414–423. ACM (2006)
Gionis, A., Mazza, A., Tassa, T.: k-anonymization revisited. In: ICDE, pp. 744–753. IEEE (2008)
Wong, W.K., Mamoulis, N., Cheung, D.W.L.: Non-homogeneous generalization in privacy preserving data publishing. In: SIGMOD, pp. 747–758. ACM (2010)
Tassa, T., Mazza, A., Gionis, A.: k-concealment: An alternative model of k-type anonymity. Transactions on Data Privacy 5(1), 189–222 (2012)
Stokes, K., Torra, V.: n-confusion: a generalization of k-anonymity. In: EDBT/ICDT Workshops, pp. 211–215. ACM (2012)
Ghinita, G., Tao, Y., Kalnis, P.: On the Anonymization of Sparse High-Dimensional Data. In: ICDE (2008)
Zigomitros, A., Solanas, A., Patsakis, C.: The role of inference in the anonymization of medical records. In: Computer-Based Medical Systems, CBMS (2014)
Xu, Y., Wang, K., Fu, A.W.-C., Yu, P.S.: Anonymizing transaction databases for publication. In: ACM SIGKDD, pp. 767–775. ACM (2008)
Gkountouna, O., Lepenioti, K., Terrovitis, M.: Privacy against aggregate knowledge attacks. In: PrivDB, Data Engineering Workshops (ICDEW), pp. 99–103. IEEE (2013)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Gkountouna, O., Angeli, S., Zigomitros, A., Terrovitis, M., Vassiliou, Y. (2014). k m-Anonymity for Continuous Data Using Dynamic Hierarchies. In: Domingo-Ferrer, J. (eds) Privacy in Statistical Databases. PSD 2014. Lecture Notes in Computer Science, vol 8744. Springer, Cham. https://doi.org/10.1007/978-3-319-11257-2_13
Download citation
DOI: https://doi.org/10.1007/978-3-319-11257-2_13
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11256-5
Online ISBN: 978-3-319-11257-2
eBook Packages: Computer ScienceComputer Science (R0)