Abstract
We present in this paper a method for dynamically creating hierarchies for quasi-identifier numerical attributes. The resulting hierarchies can be used for generalization in microdata k-anonymization, or for allowing users to define generalization boundaries for constrained k-anonymity. The construction of a new numerical hierarchy for a numerical attribute is performed as a hierarchical agglomerative clustering of that attribute’s values in the dataset to anonymize. Therefore, the resulting tree hierarchy reflects well the closeness and clustering tendency of the attribute’s values in the dataset. Due to this characteristic of the hierarchies created on-the-fly for quasi-identifier numerical attributes, the quality of the microdata anonymized through generalization based on these hierarchies is well preserved, and the information loss in the anonymization process remains in reasonable bounds, as proved experimentally.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Samarati, P.: Protecting Respondents Identities in Microdata Release. IEEE Transactions on Knowledge and Data Engineering 13(6), 1010–1027 (2001)
Sweeney, L.: k-Anonymity: A Model for Protecting Privacy. International Journal on Uncertainty, Fuzziness, and Knowledge-based Systems 10(5), 557–570 (2002)
Machanavajjhala, A., Gehrke, J., Kifer, D.: L-Diversity: Privacy beyond K-Anonymity. In: Proceedings of the International Conference on Data Engineering (IEEE ICDE 2006), p. 24 (2006)
Truta, T.M., Bindu, V.: Privacy Protection: P-Sensitive K-Anonymity Property. In: Proceedings of the Workshop on Privacy Data Management, with ICDE 2006, p. 94 (2006)
Wong, R.C.W., Li, J., Fu, A.W.C., Wang, K.: (α, k)-Anonymity: An Enhanced k-Anonymity Model for Privacy-Preserving Data Publishing. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (ACM SIGKDD 2006), pp. 754–759 (2006)
Li, N., Li, T., Venkatasubramanian, S.: T-Closeness: Privacy Beyond k-Anonymity and l-Diversity. In: Proceedings of the 23rd International Conference on Data Engineering (IEEE ICDE 2007), pp. 106–115 (2007)
Li, J., Tao, Y., Xiao, X.: Preservation of Proximity Privacy in Publishing Numerical Sensitive Data. In: Proceedings of the ACM SIGMOD, pp. 473–486 (2008)
Liu, J.Q., Wang, K.: On Optimal Anonymization for l+-Diversity. In: Proceedings of the International Conference on Data Engineering, IEEE ICDE 2010 (2010)
Wei, Q., Lu, Y., Lou, Q.: (τ,λ )-Uniqueness: Anonymity Management for Data Publication. In: Proceedings of the IEEE International Conference on Computer and Information Science (2008)
Sweeney, L.: Achieving k-Anonymity Privacy Protection Using Generalization and Suppression. International Journal on Uncertainty, Fuzziness, and Knowledge-based Systems 10(5), 571–588 (2002)
Iyengar, V.: Transforming Data to Satisfy Privacy Constraints. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 279–288 (2002)
LeFevre, K., DeWitt, D., Ramakrishnan, R.: Mondrian Multidimensional K-Anonymity. In: Proceedings of the IEEE International Conference of Data Engineering, Atlanta, Georgia (2006)
Lunacek, M., Whitley, D., Ray, I.: A Crossover Operator for the k-Anonymity Problem. In: Proceedings of the GECCO Conference, pp. 1713–1720 (2006)
Miller, J., Campan, A., Truta, T.M.: Constrained K-Anonymity: Privacy with Generalization Boundaries. In: Jonker, W., Petković, M. (eds.) SDM 2008. LNCS, vol. 5159, Springer, Heidelberg (2008)
Byun, J.W., Kamra, A., Bertino, E., Li, N.: Efficient k-Anonymity using Clustering Techniques. CERIAS Technical Report 2006-10 (2006)
Tan, P.-N., Steinbach, M., Kumar, V.: Introduction to Data Mining. Addison-Wesley, Reading (2005)
Newman, D.J., Hettich, S., Blake, C.L., Merz, C.J.: UCI Repository of Machine Learning Databases, UC Irvine (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Campan, A., Cooper, N. (2010). On-the-Fly Hierarchies for Numerical Attributes in Data Anonymization. In: Jonker, W., Petković, M. (eds) Secure Data Management. SDM 2010. Lecture Notes in Computer Science, vol 6358. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15546-8_2
Download citation
DOI: https://doi.org/10.1007/978-3-642-15546-8_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15545-1
Online ISBN: 978-3-642-15546-8
eBook Packages: Computer ScienceComputer Science (R0)