On-the-Fly Hierarchies for Numerical Attributes in Data Anonymization

Campan, Alina; Cooper, Nicholas

doi:10.1007/978-3-642-15546-8_2

Alina Campan¹⁷ &
Nicholas Cooper¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6358))

Included in the following conference series:

Workshop on Secure Data Management

905 Accesses

Abstract

We present in this paper a method for dynamically creating hierarchies for quasi-identifier numerical attributes. The resulting hierarchies can be used for generalization in microdata k-anonymization, or for allowing users to define generalization boundaries for constrained k-anonymity. The construction of a new numerical hierarchy for a numerical attribute is performed as a hierarchical agglomerative clustering of that attribute’s values in the dataset to anonymize. Therefore, the resulting tree hierarchy reflects well the closeness and clustering tendency of the attribute’s values in the dataset. Due to this characteristic of the hierarchies created on-the-fly for quasi-identifier numerical attributes, the quality of the microdata anonymized through generalization based on these hierarchies is well preserved, and the information loss in the anonymization process remains in reasonable bounds, as proved experimentally.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Generalization-Based k-Anonymization

Data Privacy Preservation Algorithm on Large-Scale Identical Generalization Hierarchy Data

Characterizations of Local Recoding Method on k-Anonymity

References

Samarati, P.: Protecting Respondents Identities in Microdata Release. IEEE Transactions on Knowledge and Data Engineering 13(6), 1010–1027 (2001)
Article Google Scholar
Sweeney, L.: k-Anonymity: A Model for Protecting Privacy. International Journal on Uncertainty, Fuzziness, and Knowledge-based Systems 10(5), 557–570 (2002)
Article MATH MathSciNet Google Scholar
Machanavajjhala, A., Gehrke, J., Kifer, D.: L-Diversity: Privacy beyond K-Anonymity. In: Proceedings of the International Conference on Data Engineering (IEEE ICDE 2006), p. 24 (2006)
Google Scholar
Truta, T.M., Bindu, V.: Privacy Protection: P-Sensitive K-Anonymity Property. In: Proceedings of the Workshop on Privacy Data Management, with ICDE 2006, p. 94 (2006)
Google Scholar
Wong, R.C.W., Li, J., Fu, A.W.C., Wang, K.: (α, k)-Anonymity: An Enhanced k-Anonymity Model for Privacy-Preserving Data Publishing. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (ACM SIGKDD 2006), pp. 754–759 (2006)
Google Scholar
Li, N., Li, T., Venkatasubramanian, S.: T-Closeness: Privacy Beyond k-Anonymity and l-Diversity. In: Proceedings of the 23rd International Conference on Data Engineering (IEEE ICDE 2007), pp. 106–115 (2007)
Google Scholar
Li, J., Tao, Y., Xiao, X.: Preservation of Proximity Privacy in Publishing Numerical Sensitive Data. In: Proceedings of the ACM SIGMOD, pp. 473–486 (2008)
Google Scholar
Liu, J.Q., Wang, K.: On Optimal Anonymization for l+-Diversity. In: Proceedings of the International Conference on Data Engineering, IEEE ICDE 2010 (2010)
Google Scholar
Wei, Q., Lu, Y., Lou, Q.: (τ,λ )-Uniqueness: Anonymity Management for Data Publication. In: Proceedings of the IEEE International Conference on Computer and Information Science (2008)
Google Scholar
Sweeney, L.: Achieving k-Anonymity Privacy Protection Using Generalization and Suppression. International Journal on Uncertainty, Fuzziness, and Knowledge-based Systems 10(5), 571–588 (2002)
Article MATH MathSciNet Google Scholar
Iyengar, V.: Transforming Data to Satisfy Privacy Constraints. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 279–288 (2002)
Google Scholar
LeFevre, K., DeWitt, D., Ramakrishnan, R.: Mondrian Multidimensional K-Anonymity. In: Proceedings of the IEEE International Conference of Data Engineering, Atlanta, Georgia (2006)
Google Scholar
Lunacek, M., Whitley, D., Ray, I.: A Crossover Operator for the k-Anonymity Problem. In: Proceedings of the GECCO Conference, pp. 1713–1720 (2006)
Google Scholar
Miller, J., Campan, A., Truta, T.M.: Constrained K-Anonymity: Privacy with Generalization Boundaries. In: Jonker, W., Petković, M. (eds.) SDM 2008. LNCS, vol. 5159, Springer, Heidelberg (2008)
Google Scholar
Byun, J.W., Kamra, A., Bertino, E., Li, N.: Efficient k-Anonymity using Clustering Techniques. CERIAS Technical Report 2006-10 (2006)
Google Scholar
Tan, P.-N., Steinbach, M., Kumar, V.: Introduction to Data Mining. Addison-Wesley, Reading (2005)
Google Scholar
Newman, D.J., Hettich, S., Blake, C.L., Merz, C.J.: UCI Repository of Machine Learning Databases, UC Irvine (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html

Download references

Author information

Authors and Affiliations

Department of Computer Science, Northern Kentucky University, Highland Heights, KY 41099, USA
Alina Campan & Nicholas Cooper

Authors

Alina Campan
View author publications
You can also search for this author in PubMed Google Scholar
Nicholas Cooper
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Philips Research Europe, High Tech Campus 34, 5656 AE, Eindhoven, The Netherlands
Willem Jonker & Milan Petković &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Campan, A., Cooper, N. (2010). On-the-Fly Hierarchies for Numerical Attributes in Data Anonymization. In: Jonker, W., Petković, M. (eds) Secure Data Management. SDM 2010. Lecture Notes in Computer Science, vol 6358. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15546-8_2

Download citation

DOI: https://doi.org/10.1007/978-3-642-15546-8_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15545-1
Online ISBN: 978-3-642-15546-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics