Maximum Entropy Oriented Anonymization Algorithm for Privacy Preserving Data Mining

Tsiafoulis, Stergios G.; Zorkadis, Vasilios C.; Pimenidis, Elias

doi:10.1007/978-3-642-33448-1_2

Maximum Entropy Oriented Anonymization Algorithm for Privacy Preserving Data Mining

Stergios G. Tsiafoulis¹⁸,
Vasilios C. Zorkadis¹⁹ &
Elias Pimenidis²⁰

Conference paper

1307 Accesses
1 Citations
3 Altmetric

Part of the book series: Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering ((LNICST,volume 99))

Abstract

This work introduces a new concept that addresses the problem of preserving privacy when anonymising and publishing personal data collections. In particular, a maximum entropy oriented algorithm to protect sensitive data is proposed. As opposed to k-anonymity, ℓ-diversity and t-closeness, the proposed algorithm builds equivalence classes with possibly uniformly distributed sensitive attribute values, probably by means of noise, and having as a lower limit the entropy of the distribution of the initial data collection, so that background information cannot be exploited to successfully attack the privacy of data subjects data refer to. Furthermore, existing privacy and information loss related metrics are presented, as well as the algorithm implementing the maximum entropy anonymity concept. From a privacy protection perspective, the achieved results are very promising, while the suffered information loss is limited.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Wickramasinghe Nilmini, B.R.K., Chris, G.M., Jonathan, S.: Realizing the Knowledge Spiral in Healthcare: the role of Data Mining and Knowledge Management. The International Council on Medical & Care Compunetics, 147–162 (2008)
Google Scholar
Dalenius, T.: Finding a Needle In a Haystack or Identifying Anonymous Census Records. Journal of Official Statistics 2(3), 329–336 (1986)
Google Scholar
Sweeney, L.: k-anonymity: a model for protecting privacy. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems 10(5), 557–570 (2002)
Article MathSciNet MATH Google Scholar
Sweeney, L., Samarati, P.: Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression. In: IEEE Symposium on Research in Security and Privacy (1998)
Google Scholar
Meyerson, A., Williams, R.: General k-Anonymization is Hard. In: PODS 2004 (2003)
Google Scholar
Ashwin Machanavajjhala, D.K., Gehrke, J., Venkitasubramaniam, M.: L-Diversity: Privacy Beyond k-Anonymity. ACM Transactions on Knowledge Discovery from Data 1(1), 52, article 3 (2007)
Google Scholar
Li, N., Li, T., Venkatasubramanian, S.: t-Closeness: Privacy Beyond k-Anonymity and ℓ-Diversity. In: 23rd International Conference on Data Engineering, ICDE 2007, pp. 106–115 (2007)
Google Scholar
Ye, Y., Deng, Q., Wang, C., Lv, D., Liu, Y., Feng, J.-H.: BSGI: An Effective Algorithm towards Stronger l-Diversity. In: Bhowmick, S.S., Küng, J., Wagner, R. (eds.) DEXA 2008. LNCS, vol. 5181, pp. 19–32. Springer, Heidelberg (2008)
Chapter Google Scholar
Xiao, X., Tao, Y.: Anatomy: Simple and effective privacy preservation. In: 32nd International Conference on Very large Data Bases, VLDB 2006, pp. 139–150 (2006)
Google Scholar
LeFevre, K.R., Dewitt, D.J., Ramakrishnan, R.: Incognito: efficient full-domain K-anonymity. In: International Conference on Management of Data ACM SIGMOD 2005, Baltimore, Maryland (2005)
Google Scholar
LeFevre, K., Dewitt, D.J., Ramakrishnan, R.: Mondrian Multidimensional K-Anonymity. In: ICDE 2006 (2006)
Google Scholar
Iyengar, V.S.: Transforming Data to Satisfy Privacy Constrains. In: KDD 2002 (2002)
Google Scholar
Xu, J., Wang, W., Pei, J., Wang, X., Shi, B., Fu, A.W.-C.: Utility-Based Anonymization Using Local Recoding. In: KDD 2006 (2006)
Google Scholar
UCI. Irvin Machine Learning Repository, http://archive.ics.uci.edu/ml/
Tsiafoulis, S.G., Zorkadis, V.C.: A Neural Network Clustering Based Algorithm for Privacy Preserving Data Mining. In: 2010 International Conference on Computational Intelligence and Security, Nanning, Guangxi Zhuang Autonomous Region, China (2010)
Google Scholar
Bayardo, R.J., Agrawal, R.: Data privacy through optimal k-anonymization. In: 21th ICDE 2005 (2005)
Google Scholar
Webb, G.I.: Opus: An Effcient Admissible Algorithm for Unordered Search. Journal of Artificial intelligence Research 3, 431–465 (1995)
MATH Google Scholar
Rymon, R.: Search Through Systematic Set Enumeration (1992)
Google Scholar
Whitley, D.: The Genitor Algorithm and Selective Pressure: Why rank-based allocation of reproductive trials is best. In: Proceedings of Third International Conference on Genetic Algorithms, pp. 116–121 (1989)
Google Scholar
Kelly, D.J., Raines, R.A., Grimaila, M.R., Baldwin, R.O., Mullins, B.E.: A Survey of State-of-the Art ion Anonymity Metrics. In: NDA 2008. ACM, Fairfax (2008)
Google Scholar
Dakshi Agrawal, C.C.A.: On the Design and Quantification of Privacy Preserving Data Mining Algorithms. In: 20th Symposium on Principles of Database Systems Santa Barbara California, USA (May 2001)
Google Scholar
Evfimievski, A.V., Srikant, R., Gehrke, J.: Limiting privacy breaches in privacy preserving data mining. In: Proceedings of the Twenty-Second ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems table of Contents, San Diego, California, pp. 211–222 (2003)
Google Scholar
Sweeney, L.: Achieving k-anonymity privacy protection using generalization and suppression. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems 10(5), 571–588 (2002)
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Ministry of Public Administrative Reform and e-Government, Hellenic Open University, Vasilissis Sofias Av. 15, 10674, Athens, Greece
Stergios G. Tsiafoulis
Hellenic Data Protection Authority Athens, Hellenic Open University, Kifissias Av.1-3, 115 23, Athens, Greece
Vasilios C. Zorkadis
University of East London, UK
Elias Pimenidis

Authors

Stergios G. Tsiafoulis
View author publications
You can also search for this author in PubMed Google Scholar
Vasilios C. Zorkadis
View author publications
You can also search for this author in PubMed Google Scholar
Elias Pimenidis
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Applied Informatics, University of East London , 156 Egnatio Street, 54006, Thessaloniki, Greece
Christos K. Georgiadis
School of Architecture & Engineering, University of East London, Docklands Campus 4-6, University Way, E16 2RD, London, UK
Hamid Jahankhani
School of Architecture Computing & Engineering, University of East London, Docklands Campus 4-6 University Way, E16 2RD, London, UK
Elias Pimenidis , Rabih Bashroush & Ameer Al-Nemrat , &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tsiafoulis, S.G., Zorkadis, V.C., Pimenidis, E. (2012). Maximum Entropy Oriented Anonymization Algorithm for Privacy Preserving Data Mining. In: Georgiadis, C.K., Jahankhani, H., Pimenidis, E., Bashroush, R., Al-Nemrat, A. (eds) Global Security, Safety and Sustainability & e-Democracy. e-Democracy ICGS3 2011 2011. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 99. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33448-1_2

Download citation

DOI: https://doi.org/10.1007/978-3-642-33448-1_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33447-4
Online ISBN: 978-3-642-33448-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics