Abstract
Collecting, releasing and sharing microdata about individuals is needed in some domains to support research initiatives aiming to create new valuable knowledge, by means of data mining and analysis tools. Thus, seeking individuals’ anonymity is required to guarantee their privacy prior publication. The k-anonymity by microaggregation, is a widely accepted model for data anonymization. It consists in de-associating the relationship between the identity of data subjects, i.e. individuals, and their confidential information. However, this method shows limits when dealing with real datasets. Indeed, the latter are characterized by their large number of attributes and the presence of noisy data. Thus, decreasing the information loss during the anonymization process is a compelling task to achieve. This paper aims to deal with such challenge. Doing so, we propose a microaggregation algorithm called Micro-PFSOM, based on fuzzy possibilitic clustering. The main thrust of this algorithm stands in applying an hybrid anonymization process.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Abidi, B., Yahia, S.B.: Multi-pfkcn: a fuzzy possibilistic clustering algorithm based on neural network. In: Proceedings of International Conference on Fuzzy Systems (FUZZ-IEEE 2013), Hyderabad, India, 7–10 July 2013, pp. 1–8. IEEE (2013)
Abidi, B., Yahia, S.B., Bouzeghoub, A.: A new algorithm for fuzzy clustering able to find the optimal number of clusters. In: Proceedings of 24th International Conference on Tools with Artificial Intelligence, ICTAI 2012, Athens, Greece, November 7–9 2012, pp. 806–813. IEEE (2012)
Aggarwal, C.C., Yu, P.S.: An introduction to privacy-preserving data mining. In: Aggarwal, C.C., Yu, P.S. (eds.) Privacy-Preserving Data Mining - Models and Algorithms. Advances in Database Systems, vol. 34, pp. 1–9. Springer, Boston (2008)
Bacher, J., Brand, R., Bender, S.: Re-identifying register data by survey data using cluster analysis: an empirical study. Int. J. Uncertainty Fuzz. Knowl. Based Syst. 10(5), 589–607 (2002)
Berkhin, P., Dhillon, I.S.: Knowledge discovery: clustering. In: Meyers, R.A. (ed.) Encyclopedia of Complexity and Systems Science, pp. 5051–5064. Springer, New York (2009)
Borgman, C.L.: The conundrum of sharing research data. J. Am. Soc. Inf. Sci. Technol. (JASIST) 63(6), 1059–1078 (2012)
Chang, C.C., Li, Y.C., Huang, W.H.: Tfrp: An efficient microaggregation algorithm for statistical disclosure control. J. Syst. Softw. 80(11), 1866–1878 (2007)
Dewri, R., Ray, I., Ray, I., Whitley, D.: On the optimal selection of k in the k-anonymity problem. In: Proceedings of the 24th International Conference on Data Engineering, ICDE 7–12 2008, Cancún, México, pp. 1364–1366. IEEE Computer Society, April 2008
Domigo-Ferrer, J., Solanas, A., MartĂnez-BallestĂ©, A.: Privacy in statistical databases: k-anonymity through microaggregation. In: Proceedings oh the IEEE International Conference on Granular Computing, GrC 2006, Atlanta, Georgia, USA, 10–12 May 2006, pp. 774–777 (2006)
Domingo-Ferrer, J., MartĂnez-BallestĂ©, A., Mateo-Sanz, J.M., SebĂ©, F.: Efficient multivariate data-oriented microaggregation. VLDB J. 15(4), 355–369 (2006)
Domingo-Ferrer, J., Torra, V.: Disclosure risk assessment in statistical data protection. J. Comput. Appl. Math. 164–165(1), 285–293 (2004)
Domingo-Ferrer, J., Torra, V.: Ordinal, continuous and heterogeneous k-anonymity through microaggregation. Data Min. Knowl. Disc. 11(2), 195–212 (2005)
Dwork, C.: Differential privacy. In: Bugliesi, M., Preneel, B., Sassone, V., Wegener, I. (eds.) ICALP 2006. LNCS, vol. 4052, pp. 1–12. Springer, Heidelberg (2006). doi:10.1007/11787006_1
Simson, L.: Garfinkel. De-identification of personal information. Technical report, National Institute of Standards and Technologie (2015)
Hu, W., Xie, D., Tan, T., Maybank, S.: Learning activity patterns using fuzzy self-organizing neural network. Syst. Man Cybern. Part B 34(3), 1618–1626 (2004)
Ehrlich, R., Bezdek, J., Full, W.: FCM: the fuzzy c-means clustering algorithm. Comput. Geosci. 10(2–3), 191–203 (1984)
Kohonen, T., Schroeder, M.R., Huang, T.S.: Self-Organizing Maps, Chap. 3. Springer, Heidelberg (2001)
Krishnapuram, R., Keller, J.M.: A possibilistic approach to clustering. IEEE Trans. Fuzzy Syst. 1(2), 98–110 (1993)
Li, N., Li, T., Venkatasubramanian, S.: t-closeness: privacy beyond k-anonymity and l-diversity. In: Proceedings of the 23rd International Conference on Data Engineering, ICDE 2007, The Marmara Hotel, Istanbul, Turkey, 15–20 April 2007, pp. 106–115. IEEE (2007)
Lin, J.L., Wen, T.H., Hsieh, J.C., Chang, P.C.: Density-based microaggregation for statistical disclosure control. Expert Syst. Appl. 37(4), 3256–3263 (2010)
Machanavajjhala, A., Kifer, D., Gehrke, J., Venkitasubramaniam, M.: L-diversity: privacy beyond k-anonymity. ACM Trans. Knowl. Disc. Data (TKDD) 1(1), 3 (2007)
Oganian, A., Domingo-Ferrer, J.: On the complexity of optimal microaggregation for statistical disclosure control. Stat. J. United Nations Econ. Comission Eur. 18, 345–354 (2001)
Ohm, P.: Broken promises of privacy: responding to the surprising failure of anonymization. UCLA Law Rev. 57(6), 1701–1777 (2010)
Pal, N.R., Pal, K., Keller, J.M., Bezdek, J.C.: A possibilistic fuzzy c-means clustering algorithm. IEEE Trans. Fuzzy Syst. 13(4), 517–530 (2005)
Ramachandran, A., Singh, L., Porter, E., Nagle, F.: Exploring re-identification risks in public domains. In: Proceedings of the Tenth Annual International Conference on Privacy, Security and Trust, PST 2012, Paris, France, 16–18 July 2012, pp. 35–42. IEEE (2012)
Sweeney, L.: K-anonymity: a model for protecting privacy. Int. J. Uncertainty Fuzz. Knowl. Based Syst. 10(5), 557–570 (2002)
Torra, V., Miyamoto, S.: Evaluating fuzzy clustering algorithms for microdata protection. In: Domingo-Ferrer, J., Torra, V. (eds.) PSD 2004. LNCS, vol. 3050, pp. 175–186. Springer, Heidelberg (2004). doi:10.1007/978-3-540-25955-8_14
Zadeh, L.A.: Fuzzy sets. Inf. Control 8, 338–353 (1965)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Abidi, B., Ben Yahia, S. (2017). Generating k-Anonymous Microdata by Fuzzy Possibilistic Clustering. In: Benslimane, D., Damiani, E., Grosky, W., Hameurlain, A., Sheth, A., Wagner, R. (eds) Database and Expert Systems Applications. DEXA 2017. Lecture Notes in Computer Science(), vol 10439. Springer, Cham. https://doi.org/10.1007/978-3-319-64471-4_1
Download citation
DOI: https://doi.org/10.1007/978-3-319-64471-4_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-64470-7
Online ISBN: 978-3-319-64471-4
eBook Packages: Computer ScienceComputer Science (R0)