Abstract
Many of the potentially sensitive personal data produced and compiled in electronic sources are nominal and multi-attribute (e.g., personal interests, healthcare diagnoses, commercial transactions, etc.). For such data, which are discrete, finite and non-ordinal, privacy-protection methods should mask original values to prevent disclosure while preserving the underlying semantics of nominal attributes and the (potential) correlation between them. In this paper we tackle this challenge by proposing a semantically-grounded version of numerical correlated noise addition that, by relying on structured knowledge sources (ontologies), is capable of perturbing/masking multivariate nominal attributes while reasonably preserving their semantics and correlations.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Hundepool, A., Domingo-Ferrer, J., Franconi, L., Giessing, S., Nordholt, E.S., Spicer, K., Wolf, P.-P.: Microdata. In: Statistical Disclosure Control, pp. 23–130. Wiley (2012)
Domingo-Ferrer, J., Sánchez, D., Soria-Comas, J.: Database Anonymization: Privacy Models, Data Utility, and Microaggregation-based Inter-model Connections. Morgan & Claypool Publishers (2016)
Ramirez, E., Brill, J., Ohlhausen, M., Wright, J., Mc-Sweeny, T.: Data brokers: a call for transparency and accountability. Federal Trade Commission, Technical Report, May 2014
Sánchez, D., Batet, M.: C-sanitized: a privacy model for document redaction and sanitization. J. Assoc. Inf. Sci. Technol. 67, 148–163 (2016)
Soria-Comas, J., Domingo-Ferrer, J., Sánchez, D., Martínez, S.: t-closeness through microaggregation: strict privacy with enhanced utility preservation. IEEE Trans. Knowl. Data Eng. 27, 3098–3110 (2015)
Martínez, S., Sánchez, D., Valls, A.: Semantic adaptive microaggregation of categorical microdata. Comput. Secur. 31, 653–672 (2012)
Batet, M., Erola, A., Sánchez, D., Castellà-Roca, J.: Utility preserving query log anonymization via semantic microaggregation. Inf. Sci. 242, 49–63 (2013)
Samarati, P., Sweeney, L.: Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression. Computer Science Laboratory, SRI International (1998)
Krempl, G., Zliobaite, I., Brzezinski, D., Hüllermeier, E., Last, M., Lemaire, V., Noack, T., Shaker, A., Sievi, S., Spiliopoulou, M., Stefanowski, J.: Open challenges for data stream mining research. ACM SIGKDD Explor. Newslett. 16, 1–10 (2014)
Dwork, C.: Differential privacy. Automata Lang. Programm. 4052, 1–2 (2006)
Kooiman, P., Willenborg, L., Gouweleeuw, J.: Pram: a method for disclosure limitation of microdata. Research Paper 9705, Statistics Netharlands, P.O. Box 4000, 2270 JM Voorburg, The Netharlands (1997)
Giggins, H., Brankovic, L.: Protecting privacy in genetic databases. In: Proceeding of the 6th Engineering Mathematics and Applications Conference (EMAC 2003), vol. 2, Sydney, Australia, pp. 73–78 (2003)
Ghosh, A., Roughgarden, T., Sundararajan, M.: Universally utility-maximizing privacy mechanisms. In: Proceeding of the ACM Symposium on Theory of Computing (STOC 2009), pp. 351–360 (2009)
McSherry, F., Talwar, K.: Mechanism design via differential privacy. In: Proceeding of Annual IEEE Symposium on Foundations of Computer Science (FOCS 2007), pp. 94–103 (2007)
Abril, D., Navarro-Arribas, G., Torra, V.: On the declassification of confidential documents. In: Torra, V., Narakawa, Y., Yin, J., Long, J. (eds.) MDAI 2011. LNCS, vol. 6820, pp. 235–246. Springer, Heidelberg (2011)
Rodriguez-Garcia, M., Batet, M., Sánchez, D.: Semantic noise: privacy-protection of nominal microdata through uncorrelated noise addition. In: Proceeding of the 27th IEEE International Conference on Tools with Artificial Intelligence, ICTAI 2015, Vietri sul Mare, Italy, pp. 1106–1113 (2015)
Conway, R., Strip, D.: Selective partial access to a database. Cornell University, Technical Report (1976)
Tendick, P.: Optimal noise addition for preserving confidentiality in multivariate data. J. Stat. Plann. Infer. 27, 341–353 (1991)
Kim, J.: A method for limiting disclosure in microdata based on random noise and transformation. In: Proceeding of the ASA Section on Survey Research Methods, pp. 370–374 (1986)
Wu, Z., Palmer, M.: Verbs semantics and lexical selection. In: Proceeding of the Annual Meeting of the Association for Computational Linguistics, pp. 133–139 (1994)
Székely, G.J., Rizzo, M.L., Bakirov, N.K.: Measuring and testing dependence by correlation of distances. Ann. Stat. 35, 2769–2794 (2007)
Spackman, K.A.: SNOMED CT milestones: endorsements are added to already-impressive standards credentials. Healthcare Inf. Bus. Mag. Inf. Commun. Syst. 21, 54–56 (2004)
Acknowledgements
This work was supported by the EU Commission under the H2020 project “CLARUS”, by the Spanish Government through projects TIN2014-57364-C2-R “SmartGlacis”, TIN2011-27076-C03-01 “Co-Privacy” and TIN2015-70054-REDC “Red de excelencia Consolider ARES” and by the Government of Catalonia under grant 2014 SGR 537. M. Batet is supported by a Postdoctoral grant from Ministry of Economy and Competitiveness (MINECO) (FPDI-2013-16589).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Rodriguez-Garcia, M., Sánchez, D., Batet, M. (2016). Perturbative Data Protection of Multivariate Nominal Datasets. In: Domingo-Ferrer, J., Pejić-Bach, M. (eds) Privacy in Statistical Databases. PSD 2016. Lecture Notes in Computer Science(), vol 9867. Springer, Cham. https://doi.org/10.1007/978-3-319-45381-1_8
Download citation
DOI: https://doi.org/10.1007/978-3-319-45381-1_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-45380-4
Online ISBN: 978-3-319-45381-1
eBook Packages: Computer ScienceComputer Science (R0)