Abstract
In this chapter we present an overview of the topic data privacy. We review privacy models and measures of disclosure risk. These models and measures provide computational definitions of what privacy means, and of how to evaluate the privacy level of a data set. Then, we give a summary of data protection mechanisms. We provide a classification of these methods according to three dimensions: whose privacy is being sought, the computations to be done, and the number of data sources. Finally, we describe masking methods. Such methods are the data protection mechanisms used for databases when the data use is undefined and the protected database is required to be useful for several data uses. We also provide a definition of information loss (or data utility) for this type of data protection mechanism. The chapter finishes with a summary.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Anandan, B., Clifton, C., Jiang, W., Murugesan, M., Pastrana-Camacho, P., & Si, L. (2012). t-Plausibility: Generalizing words to desensitize text. Transactions on Data Privacy, 5(3), 505–534.
Brand, R. (2002). Microdata protection through noise addition. In Inference control in statistical databases (pp. 97–116). Springer.
Casas-Roma, J., Herrera-joancomartí, J., & Torra, V. (2013). Analyzing the impact of edge modifications on networks. In: The 10th International Conference on Modeling Decisions for Artificial Intelligence (Vol. 8234, pp. 296–307). Lecture notes in computer science. Springer.
Cano, I., & Torra, V. (2009). Generation of synthetic data by means of fuzzy c-regression. In Proceedings of IEEE International Conference on Fuzzy Systems (pp. 1145–1150).
Chaum, D. L. (1981). Untraceable electronic mail, return addresses, and digital pseudonyms. Communications of the ACM, 24(2), 5.
Defays, D., & Nanopoulos, P. (1993). Panels of enterprises and confidentiality: The small aggregates method. In Proceedings of the 1992 Symposium on Design and Analysis of Longitudinal Surveys, Ottawa: Statistics Canada (pp. 195–204).
Domingo-Ferrer, J., & Torra, V. (2001). A quantitative comparison of disclosure control methods for microdata. In Confidentiality, disclosure and data access: Theory and practical applications for statistical agencies (pp. 111–134).
Domingo-Ferrer, J., Mateo-Sanz, J. M., & Torra, V. (2001). Comparing SDC methods for microdata on the basis of information loss and disclosure risk. In Pre-proceedings of ETK-NTTS, 2001 (Vol. 2, pp. 807–826).
Domingo-Ferrer, J., & Mateo-Sanz, J. M. (2002). Practical data-oriented microaggregation for statistical disclosure control. IEEE Transactions on Knowledge and Data Engineering, 14(1), 189–201.
Domingo Ferrer, J., Solanas, A., & Castellà Roca, J. (2009). h(k) private information retrieval from privacy uncooperative queryable databases. Online Information Review, 33(4), 720–744.
Duncan, G. T., Elliot, M., & Salazar, J. J. (2011). Statistical confidentiality. Springer.
Dwork, C. (2006). Differential privacy. In Proceedings of ICALP 2006 (Vol. 4052, pp. 1–12). LNCS.
Dwork, C. (2008). Differential privacy: A survey of results. In Proceedings of TAMC 2008 (Vol. 4978, pp. 1–19). LNCS.
Fienberg, S. E., Makov, U. E., & Steele, R. J. (1998). Disclosure limitation using perturbation and related methods for categorical data. Journal of Official Statistics, 14(4), 485–502.
Howe, D., & Nissenbaum, H. (2009). TrackMeNot: Resisting surveillance in web search. In Lessons from the identity trail: Anonymity, privacy, and identity in a networked society. Oxford University Press.
Hundepool, A., Domingo-Ferrer, J., Franconi, L., Giessing, S., Nordholt, E. S., Spicer, K., & de Wolf, P. -P. (2012). Statistical disclosure control. Wiley.
Juàrez, M., & Torra, V. (2015). DisPA: An intelligent agent for private web search In G. Navarro-Arribas, V. Torra (Eds.), Advanced research on data privacy (pp. 389–405). Springer.
Kim, J. J., & Winkler, W. E. (2003). Multiplicative noise for masking continuous data (Research Report Series No. Statistics #2003-01). Statistical Research Division. U.S. Bureau of the Census.
Kooiman, P., Willenborg, L., & Gouweleeuw, J. (1998). PRAM: A method for disclosure limitation of microdata. Research Report, Voorburg: Statistics Netherlands.
Lee, J., & Clifton, C. (2011). How much is enough? Choosing \(\epsilon \) for differential privacy. In Proceeding of ISC 2011 (Vol. 7001, pp. 325–340). LNCS
Li, N., Lyu, M., Su, D., & Yang, W. (2016). Differential privacy: From theory to practice. Morgan and Claypool Publishers.
Navarro Arribas, G., & Torra, V. (2010). Privacy preserving data mining through Microaggregation for Webbased E-commerce. Internet Research, 20(3), 366–84.
Moore, R., (1996). Controlled data swapping techniques for masking public use microdata sets. U. S. Bureau of the Census (unpublished manuscript).
Mülle, Y., Clifton, C., & Böhm, K. (2015). Privacy-integrated graph clustering through differential privacy. In EDBT/ICDT Workshops (pp. 247–254).
Navarro-Arribas, G., Torra, V., Erola, A., & Castellà-Roca, J. (2012). User K-Anonymity for privacy preserving data mining of query logs. Information Processing & Management, 48(3): 476–487. (May 2012).
Nettleton, D. F. (2012). Information loss evaluation based on fuzzy and crisp clustering of graph statistics. IEEE International Conference on Fuzzy Systems (pp. 1–8).
Raghunathan, T. J., Reiter, J. P., & Rubin, D. (2003). Multiple imputation for statistical disclosure limitation. Journal of Official Statistics, 19(1), 1–16.
Reiter, M. K., & Rubin, A. D. (1998). Crowds: Anonymity for web transactions. ACM Transactions on Information and System Security, 1(1), 66–92.
Sakuma, J., & Osame, T. (2018). Recommendation with k-Anonymized Ratings. Transactions on Data Privacy, 11(1), 47–60.
Samarati, P., & Sweeney, L. (1998). Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression. Rep: SRI Intl. Tech.
Samarati, P. (2001). Protecting respondents’ identities in microdata release. IEEE Transactions on Knowledge and Data Engineering, 13(6), 1010–1027.
Sánchez, D., & Batet, M. (2017). Toward sensitive document release with privacy guarantees. Engineering Applications of Artificial Intelligence, 59(Supplement C), 23–34.
Stokes, K., & Bras-Amorós, M. (2011). On query self-submission in peer-to-peer user-private information retrieval. In Proceedings of 4th PAIS 2011.
Stokes, K., & Farràs, O. (2014). Linear spaces and transversal designs: \(k\)-anonymous combinatorial configurations for anonymous database search. Designs, Codes and Cryptography, 71, 503–524.
Sweeney, L. (2002). Achieving \(k\)-anonymity privacy protection using generalization and suppression. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 10(5), 571–588.
Torra, V. (2017). Data privacy. Springer.
Torra, V., & Navarro-Arribas, G. (2016). Integral privacy. In Proceedings of CANS 2016 (Vol. 10052, pp. 661–669). LNCS.
Vaidya, J., Clifton, C. W., & Zhu, Y. M. (2006). Privacy preserving data mining. Springer.
Van den Hout, A. (2004). Analyzing misclassified data: Randomized response and post randomization. Ph.D. thesis, Utrecht University.
Willenborg, L., & de Waal, T. (2001). Elements of statistical disclosure control. Springer.
Winkler, W. E. (2004). Masking and re-identification methods for public-use microdata: Overview and research problems. In Privacy in statistical databases (pp. 231–246). Springer.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer International Publishing AG, part of Springer Nature
About this chapter
Cite this chapter
Torra, V., Navarro-Arribas, G., Stokes, K. (2019). Data Privacy. In: Said, A., Torra, V. (eds) Data Science in Practice. Studies in Big Data, vol 46. Springer, Cham. https://doi.org/10.1007/978-3-319-97556-6_7
Download citation
DOI: https://doi.org/10.1007/978-3-319-97556-6_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-97555-9
Online ISBN: 978-3-319-97556-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)