Abstract
National Statistical Agencies and other data custodians have a responsibility to protect the confidentiality of commercially sensitive business data as well as personally private social and survey data. However, traditional confidentiality methods have generally been developed for the context of social or survey data about individual persons. Several recent studies have highlighted that such traditional confidentiality measures may not be directly applicable to business data, due to the different characteristics of business data and personal data. In this paper we provide a discussion of these recent studies and their conclusions. We find that while the confidentiality objective is the same for business data and social and survey data, the disclosure scenarios and disclosure risks are different. There is evidence that business data and social and survey data may require different confidentiality protection methods to achieve an effective balance between disclosure risk and data utility.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Adam, N., Wortmann, J.: Security-control methods for statistical databases: A comparative study. ACM Comput. Surv. 21, 515–556 (1989)
Ahmad, N., Backer, K.D., Yoon, Y.: An OECD perspective on microdata access: Trends, opportunities and challenges. Statistical Journal of the IAOS 26, 57–63 (2010)
Australian Bureau of Statistics (website), http://www.abs.gov.au
Australian Bureau of Statistics: Remote Access Data Laboratory (RADL) (website), http://www.abs.gov.au (accessed January 23, 2013)
Australian Government Department of Climate Change and Energy Efficiency: Australian National Greenhouse Accounts National Inventory Report 2010, vol. 1. Tech. Rep., 320 p. (2012), http://climatechange.gov.au
Chambers, R., Dunstan, R.: Estimating distribution functions from survey data. Biometrika 73, 597–604 (1986)
Domingo-Ferrer, J., Magkos, E. (eds.): PSD 2010. LNCS, vol. 6344. Springer, Heidelberg (2010)
Domingo-Ferrer, J., Saygın, Y. (eds.): PSD 2008. LNCS, vol. 5262. Springer, Heidelberg (2008)
Domingo-Ferrer, J., Torra, V. (eds.): PSD 2004. LNCS, vol. 3050. Springer, Heidelberg (2004)
Doyle, P., Lane, J., Theeuwes, J., Zayatz, L. (eds.): Confidentiality, Disclosure and Data Access: Theory and Practical Applications for Statistical Agencies. North-Holland, Amsterdam (2001)
Drechsler, J., Reiter, J.: An empirical evaluation of easily implemented, nonparametric methods for generating synthetic datasets. Comput. Stat. Data An. 55, 3232–3243 (2011)
Duncan, G.T., Keller-McNulty, S.A., Stokes, S.L.: Disclosure risk vs data utility: The R-U confidentiality map. Technical Report LA-UR-01-6428, Los Alamos National Laboratory (2001)
Duncan, G., Elliot, M., Salazar-Gonzàlez, J.J.: Statistical Confidentiality. Springer, New York (2011)
Duncan, G., Pearson, R.: Enhancing access to microdata while protecting confidentiality: prospects for the future. Stat. Sci. 6, 219–239 (1991)
Dwork, C., McSherry, F., Nissim, K., Smith, A.: Calibrating noise to sensitivity in private data analysis. In: 3rd IACR Theory of Cryptography Conference, pp. 265–284 (2006)
Dwork, C., Smith, A.: Differential privacy for statistics: What we know and what we want to learn. J. Priv. Confid. 1, 135–154 (2009)
European Pollutant Release and Transfer Register, http://prtr.ec.europa.eu
Gomatam, S., Karr, A., Reiter, J., Sanil, A.: Data dissemination and disclosure limitation in a world without microdata: A risk-utility framework for remote access systems. Stat. Sci. 20, 163–177 (2005)
Hundepool, A., Domingo-Ferrer, J., Franconi, L., Giessing, S., Nordholt, E., Spicer, K., de Wolf, P.P.: Statistical Disclosure Control. Wiley series in survey methodology. John Wiley & Sons, United Kingdom (2012)
Kamel-Boulos, M., Curtis, A., AbdelMalik, P.: Musings on privacy issues in health research involving disaggregate geographic data about individuals. Int. J. Health Geogr. 8, 46, 8 p. (2009)
Lee, J.H., Kim, I.Y., O’Keefe, C.M.: On regression-tree-based synthetic data methods for business data. Journal of Privacy and Confidentiality 5(1), 5 (2013)
Little, R.: Statistical analysis of masked data. J. Off. Stat. 9, 407–426 (1993)
Marsh, C., Skinner, C., Arber, S., Penhale, B., Openshaw, S., Hobcraft, J., Lievesley, D., Walford, N.: The case for samples of anonymized records from the 1991 census. J. Roy. Stat. Soc. Ser. A 154, 305–340 (1991)
Minnesota Population Center: Integrated public use microdata series, international: version 6.0. Tech. rep., University of Minnesota, Minneapolis, technical Report (2010)
Office for National Statistics (website), http://statistics.gov.uk
Office of Information and Regulatory Affairs: Statistical policy working paper 22 - report on statistical disclosure limitation methodology. Subcommittee on Disclosure Limitation Methodology, Federal Committee on Statistical Methodology, Statistical Policy Office, Office of Information and Regulatory Affairs, Office of Management and Budget (1994)
O’Keefe, C.: Confidentialising maps of mixed point and diffuse spatial data. In: Domingo-Ferrer, J., Tinnirello, I. (eds.) PSD 2012. LNCS, vol. 7556, pp. 226–240. Springer, Heidelberg (2012)
O’Keefe, C., Shlomo, N.: Comparison of remote analysis with statistical disclosure control for protecting the confidentiality of business data. Trans. Data Privacy 5, 403–432 (2012)
Raghunathan, T., Reiter, J., Rubin, D.: Multiple imputation for statistical disclosure limitation. J. Off. Stat. 19, 1–16 (2003)
Reiter, J.: Model diagnostics for remote-access regression systems. Stat. Comput. 13, 371–380 (2003)
Reiter, J.: Releasing multiply imputed, synthetic public-use microdata: An illustration and empirical study. J. Roy. Stat. Soc. A. Sta. 168, 185–205 (2005)
Reiter, J.: Using CART to generate partially synthetic public use microdata. J. Off. Stat. 21, 441–462 (2005)
Rubin, D.: Discussion: Statistical disclosure limitation. J. Off. Stat. 9, 462–468 (1993)
Skinner, C., Shlomo, N.: Assessing identification risk in survey microdata using log-linear models. J. Am. Stat. Assoc. 103, 989–1001 (2008)
Sparks, R., Carter, C., Donnelly, J., O’Keefe, C., Duncan, J., Keighley, T., McAullay, D.: Remote access methods for exploratory data analysis and statistical modelling: Privacy-Preserving AnalyticsTM. Comput. Meth. Prog. Bio. 91, 208–222 (2008)
Sutcliffe, P., Caruso, M., Teasdale, H.: Issues associated with producing a longitudinal dataset of businesses. Research Paper, Methodology Advisory Committee 1352.0.55.062, Australian Bureau of Statistics, Statistical Services Branch, Canberra, 32 p. (2004)
UK Data Archive: Secure data service (website), http://securedata.data-archive.ac.uk
United States Census Bureau (website), http://census.gov
University of Chicago: NORC (website), http://www.norc.org
VanWey, L., Rindfuss, R., Gutmann, M., Entwisle, B., Balk, D.: Confidentiality and spatially explicit data: Concerns and challenges. P. Natl. A. Sci. USA 102, 15337–15342 (2005)
Willenborg, L., de Waal, T.: Elements of Statistical Disclosure Control. Lecture Notes in Statistics, vol. 155. Springer (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
O’Keefe, C.M., Shlomo, N. (2014). Applicability of Confidentiality Methods to Personal and Business Data. In: Domingo-Ferrer, J. (eds) Privacy in Statistical Databases. PSD 2014. Lecture Notes in Computer Science, vol 8744. Springer, Cham. https://doi.org/10.1007/978-3-319-11257-2_27
Download citation
DOI: https://doi.org/10.1007/978-3-319-11257-2_27
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11256-5
Online ISBN: 978-3-319-11257-2
eBook Packages: Computer ScienceComputer Science (R0)