Abstract
Vast amounts of data are now being collected from census and surveys, scientific research, instruments, observation of consumer and internet activities, and sensors of many kinds. These data hold a wealth of information, however there is a risk that personal privacy will not be protected when they are accessed and used.
This paper provides an overview of current and emerging approaches to balancing use and analysis of data with confidentiality protection in the research use of data, where the need for privacy protection is widely-recognised. These approaches were generally developed in the context of national statistical agencies and other data custodians releasing social and survey data for research, but are increasingly being adapted in the context of the globalisation of our information society. As examples, the paper contributes to a discussion of some of the issues regarding confidentiality in the service science and big data analytics contexts.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Abowd, J.M., Stinson, M., Benedetto, G.: Final report to the social security administration on the sipp/ssa/irs public use file project. Technical report (2006)
Australian Bureau of Statistics: Remote Access Data Laboratory (RADL) (2014). http://www.abs.gov.au. Accessed 23 October 2014
Australian Bureau of Statistics: About CURF Microdata. Website (nd) (2014). http://www.abs.gov.au/websitedbs/D3310114.nsf/home/About+CURF+Microdata. Accessed 23 October 2014
Australian Bureau of Statistics: Census TableBuilder (nd) (2014). http://www.abs.gov.au. Accessed 23 October 2014
Australian Bureau of Statistics: (website) (2014). http://www.abs.gov.au. Accessed 23 October 2014
British Columbia Linked Health Database (BCHLD) (2014). http://riskfactor.cancer.gov/tools/pharmaco/epi/british_columbia.html. Accessed 23 October 2014
Cavoukian, A., Jonas, J.: Privacy by design in the age of big data. Published online (2014). http://privacybydesign.ca/content/uploads/2012/06/pbd-big_data.pdf. Accessed 23 Dec 2014
Centers for Disease Control and Prevention: Public-use data files and documentation. Website (2014). http://www.cdc.gov/nchs/data_access/ftp_data.htm. Accessed 23 Oct 2014
Cox, L.: Linear sensitivity measures in statistical disclosure control. J. Stat. Plan. Infer. 5, 153–164 (1981)
Duncan, G.T., Keller-McNulty, S.A., Stokes, S.L.: Disclosure risk vs data utility: The R-U confidentiality map. Technical report LA-UR-01-6428, Los Alamos National Laboratory (2001)
Duncan, G., Elliot, M., Salazar-Gonzàlez, J.J.: Statistical Confidentiality. Springer, New York (2011)
Duncan, G., Pearson, R.: Enhancing access to microdata while protecting confidentiality: prospects for the future. Stat. Sci. 6, 219–239 (1991)
Ford, D.V., Jones, K.H., Verplancke, J.P., Lyons, R.A., John, G., Brown, G., Brooks, C.J., Thompson, S., Bodger, O., Couch, T., Leake, K.: The SAIL databank: building a national architecture for e-health research and evaluation. BioMed central. Health Serv. Res. 9, 157 (2009)
Gećzy, P., Izumi, N., Hasida, K.: Service science, quo vadis? Int. J. Serv. Sci. Manage. Eng. Technol. 1(1), 1–16 (2010)
Gill, L.: OX-LINK: The oxford medical record linkage system. Record Linkage Techniques. Technical report, 19, University of Oxford, Oxford (1997)
Gomatam, S., Karr, A., Reiter, J., Sanil, A.: Data dissemination and disclosure limitation in a world without microdata: a risk-utility framework for remote access systems. Stat. Sci. 20, 163–177 (2005)
Gouweleeuw, J., Kooiman, P., DeWolf, L.W.P.P.: Post randomisation for statistical disclosure control: theory and implementation. J. Official Stat. 14, 463–478 (1998)
Holman, C.D.J., Bass, A.J., Rouse, I.L., Hobbs, M.S.: Population-based linkage of health records in Western Australia: development of a health services research linked database. Aust. N. Z. J. Public Health 23, 453–459 (1999)
Hundepool, A., Domingo-Ferrer, J., Franconi, L., Giessing, S., Nordholt, E., Spicer, K., de Wolf, P.P.: Statistical Disclosure Control. Wiley Series in Survey Methodology. Wiley, United Kingdom (2012)
Kendrick, S., Clarke, J.A.: The scottish medical record linkage system. Health Bull. Edinb. 51, 72–79 (1979)
Kinney, S.K., Reiter, J.P., Reznek, A.P., Miranda, J., Jarmin, R.S., Abowd, J.M.: Towards unrestricted public use business microdata: the synthetic longitudinal business database. Int. Stat. Rev. 79(3), 362–384 (2011)
Little, R.: Statistical analysis of masked data. J. Official Stat. 9, 407–426 (1993)
Lucero, J., Zayatz, L., Singh, L., You, J., DePersio, M., Freiman, M.: The current stage of the microdata analysis system at the U.S. census bureau. In: Proceedings of the 58th Congress of the International Statistical Institute, ISI 2011 (2011)
Lusch, R., Vargo, S. (eds.): The Service-Dominant Logic of Marketing: Dialog, Debate, and Directions. ME Sharpe, Armonk (2006)
Marley, J., Leaver, V.: A method for confidentialising user-defined tables: statistical properties and a risk-utility analysis. In: Proceedings of the 58th Congress of the International Statistical Institute, ISI 2011, 21–26 Aug 2011
Marsh, C., Skinner, C., Arber, S., Penhale, B., Openshaw, S., Hobcraft, J., Lievesley, D., Walford, N.: The case for samples of anonymized records from the 1991 census. J. Roy. Stat. Soc.: Ser. A 154, 305–340 (1991)
Minnesota Population Center, University of Minnesota: Ipums international. Website (2014). https://international.ipums.org/international/. Accessed 23 Oct 2014
Office for National Statistics: (website) (2014). http://www.statistics.gov.uk. Accessed 23 Oct 2014
O’Keefe, C.M., Gould, P., Churches, T.: Comparison of two remote access systems recently developed and implemented in australia. In: Domingo-Ferrer, J. (ed.) PSD 2014. LNCS, vol. 8744, pp. 299–311. Springer, Heidelberg (2014)
O’Keefe, C.M., Rubin, D.B.: Balancing the research use of health and medical data with confidentiality protection, preprint
O’Keefe, C.M., Westcott, M., Ickowicz, A., O’Sullivan, M., Churches, T.: Protecting confidentiality in statistical analysis outputs from a virtual data centre. Working Paper (29–30 October 2013), joint UNECE/Eurostat work session on statistical data confidentiality, Ottawa, Canada, p. 10 (2014). http://www.unece.org/stats/documents/2013.10.confidentiality.html. Accessed 23 Oct 2014
Pitkänen, O., Virtanen, P., Kemppinen, J.: Legal research topics in user-centric services. IBM Syst. J. 47(1), 143–152 (2008)
Population Health Research Network (2014). http://www.phrn.org.au/. Accessed 23 Oct 2014
Reiter, J.: Model diagnostics for remote-access regression systems. Stat. Comput. 13, 371–380 (2003)
Reiter, J.: Using CART to generate partially synthetic public use microdata. J. Official Stat. 21, 441–462 (2005)
Reiter, J., Kohnen, C.: Categorical data regression diagnostics for remote systems. J. Stat. Comput. Simul. 75, 889–903 (2005)
Robertson, D.A., Ethier, R.: Cell suppression: experience and theory. In: Domingo-Ferrer, J. (ed.) Inference Control in Statistical Databases. LNCS, vol. 2316, p. 8. Springer, Heidelberg (2002)
Roos, L.L., Wajda, A.: Record linkage strategies: Part 1: Estimating information and evaluating approaches. Technical report 28, University of Manitoba, Winnipeg (1990)
Rubin, D.: Discussion: statistical disclosure limitation. J. Official Stat. 9, 462–468 (1993)
Sampson, S., Froehle, C.: Foundations and implications of a proposed unified services theory. Prod. Oper. Manag. 15(2), 329–343 (2006)
Sax Institute: Secure Unified Research Environment (SURE). Website (2014). http://www.sure.org.au. Accessed 23 Oct 2014
Skinner, C., Shlomo, N.: Assessing identification risk in survey microdata using log-linear models. J. Am. Stat. Assoc. 103, 989–1001 (2008)
Sparks, R., Carter, C., Donnelly, J., Duncan, J., O’Keefe, C.M., Ryan, L.: A framework for performing statistical analyses of unit record health data without violating either privacy or confidentiality of individuals. In: Proceedings of the 55th Session of the International Statistical Institute, Sydney, p. 4 (2005)
Sparks, R., Carter, C., Donnelly, J., O’Keefe, C.M., Duncan, J., Keighley, T., McAullay, D.: Remote access methods for exploratory data analysis and statistical modelling: privacy-preserving analytics™. Comput. Methods Programs Biomed. 91, 208–222 (2008)
Spohrer, J., Maglio, P., Bailey, J., Gruhl, D.: Steps toward a science of service systems. Computer 40, 71–77 (2007)
Thompson, G., Broadfoot, S., Elazar, D.: Methodology for automatic confidentialisation of statistical outputs from remote servers at the Australian Bureau of Statistics. In: Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality, Ottawa, Canada, 28–30 October 2013, p. 37 (2013)
UK Data Archive: Secure data service (website) (2014). http://ukdataservice.ac.uk/get-data/how-to-access/accesssecurelab.aspx. Accessed 23 Oct 2014
United States Census Bureau: (website) (2014). http://www.census.gov. Accessed 23 Oct 2014
University of Chicago: NORC (website) (2014). http://www.norc.org. Accessed 23 Oct 2014
Acknowledgments
I warmly thank the organisers of the International Federation for Information Processing (IFIP) 9th Summer School on Privacy and Identity Management for the Future Internet in the Age of Globalisation, for their invitation to participate. I acknowledge the financial support of the Authentication and Authorization for Entrusted Unions (AU2EU) project funded by the European Commission Seventh Framework Programme for Research and Technological Development.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 IFIP International Federation for Information Processing
About this paper
Cite this paper
O’Keefe, C.M. (2015). Privacy and Confidentiality in Service Science and Big Data Analytics. In: Camenisch, J., Fischer-Hübner, S., Hansen, M. (eds) Privacy and Identity Management for the Future Internet in the Age of Globalisation. Privacy and Identity 2014. IFIP Advances in Information and Communication Technology, vol 457. Springer, Cham. https://doi.org/10.1007/978-3-319-18621-4_5
Download citation
DOI: https://doi.org/10.1007/978-3-319-18621-4_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-18620-7
Online ISBN: 978-3-319-18621-4
eBook Packages: Computer ScienceComputer Science (R0)