Abstract
The increased power and interconnectivity of computer systems available today provide the ability of storing and processing large amounts of data, resulting in networked information accessible from anywhere at any time. This information sharing and dissemination process is clearly selective. Indeed, if on the one hand there is a need to disseminate some data, there is on the other hand an equally strong need to protect those data that, for various reasons, should not be disclosed. Consider, for example, the case of a private organization making available various data regarding its business (products, sales, and so on), but at the same time wanting to protect more sensitive information, such as the identity of its customers or plans for future products. As another example, government agencies, when releasing historical data, may require a sanitization process to “blank out” information considered sensitive, either directly or because of the sensitive information it would allow the recipient to infer. Effective information sharing and dissemination can take place only if the data holder has some assurance that, while releasing information, disclosure of sensitive information is not a risk.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Adam NR, Wortman JC (1989). Security-control methods for statistical databases: A comparative study. ACM Computing Surveys, 21(4):515–56.
Bourke PD, Dalenius T (1975). Some new ideas in the realm of randomized inquiries. Technical Report 5, Detpartment of Statistics, University of Stockholm, Stockholm, Sweden.
Brand R (2002). Microdata protection through noise addition. In Domingo-Ferrer J, editor, Inference Control in Statistical Databases, vol. 2316 of LNCS, pp. 97–116. Springer, Berlin Heidelberg.
Burridge J, Franconi L, Polettini S, Stander J (2002). A methodological framework for statistical disclosure limitation of business microdata. Technical Report 1.1-D4, CASC Project.
Cox LH (1980). Suppression methodology and statistical disclosure analysis. Journal of the American Statistical Association, 75(370):377–385.
Cox LH (1981). Linear sensitivity measures in statistical disclosure control. Journal of Statistical Planning and Inference, 5(2):153–164.
Cox LH (1987). A constructive procedure for unbiased controlled rounding. Journal of the American Statistical Association, 82(398):520–524.
Cox LH (1995). Network models for complementary cell suppression. Journal of the American Statistical Association, 90(432): 1453–1462.
Cox LH, Dandekar RA (2002). Synthetic tabular data — An alternative to complementary cell suppression. Unpublished manuscript.
Dalenius T, Reiss SP (1978). Data-swapping: a technique for disclosure control (extended abstract). In Proc. of the ASA Section on Survey Research Methods, pp. 191–194, Washington DC.
Dandekar R, Domingo-Ferrer J, Sebé F (2002). LHS-based hybrid microdata vs rank swapping and microaggregation for numeric microdata protection. In Domingo-Ferrer J, editor, Inference Control in Statistical Databases, vol. 2316 of LNCS, pp. 153–162. Springer, Berlin Heidelberg.
Defays D, Nanopoulos P (1993). Panels of enterprises and confidentiality: the small aggregates method. In Proc. of the 92nd Symposium on Design and Analysis of Longitudinal Surveys, pp. 195–204, Ottawa.
Denning DE (1982). Inference controls. In Cryptography and Data Security, pp. 331–392. Addison-Wesley Publishing Company, Reading, Massachusetts; Menlo Park, California; London; Amsterdam; Don Mills, Ontario; Sydney.
Domingo-Ferrer J, Mateo-Sanz JM (1999). On resampling for statistical confidentiality in contingency tables. Computers & Mathematics with Applications, 38(11–12):13–32.
Domingo-Ferrer J, Mateo-Sanz JM, Torra V (2001). Comparing SDC methods for microdata on the basis of information loss and disclosure risk. In Pre-proceedings of ETK-NTTS 001, vol. 2, pp. 807–826, Luxemburg. Eurostat.
Domingo-Ferrer J, Torra V (2001). Disclosure protection methods and information loss for microdata. In Doyle P, Lane JI, Theeuwes J, Zayatz L, editors, Confidentiality, Disclosure and Data Access: Theory and Practical Applications for Statistical Agencies. North-Holland, Amsterdam.
Domingo-Ferrer J, Torra V (2001). A quantitative comparison of disclosure control methods for microdata. In Doyle P, Lane JI, Theeuwes J, and Zayatz L, editors, Confidentiality, Disclosure and Data Access: Theory and Practical Applications for Statistical Agencies. North-Holland, Amsterdam.
Domingo-Ferrer J, Torra V (2002). Distance-based and probabilistic record linkage for re-identification of records with categorical variables. Butlleti de l’Associacio Catalana d’Intelligencia Artificial, 27.
Domingo-Ferrer J, Torra V (2003). Disclosure risk assessment in statistical microdata protection via advanced record linkage. Statistics and Computing, 13(4):343–354. Kluwer Academic Publishers.
Duncan GT, Keller-McNulty SA, Stokes SL (2001). Disclosure risk vs. data utility: The R-U confidentiality map. Technical report, Los Alamos National Laboratory. LA-UR-01-6428.
Duncan GT, Lambert D (1989). The risk of disclosure for microdata. Journal of Business and Economic Statistics, 7:207–217.
Federal Committee on Statistical Methodology (1994). Statistical policy working paper 22. USA. Report on Statistical Disclosure Limitation Methodology.
Fellegi IP, Sunter AB (1969). A theory for record linkage. Journal of the American Statistical Association, 64(328):1183–1210.
Fienberg SE (1994). A radical proposal for the provision of micro-data samples and the preservation of confidentiality. Technical Report 611, Carnegie Mellon University Department of Statistics.
Florian A (1992). An efficient sampling scheme: updated latin hypercube sampling. Probabilistic Engineering Mechanics, 7(2):123–130.
Franconi L, Polettini S (2004). Individual risk estimation in μ-ARGUS: a review. In Domingo-Ferrer J, Torra V, editors, Privacy in Statistical Databases, vol. 3050 of LNCS, pp. 262–372. Springer, Berlin Heidelberg.
Franconi L, Stander J (2002). A model based method for disclosure limitation of business microdata. Journal of the Royal Statistical Society D-Statistician, 51(1): 1–11.
Gonzalez JF, Cox LH (2005). Software for tabular data protection. Statistics in Medicine, 24(4):65–669.
Gouweleeuw JM, Kooiman P, Willenborg RCLJ, DeWolf PP (1997). Post randomization for statistical disclosure control: Theory and implementation. Technical Report 9731, Voorburg: Statistics Netherlands, Netherlands.
Greenberg B (1987). Rank swapping for ordinal data. Technical report, U. S. Bureau of the Census (unpublished manuscript), Washington, DC.
Hundepool A, Van deWetering A, Ramaswamy R, Franconi L, Capobianchi A, DeWolf PR Domingo-Ferrer J, Torra V, Brand R, Giessing S (2003). μ-ARGUS version 3.2 software and user manual. Statistics Netherlands. http://neon.vb.cbs.nl/casc.
Huntington DE, Lyrintzis CS (1998). Improvements to and limitations of latin hypercube sampling. Probabilistic Engineering Mechanics, 13(4):245–253.
Karr AF, Sanil AP (2004). Data quality and data confidentiality for microdata: Implications and strategies. Technical Report 149, National Institute of Statistical Sciences, Research Triangle Park, NC 27709-4006 USA.
Kim JJ (1986). A method for limiting disclosure in microdata based on random noise and transformation. In Proc. of the Section on Survey Research Methods, pp. 303–308, Alexandria VA.
Kooiman PL, Willenborg L, Gouweleeuw J (1998). PRAM: A method for disclosure limitation of microdata. Technical report, Statistics Netherlands, Voorburg, NL.
Little RJA, Liu F (2002). Selective multiple imputation of keys for statistical disclosure control in microdata. In Proc. of the Section on Survey Research Methods.
Mateo-Sanz JM, Domingo-Ferrer J, Sebé F (2004). Probabilistic information loss measures for continuous microdata. Technical report, University of Tarragona, Department of Computer Engineering and Mathematics, Research Triangle Park, NC 27709-4006 USA.
Mateo-Sanz JM, Martìnez-Ballesté A, Domingo-Ferrer J (2004). Fast generation of accurate synthetic microdata. In Domingo-Ferrer J, Torra V, editors, Privacy in Statistical Databases, vol. 3050 of LNCS, pp. 298–306. Springer, Berlin Heidelberg.
Oganian A, Domingo-Ferrer J (2001). On the complexity of optimal microaggregation for statistical disclosure control. Statistical Journal of the UNECE, 18(4):345–354.
Polettini S, Franconi L (2002) Simulation methods in data protection: an approach based on maximum entropy. In Proc. of the International Conference of the Royal Statistical Society, Plymouth.
Raughnathan TE, Reiter JP, Rubin DB (2003). Multiple imputation for statistical disclosure limitation. Journal of Official Statitsics, 19(1): 1–16.
Reiss S (1982). Non-reversible privacy transform. In Proc. of the ACM Symposium on Principles of Database Systems, Los Angeles, CA, USA.
Rubin DB (1993). Discussion of statistical disclosure limitation. Journal of Official Statistics, 9(2):461–468.
Samarati P (2001). Protecting respondents’ identities in microdata release. IEEE Transactions on Knowledge and Data Engineering, 13(6):1010–1027.
Computer Science and Telecommunications Board National Research Council, editors (1997). For the record protecting electronic health information. National Accademy Press, Washington, D.C., USA.
Singh AC, Yu F, Dunteman GH (2004). MASSC: A new data mask for limiting statistical information loss and disclosure. In Linden H, Riecan J, Belsby L, editors, Work Session on Statistical Data Confidentiality 2003, pp. 373–394. Eurostat, Luxemburg. Monographs in Official Statistics.
Skinner CJ, Elliot MA (2001). A measure of disclosure risk for microdata. Journal of the Royal Statistical Society, 64(4):855–867.
Sullivan GR (1989). The use of added error to avoid disclosure in microdata releases. Master’s thesis, Iowa State University.
Takemura A (2001). On recent developments in statistical disclosure control techniques. In Proc. of the IAOS Satellite Meeting on Statistics for the Information Society, Tokyo, Japan.
Tendick P (1991). Optimal noise addition for preserving confidentiality in multivariate data. Journal of Statistical Planning and Inference, 27(3):341–353.
Tendick P, Matloff N (1994). A modified random perturbation method for database security. ACM Transactions on Database Systems, 19(1):47–63.
Torra V (2004). Microaggregation for categorical variables: a median based approach. In Domingo-Ferrer J, Torra V, editors, Privacy in Statistical Databases, vol. 3050 of LNCS, pp. 162–174. Springer, Berlin Heidelberg.
Willenborg L, DeWaal T (2001). Elements of Statistical Disclosure Control. Springer-Verlag, New York, USA.
Winkler WE (1999). Re-identification methods for evaluating the confidentiality of analytically valid microdata. In Domingo-Ferrer J, editor, Statistical Data Protection. Office for Official Publications of the European Communities, Luxemburg.
Winkler WE (2004). Masking and re-identification methods for public-use microdata: Overview and research problems. In Domingo-Ferrer J, editor, Privacy in Statistical Databases 2004. Springer, New York.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2007 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Ciriani, V., De Capitani di Vimercati, S., Foresti, S., Samarati, P. (2007). Microdata Protection. In: Yu, T., Jajodia, S. (eds) Secure Data Management in Decentralized Systems. Advances in Information Security, vol 33. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-27696-0_9
Download citation
DOI: https://doi.org/10.1007/978-0-387-27696-0_9
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-27694-6
Online ISBN: 978-0-387-27696-0
eBook Packages: Computer ScienceComputer Science (R0)