Improving Individual Risk Estimators

Di Consiglio, Loredana; Polettini, Silvia

doi:10.1007/11930242_21

Loredana Di Consiglio¹⁸ &
Silvia Polettini¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4302))

Included in the following conference series:

International Conference on Privacy in Statistical Databases

762 Accesses
2 Citations

Abstract

The release of survey microdata files requires a preliminary assessment of the disclosure risk of the data. Record-level risk measures can be useful for “local” protection (e.g. partially synthetic data [21], or local suppression [25]), and are also used in [22] and [16] to produce global risk measures [13] useful to assess data release. Whereas different proposals to estimating such risk measures are available in the literature, so far only a few attempts have been targeted to the evaluation of the statistical properties of these estimators. In this paper we pursue a simulation study that aims to evaluate the statistical properties of risk estimators. Besides presenting results about the Benedetti-Franconi individual risk estimator (see [11]), we also propose a strategy to produce improved risk estimates, and assess the latter by simulation.

The problem of estimating per record reidentification risk enjoys many similarities with that of small area estimation (see [19]): we propose to introduce external information, arising from a previous census, in risk estimation. To achieve this we consider a simple strategy, namely Structure Preserving Estimation (SPREE) of Purcell and Kish [18], and show by simulation that this procedure provides better estimates of the individual risk of reidentification disclosure, especially for records whose risk is high.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Abramowitz, M., Stegun, I.A.: Handbook of Mathematical Functions. Dover, New York (1965)
Google Scholar
Benedetti, R., Franconi, L.: Statistical and technological solutions for controlled data dissemination. In: Pre-proceedings of New Techniques and Technologies for Statistics, Sorrento, June 4-6, 1998, vol. 1, pp. 225–232 (1998)
Google Scholar
Carlson, M.: Assessing microdata disclosure risk using the Poisson-inverse Gaussian distribution. Statistics in Transition 5, 901–925 (2002)
Google Scholar
Chen, G., Keller-McNulty, S.: Estimation of identification disclosure risk in microdata. Journal of Official Statistics 14, 79–95 (1998)
Google Scholar
Deville, J.C., Särndal, C.E.: Calibration estimators in survey sampling. Journal of the American Statistical Association 87, 367–382 (1992)
Article Google Scholar
Di Consiglio, L., Franconi, L., Seri, G.: Assessing individual risk of disclosure: an experiment. In: Proceedings of the Joint ECE/Eurostat Work Session on Statistical Data Confidentiality, Luxembourg, April 7-9 (2003)
Google Scholar
Duncan, G.T., Lambert, D.: Disclosure-limited data dissemination (with comments). Journal of the American Statistical Association 81, 10–27 (1986)
Article Google Scholar
Elamir, E.A.H., Skinner, C.J.: Modeling the re-identification risk per record in microdata. In: 54th Session of the International Statistical Institute, Berlin, August 13-20 (2003)
Google Scholar
Fienberg, S.E., Makov, U.E.: Confidentiality, uniqueness, and disclosure limitation for categorical data. Journal of Official Statistics 14, 385–397 (1998)
Google Scholar
Forster, J.J.: Bayesian methods for disclosure risk assessment. In: Proceedings of the Joint ECE/Eurostat Work Session on Statistical Data Confidentiality, Geneva, November 9-11, 2005, pp. 99–108. Luxembourg (2005)
Google Scholar
Franconi, L., Polettini, S.: Individual risk estimation in μ-Argus: A review. In: Domingo-Ferrer, J., Torra, V. (eds.) PSD 2004. LNCS, vol. 3050, pp. 262–272. Springer, Heidelberg (2004)
Chapter Google Scholar
Hundepool, A.: The CASC Project. In: Domingo-Ferrer, J. (ed.) Inference Control in Statistical Databases. LNCS, vol. 2316, pp. 172–180. Springer, Heidelberg (2002)
Chapter Google Scholar
Lambert, D.: Measures of disclosure risk and harm. Journal of Official Statistics 9, 313–331 (1993)
Google Scholar
Madow, W.G.: On the theory of systematic sampling ii. The Annals of Mathematical Statistics 20, 333–354 (1949)
Article MATH MathSciNet Google Scholar
Omori, Y.: Measuring identification disclosure risk for categorical microdata by posterior population uniqueness. In: Proceedings of the Conference on Statistical Data Protection, Lisbon, March, 25-27, 1998, pp. 59–76. Eurostat, Luxembourg (1999)
Google Scholar
Polettini, S.: Some remarks on the individual risk methodology. In: Proceedings of the Joint ECE/Eurostat Work Session on Statistical Data Confidentiality, Luxembourg, April 7-9 (2003)
Google Scholar
Polettini, S.: Revision of Guidelines for the protection of social micro-data using individual risk methodology: Application within μ-Argus version 3.2, by S. Polettini and G. seri. CASC-Computational Aspects of Statistical Confidentiality Deliverable No: 1.2-D3 (2004), available at http://neon.vb.cbs.nl/casc/deliv/CASC_1.2D3_guidelines_new.pdf
Purcell, N.J., Kish, L.: Postcensal estimates for local areas (small domains). International Statistical Review 48, 3–18 (1980)
Article MATH Google Scholar
Rao, J.N.K.: Small area estimation. John Wiley & Sons, Hoboken (2003)
Book MATH Google Scholar
Reiter, J.P.: Estimating risks of identification disclosure for microdata. Journal of the American Statistical Association 100, 1103–1113 (2005)
Article MATH MathSciNet Google Scholar
Reiter, J.P.: Releasing multiply-imputed, synthetic public use microdata: An illustration and empirical study. Journal of the Royal Statistical Society, Series A 168 (2005)
Google Scholar
Rinott, Y.: On models for statistical disclosure risk estimation. In: Proceedings of the Joint ECE/Eurostat Work Session on Statistical Data Confidentiality, Luxembourg, Luxembourg, April 7-9 (2003)
Google Scholar
Skinner, C.J., Elliot, M.J.: A measure of disclosure risk for microdata. Journal of the Royal Statistical Society, Series B 64, 855–867 (2002)
Article MATH MathSciNet Google Scholar
Skinner, C.J., Holmes, D.J.: Estimating the re-identification risk per record in microdata. Journal of Official Statistics 14, 361–372 (1998)
Google Scholar
Willenborg, L., de Waal, T.: Elements of Statistical Disclosure Control. Springer, New York (2001)
MATH Google Scholar
Zhang, L., Chambers, R.L.: Small area estimates for cross-classifications. J. R. Stat. Soc. Ser. B Stat. Methodol. 66(2), 479–496 (2004)
Article MATH MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

ISTAT, Servizio Progettazione e Supporto Metodologico, nei Processi di Produzione Statistica, Via Cesare Balbo 16, 00184, Roma, Italy
Loredana Di Consiglio
Dipartimento di Scienze Statistiche, Università degli Studi di Napoli Federico II, Via L. Rodinò 22, 80128, Napoli, Italy
Silvia Polettini

Authors

Loredana Di Consiglio
View author publications
You can also search for this author in PubMed Google Scholar
Silvia Polettini
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Engineering and Mathematics, Universitat Rovira i Virgili, UNESCO Chair in Data Privacy, Av. Països Catalans 26, E-43007, Tarragona, Catalonia
Josep Domingo-Ferrer
Istat, Servizio Progettazione e Supporto Metodologico, nei Processi di Produzione Statistica, Via Cesare Balbo 16, 00184, Roma, Italy
Luisa Franconi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Di Consiglio, L., Polettini, S. (2006). Improving Individual Risk Estimators. In: Domingo-Ferrer, J., Franconi, L. (eds) Privacy in Statistical Databases. PSD 2006. Lecture Notes in Computer Science, vol 4302. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11930242_21

Download citation

DOI: https://doi.org/10.1007/11930242_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-49330-3
Online ISBN: 978-3-540-49332-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics