Skip to main content

Improving Individual Risk Estimators

  • Conference paper
Privacy in Statistical Databases (PSD 2006)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4302))

Included in the following conference series:

Abstract

The release of survey microdata files requires a preliminary assessment of the disclosure risk of the data. Record-level risk measures can be useful for “local” protection (e.g. partially synthetic data [21], or local suppression [25]), and are also used in [22] and [16] to produce global risk measures [13] useful to assess data release. Whereas different proposals to estimating such risk measures are available in the literature, so far only a few attempts have been targeted to the evaluation of the statistical properties of these estimators. In this paper we pursue a simulation study that aims to evaluate the statistical properties of risk estimators. Besides presenting results about the Benedetti-Franconi individual risk estimator (see [11]), we also propose a strategy to produce improved risk estimates, and assess the latter by simulation.

The problem of estimating per record reidentification risk enjoys many similarities with that of small area estimation (see [19]): we propose to introduce external information, arising from a previous census, in risk estimation. To achieve this we consider a simple strategy, namely Structure Preserving Estimation (SPREE) of Purcell and Kish [18], and show by simulation that this procedure provides better estimates of the individual risk of reidentification disclosure, especially for records whose risk is high.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abramowitz, M., Stegun, I.A.: Handbook of Mathematical Functions. Dover, New York (1965)

    Google Scholar 

  2. Benedetti, R., Franconi, L.: Statistical and technological solutions for controlled data dissemination. In: Pre-proceedings of New Techniques and Technologies for Statistics, Sorrento, June 4-6, 1998, vol. 1, pp. 225–232 (1998)

    Google Scholar 

  3. Carlson, M.: Assessing microdata disclosure risk using the Poisson-inverse Gaussian distribution. Statistics in Transition 5, 901–925 (2002)

    Google Scholar 

  4. Chen, G., Keller-McNulty, S.: Estimation of identification disclosure risk in microdata. Journal of Official Statistics 14, 79–95 (1998)

    Google Scholar 

  5. Deville, J.C., Särndal, C.E.: Calibration estimators in survey sampling. Journal of the American Statistical Association 87, 367–382 (1992)

    Article  Google Scholar 

  6. Di Consiglio, L., Franconi, L., Seri, G.: Assessing individual risk of disclosure: an experiment. In: Proceedings of the Joint ECE/Eurostat Work Session on Statistical Data Confidentiality, Luxembourg, April 7-9 (2003)

    Google Scholar 

  7. Duncan, G.T., Lambert, D.: Disclosure-limited data dissemination (with comments). Journal of the American Statistical Association 81, 10–27 (1986)

    Article  Google Scholar 

  8. Elamir, E.A.H., Skinner, C.J.: Modeling the re-identification risk per record in microdata. In: 54th Session of the International Statistical Institute, Berlin, August 13-20 (2003)

    Google Scholar 

  9. Fienberg, S.E., Makov, U.E.: Confidentiality, uniqueness, and disclosure limitation for categorical data. Journal of Official Statistics 14, 385–397 (1998)

    Google Scholar 

  10. Forster, J.J.: Bayesian methods for disclosure risk assessment. In: Proceedings of the Joint ECE/Eurostat Work Session on Statistical Data Confidentiality, Geneva, November 9-11, 2005, pp. 99–108. Luxembourg (2005)

    Google Scholar 

  11. Franconi, L., Polettini, S.: Individual risk estimation in μ-Argus: A review. In: Domingo-Ferrer, J., Torra, V. (eds.) PSD 2004. LNCS, vol. 3050, pp. 262–272. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  12. Hundepool, A.: The CASC Project. In: Domingo-Ferrer, J. (ed.) Inference Control in Statistical Databases. LNCS, vol. 2316, pp. 172–180. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  13. Lambert, D.: Measures of disclosure risk and harm. Journal of Official Statistics 9, 313–331 (1993)

    Google Scholar 

  14. Madow, W.G.: On the theory of systematic sampling ii. The Annals of Mathematical Statistics 20, 333–354 (1949)

    Article  MATH  MathSciNet  Google Scholar 

  15. Omori, Y.: Measuring identification disclosure risk for categorical microdata by posterior population uniqueness. In: Proceedings of the Conference on Statistical Data Protection, Lisbon, March, 25-27, 1998, pp. 59–76. Eurostat, Luxembourg (1999)

    Google Scholar 

  16. Polettini, S.: Some remarks on the individual risk methodology. In: Proceedings of the Joint ECE/Eurostat Work Session on Statistical Data Confidentiality, Luxembourg, April 7-9 (2003)

    Google Scholar 

  17. Polettini, S.: Revision of Guidelines for the protection of social micro-data using individual risk methodology: Application within μ-Argus version 3.2, by S. Polettini and G. seri. CASC-Computational Aspects of Statistical Confidentiality Deliverable No: 1.2-D3 (2004), available at http://neon.vb.cbs.nl/casc/deliv/CASC_1.2D3_guidelines_new.pdf

  18. Purcell, N.J., Kish, L.: Postcensal estimates for local areas (small domains). International Statistical Review 48, 3–18 (1980)

    Article  MATH  Google Scholar 

  19. Rao, J.N.K.: Small area estimation. John Wiley & Sons, Hoboken (2003)

    Book  MATH  Google Scholar 

  20. Reiter, J.P.: Estimating risks of identification disclosure for microdata. Journal of the American Statistical Association 100, 1103–1113 (2005)

    Article  MATH  MathSciNet  Google Scholar 

  21. Reiter, J.P.: Releasing multiply-imputed, synthetic public use microdata: An illustration and empirical study. Journal of the Royal Statistical Society, Series A 168 (2005)

    Google Scholar 

  22. Rinott, Y.: On models for statistical disclosure risk estimation. In: Proceedings of the Joint ECE/Eurostat Work Session on Statistical Data Confidentiality, Luxembourg, Luxembourg, April 7-9 (2003)

    Google Scholar 

  23. Skinner, C.J., Elliot, M.J.: A measure of disclosure risk for microdata. Journal of the Royal Statistical Society, Series B 64, 855–867 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  24. Skinner, C.J., Holmes, D.J.: Estimating the re-identification risk per record in microdata. Journal of Official Statistics 14, 361–372 (1998)

    Google Scholar 

  25. Willenborg, L., de Waal, T.: Elements of Statistical Disclosure Control. Springer, New York (2001)

    MATH  Google Scholar 

  26. Zhang, L., Chambers, R.L.: Small area estimates for cross-classifications. J. R. Stat. Soc. Ser. B Stat. Methodol. 66(2), 479–496 (2004)

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Di Consiglio, L., Polettini, S. (2006). Improving Individual Risk Estimators. In: Domingo-Ferrer, J., Franconi, L. (eds) Privacy in Statistical Databases. PSD 2006. Lecture Notes in Computer Science, vol 4302. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11930242_21

Download citation

  • DOI: https://doi.org/10.1007/11930242_21

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-49330-3

  • Online ISBN: 978-3-540-49332-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics