Skip to main content

Model Based Disclosure Protection

  • Chapter
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2316))

Abstract

We argue that any microdata protection strategy is based on a formal reference model. The extent of model specification yields “parametric”, “semiparametric”, or “nonparametric” strategies. Following this classification, a parametric probability model, such as a normal regression model, or a multivariate distribution for simulation can be specified. Matrix masking (Cox [2]), covering local suppression, coarsening, microaggregation (Domingo-Ferrer [8]), noise injection, perturbation (e.g. Kim [15]; Fuller [12]), provides examples of the second and third class of models. Finally, a nonparametric approach, e.g. use of bootstrap procedures for generating synthetic microdata (e.g. Dandekar et. al. [4]) can be adopted.

In this paper we discuss the application of a regression based imputation procedure for business microdata to the Italian sample from the Community Innovation Survey. A set of regressions (Franconi and Stander [11]) is used for generating flexible perturbation, for the protection varies according to identifiability of the enterprise; a spatial aggregation strategy is also proposed, based on principal components analysis. The inferential usefulness of the released data and the protection achieved by the strategy are evaluated.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Brand, R.: Microdata protection through noise addition. In: “Inference Control in Statistical Databases”, LNCS 2316, Springer-Verlag (2002), 97–116.

    Chapter  Google Scholar 

  2. Cox, L.H.: Matrix masking methods for disclosure limitation in microdata. Surv. Method. 20 (1994) 165–169.

    Google Scholar 

  3. Cox, L.H.: Towards a Bayesian Perspective on Statistical Disclosure Limitation. Paper presented at ISBA 2000—The Sixth World Meeting of the International Society for Bayesian Analysis (2000).

    Google Scholar 

  4. Dandekar, R., Cohen, M., Kirkendall, N.: Applicability of Latin Hypercube Sampling to create multi variate synthetic micro data. In: ETK-NTTS 2001 Preproceedings of the Conference. European Communities Luxembourg (2001) 839–847.

    Google Scholar 

  5. Dandekar, R., Cohen, M., Kirkendall, N.: Sensitive micro data protection using Latin Hypercube Sampling technique. In: “Inference Control in Statistical Databases”, LNCS 2316, Springer-Verlag (2002), 117–125.

    Chapter  Google Scholar 

  6. Duncan, G.T. and Mukherjee S.: Optimal disclosure limitation strategy in statistical databases: deterring tracker attacks through additive noise. J. Am. Stat. Ass. 95 (2000) 720–729.

    Article  Google Scholar 

  7. Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Stat. Soc. B 40 (1977) 1–38.

    MathSciNet  Google Scholar 

  8. Domingo-Ferrer, J., Mateo-Sanz, J.M.: Practical data-oriented microaggregation for statistical disclosure control. IEEE Transactions on Knowledge and Data Engineering In Press (2001).

    Google Scholar 

  9. Fienberg, S.E., Makov, U., Steele, R.J.: Disclosure limitation using perturbation and related methods for categorical data (with discussion). J. O.. Stat. 14 (1998) 485–502.

    Google Scholar 

  10. Franconi, L., Stander, J.: Model based disclosure limitation for business microdata. In: Proceedings of the International Conference on Establishment Surveys-II, June 17–21, 2000 Buffalo, New York (2000) 887–896.

    Google Scholar 

  11. Franconi, L., Stander, J.: A model based method for disclosure limitation of business microdata. J. Roy. Stat. Soc. D Statistician 51 (2002) 1–11.

    Article  MathSciNet  Google Scholar 

  12. Fuller, W.A.: Masking procedures for microdata disclosure limitation. J. O.. Stat. 9 (1993) 383–406.

    Google Scholar 

  13. Grim, J., Bocek, P., Pudil, P.: Safe dissemination of census results by means of Interactive Probabilistic Models. In: ETK-NTTS 2001 Pre-proceedings of the Conference. European Communities Luxembourg (2001) 849–856.

    Google Scholar 

  14. Kennickell, A.B.: Multiple imputation and disclosure protection. In: Proceedings of the Conference on Statistical Data Protection, March, 25–27, 1998 Lisbon (1999) 381–400.

    Google Scholar 

  15. Kim, J.: A method for limiting disclosure of microdata based on random noise and transformation. In: Proceedings of the Survey Research Methods Section, American Statistical Association (1986) 370–374.

    Google Scholar 

  16. Little, R.J.A.: Statistical analysis of masked data. J. O.. Stat. 9 (1993) 407–426.

    Google Scholar 

  17. Little, R.J.A., Rubin, D.B.: Statistical Analysis with Missing Data. John Wiley New York (1987).

    MATH  Google Scholar 

  18. Raghunathan T., Rubin, D.B.: Bayesian multiple imputation to Preserve Confidentiality in Public-Use Data Sets. In: Proceedings of ISBA 2000—The Sixth World Meeting of the International Society for Bayesian Analysis. European Communities Luxembourg (2000).

    Google Scholar 

  19. Rubin, D.B.: Discussion of “Statistical disclosure limitation”. J. O.. Stat. 9 (1993) 461–468.

    Google Scholar 

  20. Winkler, W.E., Yancey, W.E., Creecy, R.H.: Disclosure risk assessment in perturbative microdata protection via record linkage. In: “Inference Control in Statistical Databases”, LNCS 2316, Springer-Verlag (2002), 135–152.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Polettini, S., Franconi, L., Stander, J. (2002). Model Based Disclosure Protection. In: Domingo-Ferrer, J. (eds) Inference Control in Statistical Databases. Lecture Notes in Computer Science, vol 2316. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-47804-3_7

Download citation

  • DOI: https://doi.org/10.1007/3-540-47804-3_7

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-43614-0

  • Online ISBN: 978-3-540-47804-1

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics