Skip to main content

Propensity Score Based Conditional Group Swapping for Disclosure Limitation of Strata-Defining Variables

  • Conference paper
  • First Online:
Privacy in Statistical Databases (PSD 2016)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9867))

Included in the following conference series:

  • 846 Accesses

Abstract

In this paper we propose a method for statistical disclosure limitation of categorical variables that we call Conditional Group Swapping. This approach is suitable for design and strata-defining variables, the cross-classification of which leads to the formation of important groups or subpopulations. These groups are considered important because from the point of view of data analysis it is desirable to preserve analytical characteristics within them. In general data swapping can be quite distorting [13, 16, 20], especially for the relationships between the variables not only within the subpopulations but for the overall data. To reduce the damage incurred by swapping, we propose to choose the records for swapping using conditional probabilities which depend on the characteristics of the exchanged records. In particular, our approach exploits the results of propensity scores methodology for the computation of swapping probabilities. The experimental results presented in the paper show good utility properties of the method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Brand, R.: Microdata protection through noise addition. In: Domingo-Ferrer, J. (ed.) Inference Control in Statistical Databases. LNCS, vol. 2316, pp. 97–116. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  2. Dalenius, T., Reiss, S.P.: Data-swapping: A technique for disclosure control. J. Stat. Plann. Infer. 6, 73–85 (1982)

    Article  MathSciNet  MATH  Google Scholar 

  3. Dandekar, R.A., Cohen, M., Kirkendall, N.: Sensitive micro data protection using latin hypercube sampling technique. In: Domingo-Ferrer, J. (ed.) Inference Control in Statistical Databases. LNCS, vol. 2316, pp. 117–125. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  4. Defays, D., Anwar, N.: Micro-aggregation: a generic method. In: Proceedings of the 2nd International Symposium on Statistical Confidentiality, pp. 69–78. Office for Official Publications of the European Community, Luxembourg (1995)

    Google Scholar 

  5. Drechsler, J.: Synthetic Datasets for Statistical Disclosure Control: Theory and Implementation. Springer, New York (2011)

    Book  MATH  Google Scholar 

  6. Elinder, M., Erixson, O.: Gender, social norms, and survival in maritime disasters. Proc Nat. Acad. Sci. USA 109(33), 13220–13224 (2012)

    Article  Google Scholar 

  7. Fellegi, I.P., Sunter, A.B.: A theory for record linkage. J. Am. Stat. Assoc. 64, 1183–1210 (1969)

    Article  MATH  Google Scholar 

  8. Gomatam, S., Karr, A.F., Chunhua, L., Sanil, A.: Data swapping: a risk-utility framework and web service implementation. Technical Report 134, National Institute of Statistical Sciences, Research Triangle Park, NC (2003)

    Google Scholar 

  9. Hundepool, A., Domingo-Ferrer, J., Franconi, L., Giessing, S., Lenz, R., Longhurst, J., Schulte-Nordholt, E., Seri, G., DeWolf, P.-P.: Handbook on Statistical Disclosure Control (version 1.2). ESSNET SDC project (2010). http://neon.vb.cbs.nl/casc

  10. Hundepool, A., Domingo-Ferrer, J., Franconi, L., Giessing, S., Schulte Nordholt, E., Spicer, K., Wolf, P.-P.: Statistical Disclosure Control. Wiley, New York (2012)

    Book  Google Scholar 

  11. Jaro, M.A.: Advances in record-linkage methodology as applied to matching the 1985 Census of Tampa, Florida. J. Am. Stat. Assoc. 84, 414–420 (1989)

    Article  Google Scholar 

  12. Kaggle. The Home of Data Science. http://www.kaggle.com

  13. Karr, A.F., Kohnen, C.N., Oganian, A., Reiter, J.P., Sanil, A.P.: A framework for evaluating the utility of data altered to protect confidentiality. Am. Stat. 60(3), 224–232 (2006)

    Article  MathSciNet  Google Scholar 

  14. Kim, J.J.: A method for limiting disclosure in microdata based on random noise and transformation. In: Proceedings of the ASA Section on Survey Research Methodology, pp. 303–308 (1986)

    Google Scholar 

  15. Lin, Y.-X.: Density approximant based on noise multiplied data. In: Domingo-Ferrer, J. (ed.) PSD 2014. LNCS, vol. 8744, pp. 89–104. Springer, Heidelberg (2014)

    Google Scholar 

  16. Mitra, R., Reiter, J.P.: Adjusting survey weights when altering identifying design variables via synthetic data. In: Domingo-Ferrer, J., Franconi, L. (eds.) PSD 2006. LNCS, vol. 4302, pp. 177–188. Springer, Heidelberg (2006)

    Google Scholar 

  17. Moor, R.: Controlled data swapping techniques for masking public use microdata sets. U.S. Census Bureau (1996)

    Google Scholar 

  18. Muralidhar, K., Sarathy, R.: Data shuffling: a new masking approach for numerical data. Manag. Sci. 52(5), 658–670 (2006)

    Article  Google Scholar 

  19. Oganian, A.: Security and Information Loss in Statistical Database Protection. Ph.D. thesis, Universitat Politecnica de Catalunya (2003)

    Google Scholar 

  20. Oganian, A., Karr, A.F.: Combinations of SDC methods for microdata protection. In: Domingo-Ferrer, J., Franconi, L. (eds.) PSD 2006. LNCS, vol. 4302, pp. 102–113. Springer, Heidelberg (2006)

    Google Scholar 

  21. Oganian, A., Karr, A.F.: Masking methods that preserve positivity constraints in microdata. J. Stat. Plann. Infer. 141(1), 31–41 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  22. Reiss, S.P., Post, M.J., Dalenius, T.: Non-reversible privacy transformations. In: Proceedings of the ACM Symposium on Principles of Database Systems, 29–31 March, pp. 139–146 (1982)

    Google Scholar 

  23. Rosenbaum, P.R., Rubin, D.B.: The Central Role of the propensity score in observational studies for Causal Effects. Biometrika 70, 41–55 (1983)

    Article  MathSciNet  MATH  Google Scholar 

  24. Takemura, A.: Local recoding and record swapping by maximum weight matching for disclosure control of microdata sets. J. Offic. Stat. 18, 275–289 (2002)

    MathSciNet  Google Scholar 

  25. Templ, M.: Statistical disclosure control for microdata using the R-package sdcMicro. Trans. Data Priv. 1(2), 67–85 (2008)

    MathSciNet  Google Scholar 

  26. Torra, V.: Microaggregation for categorical variables: a median based approach. In: Domingo-Ferrer, J., Torra, V. (eds.) PSD 2004. LNCS, vol. 3050, pp. 162–174. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  27. Valliant, R., Dever, J.A., Kreuter, F.: Package ‘PracTools’: Tools for Designing and Weighting Survey Samples (2015). https://cran.r-project.org/web/packages/PracTools/PracTools.pdf

  28. Woo, M.-J., Reiter, J.P., Oganian, A., Karr, A.F.: Global measures of data utility for microdata masked for disclosure limitation. J. Priv. Confidentiality 1(1), 111–124 (2009)

    Google Scholar 

Download references

Acknowledgments

The authors would like to thank Alan Dorfman and Van Parsons for valuable suggestions and help during the preparation of the paper. The findings and conclusions in this paper are those of of the authors and do not necessarily represent the official position of the Centers for Disease Control and Prevention.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anna Oganian .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Oganian, A., Lesaja, G. (2016). Propensity Score Based Conditional Group Swapping for Disclosure Limitation of Strata-Defining Variables. In: Domingo-Ferrer, J., Pejić-Bach, M. (eds) Privacy in Statistical Databases. PSD 2016. Lecture Notes in Computer Science(), vol 9867. Springer, Cham. https://doi.org/10.1007/978-3-319-45381-1_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-45381-1_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-45380-4

  • Online ISBN: 978-3-319-45381-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics