Skip to main content

Why Swap When You Can Shuffle? A Comparison of the Proximity Swap and Data Shuffle for Numeric Data

  • Conference paper
Privacy in Statistical Databases (PSD 2006)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4302))

Included in the following conference series:

Abstract

The rank based proximity swap has been suggested as a data masking mechanism for numerical data. Recently, more sophisticated procedures for masking numerical data that are based on the concept of “shuffling” the data have been proposed. In this study, we compare and contrast the performance of the swapping and shuffling procedures. The results indicate that the shuffling procedures perform better than data swapping both in terms of data utility and disclosure risk.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Burridge, J.: Information Preserving Statistical Obfuscation. Statistics and Computing 13, 321–327 (2003)

    Article  MathSciNet  Google Scholar 

  2. Carlson, M., Salabasis, M.: A data swapping technique for generating synthetic samples: A method for disclosure control. Research in Official Statistics 6, 35–64 (2002)

    Google Scholar 

  3. Dalenius, T., Reiss, S.P.: Data-swapping: A Technique for Disclosure Control. Journal of Statistical Planning and Inference 6, 73–85 (1982)

    Article  MATH  MathSciNet  Google Scholar 

  4. Dandekar, R.A., Cohen, M., Kirkendall, N.: Sensitive Microdata Protection Using Latin Hypercube Sampling Technique. In: Domingo-Ferrer, J. (ed.) Inference Control in Statistical Databases, Springer, New York (2002)

    Google Scholar 

  5. Domingo-Ferrer, J., Torra, V.: Disclosure control methods and information loss for microdata. In: Doyle, P., Lane, J., Theeuwes, J., Zayatz, L. (eds.) Confidentiality, Disclosure and Data Access, pp. 91–110. North-Holland, Amsterdam (2001)

    Google Scholar 

  6. Fienberg, S.E., McIntyre, J.: Data swapping: Variations on a theme by Dalenius and Reiss. Journal of Official Statistics 21, 309–323 (2005)

    Google Scholar 

  7. Fuller, W.A.: Masking procedures for microdata disclosure limitation. Journal of Official Statistics 9, 383–406 (1993)

    Google Scholar 

  8. Iman, R.L., Conover, W.J.: A distribution free approach to inducing rank correlation among input variables. Communication in Statistics B11, 311–334 (1982)

    Google Scholar 

  9. McKay, M.D., Conover, W.J., Beckman, R.J.: A comparison of three methods for selecting values of input variables in the analysis of output from a computer code. Technometrics 21, 239–245 (1979)

    Article  MATH  MathSciNet  Google Scholar 

  10. Moore, R.A.: Controlled data swapping for masking public use microdatasets. U.S. Census Bureau Research Report 96/04 (1996)

    Google Scholar 

  11. Muralidhar, K., Sarathy, R.: Application of the Two-step Data Shuffle to the 1993 AHS Data: A Report on the Feasibility of Applying Data Shuffling for Microdata Release, research report prepared for the Census Bureau (2002), http://gatton.uky.edu/faculty/muralidhar/maskingpapers/

  12. Muralidhar, K., Sarathy, R.: A theoretical basis for perturbation methods. Statistics and Computing 13, 329–335 (2003)

    Article  MathSciNet  Google Scholar 

  13. Muralidhar, K., Sarathy, R.: Data Shuffling - A New Masking Approach for Numerical Data. Management Science 52, 658–670 (2006)

    Article  Google Scholar 

  14. Reiss, S.P., Post, M.J., Dalenius, T.: Non-reversible privacy transformations. In: Proceedings of the ACM Symposium on Principles of Database Systems, Los Angeles, CA, pp. 139–146 (1982)

    Google Scholar 

  15. Sarathy, R., Muralidhar, K., Parsa, R.: Perturbing non-normal confidential variables: The copula approach. Management Science 48, 1613–1627 (2002)

    Article  Google Scholar 

  16. Sarathy, R., Muralidhar, K.: The Security of Confidential Numerical Data in Databases. Information Systems Research 389-403 (2002)

    Google Scholar 

  17. Wall Street Journal, Bureau Blurs Data to Keep Names Confidential. B1-B2 (February 14, 2001)

    Google Scholar 

  18. Winkler, W.E.: Advanced methods for record linkage. In: Proceedings of the American Statistical Association Section on Survey Research Methods, pp. 467–472 (1995)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Muralidhar, K., Sarathy, R., Dandekar, R. (2006). Why Swap When You Can Shuffle? A Comparison of the Proximity Swap and Data Shuffle for Numeric Data. In: Domingo-Ferrer, J., Franconi, L. (eds) Privacy in Statistical Databases. PSD 2006. Lecture Notes in Computer Science, vol 4302. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11930242_15

Download citation

  • DOI: https://doi.org/10.1007/11930242_15

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-49330-3

  • Online ISBN: 978-3-540-49332-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics