Abstract
We propose use of Latin Hypercube Sampling to create a synthetic data set that reproduces many of the essential features of an original data set while providing disclosure protection. The synthetic micro data can also be used to create either additive or multiplicative noise which when merged with the original data can provide disclosure protection. The technique can also be used to create hybrid micro data sets containing pre-determined mixtures of real and synthetic data. We demonstrate the basic properties of the synthetic data approach by applying the Latin Hypercube Sampling technique to a database supported a by the Energy Information Administration. The use of Latin Hypercube Sampling, along with the goal of reproducing the rank correlation structure instead of the Pearson correlation structure, has not been previously applied to the disclosure protection problem. Given its properties, this technique offers multiple alternatives to current methods for providing disclosure protection for large data sets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Dandekar, Ramesh Ch.A. (1993), “Performance Improvement of Restricted Pairing Algorithm for Latin Hypercube Sampling”, ASA Summer conference (unpublished).
Iman R.L. and Conover W. J. (1982), “A Distribution-Free Approach to Inducing Rank Correlation Among Input Variables”, Commun. Stat., B11(3): pp. 311–334.
McKay M.D., Conover W. J., and Beckman, R. J. (1979), “A Comparison of Three Methods for Selecting Values of Input Variables in the Analysis of Output from a Computer Code”, Technometrics 21(2): pp. 239–245.
Stein M. (1987), “Large Sample Properties of Simulations Using Latin Hypercube Sampling”, Technometrics (29)2: pp. 143–151.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Dandekar, R.A., Cohen, M., Kirkendall, N. (2002). Sensitive Micro Data Protection Using Latin Hypercube Sampling Technique. In: Domingo-Ferrer, J. (eds) Inference Control in Statistical Databases. Lecture Notes in Computer Science, vol 2316. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-47804-3_9
Download citation
DOI: https://doi.org/10.1007/3-540-47804-3_9
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43614-0
Online ISBN: 978-3-540-47804-1
eBook Packages: Springer Book Archive