A well known method for privacy-preserving data mining is that of randomization. In randomization, we add noise to the data so that the behavior of the individual records is masked. However, the aggregate behavior of the data distribution can be reconstructed by subtracting out the noise from the data. The reconstructed distribution is often sufficient for a variety of data mining tasks such as classification. In this chapter, we will provide a survey of the randomization method for privacy-preserving data mining.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Aggarwal C. C.: On Randomization, Public Information and the Curse of Dimensionality. ICDE Conference, 2007.
Aggarwal C. C., Yu P. S.: On Privacy-Preservation of Text and Sparse Binary Data with Sketches. SIAM Conference on Data Mining, 2007.
Agrawal R., Srikant R. Privacy-Preserving Data Mining. Proceedings of the ACM SIGMOD Conference, 2000.
Agrawal R., Srikant R., Thomas D. Privacy-Preserving OLAP. Proceedings of the ACM SIGMOD Conference, 2005.
Agrawal D. Aggarwal C. C. On the Design and Quantification of Privacy-Preserving Data Mining Algorithms. ACM PODS Conference, 2002.
Chen K., Liu L.: Privacy-preserving data classification with rotation perturbation. ICDM Conference, 2005.
Evfimievski A., Gehrke J., Srikant R. Limiting Privacy Breaches in Privacy Preserving Data Mining. ACM PODS Conference, 2003.
Evfimievski A., Srikant R., Agrawal R., Gehrke J.: Privacy-Preserving Mining of Association Rules. ACM KDD Conference, 2002.
Fienberg S., McIntyre J.: Data Swapping: Variations on a Theme by Dalenius and Reiss. Technical Report, National Institute of Statistical Sciences, 2003.
Gambs S., Kegl B., Aimeur E.: Privacy-Preserving Boosting. Knowledge Discovery and Data Mining Journal, to appear.
Huang Z., Du W., Chen B.: Deriving Private Information from Randomized Data. pp. 37–48, ACM SIGMOD Conference, 2005.
Warner S. L. Randomized Response: A survey technique for eliminating evasive answer bias. Journal of American Statistical Association, 60(309):63–69, March 1965.
Johnson W., Lindenstrauss J.: Extensions of Lipshitz Mapping into Hilbert Space, Contemporary Math. vol. 26, pp. 189–206, 1984.
Kargupta H., Datta S., Wang Q., Sivakumar K.: On the Privacy Preserving Properties of Random Data Perturbation Techniques. ICDM Conference, pp. 99–106, 2003.
Kim J., Winkler W.: Multiplicative Noise for Masking Continuous Data, Technical Report Statistics 2003-01, Statistical Research Division, US Bureau of the Census, Washington D.C., Apr. 2003.
Liew C. K., Choi U. J., Liew C. J. A data distortion by probability distribution. ACM TODS, 10(3):395–411, 1985.
Liu K., Kargupta H., Ryan J.: Random Projection Based Multiplicative Data Perturbation for Privacy Preserving Distributed Data Mining. IEEE Transactions on Knowledge and Data Engineering, 18(1), 2006.
Liu K., Giannella C., Kargupta H.: An Attacker’s View of Distance Preserving Maps for Privacy-Preserving Data Mining. PKDD Conference, 2006.
Mukherjee S., Chen Z., Gangopadhyay S.: A privacy-preserving technique for Euclidean distance-based mining algorithms using Fourier based transforms, VLDB Journal, 2006.
Oliveira S. R. M., Zaane O.: Privacy Preserving Clustering by Data Transformation, Proc. 18th Brazilian Symp. Databases, pp. 304–318, Oct. 2003.
Oliveira S. R. M., Zaiane O.: Data Perturbation by Rotation for Privacy-Preserving Clustering, Technical Report TR04–17, Department of Computing Science, University of Alberta, Edmonton, AB, Canada, August 2004.
Polat H., Du W.: SVD-based collaborative filtering with privacy. ACM SAC Symposium, 2005.
Polat H., Du W.: Privacy-preserving collaborative filtering with randomized perturbation techniques. ICDM Conference, 2003.
Rizvi S., Haritsa J.: Maintaining Data Privacy in Association Rule Mining. VLDB Conference, 2002.
Samarati P.: Protecting Respondents’ Identities in Microdata Release. IEEE Trans. Knowl. Data Eng. 13(6): 1010–1027 (2001).
Shannon C. E.: The Mathematical Theory of Communication, University of Illinois Press, 1949.
Silverman B. W.: Density Estimation for Statistics and Data Analysis. Chapman and Hall, 1986.
Li F., Sun J., Papadimitriou S., Mihaila G., Stanoi I.: Hiding in the Crowd: Privacy Preservation on Evolving Streams through Correlation Tracking. ICDE Conference, 2007.
Zhang P., Tong Y., Tang S., Yang D.: Privacy-Preserving Naive Bayes Classifier. Lecture Notes in Computer Science, Vol 3584, 2005.
Zhu Y., Liu L. Optimal Randomization for Privacy- Preserving Data Mining. ACM KDD Conference, 2004.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Aggarwal, C.C., Yu, P.S. (2008). A Survey of Randomization Methods for Privacy-Preserving Data Mining. In: Aggarwal, C.C., Yu, P.S. (eds) Privacy-Preserving Data Mining. Advances in Database Systems, vol 34. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-70992-5_6
Download citation
DOI: https://doi.org/10.1007/978-0-387-70992-5_6
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-70991-8
Online ISBN: 978-0-387-70992-5
eBook Packages: Computer ScienceComputer Science (R0)