Skip to main content

A Survey of Multiplicative Perturbation for Privacy-Preserving Data Mining

  • Chapter
Privacy-Preserving Data Mining

Part of the book series: Advances in Database Systems ((ADBS,volume 34))

The major challenge of data perturbation is to achieve the desired balance between the level of privacy guarantee and the level of data utility. Data privacy and data utility are commonly considered as a pair of conflicting requirements in privacy-preserving data mining systems and applications. Multiplicative perturbation algorithms aim at improving data privacy while maintaining the desired level of data utility by selectively preserving the mining task and model specific information during the data perturbation process. By preserving the task and model specific information, a set of “transformation-invariant data mining models” can be applied to the perturbed data directly, achieving the required model accuracy. Often a multiplicative perturbation algorithm may find multiple data transformations that preserve the required data utility. Thus the next major challenge is to find a good transformation that provides a satisfactory level of privacy guarantee. In this chapter, we review three representative multiplicative perturbation methods: rotation perturbation, projection perturbation, and geometric perturbation, and discuss the technical issues and research challenges. We first describe the mining task and model specific information for a class of data mining models, and the transformations that can (approximately) preserve the information. Then we discuss the design of appropriate privacy evaluation models for multiplicative perturbations, and give an overview of how we use the privacy evaluation model to measure the level of privacy guarantee in the context of different types of attacks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aggarwal, C. C., and Yu, P. S. A condensation approach to privacy preserving data mining. Proc. of Intl. Conf. on Extending Database Technology (EDBT) 2992 (2004), 183–199.

    Google Scholar 

  2. Aggarwal, C. C., and Yu, P. S. On privacy-preservation of text and sparse binary data with sketches. SIAM Data Mining Conference (2007).

    Google Scholar 

  3. Agrawal, D., and Aggarwal, C. C. On the design and quantification of privacy preserving data mining algorithms. Proc. of ACM PODS Conference (2002).

    Google Scholar 

  4. Agrawal, R., and Srikant, R. Privacy-preserving data mining. Proc. of ACM SIGMOD Conference (2000).

    Google Scholar 

  5. Alon, N., Matias, Y., and Szegedy, M. The space complexity of approximating the frequency moments. Proc. of ACM PODS Conference (1996).

    Google Scholar 

  6. Ankerst, M., Breunig, M. M., Kriegel, H.-P., and Sander, J. OPTICS: Ordering points to identify the clustering structure. Proc. of ACM SIGMOD Conference (1999), 49–60.

    Google Scholar 

  7. Chen, K., and Liu, L. A random geometric perturbation approach to privacy-preserving data classification. Technical Report, College of Computing, Georgia Tech (2005).

    Google Scholar 

  8. Chen, K., and Liu, L. A random rotation perturbation approach to privacy preserving data classification. Proc. of Intl. Conf. on Data Mining (ICDM) (2005).

    Google Scholar 

  9. Chen, K., and Liu, L. Towards attack-resilient geometric data perturbation. SIAM Data Mining Conference (2007).

    Google Scholar 

  10. Cristianini, N., and Shawe-Taylor, J. An Introduction to Support Vector Machines and Other Kernel-based Learning Methods. Cambridge University Press, 2000.

    Google Scholar 

  11. Ester, M., Kriegel, H.-P., Sander, J., and Xu, X. A density-based algorithm for discovering clusters in large spatial databases with noise. Second International Conference on Knowledge Discovery and Data Mining (1996), 226–231.

    Google Scholar 

  12. Evfimievski, A., Gehrke, J., and Srikant, R. Limiting privacy breaches in privacy preserving data mining. Proc. of ACM PODS Conference (2003).

    Google Scholar 

  13. Evfimievski, A., Srikant, R., Agrawal, R., and Gehrke, J. Privacy preserving mining of association rules. Proc. of ACM SIGKDD Conference (2002).

    Google Scholar 

  14. Feigenbaum, J., Ishai, Y., Malkin, T., Nissim, K., Strauss, M., and Wright, R. N. Secure multiparty computation of approximations. In ICALP ’01: Proceedings of the 28th International Colloquium on Automata, Languages and Programming, (2001), Springer-Verlag, pp. 927–938.

    Google Scholar 

  15. Guo, S., and Wu, X. Deriving private information from arbitrarily projected data. In Proceedings of the 11th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD07) (Warsaw, Poland, Sept 2007).

    Google Scholar 

  16. Hastie, T., Tibshirani, R., and Friedmann, J. The Elements of Statistical Learning. Springer-Verlag, 2001.

    Google Scholar 

  17. Hinneburg, A., and Keim, D. A. An efficient approach to clustering in large multimedia databases with noise. Proc. of ACM SIGKDD Conference (1998), 58–65.

    Google Scholar 

  18. Hyvarinen, A., Karhunen, J., and Oja, E. Independent Component Analysis. Wiley-Interscience, 2001.

    Google Scholar 

  19. Jain, A. K., and Dubes, R. C. Data clustering: A review. ACM Computing Surveys 31 (1999), 264–323.

    Article  Google Scholar 

  20. Jiang, T. How many entries in a typical orthogonal matrix can be approximated by independent normals. To appear in The Annals of Probability (2005).

    Google Scholar 

  21. Johnson, W. B., and Lindenstrauss, J. Extensions of lipshitz mapping into hilbert space. Contemporary Mathematics 26 (1984).

    Google Scholar 

  22. Kargupta, H., Datta, S., Wang, Q., and Sivakumar, K. On the privacy preserving properties of random data perturbation techniques. Proc. of Intl. Conf. on Data Mining (ICDM) (2003).

    Google Scholar 

  23. Kim, J. J., and Winkler, W. E. Multiplicative noise for masking continuous data. Tech. Rep. Statistics #2003-01, Statistical Research Division, U.S. Bureau of the Census, Washington D.C., April 2003.

    Google Scholar 

  24. LeFevre, K., DeWitt, D. J., and Ramakrishnan, R. Mondrain multidimensional k-anonymity. Proc. of IEEE Intl. Conf. on Data Eng. (ICDE) (2006).

    Google Scholar 

  25. Lewicki, M. S., and Sejnowski, T. J. Learning overcomplet representations. Neural Computation 12, 2 (2000).

    Google Scholar 

  26. Lindell, Y., and Pinkas, B. Privacy preserving data mining. Journal of Cryptology 15, 3 (2000), 177–206.

    Article  MathSciNet  Google Scholar 

  27. Liu, K., Giannella, C., and Kargupta, H. An attacker’s view of distance preserving maps for privacy preserving data mining. In Proceedings of the 10th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD’06) (Berlin, Germany, September 2006).

    Google Scholar 

  28. Liu, K., Kargupta, H., and Ryan, J. Random projection-based multiplicative data perturbation for privacy preserving distributed data mining. IEEE Transactions on Knowledge and Data Engineering (TKDE) 18, 1 (January 2006), 92–106.

    Article  Google Scholar 

  29. Machanavajjhala, A., Gehrke, J., Kifer, D., and Venkitasubramaniam, M. l-diversity: Privacy beyond k-anonymity. Proc. of IEEE Intl. Conf. on Data Eng. (ICDE) (2006).

    Google Scholar 

  30. Neter, J., Kutner, M. H., Nachtsheim, C. J., and Wasserman, W. Applied Linear Statistical Methods. WCB/McGraw-Hill, 1996.

    Google Scholar 

  31. Oliveira, S. R. M., and Zaïane, O. R. Privacy preservation when sharing data for clustering. In Proceedings of the International Workshop on Secure Data Management in a Connected World (Toronto, Canada, August 2004), pp. 67–82.

    Google Scholar 

  32. Sadun, L. Applied Linear Algebra: the Decoupling Principle. Prentice Hall, 2001.

    Google Scholar 

  33. Stewart, G. The efficient generation of random orthogonal matrices with an application to condition estimation. SIAM Journal on Numerical Analysis 17 (1980).

    Google Scholar 

  34. Sweeney, L. k-anonymity: a model for protecting privacy. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems 10, 5 (2002).

    Article  Google Scholar 

  35. Vaidya, J., and Clifton, C. Privacy preserving k-means clustering over vertically partitioned data. Proc. of ACM SIGKDD Conference (2003).

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Chen, K., Liu, L. (2008). A Survey of Multiplicative Perturbation for Privacy-Preserving Data Mining. In: Aggarwal, C.C., Yu, P.S. (eds) Privacy-Preserving Data Mining. Advances in Database Systems, vol 34. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-70992-5_7

Download citation

  • DOI: https://doi.org/10.1007/978-0-387-70992-5_7

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-0-387-70991-8

  • Online ISBN: 978-0-387-70992-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics