Skip to main content

Disclosure Risks of Distance Preserving Data Transformations

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5069))

Abstract

One of the fundamental challenges that the data mining community faces today is privacy. The question “How are we going to do data mining without violating the privacy of individuals?” is still on the table, and research is being conducted to find efficient methods to do that. Data transformation was previously proposed as one efficient method for privacy preserving data mining when a party needs to out-source the data mining task, or when distributed data mining needs to be performed among multiple parties without each party disclosing its actual data. In this paper we study the safety of distance preserving data transformations proposed for privacy preserving data mining. We show that an adversary can recover the original data values with very high confidence via knowledge of mutual distances between data objects together with the probability distribution from which they are drawn. Experiments conducted on real and synthetic data sets demonstrate the effectiveness of the theoretical results.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aggarwal, C.C.: On randomization, public information and the curse of dimensionality. In: ICDE, pp. 136–145 (2007)

    Google Scholar 

  2. Aggarwal, C.C., Yu, P.S.: A condensation approach to privacy preserving data mining. In: EDBT, pp. 183–199 (2004)

    Google Scholar 

  3. Agrawal, R., Srikant, R.: Privacy-preserving data mining. In: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, Dallas, Texas, USA, May 16-18, 2000, pp. 439–450. ACM, New York (2000)

    Chapter  Google Scholar 

  4. Chen, K.: Geometric Methods for Mining Large and Possibly Private Datasets. PhD thesis, Georgia Institute of Technology (2006)

    Google Scholar 

  5. Chen, K., Sun, G., Liu, L.: Towards attack-resilient geometric data perturbation. In: Proceedings of the 2007 SIAM International Conference on Data Mining, pp. 78–89 (2007)

    Google Scholar 

  6. Kantarcioglu, M., Clifton, C.: Privacy-preserving distributed mining of association rules on horizontally partitioned data. IEEE Trans. Knowl. Data Eng. 16(9), 1026–1037 (2004)

    Article  Google Scholar 

  7. Liu, K., Giannella, C., Kargupta, H.: An attacker’s view of distance preserving maps for privacy preserving data mining. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) PKDD 2006. LNCS (LNAI), vol. 4213, pp. 297–308. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  8. Liu, K., Kargupta, H., Ryan, J.: Random projection-based multiplicative data perturbation for privacy preserving distributed data mining. IEEE Trans. Knowl. Data Eng. 18(1), 92–106 (2006)

    Article  Google Scholar 

  9. Mann, C.C.: Homeland insecurity. The Atlantic Monthly 290(2) (2002)

    Google Scholar 

  10. Muralidhar, K., Sarathy, R.: Security of random data perturbation methods. ACM Trans. Database Syst. 24(4), 487–493 (1999)

    Article  Google Scholar 

  11. Oliveira, S.R.M., Zaïane, O.R.: Privacy preserving clustering by data transformation. In: Proceedings of the 18th Brazilian Symposium on Databases, pp. 304–318 (2003)

    Google Scholar 

  12. Oliveira, S.R.M., Zaïane, O.R.: Privacy-preserving clustering by object similarity-based representation and dimensionality reduction transformation. In: ICDM 2004, pp. 21–30. IEEE Computer Society, Los Alamitos (2004)

    Google Scholar 

  13. UCI Machine Learning Repository, http://mlearn.ics.uci.edu/MLsummary.html

  14. Vaidya, J., Clifton, C.: Privacy-preserving k-means clustering over vertically partitioned data. In: KDD 2003: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, New York, NY, USA, pp. 206–215. ACM Press, New York (2003)

    Chapter  Google Scholar 

  15. Vijayan, J.: House committee chair wants info on cancelled dhs data-mining programs. Computer World (September 18, 2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Bertram Ludäscher Nikos Mamoulis

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Turgay, E.O., Pedersen, T.B., Saygın, Y., Savaş, E., Levi, A. (2008). Disclosure Risks of Distance Preserving Data Transformations. In: Ludäscher, B., Mamoulis, N. (eds) Scientific and Statistical Database Management. SSDBM 2008. Lecture Notes in Computer Science, vol 5069. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69497-7_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-69497-7_8

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-69476-2

  • Online ISBN: 978-3-540-69497-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics