Skip to main content

A Survey of Utility-based Privacy-Preserving Data Transformation Methods

  • Chapter
Book cover Privacy-Preserving Data Mining

Part of the book series: Advances in Database Systems ((ADBS,volume 34))

As a serious concern in data publishing and analysis, privacy preserving data processing has received a lot of attention. Privacy preservation often leads to information loss. Consequently, we want to minimize utility loss as long as the privacy is preserved. In this chapter, we survey the utility-based privacy preservation methods systematically. We first briefly discuss the privacy models and utility measures, and then review four recently proposed methods for utilitybased privacy preservation.

We first introduce the utility-based anonymization method for maximizing the quality of the anonymized data in query answering and discernability. Then we introduce the top-down specialization (TDS) method and the progressive disclosure algorithm (PDA) for privacy preservation in classification problems. Last, we introduce the anonymized marginal method, which publishes the anonymized projection of a table to increase the utility and satisfy the privacy requirement.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Charu C. Aggarwal. On k-anonymity and the curse of dimensionality. In Proceedings of the 31st International Conference on Very Large Data Bases, pages 901–909, August 2005.

    Google Scholar 

  2. Charu C. Aggarwal, Jian Pei, and Bo Zhang. On privacy preservation against adversarial data mining. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 510 – 516. ACM Press, 2006.

    Google Scholar 

  3. Roberto J. Bayardo and Rakesh Agrawal. Data privacy through optimal k-anonymization. In Proceedings of the 21st International Conference on Data Engineering (ICDE’05), pages 217 – 228. IEEE Computer Society, 2005.

    Google Scholar 

  4. A.L. Berger, S.A. Della-Pietra, and V.J. Della-Pietra. A maximum entropy approach to natural language processing. Computational Linguistics, 22(1):39–71, 1996.

    Google Scholar 

  5. Benjamin C. M. Fung, Ke Wang, and Philip S. Yu. Top-down specialization for information and privacy preservation. In Proceedings of the 21st International Conference on Data Engineering (ICDE’05), volume 00, pages 205 – 216. IEEE Computer Society, 2005.

    Google Scholar 

  6. Benjamin C. M. Fung, Ke Wang, and Philip S. Yu. Anonymizing classification data for privacy preservation. IEEE Transactions on Knowledge and Data Engineering, 19(5):711–725, May 2007.

    Article  Google Scholar 

  7. Vijay S. Iyengar. Transforming data to satisfy privacy constraints. In Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 279 – 288. ACM Press, 2002.

    Google Scholar 

  8. Daniel Kifer and Johannes Gehrke. Injecting utility into anonymized datasets. In Proceedings of the 2006 ACM SIGMOD international conference on Management of data, pages 217 – 228. ACM Press, 2006.

    Google Scholar 

  9. S. Kullback and R. Leibler. On information and sufficiency. Annals of Mathematical Statistics, 22:79–87, 1951.

    Article  MATH  MathSciNet  Google Scholar 

  10. Steffen L. Lauritzen. Graphical Models. Oxford Science Publicatins, 1996.

    Google Scholar 

  11. F. Giannotti M. Atzori, F. Bonchi and D. Pedreschi. Blocking anonymity threats raised by frequent itemset mining. In Proceedings of the Fifth IEEE International Conference on Data Mining (ICDM’05), November 2005.

    Google Scholar 

  12. F. Giannotti M. Atzori, F. Bonchi and D. Pedreschi. k-anonymous patterns. In Proceedings of the Ninth European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD’05), volume 3721 of Lecture Notes in Computer Science, Springer, Porto, Portugal, October 2005.

    Google Scholar 

  13. Ashwin Machanavajjhala, Johannes Gehrke, Daniel Kifer, and Muthuramakrishnan Venkitasubramaniam. l-diversity: Privacy beyond k-anonymity. In Proceedings of the 22nd International Conference on Data Engineering (ICDE’06), page 24, 2006.

    Google Scholar 

  14. Adam Meyerson and Ryan Williams. On the complexity of optimal k-anonymity. In Proceedings of the Twenty-third ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pages 223–228, June 2004.

    Google Scholar 

  15. Stanley R. M. Oliveira and Osmar R. Zaïane. Privacy preserving frequent itemset mining. In CRPITS’14: Proceedings of the IEEE international conference on Privacy, security and data mining, pages 43–54, Darlinghurst, Australia, Australia, 2002. Australian Computer Society, Inc.

    Google Scholar 

  16. Adwait Ratnaparkhi. A maximum entropy part-of-speech tagger. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, pages 133–142, University of Pennsylvania, May 1996. ACL.

    Google Scholar 

  17. P. Samarati. Protecting respondents’ identities in microdata release. IEEE Transactions on Knowledge and Data Engineering, 13(6): 1010 – 1027, November 2001.

    Article  Google Scholar 

  18. Pierangela Samarati and Latanya Sweeney. Generalizing data to provide anonymity when disclosing information. Technical report, March 1998.

    Google Scholar 

  19. Latanya Sweeney. Achieving k-Anonymity Privacy Protection Using Generalization and Suppression. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems, 10(5):571–588, 2002.

    Article  MATH  MathSciNet  Google Scholar 

  20. Latanya Sweeney. k-anonymity: a model for protecting privacy. Int. J. Uncertain. Fuzziness Knowl.-Based Syst., 10(5):557–570, 2002.

    Article  MATH  MathSciNet  Google Scholar 

  21. Vassilios S. Verykios, Elisa Bertino, Igor Nai Fovino, Loredana Parasiliti Provenza, Yucel Saygin, and Yannis Theodoridis. State-of-the-art in privacy preserving data mining. ACM SIGMOD Record, 33(1):50 – 57, 2004.

    Article  Google Scholar 

  22. Vassilios S. Verykios, Ahmed K. Elmagarmid, Elisa Bertino, Yucel Saygin, and Elena Dasseni. Association rule hiding. IEEE Transactions on Knowledge and Data Engineering, 16(4):434–447, 2004.

    Article  Google Scholar 

  23. Ke Wang, Benjamin C. M. Fung, and Philip S. Yu. Template-based privacy preservation in classification problems. In Proceedings of the Fifth IEEE International Conference on Data Mining, pages 466 – 473. IEEE Computer Society, 2005.

    Google Scholar 

  24. Ke Wang, Philip S. Yu, and Sourav Chakraborty. Bottom-up generalization: A data mining solution to privacy protection. In Proceedings of the Fourth IEEE International Conference on Data Mining (ICDM’04), volume 00, pages 249 – 256. IEEE Computer Society, 2004.

    Google Scholar 

  25. Xiaokui Xiao and Yufei Tao. m-invariance: Towards privacy preserving re-publication of dynamic datasets. In To appear in ACM Conference on Management of Data (SIGMOD), 2007.

    Google Scholar 

  26. Xiaokui Xiao and Yufei Tao. Anatomy: simple and effective privacy preservation. In Proceedings of the 32nd international conference on Very large data bases, volume 32, pages 139 – 150. VLDB Endowment, 2006.

    Google Scholar 

  27. Jian Xu, Wei Wang, Jian Pei, Xiaoyuan Wang, Baile Shi, and Ada Wai-Chee Fu. Utility-based anonymization for privacy preservation with less information loss. ACM SIGKDD Explorations Newsletter, 8(2):21–30, December 2006.

    Article  Google Scholar 

  28. Jian Xu, Wei Wang, Jian Pei, Xiaoyuan Wang, Baile Shi, and Ada Wai-Chee Fu. Utility-based anonymization using local recoding. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 785 – 790. ACM Press, 2006.

    Google Scholar 

  29. Sheng Zhong, Zhiqiang Yang, and Rebecca N. Wright. Privacy-enhancing k-anonymization of customer data. In Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems(PODS ’05), pages 139–147, New York, NY, USA, 2005. ACM Press.

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Hua, M., Pei, J. (2008). A Survey of Utility-based Privacy-Preserving Data Transformation Methods. In: Aggarwal, C.C., Yu, P.S. (eds) Privacy-Preserving Data Mining. Advances in Database Systems, vol 34. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-70992-5_9

Download citation

  • DOI: https://doi.org/10.1007/978-0-387-70992-5_9

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-0-387-70991-8

  • Online ISBN: 978-0-387-70992-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics