A Survey of Utility-based Privacy-Preserving Data Transformation Methods

Hua, Ming; Pei, Jian

doi:10.1007/978-0-387-70992-5_9

Ming Hua⁵ &
Jian Pei⁵

Part of the book series: Advances in Database Systems ((ADBS,volume 34))

5000 Accesses
3 Citations

As a serious concern in data publishing and analysis, privacy preserving data processing has received a lot of attention. Privacy preservation often leads to information loss. Consequently, we want to minimize utility loss as long as the privacy is preserved. In this chapter, we survey the utility-based privacy preservation methods systematically. We first briefly discuss the privacy models and utility measures, and then review four recently proposed methods for utilitybased privacy preservation.

We first introduce the utility-based anonymization method for maximizing the quality of the anonymized data in query answering and discernability. Then we introduce the top-down specialization (TDS) method and the progressive disclosure algorithm (PDA) for privacy preservation in classification problems. Last, we introduce the anonymized marginal method, which publishes the anonymized projection of a table to increase the utility and satisfy the privacy requirement.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Charu C. Aggarwal. On k-anonymity and the curse of dimensionality. In Proceedings of the 31st International Conference on Very Large Data Bases, pages 901–909, August 2005.
Google Scholar
Charu C. Aggarwal, Jian Pei, and Bo Zhang. On privacy preservation against adversarial data mining. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 510 – 516. ACM Press, 2006.
Google Scholar
Roberto J. Bayardo and Rakesh Agrawal. Data privacy through optimal k-anonymization. In Proceedings of the 21st International Conference on Data Engineering (ICDE’05), pages 217 – 228. IEEE Computer Society, 2005.
Google Scholar
A.L. Berger, S.A. Della-Pietra, and V.J. Della-Pietra. A maximum entropy approach to natural language processing. Computational Linguistics, 22(1):39–71, 1996.
Google Scholar
Benjamin C. M. Fung, Ke Wang, and Philip S. Yu. Top-down specialization for information and privacy preservation. In Proceedings of the 21st International Conference on Data Engineering (ICDE’05), volume 00, pages 205 – 216. IEEE Computer Society, 2005.
Google Scholar
Benjamin C. M. Fung, Ke Wang, and Philip S. Yu. Anonymizing classification data for privacy preservation. IEEE Transactions on Knowledge and Data Engineering, 19(5):711–725, May 2007.
Article Google Scholar
Vijay S. Iyengar. Transforming data to satisfy privacy constraints. In Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 279 – 288. ACM Press, 2002.
Google Scholar
Daniel Kifer and Johannes Gehrke. Injecting utility into anonymized datasets. In Proceedings of the 2006 ACM SIGMOD international conference on Management of data, pages 217 – 228. ACM Press, 2006.
Google Scholar
S. Kullback and R. Leibler. On information and sufficiency. Annals of Mathematical Statistics, 22:79–87, 1951.
Article MATH MathSciNet Google Scholar
Steffen L. Lauritzen. Graphical Models. Oxford Science Publicatins, 1996.
Google Scholar
F. Giannotti M. Atzori, F. Bonchi and D. Pedreschi. Blocking anonymity threats raised by frequent itemset mining. In Proceedings of the Fifth IEEE International Conference on Data Mining (ICDM’05), November 2005.
Google Scholar
F. Giannotti M. Atzori, F. Bonchi and D. Pedreschi. k-anonymous patterns. In Proceedings of the Ninth European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD’05), volume 3721 of Lecture Notes in Computer Science, Springer, Porto, Portugal, October 2005.
Google Scholar
Ashwin Machanavajjhala, Johannes Gehrke, Daniel Kifer, and Muthuramakrishnan Venkitasubramaniam. l-diversity: Privacy beyond k-anonymity. In Proceedings of the 22nd International Conference on Data Engineering (ICDE’06), page 24, 2006.
Google Scholar
Adam Meyerson and Ryan Williams. On the complexity of optimal k-anonymity. In Proceedings of the Twenty-third ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pages 223–228, June 2004.
Google Scholar
Stanley R. M. Oliveira and Osmar R. Zaïane. Privacy preserving frequent itemset mining. In CRPITS’14: Proceedings of the IEEE international conference on Privacy, security and data mining, pages 43–54, Darlinghurst, Australia, Australia, 2002. Australian Computer Society, Inc.
Google Scholar
Adwait Ratnaparkhi. A maximum entropy part-of-speech tagger. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, pages 133–142, University of Pennsylvania, May 1996. ACL.
Google Scholar
P. Samarati. Protecting respondents’ identities in microdata release. IEEE Transactions on Knowledge and Data Engineering, 13(6): 1010 – 1027, November 2001.
Article Google Scholar
Pierangela Samarati and Latanya Sweeney. Generalizing data to provide anonymity when disclosing information. Technical report, March 1998.
Google Scholar
Latanya Sweeney. Achieving k-Anonymity Privacy Protection Using Generalization and Suppression. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems, 10(5):571–588, 2002.
Article MATH MathSciNet Google Scholar
Latanya Sweeney. k-anonymity: a model for protecting privacy. Int. J. Uncertain. Fuzziness Knowl.-Based Syst., 10(5):557–570, 2002.
Article MATH MathSciNet Google Scholar
Vassilios S. Verykios, Elisa Bertino, Igor Nai Fovino, Loredana Parasiliti Provenza, Yucel Saygin, and Yannis Theodoridis. State-of-the-art in privacy preserving data mining. ACM SIGMOD Record, 33(1):50 – 57, 2004.
Article Google Scholar
Vassilios S. Verykios, Ahmed K. Elmagarmid, Elisa Bertino, Yucel Saygin, and Elena Dasseni. Association rule hiding. IEEE Transactions on Knowledge and Data Engineering, 16(4):434–447, 2004.
Article Google Scholar
Ke Wang, Benjamin C. M. Fung, and Philip S. Yu. Template-based privacy preservation in classification problems. In Proceedings of the Fifth IEEE International Conference on Data Mining, pages 466 – 473. IEEE Computer Society, 2005.
Google Scholar
Ke Wang, Philip S. Yu, and Sourav Chakraborty. Bottom-up generalization: A data mining solution to privacy protection. In Proceedings of the Fourth IEEE International Conference on Data Mining (ICDM’04), volume 00, pages 249 – 256. IEEE Computer Society, 2004.
Google Scholar
Xiaokui Xiao and Yufei Tao. m-invariance: Towards privacy preserving re-publication of dynamic datasets. In To appear in ACM Conference on Management of Data (SIGMOD), 2007.
Google Scholar
Xiaokui Xiao and Yufei Tao. Anatomy: simple and effective privacy preservation. In Proceedings of the 32nd international conference on Very large data bases, volume 32, pages 139 – 150. VLDB Endowment, 2006.
Google Scholar
Jian Xu, Wei Wang, Jian Pei, Xiaoyuan Wang, Baile Shi, and Ada Wai-Chee Fu. Utility-based anonymization for privacy preservation with less information loss. ACM SIGKDD Explorations Newsletter, 8(2):21–30, December 2006.
Article Google Scholar
Jian Xu, Wei Wang, Jian Pei, Xiaoyuan Wang, Baile Shi, and Ada Wai-Chee Fu. Utility-based anonymization using local recoding. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 785 – 790. ACM Press, 2006.
Google Scholar
Sheng Zhong, Zhiqiang Yang, and Rebecca N. Wright. Privacy-enhancing k-anonymization of customer data. In Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems(PODS ’05), pages 139–147, New York, NY, USA, 2005. ACM Press.
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

School of Computing Science, Simon Fraser University, V5A 1S6, Burnaby, BC, Canada
Ming Hua & Jian Pei

Authors

Ming Hua
View author publications
You can also search for this author in PubMed Google Scholar
Jian Pei
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

IBM Thomas J. Watson Research Center, 19 Skyline Drive, 10532, Hawthorne, NY, USA
Charu C. Aggarwal
Department of Computer Science, University of Illinois at Chicago, 854 South Morgan Street, 60607-7053, Chicago, IL, USA
Philip S. Yu

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Hua, M., Pei, J. (2008). A Survey of Utility-based Privacy-Preserving Data Transformation Methods. In: Aggarwal, C.C., Yu, P.S. (eds) Privacy-Preserving Data Mining. Advances in Database Systems, vol 34. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-70992-5_9

Download citation

DOI: https://doi.org/10.1007/978-0-387-70992-5_9
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-70991-8
Online ISBN: 978-0-387-70992-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics