Abstract
Privacy protection in publishing transaction data is an important problem. A key feature of transaction data is the extreme sparsity, which renders any single technique ineffective in anonymizing such data. Among recent works, some incur high information loss, some result in data hard to interpret, and some suffer from performance drawbacks. This paper proposes to integrate generalization and suppression to reduce information loss. However, the integration is non-trivial. We propose novel techniques to address the efficiency and scalability challenges. Extensive experiments on real world databases show that this approach outperforms the state-of-the-art methods, including global generalization, local generalization, and total suppression. In addition, transaction data anonymized by this approach can be analyzed by standard data mining tools, a property that local generalization fails to provide.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Agrawal, R., Imielinski, T., Swami, A.: Mining Association Rules between Sets of Items in Large Databases. In: SIGMOD 1993 (1993)
Barbaro, M., Zeller, T.: A Face Is Exposed for AOL Searcher No. 4417749. New York Times (August 9, 2006)
Fellbaum, C.: WordNet, An Electronic Lexical Database. MIT Press, Cambridge (1998)
Ghinita, G., Tao, Y., Kalnis, P.: On the Anonymization of Sparse High-Dimensional Data. In: ICDE 2008 (2008)
He, Y., Naughton, J.: Anonymization of Set-valued Data via Top-down Local Generalization. In: VLDB 2009 (2009)
Iyengar, V.: Transforming Data to Satisfy Privacy Constraints. In: KDD 2002 (2002)
LeFevre, K., DeWitt, D., Ramakrishnan, R.: Mondrian Multidimensional k-Anonymity. In: ICDE 2006 (2006)
Liu, J., Wang, K.: On Optimal Anonymization for l  + -Diversity. In: ICDE 2010 (2010)
Machanavajjhala, A., Gehrke, J., Kifer, D., Venkitasubramaniam, M.: l-Diversity: Privacy beyond k-Anonymity. In: ICDE 2006 (2006)
Narayanan, A., Shmatikov, V.: How to Break Anonymity of the Netflix Prize Dataset. ArXiv Computer Science e-prints (October 2006)
Pass, G., Chowdhury, A., Torgeson, C.: A Picture of Search. In: The 1st Intl. Conf. on Scalable Information Systems, Hong Kong (June 2006)
Samarati, P., Sweeney, L.: Generalizing Data to Provide Anonymity When Disclosing Information. In: PODS 1998 (1998)
Terrovitis, M., Mamoulis, N., Kalnis, P.: Privacy Preserving Anonymization of Set-valued Data. In: VLDB 2008 (2008)
Xu, Y., Wang, K., Fu, A., Yu, P.S.: Anonymizing Transaction Databases for Publication. In: KDD 2008 (2008)
Zheng, Z., Kohavi, R., Mason, L.: Real World Performance of Association Rule Algorithms. In: KDD 2001 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Liu, J., Wang, K. (2010). Anonymizing Transaction Data by Integrating Suppression and Generalization. In: Zaki, M.J., Yu, J.X., Ravindran, B., Pudi, V. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2010. Lecture Notes in Computer Science(), vol 6118. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13657-3_20
Download citation
DOI: https://doi.org/10.1007/978-3-642-13657-3_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-13656-6
Online ISBN: 978-3-642-13657-3
eBook Packages: Computer ScienceComputer Science (R0)