Hiding Emerging Patterns with Local Recoding Generalization

Cheng, Michael W. K.; Choi, Byron Koon Kau; Cheung, William Kwok Wai

doi:10.1007/978-3-642-13657-3_19

Michael W. K. Cheng²³,
Byron Koon Kau Choi²³ &
William Kwok Wai Cheung²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6118))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

4129 Accesses
3 Citations

Abstract

Establishing strategic partnership often requires organizations to publish and share meaningful data to support collaborative business activities. An equally important concern for them is to protect sensitive patterns like unique emerging sales opportunities embedded in their data. In this paper, we contribute to the area of data sanitization by introducing an optimization-based local recoding methodology to hide emerging patterns from a dataset but with the underlying frequent itemsets preserved as far as possible. We propose a novel heuristic solution that captures the unique properties of hiding EPs to carry out iterative local recoding generalization. Also, we propose a metric which measures (i) frequentitemset distortion that quantifies the quality of published data and (ii) the degree of reduction in emerging patterns, to guide a bottom-up recoding process. We have implemented our proposed solution and experimentally verified its effectiveness with a benchmark dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Adam, N.R., Worthmann, J.C.: Security-control methods for statistical databases: A comparative study. ACM Computing Surveys 21(4), 515–556 (1989)
Article Google Scholar
Agrawal, D., Aggarwal, C.: On the design and quantification of privacy preserving data mining algorithms. In: PODS (2001)
Google Scholar
Bailey, J., Manoukian, T., Ramamohanarao, K.: Fast algorithms for mining emerging patterns. In: ECML/PKDD (2002)
Google Scholar
Bayardo, J.R.: Efficiently mining long patterns from databases. In: SIGMOD (1998)
Google Scholar
Bayardo, R., Agrawal, R.: Data privacy through optimal k-anonymization. In: ICDE (2005)
Google Scholar
Davenport, T.H., Harris, J.G.: Competing on Analytics: The New Science of Winning, 1st edn. Harvard Business School Press (2007)
Google Scholar
Dong, G., Li, J.: Efficient mining of emerging patterns: Discovering trends and differences. In: SIGKDD (1999)
Google Scholar
Dong, G., Zhang, X., Wong, L.: CAEP: Classification by aggregating emerging patterns. In: Arikawa, S., Furukawa, K. (eds.) DS 1999. LNCS (LNAI), vol. 1721, p. 30. Springer, Heidelberg (1999)
Chapter Google Scholar
Du, Y., Xia, T., Tao, Y., Zhang, D., Zhu, F.: On multidimensional k-anonymity with local recoding generalization. In: ICDE, pp. 1422–1424 (2007)
Google Scholar
Evfimievski, A., Strikant, R., Agrawal, R., Gehrke, J.: Privacy preserving mining of association rules. In: SIGKDD (2002)
Google Scholar
Fan, H., Ramamohanarao, K.: A Bayesian approach to use emerging patterns for classification. In: ADC (2003)
Google Scholar
Fung, B., Wang, K., Fu, A., Yu, P.: Privacy-Preserving Data Publishing: Concepts and Techniques. Chapman & Hall/CRC (2010)
Google Scholar
Fung, B., Wang, K., Wang, L., Debbabi, M.: A framework for privacy-preserving cluster analysis. In: ISI (2008)
Google Scholar
Fung, B., Wang, K., Yu, P.: Top-down specialization for information and privacy preservation. In: ICDE (2005)
Google Scholar
Fung, B., Wang, K., Yu, P.: Anonymizing classification data for privacy preservation. TKDE 10(5), 711–725 (2007)
Google Scholar
Ramamohanarao, H.F.K.: Pattern based classifiers. In: WWW (2007)
Google Scholar
Kargupta, H., Datta, S., Wang, Q., Sivakumar, K.: Random-data perturbation techniques and privacy-preserving data mining. KAIS 7(4), 387–414 (2005)
Article Google Scholar
LeFevre, K., Dewitt, D., Ramakrishnan, R.: Incognito: Efficient full-domain k-anonymity. In: SIGMOD (2005)
Google Scholar
LeFevre, K., Dewitt, D., Ramakrishnan, R.: Mondrian multidimensional k-anonymity. In: ICDE (2006)
Google Scholar
Li, J., Wong, R., Fu, A., Pei, J.: Anonymization by local recoding in data with attribute hierarchical taxonomies. TKDE 20(9), 1181–1194 (2008)
Google Scholar
Li, T., Li, N.: On the tradeoff between privacy and utility in data publishing. In: SIGKDD (2009)
Google Scholar
Machanavajjhala, A., Kifer, D., Gehrke, J., Vénkitasubramaniam, M.: L-diversity: Privacy beyond k-anonymity. TKDD 1(1), 3 (2007)
Article Google Scholar
MAFIA. Mining Maximal Frequent Itemsets, http://himalaya-tools.sourceforge.net/Mafia/
Moustakides, G., Verykios, V.: A maxmin approach for hiding frequent itemsets. DKE 65(1), 75–79 (2008)
Article Google Scholar
Oliveira, S., Zaiane, O.R.: Privacy preserving frequent itemset mining. In: ICDM Workshop on Privacy, Security and Data Mining, vol. 14, pp. 43–54 (2002)
Google Scholar
Sweeney, L.: Achieving k-anonymity privacy protection using generalization and suppression. IJUFKS 10(5), 571–588 (2002)
MATH MathSciNet Google Scholar
Sweeney, L.: k-anonymity: A model for protecting privacy. In: IJUFKS, pp. 557–570 (2002)
Google Scholar
Terrovitis, M., Mamoulis, N., Kalnis, P.: Privacy-preserving anonymization of set-valued data. PVLDB 1(1), 115–125 (2008)
Google Scholar
Tobji, M.A.B., Abrougui, A., Yaghlane, B.B.: Gufi: A new algorithm for general updating of frequent itemsets. In: CSEWORKSHOPS (2008)
Google Scholar
UCI Machine Learning Repository. Adult Datas, http://archive.ics.uci.edu/ml/datasets
Wang, Z., Fan, H., Ramamohanarao, K.: Exploiting maximal emerging patterns for classification. In: AUS-AI (2004)
Google Scholar
Xu, J., Wang, W., Pei, J., Wang, X., Shi, B., Fu, A.: Utility-based anonymization using local recoding. In: SIGKDD (2006)
Google Scholar
Xu, Y., Wang, K., Fu, A.W.-C., Yu, P.S.: Anonymizing transaction databases for publication. In: SIGKDD (2008)
Google Scholar
Zhang, X., Dong, G., Ramamohanarao, K.: Exploring constraints to efficiently mine emerging patterns from large high-dimensional datasets. In: SIGKDD (2000)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Hong Kong Baptist University, Kowloon Tong, Hong Kong
Michael W. K. Cheng, Byron Koon Kau Choi & William Kwok Wai Cheung

Authors

Michael W. K. Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Byron Koon Kau Choi
View author publications
You can also search for this author in PubMed Google Scholar
William Kwok Wai Cheung
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Computer Science Department, Rensselaer Polytechnic Institute, USA
Mohammed J. Zaki
The Chinese University of Hong Kong, China
Jeffrey Xu Yu
IIT Madras, Chennai, India
B. Ravindran
IIIT, Hyderabad, India
Vikram Pudi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cheng, M.W.K., Choi, B.K.K., Cheung, W.K.W. (2010). Hiding Emerging Patterns with Local Recoding Generalization. In: Zaki, M.J., Yu, J.X., Ravindran, B., Pudi, V. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2010. Lecture Notes in Computer Science(), vol 6118. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13657-3_19

Download citation

DOI: https://doi.org/10.1007/978-3-642-13657-3_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-13656-6
Online ISBN: 978-3-642-13657-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics