Skip to main content

Hiding Emerging Patterns with Local Recoding Generalization

  • Conference paper
Advances in Knowledge Discovery and Data Mining (PAKDD 2010)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6118))

Included in the following conference series:

Abstract

Establishing strategic partnership often requires organizations to publish and share meaningful data to support collaborative business activities. An equally important concern for them is to protect sensitive patterns like unique emerging sales opportunities embedded in their data. In this paper, we contribute to the area of data sanitization by introducing an optimization-based local recoding methodology to hide emerging patterns from a dataset but with the underlying frequent itemsets preserved as far as possible. We propose a novel heuristic solution that captures the unique properties of hiding EPs to carry out iterative local recoding generalization. Also, we propose a metric which measures (i) frequentitemset distortion that quantifies the quality of published data and (ii) the degree of reduction in emerging patterns, to guide a bottom-up recoding process. We have implemented our proposed solution and experimentally verified its effectiveness with a benchmark dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Adam, N.R., Worthmann, J.C.: Security-control methods for statistical databases: A comparative study. ACM Computing Surveys 21(4), 515–556 (1989)

    Article  Google Scholar 

  2. Agrawal, D., Aggarwal, C.: On the design and quantification of privacy preserving data mining algorithms. In: PODS (2001)

    Google Scholar 

  3. Bailey, J., Manoukian, T., Ramamohanarao, K.: Fast algorithms for mining emerging patterns. In: ECML/PKDD (2002)

    Google Scholar 

  4. Bayardo, J.R.: Efficiently mining long patterns from databases. In: SIGMOD (1998)

    Google Scholar 

  5. Bayardo, R., Agrawal, R.: Data privacy through optimal k-anonymization. In: ICDE (2005)

    Google Scholar 

  6. Davenport, T.H., Harris, J.G.: Competing on Analytics: The New Science of Winning, 1st edn. Harvard Business School Press (2007)

    Google Scholar 

  7. Dong, G., Li, J.: Efficient mining of emerging patterns: Discovering trends and differences. In: SIGKDD (1999)

    Google Scholar 

  8. Dong, G., Zhang, X., Wong, L.: CAEP: Classification by aggregating emerging patterns. In: Arikawa, S., Furukawa, K. (eds.) DS 1999. LNCS (LNAI), vol. 1721, p. 30. Springer, Heidelberg (1999)

    Chapter  Google Scholar 

  9. Du, Y., Xia, T., Tao, Y., Zhang, D., Zhu, F.: On multidimensional k-anonymity with local recoding generalization. In: ICDE, pp. 1422–1424 (2007)

    Google Scholar 

  10. Evfimievski, A., Strikant, R., Agrawal, R., Gehrke, J.: Privacy preserving mining of association rules. In: SIGKDD (2002)

    Google Scholar 

  11. Fan, H., Ramamohanarao, K.: A Bayesian approach to use emerging patterns for classification. In: ADC (2003)

    Google Scholar 

  12. Fung, B., Wang, K., Fu, A., Yu, P.: Privacy-Preserving Data Publishing: Concepts and Techniques. Chapman & Hall/CRC (2010)

    Google Scholar 

  13. Fung, B., Wang, K., Wang, L., Debbabi, M.: A framework for privacy-preserving cluster analysis. In: ISI (2008)

    Google Scholar 

  14. Fung, B., Wang, K., Yu, P.: Top-down specialization for information and privacy preservation. In: ICDE (2005)

    Google Scholar 

  15. Fung, B., Wang, K., Yu, P.: Anonymizing classification data for privacy preservation. TKDE 10(5), 711–725 (2007)

    Google Scholar 

  16. Ramamohanarao, H.F.K.: Pattern based classifiers. In: WWW (2007)

    Google Scholar 

  17. Kargupta, H., Datta, S., Wang, Q., Sivakumar, K.: Random-data perturbation techniques and privacy-preserving data mining. KAIS 7(4), 387–414 (2005)

    Article  Google Scholar 

  18. LeFevre, K., Dewitt, D., Ramakrishnan, R.: Incognito: Efficient full-domain k-anonymity. In: SIGMOD (2005)

    Google Scholar 

  19. LeFevre, K., Dewitt, D., Ramakrishnan, R.: Mondrian multidimensional k-anonymity. In: ICDE (2006)

    Google Scholar 

  20. Li, J., Wong, R., Fu, A., Pei, J.: Anonymization by local recoding in data with attribute hierarchical taxonomies. TKDE 20(9), 1181–1194 (2008)

    Google Scholar 

  21. Li, T., Li, N.: On the tradeoff between privacy and utility in data publishing. In: SIGKDD (2009)

    Google Scholar 

  22. Machanavajjhala, A., Kifer, D., Gehrke, J., Vénkitasubramaniam, M.: L-diversity: Privacy beyond k-anonymity. TKDD 1(1), 3 (2007)

    Article  Google Scholar 

  23. MAFIA. Mining Maximal Frequent Itemsets, http://himalaya-tools.sourceforge.net/Mafia/

  24. Moustakides, G., Verykios, V.: A maxmin approach for hiding frequent itemsets. DKE 65(1), 75–79 (2008)

    Article  Google Scholar 

  25. Oliveira, S., Zaiane, O.R.: Privacy preserving frequent itemset mining. In: ICDM Workshop on Privacy, Security and Data Mining, vol. 14, pp. 43–54 (2002)

    Google Scholar 

  26. Sweeney, L.: Achieving k-anonymity privacy protection using generalization and suppression. IJUFKS 10(5), 571–588 (2002)

    MATH  MathSciNet  Google Scholar 

  27. Sweeney, L.: k-anonymity: A model for protecting privacy. In: IJUFKS, pp. 557–570 (2002)

    Google Scholar 

  28. Terrovitis, M., Mamoulis, N., Kalnis, P.: Privacy-preserving anonymization of set-valued data. PVLDB 1(1), 115–125 (2008)

    Google Scholar 

  29. Tobji, M.A.B., Abrougui, A., Yaghlane, B.B.: Gufi: A new algorithm for general updating of frequent itemsets. In: CSEWORKSHOPS (2008)

    Google Scholar 

  30. UCI Machine Learning Repository. Adult Datas, http://archive.ics.uci.edu/ml/datasets

  31. Wang, Z., Fan, H., Ramamohanarao, K.: Exploiting maximal emerging patterns for classification. In: AUS-AI (2004)

    Google Scholar 

  32. Xu, J., Wang, W., Pei, J., Wang, X., Shi, B., Fu, A.: Utility-based anonymization using local recoding. In: SIGKDD (2006)

    Google Scholar 

  33. Xu, Y., Wang, K., Fu, A.W.-C., Yu, P.S.: Anonymizing transaction databases for publication. In: SIGKDD (2008)

    Google Scholar 

  34. Zhang, X., Dong, G., Ramamohanarao, K.: Exploring constraints to efficiently mine emerging patterns from large high-dimensional datasets. In: SIGKDD (2000)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Cheng, M.W.K., Choi, B.K.K., Cheung, W.K.W. (2010). Hiding Emerging Patterns with Local Recoding Generalization. In: Zaki, M.J., Yu, J.X., Ravindran, B., Pudi, V. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2010. Lecture Notes in Computer Science(), vol 6118. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13657-3_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-13657-3_19

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-13656-6

  • Online ISBN: 978-3-642-13657-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics