Skip to main content

A Novel Heuristic Algorithm for Privacy Preserving of Associative Classification

  • Conference paper
PRICAI 2008: Trends in Artificial Intelligence (PRICAI 2008)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5351))

Included in the following conference series:

Abstract

Since individual data are being collected everywhere in the era of data explosion, privacy preserving has become a necessity for any data mining task. Therefore, data transformation to ensure privacy preservation is needed. Meanwhile, the transformed data must have quality to be used in the intended data mining task, i.e. the impact on the data quality with regard to the data mining task must be minimized. However, the data transformation problem to preserve the data privacy while minimizing the impact has been proven as an NP-hard. In this paper, we address the problem of maintaining the data quality in the scenarios which the transformed data will be used to build associative classification models. We propose a novel heuristic algorithm to preserve the privacy and maintain the data quality. Our heuristic is guided by the classification correction rate (CCR) of the given datasets. Our proposed algorithm is validated by experiments. From the experiments, the results show that the proposed algorithm is not only efficient, but also highly effective.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Sweeney, L.: k-anonymity: a model for protecting privacy. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems 10, 557–570 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  2. Fung, B.C.M., Wang, K., Yu, P.S.: Top-down specialization for information and privacy preservation. In: Proceedings of the 21st International Conference on Data Engineering, pp. 205–216. IEEE Computer Society, Los Alamitos (2005)

    Google Scholar 

  3. Bayardo Jr., R.J., Agrawal, R.: Data privacy through optimal k-anonymization. In: Proceedings of the 21st IEEE ICDE International Conference on Data Engineering, pp. 217–228. IEEE Computer Society, Los Alamitos (2005)

    Google Scholar 

  4. Li, J., Wong, R.C.W., Fu, A.W.C., Pei, J.: Achieving k-anonymity by clustering in attribute hierarchical structures. In: Tjoa, A.M., Trujillo, J. (eds.) DaWaK 2006. LNCS, vol. 4081, pp. 405–416. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  5. Meyerson, A., Williams, R.: On the complexity of optimal k-anonymity. In: Proceedings of the Twenty-third ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pp. 223–228. ACM, New York (2004)

    Chapter  Google Scholar 

  6. Sweeney, L.: Achieving k-anonymity privacy protection using generalization and suppression. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems 10, 571–588 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  7. Truta, T.M., Campan, A.: K-anonymization incremental maintenance and optimization techniques. In: SAC 2007: Proceedings of the 2007 ACM symposium on Applied computing, pp. 380–387. ACM, New York (2007)

    Chapter  Google Scholar 

  8. Wang, K., Fung, B.C.M.: Anonymizing sequential releases. In: KDD 2006: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 414–423. ACM Press, New York (2006)

    Google Scholar 

  9. Wang, K., Yu, P.S., Chakraborty, S.: Bottom-up generalization: A data mining solution to privacy protection. In: Proceedings of the 4th IEEE International Conference on Data Mining, pp. 249–256. IEEE Computer Society, Los Alamitos (2004)

    Google Scholar 

  10. Li, W., Han, J., Pei, J.: Cmar: Accurate and efficient classification based on multiple class-association rules. In: Proceedings of the 2001 IEEE ICDM International Conference on Data Mining, Washington, DC, USA, pp. 369–376. IEEE Computer Society, Los Alamitos (2001)

    Google Scholar 

  11. Liu, B., Hsu, W., Ma, Y.: Integrating classification and association rule mining. In: Proceedings of the fourth ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 80–86. AAAI Press, Menlo Park (1998)

    Google Scholar 

  12. Agrawal, R., Imielinski, T., Swami, A.: Mining association rules between sets of items in large databases. In: SIGMOD 1993: Proceedings of the 1993 ACM SIGMOD international conference on Management of data, pp. 207–216. ACM Press, New York (1993)

    Chapter  Google Scholar 

  13. Harnsamut, N., Natwichai, J., Sun, X., Li, X.: Data quality in privacy preserving for associative classification. In: Proceedings of the Third International Conference on Advanced Data Mining and Applications (to appear, 2008)

    Google Scholar 

  14. Blake, C., Merz, C.: UCI repository of machine learning databases (1998)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Harnsamut, N., Natwichai, J. (2008). A Novel Heuristic Algorithm for Privacy Preserving of Associative Classification. In: Ho, TB., Zhou, ZH. (eds) PRICAI 2008: Trends in Artificial Intelligence. PRICAI 2008. Lecture Notes in Computer Science(), vol 5351. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-89197-0_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-89197-0_27

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-89196-3

  • Online ISBN: 978-3-540-89197-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics