Abstract
Since individual data are being collected everywhere in the era of data explosion, privacy preserving has become a necessity for any data mining task. Therefore, data transformation to ensure privacy preservation is needed. Meanwhile, the transformed data must have quality to be used in the intended data mining task, i.e. the impact on the data quality with regard to the data mining task must be minimized. However, the data transformation problem to preserve the data privacy while minimizing the impact has been proven as an NP-hard. In this paper, we address the problem of maintaining the data quality in the scenarios which the transformed data will be used to build associative classification models. We propose a novel heuristic algorithm to preserve the privacy and maintain the data quality. Our heuristic is guided by the classification correction rate (CCR) of the given datasets. Our proposed algorithm is validated by experiments. From the experiments, the results show that the proposed algorithm is not only efficient, but also highly effective.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Sweeney, L.: k-anonymity: a model for protecting privacy. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems 10, 557–570 (2002)
Fung, B.C.M., Wang, K., Yu, P.S.: Top-down specialization for information and privacy preservation. In: Proceedings of the 21st International Conference on Data Engineering, pp. 205–216. IEEE Computer Society, Los Alamitos (2005)
Bayardo Jr., R.J., Agrawal, R.: Data privacy through optimal k-anonymization. In: Proceedings of the 21st IEEE ICDE International Conference on Data Engineering, pp. 217–228. IEEE Computer Society, Los Alamitos (2005)
Li, J., Wong, R.C.W., Fu, A.W.C., Pei, J.: Achieving k-anonymity by clustering in attribute hierarchical structures. In: Tjoa, A.M., Trujillo, J. (eds.) DaWaK 2006. LNCS, vol. 4081, pp. 405–416. Springer, Heidelberg (2006)
Meyerson, A., Williams, R.: On the complexity of optimal k-anonymity. In: Proceedings of the Twenty-third ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pp. 223–228. ACM, New York (2004)
Sweeney, L.: Achieving k-anonymity privacy protection using generalization and suppression. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems 10, 571–588 (2002)
Truta, T.M., Campan, A.: K-anonymization incremental maintenance and optimization techniques. In: SAC 2007: Proceedings of the 2007 ACM symposium on Applied computing, pp. 380–387. ACM, New York (2007)
Wang, K., Fung, B.C.M.: Anonymizing sequential releases. In: KDD 2006: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 414–423. ACM Press, New York (2006)
Wang, K., Yu, P.S., Chakraborty, S.: Bottom-up generalization: A data mining solution to privacy protection. In: Proceedings of the 4th IEEE International Conference on Data Mining, pp. 249–256. IEEE Computer Society, Los Alamitos (2004)
Li, W., Han, J., Pei, J.: Cmar: Accurate and efficient classification based on multiple class-association rules. In: Proceedings of the 2001 IEEE ICDM International Conference on Data Mining, Washington, DC, USA, pp. 369–376. IEEE Computer Society, Los Alamitos (2001)
Liu, B., Hsu, W., Ma, Y.: Integrating classification and association rule mining. In: Proceedings of the fourth ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 80–86. AAAI Press, Menlo Park (1998)
Agrawal, R., Imielinski, T., Swami, A.: Mining association rules between sets of items in large databases. In: SIGMOD 1993: Proceedings of the 1993 ACM SIGMOD international conference on Management of data, pp. 207–216. ACM Press, New York (1993)
Harnsamut, N., Natwichai, J., Sun, X., Li, X.: Data quality in privacy preserving for associative classification. In: Proceedings of the Third International Conference on Advanced Data Mining and Applications (to appear, 2008)
Blake, C., Merz, C.: UCI repository of machine learning databases (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Harnsamut, N., Natwichai, J. (2008). A Novel Heuristic Algorithm for Privacy Preserving of Associative Classification. In: Ho, TB., Zhou, ZH. (eds) PRICAI 2008: Trends in Artificial Intelligence. PRICAI 2008. Lecture Notes in Computer Science(), vol 5351. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-89197-0_27
Download citation
DOI: https://doi.org/10.1007/978-3-540-89197-0_27
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-89196-3
Online ISBN: 978-3-540-89197-0
eBook Packages: Computer ScienceComputer Science (R0)