A <i>k</i>-Anonymous Rule Clustering Approach for Data Publishing

Motoyuki Ohki; Masahiro Inuiguchi

doi:10.20965/jaciii.2017.p0980

single-jc.php

« previous

JACIII Vol.21 No.6 pp. 980-988

doi: 10.20965/jaciii.2017.p0980

(2017)

Paper:

Views over last 60 days: 797

A k-Anonymous Rule Clustering Approach for Data Publishing

Motoyuki Ohki and Masahiro Inuiguchi

Osaka University
1-3 Machikaneyama, Toyonaka, Osaka 560-8531, Japan

Received:

December 15, 2016

Accepted:

April 20, 2017

Published:

October 20, 2017

Keywords:

decision rule, k-anonymity, similarity, clustering

Abstract

Classification rules should be open for public inspection to ensure fairness.

These rules can be originally induced from some dataset. If induced classification rules are supported only by a small number of objects in the dataset, publication can lead to identification of objects supporting the rule, given their speciality. Eventually, it is possible to retrieve information about the identified objects. This identifiability is not desirable in terms of data privacy.

In this paper, to avoid such privacy breaches, we propose rule clustering for achieving k-anonymity of all induced rules, i.e., the induced rules are supported by at least k objects in the dataset. The proposed approach merges similar rules to satisfy k-anonymity while aiming to maintain the classification accuracy. Two numerical experiments were executed to verify both the accuracy of the classifier with the rules obtained by the proposed method and the ratio of decision classes revealed from leaked information about objects. The experimental results show the usefulness of the proposed method.

Cite this article as:

M. Ohki and M. Inuiguchi, “A k-Anonymous Rule Clustering Approach for Data Publishing,” J. Adv. Comput. Intell. Intell. Inform., Vol.21 No.6, pp. 980-988, 2017.

Data files:

References

[1] L. Sweeney, “k-Anonymity: A Model for Protecting Privacy,” Int. J. on Uncertainty Fuzziness and Knowledge-based System, Vol.10, No.5, pp. 557-570, 2002.
[2] R. Agrawal and R. Srikant, “Fast Algorithms for Mining Association Rules,” VLDB ’94 Proc. of the 20th Int. Conf. on Very Large Data Bases, pp. 487-499, 1994.
[3] C. H. Tai, P. S.Yu, and M. S. Chen, “k-Support Anonymity Based on Pseudo Taxonomy for Outsourcing of Frequent Itemset Mining,” Proc. of the 16th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, pp. 473-482, 2010.
[4] Z. Zhu and W. Du, “K-anonymous association rule hiding,” Proc. of the 5th ACM Symp. on Information, Computer and Communications Security, pp. 305-309, 2010.
[5] J. G. Narges and N. D. Mohammad, “A survey on privacy preserving association rule mining,” Advances in Computer Science: an Int. J., Vol.4, No.14, pp. 41-48, 2015.
[6] B. J. Khyati, V. Jignesh, and R. P. Dhiren, “A Survey on Association Rule Hiding Methods,” Int. J. of Computer Application, Vol.82, No.13, pp. 20-25, 2013.
[7] Z. Pawlak, “Rough Sets,” Int. J. of Computer and Information Sciences, Vol.11, No.5, pp. 341-356, 1982.
[8] N. Ytow, D. R. Morse, and D. McL. Roberts, “Rough Set Approximation as Formal Concept,” J. Adv. Comput. Intell. Intell. Inform., Vol.10, No.5, pp. 606-611, 2006.
[9] N. Yamaguchi, M. Wu, M. Nakata, and H. Sakai, “Application of Rough Set-Based Information Analysis to Questionnaire Data,” J. Adv. Comput. Intell. Intell. Inform., Vol.18, No.6, pp. 953-961, 2014.
[10] M. Inuiguchi and K. Washimi, “Improving Rough Set Rule-Based Classification by Supplementary Rules,” J. Adv. Comput. Intell. Intell. Inform., Vol.19, No.6, pp. 747-758, 2015.
[11] M. Ohki, E. Sekiya, and M. Inuiguchi, “Role of Robustness Measure in Rule Induction,” J. Adv. Comput. Intell. Intell. Inform., Vol.20, No.4, pp. 580-589, 2016.
[12] M. Inuiguchi, T. Hamakawa, and S. Ubukata, “Imprecise Rules for Data Privacy,” Rough Sets and Knowledge Technology 10th Int. Conf. RSKT 2015, Vol.11, pp. 129-139, 2015.
[13] A. Machanavajjhala, D. Kifer, J. Gehrke, and M. Venkitasubramaniam, “l-diversity : Privacy beyond k-anonymity,” ACM Trans. on Knowledge Discovery from Data, Vol.1, No.3, pp. 1-12, 2007.
[14] B. Ji-Won, K. Ashish, B. Elisa, and L Ninghui, “Efficient k-Anonymization Using Clustering Techniques,” 12th Int. Conf. on Database Systems for Advanced Applications, Vol.4443, pp. 188-200, 2007.
[15] A. Kawano, K. Honda, H. Kasugai, and A. Notsu, “A Greedy Algorithm for k-Member Co-clustering and its Applicability to Collaborative Filtering,” 17th Int. Conf. in Knowledge Based and Intelligent Information and Engineering Systems, Vol.22, pp. 477-484, 2013.
[16] W. Ziarko, “Variable Precision Rough Set Model,” J. of Computer and System Sciences, Vol.46, No.1, pp. 39-59, 1993.
[17] N. Shan and W. Ziarko, “Data-based acquisition and incremental modification of classification rules,” Computational Intelligence, Vol.11, pp. 357-370, 1995.
[18] J. W. Grzymala-Busse, “MLEM2 - Discretization During Rule Induction,” Proc. of the IIPWM2003, pp. 499-508, 2003.
[19] J. Han, J. Pei, and Y. Yin, “Mining Frequent Patterns without Candidate Generation,” Proc. of the ACM SIGMOD Conf. on Management of Data, pp. 1-12, 2000.
[20] J. W. Grzymala-Busse, “LERS – A system for learning from examples based on rough sets,” Intelligent Decision Support: Handbook of Applications and Advance of the Rough Sets Theory, Kluwer Academic Publishers, 1992.

This article is published under a Creative Commons Attribution-NoDerivatives 4.0 Internationa License.

[1] [1] L. Sweeney, “k-Anonymity: A Model for Protecting Privacy,” Int. J. on Uncertainty Fuzziness and Knowledge-based System, Vol.10, No.5, pp. 557-570, 2002.

[2] [2] R. Agrawal and R. Srikant, “Fast Algorithms for Mining Association Rules,” VLDB ’94 Proc. of the 20th Int. Conf. on Very Large Data Bases, pp. 487-499, 1994.

[3] [3] C. H. Tai, P. S.Yu, and M. S. Chen, “k-Support Anonymity Based on Pseudo Taxonomy for Outsourcing of Frequent Itemset Mining,” Proc. of the 16th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, pp. 473-482, 2010.

[4] [4] Z. Zhu and W. Du, “K-anonymous association rule hiding,” Proc. of the 5th ACM Symp. on Information, Computer and Communications Security, pp. 305-309, 2010.

[5] [5] J. G. Narges and N. D. Mohammad, “A survey on privacy preserving association rule mining,” Advances in Computer Science: an Int. J., Vol.4, No.14, pp. 41-48, 2015.

[6] [6] B. J. Khyati, V. Jignesh, and R. P. Dhiren, “A Survey on Association Rule Hiding Methods,” Int. J. of Computer Application, Vol.82, No.13, pp. 20-25, 2013.

[7] [7] Z. Pawlak, “Rough Sets,” Int. J. of Computer and Information Sciences, Vol.11, No.5, pp. 341-356, 1982.

[8] [8] N. Ytow, D. R. Morse, and D. McL. Roberts, “Rough Set Approximation as Formal Concept,” J. Adv. Comput. Intell. Intell. Inform., Vol.10, No.5, pp. 606-611, 2006.

[9] [9] N. Yamaguchi, M. Wu, M. Nakata, and H. Sakai, “Application of Rough Set-Based Information Analysis to Questionnaire Data,” J. Adv. Comput. Intell. Intell. Inform., Vol.18, No.6, pp. 953-961, 2014.

[10] [10] M. Inuiguchi and K. Washimi, “Improving Rough Set Rule-Based Classification by Supplementary Rules,” J. Adv. Comput. Intell. Intell. Inform., Vol.19, No.6, pp. 747-758, 2015.

[11] [11] M. Ohki, E. Sekiya, and M. Inuiguchi, “Role of Robustness Measure in Rule Induction,” J. Adv. Comput. Intell. Intell. Inform., Vol.20, No.4, pp. 580-589, 2016.

[12] [12] M. Inuiguchi, T. Hamakawa, and S. Ubukata, “Imprecise Rules for Data Privacy,” Rough Sets and Knowledge Technology 10th Int. Conf. RSKT 2015, Vol.11, pp. 129-139, 2015.

[13] [13] A. Machanavajjhala, D. Kifer, J. Gehrke, and M. Venkitasubramaniam, “l-diversity : Privacy beyond k-anonymity,” ACM Trans. on Knowledge Discovery from Data, Vol.1, No.3, pp. 1-12, 2007.

[14] [14] B. Ji-Won, K. Ashish, B. Elisa, and L Ninghui, “Efficient k-Anonymization Using Clustering Techniques,” 12th Int. Conf. on Database Systems for Advanced Applications, Vol.4443, pp. 188-200, 2007.

[15] [15] A. Kawano, K. Honda, H. Kasugai, and A. Notsu, “A Greedy Algorithm for k-Member Co-clustering and its Applicability to Collaborative Filtering,” 17th Int. Conf. in Knowledge Based and Intelligent Information and Engineering Systems, Vol.22, pp. 477-484, 2013.

[16] [16] W. Ziarko, “Variable Precision Rough Set Model,” J. of Computer and System Sciences, Vol.46, No.1, pp. 39-59, 1993.

[17] [17] N. Shan and W. Ziarko, “Data-based acquisition and incremental modification of classification rules,” Computational Intelligence, Vol.11, pp. 357-370, 1995.

[18] [18] J. W. Grzymala-Busse, “MLEM2 - Discretization During Rule Induction,” Proc. of the IIPWM2003, pp. 499-508, 2003.

[19] [19] J. Han, J. Pei, and Y. Yin, “Mining Frequent Patterns without Candidate Generation,” Proc. of the ACM SIGMOD Conf. on Management of Data, pp. 1-12, 2000.

[20] [20] J. W. Grzymala-Busse, “LERS – A system for learning from examples based on rough sets,” Intelligent Decision Support: Handbook of Applications and Advance of the Rough Sets Theory, Kluwer Academic Publishers, 1992.