Data Anonymization Based on Natural Equivalent Class | IEEE Conference Publication | IEEE Xplore

Data Anonymization Based on Natural Equivalent Class


Abstract:

Data anonymization is widely used to preserve the utility of published datasets without compromising privacy. The state-of-the-art data anonymization approaches are mainl...Show More

Abstract:

Data anonymization is widely used to preserve the utility of published datasets without compromising privacy. The state-of-the-art data anonymization approaches are mainly single-record-based algorithms. They group similar records together one by one, then form equivalence classes through generalization. However, these algorithms didn't utilize equivalence classes which exist in the raw dataset. In this paper, we propose a new concept named natural equivalent class. It refers to the record set with the same quasi-identifier values naturally existing in the raw dataset. We theoretically prove that the natural equivalent class can effectively reduce the computational complexity of clustering algorithms as well as information loss. Then, we propose a novel clustering-based anonymization algorithm, which tries to cluster records without separating any natural equivalent class. Extensive experiments on real world datasets show that our approach outperforms the previous clustering-based anonymization algorithms in terms of efficiency and data utility.
Date of Conference: 06-08 May 2019
Date Added to IEEE Xplore: 08 August 2019
ISBN Information:
Conference Location: Porto, Portugal

References

References is not available for this document.