Skip to main content
Log in

Privacy-Preserving Algorithms for Multiple Sensitive Attributes Satisfying t-Closeness

  • Regular Paper
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

Although k-anonymity is a good way of publishing microdata for research purposes, it cannot resist several common attacks, such as attribute disclosure and the similarity attack. To resist these attacks, many refinements of kanonymity have been proposed with t-closeness being one of the strictest privacy models. While most existing t-closeness models address the case in which the original data have only one single sensitive attribute, data with multiple sensitive attributes are more common in practice. In this paper, we cover this gap with two proposed algorithms for multiple sensitive attributes and make the published data satisfy t-closeness. Based on the observation that the values of the sensitive attributes in any equivalence class must be as spread as possible over the entire data to make the published data satisfy t-closeness, both of the algorithms use different methods to partition records into groups in terms of sensitive attributes. One uses a clustering method, while the other leverages the principal component analysis. Then, according to the similarity of quasiidentifier attributes, records are selected from different groups to construct an equivalence class, which will reduce the loss of information as much as possible during anonymization. Our proposed algorithms are evaluated using a real dataset. The results show that the average speed of the first proposed algorithm is slower than that of the second proposed algorithm but the former can preserve more original information. In addition, compared with related approaches, both proposed algorithms can achieve stronger protection of privacy and reduce less.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Sánchez D, Martínez S, Domingo-Ferrer J. Comment on “Unique in the shopping mall: On the reidentifiability of credit card metadata”. Science, 2016, 351(6279): 1274.

  2. Sweeney L. k-anonymity: A model for protecting privacy. International Journal of Uncertainty Fuzziness and Knowledge-Based Systems, 2002, 10(5): 557-570.

  3. LeFevre K, DeWitt D J, Ramakrishnan R. Mondrian multidimensional k-anonymity. In Proc. the 22nd International Conference on Data Engineering, April 2006, p.25.

  4. Machanavajjhala A, Gehrke J, Kifer D. Venkitasubramaniam M. l-diversity: Privacy beyond k-anonymity. In Proc. the 22nd International Conference on Data Engineering, April 2006, p.24.

  5. Li N H, Li T C, Venkatasubramanian S. t-closeness: Privacy beyond k-anonymity and l-diversity. In Proc. the 23rd International Conference on Data Engineering, April 2007, pp.106-115.

  6. Domingo-Ferrer J, Soria-Comas J. From t-closeness to differential privacy and vice versa in data anonymization. Knowledge-Based Systems, 2015, 74: 151-158.

  7. Rebollo-Monedero D, Forne J, Domingo-Ferrer J. From t-closeness-like privacy to postrandomization via information theory. IEEE Trans. Knowl. Data Eng., 2010, 22(11): 1623-1636.

  8. Cao J N, Karras P, Kalnis P, Tan K L. SABRE: A sensitive attribute bucketization and redistribution framework for t-closeness. The VLDB Journal, 2011, 20: 59-81.

  9. Soria-Comas J, Domingo-Ferrer J, Sánchez D, Martínez S. t-closeness through microaggregation: Strict privacy with enhanced utility preservation. IEEE Trans. Knowl. Data Eng., 2015, 27(11): 3098-3110.

  10. Sha C F, Li Y, Zhou A Y. On t-closeness with KL-divergence and semantic privacy. In Proc. the 15th International Conference on Database Systems for Advanced Applications, April 2010, pp.153-167.

  11. Zhang J P, Xie J, Yang J, Zhang B. A t-closeness privacy model based on sensitive attribute values semantics bucketization. Journal of Computer Research and Development, 2014, 51(1): 126-137. (in Chinese)

  12. Rubner Y, Tomasi C, Guibas L J. The earth mover’s distance as a metric for image retrieval. International Journal of Computer Vision, 2000, 40(2): 99-121.

  13. Xu J, Wang W, Pei J, Wang X Y, Shi B L, Fu A W C. Utility-based anonymization using local recoding. In Proc. the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 2006, pp.785-790.

  14. Ghinita G, Karras P, Kalnis P, Mamoulis N. Fast data anonymization with low information loss. In Proc. the 33rd International Conference on Very Large Data Bases, September 2007, pp.758-769.

  15. LeFevre K, DeWitt D J, Ramakrishnan R. Incognito: Efficient full-domain k-anonymity. In Proc. ACM SIGMOD International Conference on Management of Data, June 2005, pp.49-60.

  16. Li N H, Li T C, Venkatasubramanian S. Closeness: A new privacy measure for data publishing. IEEE Trans. Knowl. Data Eng., 2010, 22(7): 943-956.

  17. Fang Y, Ashrafi M Z, Ng S K. Privacy beyond single sensitive attribute. In Proc. the 22nd International Conference on Database and Expert Systems Applications, August 2011, pp.187-201.

  18. Sei Y C, Okumura H, Takenouchi T, Ohsuga A. Anonymization of sensitive quasiidentifiers for l-diversity and t-closeness. IEEE Transactions on Dependable and Secure Computing. doi:10.1109/TDSC.2017.2698472.

  19. Höppner F, Klawonn F. Clustering with size constraints. In Computational Intelligence Paradigms, Jain L C, Sato-Ilic M, Virvou M, Tsihrintzis G A, Balas V E (eds.), Springer, Berlin, Heidelberg, 2008, pp.167-180.

  20. Jolliffe I T, Cadima J. Principal component analysis: A review and recent developments. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 2016, 374(2065): 20150202.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yan Zhu.

Electronic supplementary material

Below is the link to the electronic supplementary material.

ESM 1

(PDF 185 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, R., Zhu, Y., Chen, TS. et al. Privacy-Preserving Algorithms for Multiple Sensitive Attributes Satisfying t-Closeness. J. Comput. Sci. Technol. 33, 1231–1242 (2018). https://doi.org/10.1007/s11390-018-1884-6

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-018-1884-6

Keywords

Navigation