Skip to main content

K-Anonymity Algorithm Based on Improved Clustering

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11335))

Abstract

K-anonymity is the most widely used technology in the field of privacy preservation. It has a good performance particularly in protecting data privacy in the scenarios of data publication, location-based service and social network. In this paper, we propose a new algorithm to achieve k-anonymity in a better way through improved clustering, and we optimize the clustering process by considering the overall distribution of quasi-identifier groups in a multidimensional space. With the local optimal clustering, we try our best to guarantee minimized intra-cluster distances and maximized inter-cluster distances. Therefore, the quality of anonymized data can be greatly improved. Compared with some popular algorithms like k-member, Mondrian, and one-time k-means, the experimental results show our algorithm can effectively reduce the information loss while generating equivalence classes. The total information loss of the anonymized dataset decreases by about 20% on average than that of other algorithms. It also performs well in dealing with both numerical attributes and categorical attributes.

This project is partly supported the National Natural Science Foundation of China (No. 61772291), the Natural Science Foundation of Tianjin (No. 17JCZDJC30500), the Open Project Foundation of Information Security Evaluation Center of Civil Aviation, Civil Aviation University of China (No. CAAC-ISECCA-201702).

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Aggarwal, G., Kenthapadi, K., Khuller, S., Panigrahy, R., Thomas, D., Zhu, A.: Achieving anonymity via clustering. In: ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, pp. 153–162 (2006)

    Google Scholar 

  2. Byun, J.-W., Kamra, A., Bertino, E., Li, N.: Efficient k-anonymization using clustering techniques. In: Kotagiri, R., Krishna, P.R., Mohania, M., Nantajeewarawat, E. (eds.) DASFAA 2007. LNCS, vol. 4443, pp. 188–200. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-71703-4_18

    Chapter  Google Scholar 

  3. Fung, B.C.M., Wang, K., Chen, R., Yu, P.S.: Privacy-preserving data publishing: a survey of recent developments. ACM Comput. Surv. 42(4), 14 (2010)

    Article  Google Scholar 

  4. Fung, B.C.M., Wang, K., Yu, P.S.: Top-down specialization for information and privacy preservation. In: International Conference on Data Engineering, pp. 205–216 (2005)

    Google Scholar 

  5. Gkountouna, O., Terrovitis, M.: Anonymizing collections of tree-structured data. IEEE Trans. Knowl. Data Eng. 27(8), 2034–2048 (2015)

    Article  Google Scholar 

  6. Lefevre, K., Dewitt, D.J., Ramakrishnan, R.: Mondrian multidimensional k-anonymity. In: International Conference on Data Engineering, p. 25 (2006)

    Google Scholar 

  7. Li, H., Zhu, H., Du, S., Liang, X., Shen, X.: Privacy leakage of location sharing in mobile social networks: attacks and defense. IEEE Trans. Dependable Sec. Comput. PP(99), 1 (2016)

    Google Scholar 

  8. Lin, J.L., Wei, M.C.: An efficient clustering method for k-anonymization. In: International Workshop on Privacy and Anonymity in Information Society, pp. 46–50. ACM (2008)

    Google Scholar 

  9. Meyerson, A., Williams, R.: On the complexity of optimal k-anonymity. In: Proceedings of the ACM Symposium on Principles of Database Systems, PODS 2004, pp. 223–228. ACM (2004)

    Google Scholar 

  10. Ozalp, I., Gursoy, M.E., Nergiz, M.E., Saygin, Y.: Privacy-preserving publishing of hierarchical data. ACM Trans. Priv. Secur. 19(3), 7 (2016)

    Article  Google Scholar 

  11. Palanisamy, B., Liu, L., Zhou, Y., Wang, Q.: Privacy-preserving publishing of multilevel utility-controlled graph datasets. ACM Trans. Internet Technol. 18(2), 24 (2018)

    Article  Google Scholar 

  12. Samarati, P., Sweeney, L.: Generalizing data to provide anonymity when disclosing information. In: The ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, vol. 98, p. 188. Citeseer (1998)

    Google Scholar 

  13. Samarati, P., Sweeney, L.: Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression. Technical report, SRI International (1998)

    Google Scholar 

  14. Sweeney, L.: k-anonymity: a model for protecting privacy. Int. J. Uncertainty Fuzziness Knowl.-Based Syst. 10(05), 557–570 (2002)

    Article  MathSciNet  Google Scholar 

  15. Wang, K., Yu, P.S., Chakraborty, S.: Bottom-up generalization: a data mining solution to privacy protection, pp. 249–256. IEEE (2004)

    Google Scholar 

  16. Xu, J., Wang, W., Pei, J., Wang, X., Shi, B., Fu, A.W.C.: Utility-based anonymization for privacy preservation with less information loss. ACM SIGKDD Explorations Newsl. 8(2), 21–30 (2006)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chunfu Jia .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zheng, W., Wang, Z., Lv, T., Ma, Y., Jia, C. (2018). K-Anonymity Algorithm Based on Improved Clustering. In: Vaidya, J., Li, J. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2018. Lecture Notes in Computer Science(), vol 11335. Springer, Cham. https://doi.org/10.1007/978-3-030-05054-2_36

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-05054-2_36

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-05053-5

  • Online ISBN: 978-3-030-05054-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics