Skip to main content
Log in

High utility K-anonymization for social network publishing

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Privacy and utility are two main desiderata of good sensitive information publishing schemes. For publishing social networks, many existing algorithms rely on \(k\)-anonymity as a criterion to guarantee privacy protection. They reduce the utility loss by first using the degree sequence to model the structural properties of the original social network and then minimizing the changes on the degree sequence caused by the anonymization process. However, the degree sequence-based graph model is simple, and it fails to capture many important graph topological properties. Consequently, the existing anonymization algorithms that rely on this simple graph model to measure utility cannot guarantee generating anonymized social networks of high utility. In this paper, we propose novel utility measurements that are based on more complex community-based graph models. We also design a general \(k\)-anonymization framework, which can be used with various utility measurements to achieve \(k\)-anonymity with small utility loss on given social networks. Finally, we conduct extensive experimental evaluation on real datasets to evaluate the effectiveness of the new utility measurements proposed. The results demonstrate that our scheme achieves significant improvement on the utility of the anonymized social networks compared with the existing anonymization algorithms. The utility losses of many social network statistics of the anonymized social networks generated by our scheme are under 1 % in most cases.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Notes

  1. The attacker could possess some non-structural information about the targets as well (e.g., the labels of vertices and edges), but in this paper, we only consider the structural background knowledge.

  2. Zhou et al provide the NP-hardness proof in Zhou and Pei [37] by an induction from the \(k\)-anonymity problem in relational data.

  3. We want to highlight that the additive adjustment only applies to the refining local structure information step of Algorithm 1 (i.e., lines 10–11). In other part of the algorithm (e.g., lines 5–9 where edge operations are performed to change the current graph toward the target graph), we consider both the edge addition operations and the edge deletion operations.

  4. The \(y\)-axis is in logarithm scale.

References

  1. Aggarwal CC, Yu PS (2008) Privacy-preserving data mining: models and algorithms. Springer, Berlin

    Book  Google Scholar 

  2. Backstrom L, Dwork C, Kleinberg J (2007) Wherefore art thou r3579x? Anonymized social networks, hidden patterns, and structural steganography. In: WWW’07, pp 181–190

  3. Bhagat S, Cormode G, Krishnamurthy B, Srivastava D (2009) Class-based graph anonymization for social network data. VLDB Endow 2(1):766–777

    Article  Google Scholar 

  4. Cheng J, Fu AWC, Liu J (2010) K-isomorphism: privacy preserving network publication against structural attacks. In: SIGMOD’10, pp 459–470

  5. Clauset A, Moore C, Newman MEJ (2007) Structural inference of hierarchies in networks. In: ICML’06, pp 1–13

  6. Clauset A, Moore C, Newman MEJ (2008) Hierarchical structure and the prediction of missing links in networks. Nature 453(7191):98–101

    Article  Google Scholar 

  7. comScore (2011) It’s a social world: Top 10 need-to-knows about social networking and where it’s headed http://www.comscore.com/Press_Events/Presentations_Whitepapers/2011/it_is_a_social_world_top_10_need-to-knows_about_social_networking

  8. Dwork C (2008) Differential privacy: a survey of results. In: TAMC’08, pp 1–19

  9. Dwork C, McSherry F, Nissim K, Smith A (2006) Calibrating noise to sensitivity in private data analysis. In: TCC’06, pp 265–284

  10. Frikken KB, Golle P (2006) Private social network analysis: how to assemble pieces of a graph privately. In: WPES’06, pp 89–98

  11. Fung BCM, Wang K, Chen R, Yu PS (2010) Privacy-preserving data publishing: a survey of recent developments. ACM Comput Surv 42(4):14:1–14:53

    Article  Google Scholar 

  12. Gao J, Yu JX, Jin R, Zhou J, Wang T, Yang D (2011) Neighborhood-privacy protected shortest distance computing in cloud. In: SIGMOD ’11, pp 409–420

  13. Ghinita G, Tao Y, Kalnis P (2008) On the anonymization of sparse high-dimensional data. In: ICDE ’08, pp 715–724

  14. Hay M, Miklau G, Jensen D (2007) Anonymizing social networks. Tech. rep, UMass Amberst

  15. Hay M, Miklau G, Jensen D, Towsley D, Weis P (2008) Resisting structural re-identification in anonymized social networks. VLDB Endow 1(1):102–114

    Article  Google Scholar 

  16. Hay M, Li C, Miklau G, Jensen D (2009) Accurate estimation of the degree distribution of private networks. In: ICDM ’09, pp 169–178

  17. Kifer D, Gehrke J (2006) Injecting utility into anonymized datasets. In: SIGMOD’06, pp 217–228

  18. Li T, Li N (2009) On the tradeoff between privacy and utility in data publishing. In: SIGKDD’09, pp 517–525

  19. Liu K, Terzi E (2008) Towards identity anonymization on graphs. In: SIGMOD’08, pp 93–106

  20. Liu K, Das K, Grandison T, Kargupta H (2008) Privacy-preserving data analysis on graphs and social networks. In: Next generation of data mining, chap 21. Chapman & Hall/CRC, pp 419–437

  21. Liu L, Wang J, Liu J, Zhang J (2009) Privacy preservation in social networks with sensitive edge weights. In: SDM’09, pp 954–965

  22. Maiya AS, Berger-Wolf TY (2010) Sampling community structure. In: WWW’10, pp 701–710

  23. Mir DJ, Wright RN (2012) A differentially private graph estimator for the stochastic kronecker graph model. In: EDBT, PAIS workshops, pp 122–129

  24. Newman MEJ (2004) Fast algorithm for detecting community structure in networks. Phys Rev E 69(6):066,133+

    Article  Google Scholar 

  25. Rastogi V, Suciu D, Hong S (2007) The boundary between privacy and utility in data publishing. In: VLDB’07, pp 531–542

  26. Sala A, Zhao X, Wilson C, Zheng H, Zhao BY (2011) Sharing graphs using differentially private graph models. In: IMC ’11, pp 81–98

  27. Santo F (2010) Community detection in graphs. Phys Rep 486:75–174

    Article  MathSciNet  Google Scholar 

  28. Sun X, Wang H, Li J, Pei J (2011) Publishing anonymous survey rating data. Data Mining Knowl Discov 23(3):379–406

    Article  MATH  MathSciNet  Google Scholar 

  29. Sweeney L (2002) K-anonymity: a model for protecting privacy. Int J Uncertain Fuzziness Knowl Based Syst 10(5):557–570

    Article  MATH  MathSciNet  Google Scholar 

  30. Wong RCW, Fu AWC, Wang K, Pei J (2007) Minimality attack in privacy preserving data publishing. In: VLDB ’07, pp 543–554

  31. Wu W, Xiao Y, Wang W, He Z, Wang Z (2010) K-symmetry model for identity anonymization in social networks. In: EDBT’10, pp 111–122

  32. Xiao Q, Wang Z, Tan KL (2011) Lora: link obfuscation by randomization in graphs. In: SDM’11, pp 33–51

  33. Ying X, Wu X (2008) Randomizing social networks: a spectrum preserving approach. In: SDM’08, pp 739–750

  34. Ying X, Wu X (2009) On link privacy in randomizing social networks. In: PAKDD’09, pp 28–39

  35. Zheleva E, Getoor L (2008) Preserving the privacy of sensitive relationships in graph data. In: PinKDD’07, pp 153–171

  36. Zhou B, Pei J (2008) Preserving privacy in social networks against neighborhood attacks. In: ICDE’08, pp 506–515

  37. Zhou B, Pei J (2010) The k-anonymity and l-diversity approaches for privacy preservation in social networks against neighborhood attacks. Knowl Inf Syst 24(2):1–38

    Google Scholar 

  38. Zhou B, Pei J, Luk W (2008) A brief survey on anonymization techniques for privacy preserving publishing of social network data. SIGKDD Explor Newsl 10:12–22

    Article  Google Scholar 

  39. Zou L, Chen L, Özsu M (2009) K-automorphism: a general framework for privacy preserving network publication. VLDB Endow 2(1):946–957

    Article  Google Scholar 

Download references

Acknowledgments

This study was funded through a research Grant 10-C220-SMU-005 from the Office of Research, Singapore Management University.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yazhe Wang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, Y., Xie, L., Zheng, B. et al. High utility K-anonymization for social network publishing. Knowl Inf Syst 41, 697–725 (2014). https://doi.org/10.1007/s10115-013-0674-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-013-0674-2

Keywords

Navigation