Skip to main content
Log in

Utility analysis on privacy-preservation algorithms for online social networks: an empirical study

  • Original Article
  • Published:
Personal and Ubiquitous Computing Aims and scope Submit manuscript

Abstract

Social networks have gained tremendous popularity recently. Millions of people use social network apps to share precious moments with friends and family. Users are often asked to provide personal information such as name, gender, and address when using social networks. However, as the social network data are collected, analyzed, and re-published at a large scale, personal information might be misused by unauthorized third parties and even attackers. Therefore, extensive research has been carried out to protect the data from privacy violations in social networks. The most popular technique is graph perturbation, which modifies the local topological structure of a social network user (a vertex) via various randomization techniques before the social graph data is published. Nevertheless, graph anonymization may affect the usability of the data as random noises are introduced, decreasing user experience. Therefore, a trade-off between privacy protection and data usability must be sought. In this paper, we employ various graph and application utility metrics to investigate this trade-off. More specifically, we conduct an empirical study by implementing five state-of-the-art anonymization algorithms to analyze the graph and application utilities on a Facebook and a Twitter dataset. Our results indicate that most anonymization algorithms can partially or conditionally preserve the graph and application utilities and any single anonymization algorithm may not always perform well on different datasets. Finally, drawing on the reviewed graph anonymization techniques, we provide a brief overview on future research directions and challenges involved therein.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Bailey NT, et al. (1975) The mathematical theory of infectious diseases and its applications. Charles Griffin & Company Ltd, 5a Crendon Street, High Wycombe Bucks HP13 6LE

  2. Bhagat S, Cormode G, Krishnamurthy B, Srivastava D (2009) Class-based graph anonymization for social network data. Proceedings of the VLDB Endowment 2(1):766–777

    Article  Google Scholar 

  3. Casas-Roma J, Herrera-Joancomartí J, Torra V (2013) An algorithm for k-degree anonymity on large networks. In: Proceedings of the 2013 IEEE/ACM international conference on advances in social networks analysis and mining, ACM, pp 671–675

  4. Chen S, Zhou S (2013) Recursive mechanism: towards node differential privacy and unrestricted joins. In: Proceedings of the 2013 ACM SIGMOD international conference on management of data, ACM, pp 653–664

  5. Cheng J, Fu AWc, Liu J (2010) k-isomorphism: privacy preserving network publication against structural attacks. In: Proceedings of the 2010 ACM SIGMOD international conference on management of data, ACM, pp 459–470

  6. Csárdi G., Nepusz T, Airoldi EM (2016) Statistical network analysis with igraph. Springer

  7. Day WY, Li N, Lyu M (2016) Publishing graph degree distribution with node differential privacy. In: Proceedings of the 2016 international conference on management of data, ACM, pp 123–138

  8. Dwork C (2008) Differential privacy: a survey of results. In: International conference on theory and applications of models of computation, Springer, pp 1–19

  9. Dwork C, McSherry FD (2010) Differential data privacy. US Patent 7,698,250

  10. Fortunato S (2010) Community detection in graphs. Phys Rep 486(3-5):75–174

    Article  MathSciNet  Google Scholar 

  11. Hay M, Li C, Miklau G, Jensen D (2009) Accurate estimation of the degree distribution of private networks. In: 2009. ICDM’09. ninth IEEE international conference on data mining, IEEE, pp 169–178

  12. Hay M, Miklau G, Jensen D, Towsley D, Weis P (2008) Resisting structural re-identification in anonymized social networks. Proceedings of the VLDB Endowment 1(1):102–114

    Article  Google Scholar 

  13. Hay M, Miklau G, Jensen D, Towsley D, Weis P (2008) Resisting structural re-identification in anonymized social networks. Proceedings of the VLDB Endowment 1(1):102–114

    Article  Google Scholar 

  14. Heitmann B, Hermsen F, Decker S (2017) k-rdf-neighbourhood anonymity: combining structural and attribute-based anonymisation for linked data. In: Privon@ ISWC

  15. Henderson K, Gallagher B, Eliassi-Rad T, Tong H, Basu S, Akoglu L, Koutra D, Faloutsos C, Li L (2012) Rolx: structural role extraction & mining in large graphs. In: Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 1231–1239

  16. Ji S, Li W, Srivatsa M, Beyah R (2014) Structural data de-anonymization: Quantification, practice, and implications. In: Proceedings of the 2014 ACM SIGSAC conference on computer and communications security, ACM, pp 1040–1053

  17. Ji S, Mittal P, Beyah R (2016) Graph data anonymization, de-anonymization attacks, and de-anonymizability quantification: a survey. IEEE Commun Surv Tutorials 19(2):1305–1326

    Article  Google Scholar 

  18. Kellaris G, Papadopoulos S (2013) Practical differential privacy via grouping and smoothing. Proceedings of the VLDB endowment 6(5):301–312

    Article  Google Scholar 

  19. Kifer D, Machanavajjhala A (2011) No free lunch in data privacy. In: Proceedings of the 2011 ACM SIGMOD international conference on management of data, ACM, pp 193–204

  20. Korayem M, Crandall DJ (2013) De-anonymizing users across heterogeneous social computing platforms. In: ICWSM

  21. Leskovec J, Krevl A (2014) SNAP datasets: Stanford large network dataset collection http://snap.stanford.edu/data

  22. Li C, Hay M, Miklau G, Wang Y (2014) A data-and workload-aware algorithm for range queries under differential privacy. Proceedings of the VLDB endowment 7(5):341–352

    Article  Google Scholar 

  23. Liu K, Terzi E (2008) Towards identity anonymization on graphs. In: Proceedings of the 2008 ACM SIGMOD international conference on management of data, ACM, pp 93–106

  24. Liu Y, Ji S, Mittal P (2016) Smartwalk: Enhancing social network security via adaptive random walks. In: Proceedings of the 2016 ACM SIGSAC conference on computer and communications security, ACM, pp 492–503

  25. Mittal P, Papamanthou C, Song D (2012) Preserving link privacy in social network based systems. arXiv:1208.6189

  26. Narayanan A, Shmatikov V (2008) Robust de-anonymization of large sparse datasets. In: 2008. SP 2008. IEEE symposium on security and privacy, IEEE, pp 111–125

  27. Narayanan A, Shmatikov V (2009) De-anonymizing social networks. In: 2009 30th IEEE symposium on security and privacy, IEEE, pp 173–187

  28. Newman ME (2016) Mathematics of networks. The new Palgrave dictionary of economics, pp 1–8

  29. Nguyen BP, Ngo H, Kim J, Kim J (2015) Publishing graph data with subgraph differential privacy. In: International workshop on information security applications, Springer, pp 134–145

  30. Nilizadeh S, Kapadia A, Ahn YY (2014) Community-enhanced de-anonymization of online social networks. In: Proceedings of the 2014 ACM SIGSAC conference on computer and communications security, ACM, pp 537–548

  31. Nissim K, Raskhodnikova S, Smith A (2007) Smooth sensitivity and sampling in private data analysis. In: Proceedings of the thirty-ninth annual ACM symposium on theory of computing, ACM, pp 75–84

  32. Qian J, Li XY, Zhang C, Chen L, Jung T, Han J. (2017) Social network de-anonymization and privacy inference with knowledge graph model. IEEE Transactions on Dependable and Secure Computing

  33. Rong H, Ma T, Tang M, Cao J (2018) A novel subgraph k+-isomorphism method in social network based on graph similarity detection. Soft Comput 22(8):2583–2601

    Article  Google Scholar 

  34. Samarati P, Sweeney L (1998) Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression, Tech. rep., technical report, SRI International

  35. Srivatsa M, Hicks M (2012) Deanonymizing mobility traces: using social network as a side-channel. In: Proceedings of the 2012 ACM conference on computer and communications security, ACM, pp 628–637

  36. Thompson B, Yao D (2009) The union-split algorithm and cluster-based anonymization of social networks. In: Proceedings of the 4th international symposium on information, computer, and communications security, ACM, pp 218–227

  37. Tian W, Mao J, Jiang J, He Z, Zhou Z, Liu J (2018) Deeply understanding structure-based social network de-anonymization. Prog Comput Sci 129:52–58

    Article  Google Scholar 

  38. Wang B, Jia J, Zhang L, Gong NZ (2018) Structure-based sybil detection in social networks via local rule-based propagation. IEEE Transactions on Network Science and Engineering

  39. Wang Q, Zhang Y, Lu X, Wang Z, Qin Z, Ren K (2018) Real-time and spatio-temporal crowd-sourced social network data publishing with differential privacy. IEEE Trans Dependable Secure Comput 15 (4):591–606

    Google Scholar 

  40. Wang Y, Wu X (2013) Preserving differential privacy in degree-correlation based graph generation. Transactions on Data Privacy 6(2):127

    MathSciNet  Google Scholar 

  41. Wu X, Hu Z, Fu X, Fu L, Wang X, Lu S (2018) Social network de-anonymization with overlapping communities: Analysis, algorithm and experiments. In: Proc IEEE INFOCOM

  42. Ying X, Wu X (2008) Randomizing social networks: a spectrum preserving approach. In: Proceedings of the 2008 SIAM international conference on data mining, SIAM, pp 739–750

  43. Zhang Z, Wang H, Wang C, Fang H (2015) Modeling epidemics spreading on social contact networks. IEEE Trans Emerg Top Comput 3(3):410–419

    Article  Google Scholar 

  44. Zhou B, Pei J (2008) Preserving privacy in social networks against neighborhood attacks. In: 2008. ICDE 2008. IEEE 24th international conference on data engineering, IEEE, pp 506–515

  45. Zou L, Chen L, Özsu MT (2009) K-automorphism: a general framework for privacy preserving network publication. Proceedings of the VLDB Endowment 2(1):946–957

    Article  Google Scholar 

Download references

Funding

This work was partially supported by the US National Science Foundation under grants CNS-1704397, CNS-1704287, and CNS-1704274; the National Science Foundation of China under grants 61832012, 61871466, and 61771289; and the Guangxi Natural Science Foundation Innovation Research Team Project under Grant 2016GXNSFGA380002.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Feng Zhao.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, C., Jiang, H., Cheng, X. et al. Utility analysis on privacy-preservation algorithms for online social networks: an empirical study. Pers Ubiquit Comput 25, 1063–1079 (2021). https://doi.org/10.1007/s00779-019-01287-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00779-019-01287-0

Keywords

Navigation