K-anonymity for social networks containing rich structural and textual information

Hao, Yifan; Cao, Huiping; Hu, Chuan; Bhattarai, Kabi; Misra, Satyajayant

doi:10.1007/s13278-014-0223-3

K-anonymity for social networks containing rich structural and textual information

Original Article
Published: 20 August 2014

Volume 4, article number 223, (2014)
Cite this article

Social Network Analysis and Mining Aims and scope Submit manuscript

Yifan Hao¹,
Huiping Cao¹,
Chuan Hu¹,
Kabi Bhattarai¹ &
…
Satyajayant Misra¹

343 Accesses
Explore all metrics

Abstract

When social networks are released for analysis, individuals’ sensitive information (e.g., node identities) in the network may be exposed. To avoid unwanted information exposure, social networks need to be anonymized before they are published. In the literature, many approaches exist to anonymize social networks to prevent attacks by adversaries that know the network structures such as node degrees and neighbors. However, these techniques cannot prevent the leakage of valuable identification information during social network analysis if the social network graphs contain both structural and textual information. In this paper, we study the problem of anonymizing social networks to prevent individual identifications which use both structural (node degrees) and textual (edge labels) information in graphs. We formally define the problem as Structure and Text aware $K$-anonymity of social networks (STK-Anonymity). In an STK-anonymized network, each individual is $ST$-equivalent to at least $K-1$ other nodes. The major challenge in achieving STK-Anonymity comes from the correlation of edge labels, which causes the propagation of edge anonymization. It has been shown that it is intractable to optimally $K$-anonymizing the label sequences of edge-labeled graphs. To address the challenge, we present a two-phase approach which consists of two heuristics in the first phase to process partial graph structures (node degrees in particular) and a set-enumeration tree-based approach in the second phase to anonymize edge labels. Results from extensive experiments on both real and synthetic datasets are presented to show the effectiveness and efficiency of our approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An Enhanced Structure-Based De-anonymization of Online Social Networks

HAkAu: hybrid algorithm for effective k-automorphism anonymization of social networks

Article Open access 04 April 2023

Alpha-anonymization techniques for privacy preservation in social networks

Article 02 June 2016

Notes

A table with missing values is an exception.
Multiple edges are combined to one topological edge with multiple edge labels.
In anonymizing microdata, the cell-based anonymization strategy (Meyerson and Williams 2004; Park and Shim 2007) uses a many to many mapping to perform the anonymization, i.e., the same label $el$ may be generalized to multiple other labels. Such $K$-anonymization problem has shown to be NP-Hard even when the attribute values are ternary (Aggarwal et al. 2005). In this work, we do not consider the anonymization solutions with such fine granularities.

References

Aggarwal CC, Khan A, Yan X (2011) On flow authority discovery in social networks. In: Proceedings of SIAM international conference on data mining (SDM). SIAM/Omnipress, pp 522–533
Aggarwal G, Feder T, Kenthapadi K, Motwani R, Panigrahy R, Thomas D, Zhu A (2005) Anonymizing tables. In: ICDT, pp 246–258
Backstrom L, Dwork C, Kleinberg JM (2007) Wherefore art thou r3579x?: anonymized social networks, hidden patterns, and structural steganography. In: Proceedings of World Wide Web Conference (WWW), pp 181–190
Backstrom L, Huttenlocher DP, Kleinberg JM, Lan X (2006) Group formation in large social networks: membership, growth, and evolution. In: Proceedings of ACM SIGKDD international conference on knowledge discovery and data mining, pp 44–54
Bhagat S, Cormode G, Krishnamurthy B, Srivastava D (2010) Privacy in dynamic social networks. In: Proceedings of World Wide Web Conference (WWW), pp 1059–1060
Bonchi F, Gionis A, Tassa T (2011) Identity obfuscation in graphs through the information theoretic lens. In: Proceedings of IEEE International conference on data engineering (ICDE), pp 924–935
Campan A, Truta TM (2008) Data and structural k-anonymity in social networks. In: ACM International workshop on privacy, security, and trust in KDD (PinKDD), pp 33–54
Chakrabarti D, Zhan Y, Faloutsos C (2004) R-MAT: a recursive model for graph mining. In: Proceedings of SIAM International conference on data mining (SDM)
Chen C, Yan X, Zhu F, Han J, Yu PS (2008) Graph olap: towards online analytical processing on graphs. In: Proceedings of IEEE International conference on data mining (ICDM). IEEE Computer Society, pp 103–112
Cheng J, Fu AWC, Liu J (2010) K-isomorphism: privacy preserving network publication against structural attacks. In: Proceedings of ACM SIGMOD International conference on management of data, pp 459–470
Chester S, Kapron BM, Srivastava G, Venkatesh S (2013) Complexity of social network anonymization. Soc Netw Anal Min 3(2):151–166
Article Google Scholar
Cormen TH, Leiserson CE, Rivest RL (2009) Introduction to Algorithms. The MIT Press, Massachusetts
MATH Google Scholar
Cormode G, Srivastava D, Bhagat S, Krishnamurthy B (2009) Class-based graph anonymization for social network data. Proc VLDB Endow 2(1):766–777
Article Google Scholar
Das S, Egecioglu Ö, Abbadi AE (2010) Anonymizing weighted social network graphs. In: Proceedings of IEEE International Conference on Data Engineering (ICDE), pp 904–907
Das S, Egecioglu Ö, El Abbadi A (2012) Anónimos: an LP-based approach for anonymizing weighted social network graphs. IEEE Trans Knowl Data Eng 24(4):590–604
Article Google Scholar
Fard AM, Wang K, Yu PS (2012) Limiting link disclosure in social network analysis through subgraph-wise perturbation. In: Proceedings of international conference on extending database technology (EDBT), pp 109–119
Han J, Yan X, Yu PS (2009) Scalable olap and mining of information networks. In: Proceedings of international conference n extending database technology (EDBT), p 1159
Hay M, Li C, Miklau G, Jensen D (2009) Accurate estimation of the degree distribution of private networks. In: Proceedings of IEEE international conference on data mining (ICDM), pp 169–178
Hay M, Miklau G, Jensen D, Towsley DF, Li C (2010) Resisting structural re-identification in anonymized social networks. VLDB J 19(6):797–823
Article Google Scholar
Hay M, Miklau G, Jensen D, Towsley DF, Weis P (2008) Resisting structural re-identification in anonymized social networks. Proc VLDB Endow 1(1):102–114
Article Google Scholar
Bayardo RJ, Agrawal R (2005) Data privacy through optimal k-anonymization. In: Proceedings of IEEE intlernational conference on data engineering (ICDE), pp 217–228
Kumar R, Novak J, Tomkins A (2006) Structure and evolution of online social networks. In: Proceedings of ACM SIGKDD international conference on knowledge discovery and data mining, pp 611–617
Lee Y-S (1995) Graphical demonstration of an optimality property of the median. Am Stat 49(4):369–372
Google Scholar
LeFevre K, DeWitt DJ, Ramakrishnan R (2005) Incognito: efficient full-domain k-anonymity. In: Proceedings of ACM SIGMOD intlernational conference on management of data, pp 49–60
Li N, Li T, Venkatasubramanian S (2007) t-closeness: privacy beyond k-anonymity and l-diversity. In: Proceedings of IEEE international conference on data engineering (ICDE), pp 106–115
Liu K, Terzi E (2008) Towards identity anonymization on graphs. In: Proceedings of ACM SIGMOD international conference on management of data, pp 93–106
Liu L, Wang J, Liu J, Zhang J (2009) Privacy preservation in social networks with sensitive edge weights. In: Proceedings of SIAM international conference on data mining (SDM), pp 954–965
Liu X, Yang X (2011) A generalization based approach for anonymizing weighted social network graphs. In: WAIM, pp 118–130
Lu X, Song Y, Bressan S (2012) Fast identity anonymization on graphs. In: Proceedings of international conference on database and expert systems applications (DEXA), pp 281–295
Machanavajjhala A, Gehrke J, Kifer D, Venkitasubramaniam M (2006) l-diversity: privacy beyond k-anonymity. In: Proceedings of IEEE international conference on data engineering (ICDE), p 24
McCallum A, Corrada-Emmanuel A, Wang X (2005) Topic and role discovery in social networks. In: International joint conference on artificial intelligence (IJCAI), pp 786–791
Medforth N, Wang K (2011) Privacy risk in graph stream publishing for social network data. In: Proceedings of IEEE international conference on data mining (ICDM), pp 437–446
Meyerson A, Williams R (2004) On the complexity of optimal k-anonymity. In: Proceedings of ACM symposium on principles of database systems (PODS), pp 223–228
Narayanan A, Shmatikov V (2009) De-anonymizing social networks. In: IEEE symposium on security and privacy, pp 173–187
Nobari S, Karras P, Pang H, Bressan S (2014) L-opacity: linkage-aware graph anonymization. In: Proceedings of international conference on extending database technology (EDBT), pp 583–594
Park H, Shim K (2007) Approximate algorithms for k-anonymity. In: Proceedings of ACM SIGMOD international conference on management of data, pp 67–78
Rymon R (1992) Search through systematic set enumeration. In: International conference on principles of knowledge representation and reasoning (KR), pp 539–550
Samarati P (2001) Protecting respondents’ identities in microdata release. IEEE Trans Knowl Data Eng 13(6):1010–1027
Article Google Scholar
Seary AJ, Richards WD (2000) Spectral methods for analyzing and visualizing networks: an introduction. In: Workshop summary and papers, pp 209–228
Song Y, Karras P, Xiao Q, Bressan S (2012) Sensitive label privacy protection on social network data. In: International conference on scientific and statistical database management (SSDBM), pp 562–571
Sweeney L (2002) k-anonymity: a model for protecting privacy. Int J Uncertain Fuzziness Knowl Based Syst 10(5):557–570
Article MathSciNet MATH Google Scholar
Tai CH, Yu PS, Yang DN, Chen MS (2011) Privacy-preserving social network publication against friendship attacks. In: Proceedings of ACM SIGKDD international conference on knowledge discovery and data mining, pp 1262–1270
Watts DJ, Strogatz SH (1998) Collective dynamics of ‘small-world’ networks. Nature 393:440–442
Article Google Scholar
Wu W, Xiao Y, Wang W, He Z, Wang Z (2010) K-symmetry model for identity anonymization in social networks. In: Proceedings of international conference on extending database technology (EDBT), pp 111–122
Xue M, Karras P, Raïssi C, Kalnis P, Pung HK (2012) In: CIKM Delineating social network data anonymization via random edge perturbation, pp 475–484
Ying X, Pan K, Wu X, Guo L (2009) Comparisons of randomization and k-degree anonymization schemes for privacy preserving social network publishing. In: Workshop on social network mining and analysis (SNA-KDD), p 10
Ying X, Wu X (2008) Randomizing social networks: a spectrum preserving approach. In: Proceedings of SIAM International Conference on Data Mining (SDM), pp 739–750
Yuan M, Chen L (2011) Node protection in weighted social networks. DASFAA 1:123–137
Google Scholar
Yuan M, Chen L, Yu PS (2010) Personalized privacy protection in social networks. Proc VLDB Endow 4(2):141–150
Article Google Scholar
Zheleva E, Getoor L (2007) Preserving the privacy of sensitive relationships in graph data. In: ACM international workshop on privacy, security, and trust in KDD (PinKDD), pp 153–171
Zhou B, Pei J (2008) Preserving privacy in social networks against neighborhood attacks. In: Proceedings of IEEE international conference on data engineering (ICDE), pp 506–515
Zhou B, Pei J (2011) The k-anonymity and l-diversity approaches for privacy preservation in social networks against neighborhood attacks. Knowl Inf Syst 28(1):47–77
Article MathSciNet Google Scholar
Zou L, Chen L, Özsu MT (2009) K-automorphism: a general framework for privacy preserving network publication. Proc VLDB Endow 2(1):946–957
Article Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science, New Mexico State University, Las Cruces, NM, 88011, USA
Yifan Hao, Huiping Cao, Chuan Hu, Kabi Bhattarai & Satyajayant Misra

Authors

Yifan Hao
View author publications
You can also search for this author inPubMed Google Scholar
Huiping Cao
View author publications
You can also search for this author inPubMed Google Scholar
Chuan Hu
View author publications
You can also search for this author inPubMed Google Scholar
Kabi Bhattarai
View author publications
You can also search for this author inPubMed Google Scholar
Satyajayant Misra
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Huiping Cao.

Appendix

This section contains more figures with the utility measures for datasets with different size: Fig. 23 (DBLP 1000), Fig. 24 (DBLP 2000), Fig. 25 (DBLP 4000), Fig. 26 (DBLP8000), Fig. 27 (DBLP 32000), Fig. 28 (Synthetic N1000), Fig. 29 (Synthetic N5000), Fig. 30 (Synthetic N10000, E20000), Fig. 31 (Synthetic N10000, E40000), Fig. 32 (Synthetic N10000, E50000), Fig. 33 (Synthetic N20000, E60000). The trend that we observed from these figures is the same to that in Sect. 6.1.2.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hao, Y., Cao, H., Hu, C. et al. K-anonymity for social networks containing rich structural and textual information. Soc. Netw. Anal. Min. 4, 223 (2014). https://doi.org/10.1007/s13278-014-0223-3

Download citation

Received: 06 December 2013
Revised: 11 June 2014
Accepted: 21 July 2014
Published: 20 August 2014
DOI: https://doi.org/10.1007/s13278-014-0223-3

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

K-anonymity for social networks containing rich structural and textual information

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

An Enhanced Structure-Based De-anonymization of Online Social Networks

HAkAu: hybrid algorithm for effective k-automorphism anonymization of social networks

Alpha-anonymization techniques for privacy preservation in social networks

Notes

References

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now