Detecting anomalies in social network data consumption

Akcora, Cuneyt Gurcan; Carminati, Barbara; Ferrari, Elena; Kantarcioglu, Murat

doi:10.1007/s13278-014-0231-3

Detecting anomalies in social network data consumption

Original Article
Published: 29 August 2014

Volume 4, article number 231, (2014)
Cite this article

Social Network Analysis and Mining Aims and scope Submit manuscript

Cuneyt Gurcan Akcora¹,
Barbara Carminati¹,
Elena Ferrari¹ &
…
Murat Kantarcioglu²

566 Accesses
11 Citations
Explore all metrics

Abstract

As the popularity and usage of social media exploded over the years, understanding how social network users’ interests evolve gained importance in diverse fields, ranging from sociological studies to marketing. In this paper, we use two snapshots from the Twitter network and analyze data interest patterns of users in time to understand individual and collective user behavior on social networks. Building topical profiles of users, we propose novel metrics to identify anomalous friendships, and validate our results with Amazon Mechanical Turk experiments. We show that although more than 80 % of all friendships on Twitter are created due to data interests, 83 % of all users have at least one friendship that can be explained neither by users’ past interest nor collective behavior of other similar users.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Social Informatics: Using Big Data to Understand Social Behavior

#FewThingsAboutIdioms: Understanding Idioms and Its Users in the Twitter Online Social Network

Temporal Analysis of User Behavior and Topic Evolution on Twitter

Notes

In this paper, we use the term “anomaly” to represent such significant changes in user behavior.
https://dev.twitter.com/.
In Twitter API, friends of a user are the accounts followed by the user.
Two senators are excluded in bioLDA because of short or blank bios.
Other words from the topic include words such as green, water, power, wind, oil and gas.
The number of new friendships is greater than the total number of queried Twitter users because we have queried Twitter breadth first, and many new friendships are shared by seed users.
http://sight.dicom.uninsubria.it/anomaly/.
Approved by the Office of Research Compliance-University of Texas at Dallas, human experiment IRB MR 13-231.
For Fleiss’ Kappa, >0.2 Fair agreement, >0.40 Moderate agreement, >0.6 Substantial agreement

References

Akcora CG, Carminati B, Ferrari E (2012) Risks of friendships on social networks. In: Data Mining (ICDM), 2012 IEEE 12th International Conference
Akcora CG, Carminati B, Ferrari E, Kantarcioglu M (2014) Twitter diff dataset: friends of users in 2009 and 2013. http://strict.dista.uninsubria.it/?p=364, 2014
Akcora CG, Carminati B, Ferrari E, Kantarcioglu M (2014) Twitter taf dataset: detecting topically anomalous friendships. http://strict.dista.uninsubria.it/?p=442, 2014
Anantharam P, Sheth A (2012) Topical anomaly detection from twitter stream. Proc ACM Web Sci 2012:11–14
Article Google Scholar
Blei D, Ng A, Jordan M (2003) Latent dirichlet allocation. J Mac Learn Res 3:993–1022
MATH Google Scholar
Bollen J, Mao H, Zeng X (2011) Twitter mood predicts the stock market. J Comput Sci 2(1):1–8
Article Google Scholar
Cataldi M, Di Caro L, Schifanella C (2010) Emerging topic detection on twitter based on temporal and social terms evaluation. In: Proceedings of the Tenth International Workshop on Multimedia Data Mining. ACM, p 4
Chandola V, Banerjee A, Kumar V (2009) Anomaly detection: a survey. ACM Comput Surv (CSUR) 41(3):15
Article Google Scholar
Choudhury MD (2011) Tie formation on twitter: homophily and structure of egocentric networks. In: SocialCom/PASSAT, p 465–470
Chu Z, Gianvecchio S, Wang H, Jajodia S (2010) Who is tweeting on twitter: human, bot, or cyborg? In: Proceedings of the 26th Annual Computer Security Applications Conference, ACM, p 21–30
Fleiss JL, Levin B, Paik MC (1981) The measurement of interrater agreement. Stat Methods Rates Proportions 2:212–236
Google Scholar
Gan G, Ma C, Wu J (2007) Data clustering. SIAM, Society for Industrial and Applied Mathematics
Hong L, Davison B (2010) Empirical study of topic modeling in twitter. In: Proceedings of the First Workshop on Social Media Analytics, ACM, p 80–88
Kwak H, Lee C, Park H, Moon S (2010) What is Twitter, a social network or a news media? In WWW ’10: Proceedings of the 19th international conference on World wide web, ACM, New York, NY, USA, p 591–600
Lee S, Kim J (2012) Warningbird: detecting suspicious urls in twitter stream. In: Symposium on Network and Distributed System Security (NDSS)
Leetaru K, Wang S, Cao G, Padmanabhan A, Shook E (2013) Mapping the global twitter heartbeat: The geography of twitter. First Monday 18(5)
Lucia W, Akcora CG, Ferrari E (2013) Multi-dimensional conversation analysis across online social networks. In: Cloud and Green Computing (CGC), 2013 Third International Conference, IEEE, p 369–376
Mathioudakis M, Koudas N (2010) Twittermonitor: trend detection over the twitter stream. In: Proceedings of the 2010 international conference on Management of data. ACM, p 1155–1158
McPherson M, Smith-Lovin L, Cook JM (2001) Birds of a feather: homophily in social networks. Ann Rev Sociol 415–444
Meeder B, Karrer B, Sayedi A, Ravi R, Borgs C, Chayes J (2011) We know who you followed last summer: inferring social link creation times in twitter. In: Proceedings of the 20th international conference on World wide web. ACM, p 517–526
Pak A, Paroubek P (2010) Twitter as a corpus for sentiment analysis and opinion mining. In: Proceedings of LREC, vol 2010
Papadimitriou P, Dasdan A, Garcia-Molina H (2010) Web graph similarity for anomaly detection. J Internet Serv Appl 1(1):19–30
Article Google Scholar
Ramage D, Dumais S, Liebling D (2010) Characterizing microblogs with topic models. In: ICWSM
Sakaki T, Okazaki M, Matsuo Y (2010) Earthquake shakes twitter users: real-time event detection by social sensors. In: Proceedings of the 19th international conference on World wide web. ACM, p 851–860
Takahashi T, Tomioka R, Yamanishi K (2011) Discovering emerging topics in social streams via link anomaly detection. In Data Mining (ICDM), 2011 IEEE 11th International Conference. IEEE, p 1230–1235
Thomases H (2010) Twitter marketing: an hour a day. Sybex
Traud AL, Mucha PJ, Porter MA (2012) Social structure of facebook networks. Phys A Stat Mech Appl 391(16):4165–4180
Article Google Scholar
Zhao W, Jiang J, Weng J, He J, Lim E-P, Yan H, Li X (2011) Comparing twitter and traditional media using topic models. Adv Inf Retr 338–349

Download references

Acknowledgments

This work is partially funded by National Science Foundation (NSF) Grants Career—CNS-0845803, CNS-0964350, CNS-1016343, CNS-1111529, CNS-1228198.

Author information

Authors and Affiliations

DISTA, Università degli Studi dell’Insubria, Via Mazzini 5, Varese, Italy
Cuneyt Gurcan Akcora, Barbara Carminati & Elena Ferrari
Data Security and Privacy Laboratory, University of Texas at Dallas, Richardson, USA
Murat Kantarcioglu

Authors

Cuneyt Gurcan Akcora
View author publications
You can also search for this author in PubMed Google Scholar
Barbara Carminati
View author publications
You can also search for this author in PubMed Google Scholar
Elena Ferrari
View author publications
You can also search for this author in PubMed Google Scholar
Murat Kantarcioglu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Cuneyt Gurcan Akcora.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Akcora, C.G., Carminati, B., Ferrari, E. et al. Detecting anomalies in social network data consumption. Soc. Netw. Anal. Min. 4, 231 (2014). https://doi.org/10.1007/s13278-014-0231-3

Download citation

Received: 11 December 2013
Revised: 24 May 2014
Accepted: 02 August 2014
Published: 29 August 2014
DOI: https://doi.org/10.1007/s13278-014-0231-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Detecting anomalies in social network data consumption

Abstract

Access this article

Similar content being viewed by others

Social Informatics: Using Big Data to Understand Social Behavior

#FewThingsAboutIdioms: Understanding Idioms and Its Users in the Twitter Online Social Network

Temporal Analysis of User Behavior and Topic Evolution on Twitter

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Abstract

Access this article

Similar content being viewed by others

Social Informatics: Using Big Data to Understand Social Behavior

#FewThingsAboutIdioms: Understanding Idioms and Its Users in the Twitter Online Social Network

Temporal Analysis of User Behavior and Topic Evolution on Twitter

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation