Abstract
In this paper, we present two methods for classification of different social network actors (individuals or organizations) such as leaders (e.g., news groups), lurkers, spammers and close associates. The first method is a two-stage process with a fuzzy-set theoretic (FST) approach to evaluation of the strengths of network links (or equivalently, actor-actor relationships) followed by a simple linear classifier to separate the actor classes. Since this method uses a lot of contextual information including actor profiles, actor-actor tweet and reply frequencies, it may be termed as a context-dependent approach. To handle the situation of limited availability of actor data for learning network link strengths, we also present a second method that performs actor classification by matching their short-term (say, roughly 25 days) tweet patterns with the generic tweet patterns of the prototype actors of different classes. Since little contextual information is used here, this can be called a context-independent approach. Our experimentation with over 500 randomly sampled records from a twitter database consists of 441,234 actors, 2,045,804 links, 6,481,900 tweets, and 2,312,927 total reply messages indicates that, in the context-independent analysis, a multilayer perceptron outperforms on both on classification accuracy and a new F-measure for classification performance, the Bayes classifier and Random Forest classifiers. However, as expected, the context-dependent analysis using link strengths evaluated using the FST approach in conjunction with some actor information reveals strong clustering of actor data based on their types, and hence can be considered as a superior approach when data available for training the system is abundant.
Similar content being viewed by others
References
Bellman RE (1957) Dynamic programming. Rand Corporation
Berkowitz SD (1982) An introduction to structure analysis: the network approach to social research. Butterworth
Breiman L (2001) Random forests. Mach Learning 45(1):5–32
Cha M, Mislove A, Gummadi KP (2009) A measurement-driven analysis of information propagation in the flickr social network. In: Proceedings of 18th international conference on World wide web, pp 721–730
Cheong M, Lee V (2009) Integrating web-based intelligence retrieval and decision-making from the twitter trends knowledge base. In: Proceedings of 2nd ACM workshop on social web search and mining, Hong Kong, pp 1–8
Cingolani P (2010) jfuzzylogic, open source fuzzy logic library and FCL language implementation. http://jfuzzylogicsourceforgenet/html/indexhtml
Commission IE (1997) Technical committee no. 6: Industrial process measurement and control. sub-committee 65 b: Devices. iec 1131-programmable controllers.http://www.fuzzytechcom/binaries/ieccd1pdf
Dorein P, Sockman FN (1997) Evolution of social networks, vol 1. Overseas Publishers Association, Amsterdam
Garton L, Haythornthwaite C, Wellman B (1997) Studying online social networks. J Comp Mediated Commun 3(1):75–105
Kim S, Han S (2009) The method of inferring trust in web-based social network using fuzzy logic. In: Proceedings of international workshop on machine intelligence research, vol 2, pp 140–144
Klym N, Montpetit MJ (2008) Innovation at the edge: social TV and beyond. MIT Communications Futures Program (CFP). http://cfp.mit.edu/publications/CFP_Papers/Social%20TV%20Final%202008.09.01%20for%20distribution.pdf
Kossinets G, Watts DJ (2006) Empirical analysis of an evolving social network. Sci Agric 311(5757):88–90
Kumar R, Novak J, Tomkins A (2010) Structure and evolution of online social networks. In: Link mining: models, algorithms, and applications. Springer, Berlin
Lin Y, Sundaram H, Chi Y, Tatemura J, Tseng B (2007) Splog detection using content, time and link structures. In: Proceedings of IEEE international conference on multimedia and expo, pp 2030–2033
Ma W, Tran D, Sharma D (2009) A novel spam email detection system based on negative selection. In: Proceedings of 4th international conference on computer science and convergence information technology, Seoul, pp 987–992
Marsden PV, Campbell KE (1984) Measuring tie strength. Social Forces 63(2):482–501
Mathioudakis M, Koudas N, Marbach P (2010) Early online identification of attention gathering items in social media. In: Proceedings of 3rd ACM international conference web search and data mining, pp 301–310
McPherson M, Smith-Lovin L, Cook JM (2001) Birds of a feather: Homophily in social networks. Annu Rev Sociol 27:415–444
Mislove A, Marcon M, Gummadi KP, Druschel P, Bhattacharjee B (2007) Measurement and analysis of online social networks. In: Proceedings of 7th ACM SIGCOMM conference on international measurement, pp 29–42
Onnela JP, Saramaki J, Hyvonen J, Szabo G, Lazer D, Kaski K, Kertesz J, Barabasi AL (2007) Structure and tie strengths in mobile communication networks. Proc Natl Acad Sci USA 104(18):7332–7336
Rojas R (1996) Neural networks—a systematic introduction. Springer, Berlin
Scott J (1992) Social network analysis. Sage, Newbury Park
Strauss D, Ikeda M (1990) Pseudolikelihood estimation for social networks. J Am Stat Assoc 85:204–212
Wasserman S, Faust K (1994) Social network analysis. Cambridge University Press, Cambridge
Wasserman S, Pattison P (1996) Logit models and logistic regressions for social networks: I. an introduction to markov graphs and p*. Psychometrika 61(3):401–425
Watts DJ, Dodds PS, Newman MEJ (2002) Identity and search in social networks. Sci Agric 296(17):1302–1305
Weng J, Lim E, Jiang J, He Q (2010) Twitterrank: finding topic-sensitive influential twitterers. In: Proceedings of 3rd ACM international conference on web search and data mining, New York, pp 261–270
White H, Boorman S, Breiger R (1976) Social structure from multiple networks: I. blockmodels of roles and positions. Am J Sociol 81:730–780
Wikipedia (2010) Twitter. http://enwikipediaorg/wiki/Twitter
Wu C, Zhou B (2009) Analysis of tag within online social networks. In: Procedings of ACM 2009 international conference, pp 21–30
Yeh C, Chiang S (2009) Revisit bayesian approaches for spam detection. In: Proceedings of. 9th international conference for young computer scientists, Hunan, pp 659–664
Zhou B, Pei J, Luk W (2009) A brief survey on anonymization techniques for privacy preserving publishing of social network data. In: Proceedings of ACM SIGKDD Explorations Newsletter, vol 10, pp 12–22
Acknowledgments
We would like to thank Dr. Zuoming Wang, Assistant Professor, communication studies, University of North Texas, Texas, USA, for helping us in this work. We would also like to thank Dr. Santi Phithakkitnukoon, SENSEable City Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA, for giving us valuable comments on this work. This work is supported by the National Science Foundation under Grants CNS-0627754, CNS-0619871 and CNS-0551694.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Fazeen, M., Dantu, R. & Guturu, P. Identification of leaders, lurkers, associates and spammers in a social network: context-dependent and context-independent approaches. Soc. Netw. Anal. Min. 1, 241–254 (2011). https://doi.org/10.1007/s13278-011-0017-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13278-011-0017-9