Skip to main content
Log in

Identification of leaders, lurkers, associates and spammers in a social network: context-dependent and context-independent approaches

  • Review Article
  • Published:
Social Network Analysis and Mining Aims and scope Submit manuscript

Abstract

In this paper, we present two methods for classification of different social network actors (individuals or organizations) such as leaders (e.g., news groups), lurkers, spammers and close associates. The first method is a two-stage process with a fuzzy-set theoretic (FST) approach to evaluation of the strengths of network links (or equivalently, actor-actor relationships) followed by a simple linear classifier to separate the actor classes. Since this method uses a lot of contextual information including actor profiles, actor-actor tweet and reply frequencies, it may be termed as a context-dependent approach. To handle the situation of limited availability of actor data for learning network link strengths, we also present a second method that performs actor classification by matching their short-term (say, roughly 25 days) tweet patterns with the generic tweet patterns of the prototype actors of different classes. Since little contextual information is used here, this can be called a context-independent approach. Our experimentation with over 500 randomly sampled records from a twitter database consists of 441,234 actors, 2,045,804 links, 6,481,900 tweets, and 2,312,927 total reply messages indicates that, in the context-independent analysis, a multilayer perceptron outperforms on both on classification accuracy and a new F-measure for classification performance, the Bayes classifier and Random Forest classifiers. However, as expected, the context-dependent analysis using link strengths evaluated using the FST approach in conjunction with some actor information reveals strong clustering of actor data based on their types, and hence can be considered as a superior approach when data available for training the system is abundant.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  • Bellman RE (1957) Dynamic programming. Rand Corporation

  • Berkowitz SD (1982) An introduction to structure analysis: the network approach to social research. Butterworth

  • Breiman L (2001) Random forests. Mach Learning 45(1):5–32

    Article  MATH  Google Scholar 

  • Cha M, Mislove A, Gummadi KP (2009) A measurement-driven analysis of information propagation in the flickr social network. In: Proceedings of 18th international conference on World wide web, pp 721–730

  • Cheong M, Lee V (2009) Integrating web-based intelligence retrieval and decision-making from the twitter trends knowledge base. In: Proceedings of 2nd ACM workshop on social web search and mining, Hong Kong, pp 1–8

  • Cingolani P (2010) jfuzzylogic, open source fuzzy logic library and FCL language implementation. http://jfuzzylogicsourceforgenet/html/indexhtml

  • Commission IE (1997) Technical committee no. 6: Industrial process measurement and control. sub-committee 65 b: Devices. iec 1131-programmable controllers.http://www.fuzzytechcom/binaries/ieccd1pdf

  • Dorein P, Sockman FN (1997) Evolution of social networks, vol 1. Overseas Publishers Association, Amsterdam

  • Garton L, Haythornthwaite C, Wellman B (1997) Studying online social networks. J Comp Mediated Commun 3(1):75–105

    Google Scholar 

  • Kim S, Han S (2009) The method of inferring trust in web-based social network using fuzzy logic. In: Proceedings of international workshop on machine intelligence research, vol 2, pp 140–144

  • Klym N, Montpetit MJ (2008) Innovation at the edge: social TV and beyond. MIT Communications Futures Program (CFP). http://cfp.mit.edu/publications/CFP_Papers/Social%20TV%20Final%202008.09.01%20for%20distribution.pdf

  • Kossinets G, Watts DJ (2006) Empirical analysis of an evolving social network. Sci Agric 311(5757):88–90

    MathSciNet  Google Scholar 

  • Kumar R, Novak J, Tomkins A (2010) Structure and evolution of online social networks. In: Link mining: models, algorithms, and applications. Springer, Berlin

  • Lin Y, Sundaram H, Chi Y, Tatemura J, Tseng B (2007) Splog detection using content, time and link structures. In: Proceedings of IEEE international conference on multimedia and expo, pp 2030–2033

  • Ma W, Tran D, Sharma D (2009) A novel spam email detection system based on negative selection. In: Proceedings of 4th international conference on computer science and convergence information technology, Seoul, pp 987–992

  • Marsden PV, Campbell KE (1984) Measuring tie strength. Social Forces 63(2):482–501

    Article  Google Scholar 

  • Mathioudakis M, Koudas N, Marbach P (2010) Early online identification of attention gathering items in social media. In: Proceedings of 3rd ACM international conference web search and data mining, pp 301–310

  • McPherson M, Smith-Lovin L, Cook JM (2001) Birds of a feather: Homophily in social networks. Annu Rev Sociol 27:415–444

    Article  Google Scholar 

  • Mislove A, Marcon M, Gummadi KP, Druschel P, Bhattacharjee B (2007) Measurement and analysis of online social networks. In: Proceedings of 7th ACM SIGCOMM conference on international measurement, pp 29–42

  • Onnela JP, Saramaki J, Hyvonen J, Szabo G, Lazer D, Kaski K, Kertesz J, Barabasi AL (2007) Structure and tie strengths in mobile communication networks. Proc Natl Acad Sci USA 104(18):7332–7336

    Article  Google Scholar 

  • Rojas R (1996) Neural networks—a systematic introduction. Springer, Berlin

    MATH  Google Scholar 

  • Scott J (1992) Social network analysis. Sage, Newbury Park

    Google Scholar 

  • Strauss D, Ikeda M (1990) Pseudolikelihood estimation for social networks. J Am Stat Assoc 85:204–212

    Article  MathSciNet  Google Scholar 

  • Wasserman S, Faust K (1994) Social network analysis. Cambridge University Press, Cambridge

    Google Scholar 

  • Wasserman S, Pattison P (1996) Logit models and logistic regressions for social networks: I. an introduction to markov graphs and p*. Psychometrika 61(3):401–425

    Article  MATH  MathSciNet  Google Scholar 

  • Watts DJ, Dodds PS, Newman MEJ (2002) Identity and search in social networks. Sci Agric 296(17):1302–1305

    Google Scholar 

  • Weng J, Lim E, Jiang J, He Q (2010) Twitterrank: finding topic-sensitive influential twitterers. In: Proceedings of 3rd ACM international conference on web search and data mining, New York, pp 261–270

  • White H, Boorman S, Breiger R (1976) Social structure from multiple networks: I. blockmodels of roles and positions. Am J Sociol 81:730–780

    Article  Google Scholar 

  • Wikipedia (2010) Twitter. http://enwikipediaorg/wiki/Twitter

  • Wu C, Zhou B (2009) Analysis of tag within online social networks. In: Procedings of ACM 2009 international conference, pp 21–30

  • Yeh C, Chiang S (2009) Revisit bayesian approaches for spam detection. In: Proceedings of. 9th international conference for young computer scientists, Hunan, pp 659–664

  • Zhou B, Pei J, Luk W (2009) A brief survey on anonymization techniques for privacy preserving publishing of social network data. In: Proceedings of ACM SIGKDD Explorations Newsletter, vol 10, pp 12–22

Download references

Acknowledgments

We would like to thank Dr. Zuoming Wang, Assistant Professor, communication studies, University of North Texas, Texas, USA, for helping us in this work. We would also like to thank Dr. Santi Phithakkitnukoon, SENSEable City Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA, for giving us valuable comments on this work. This work is supported by the National Science Foundation under Grants CNS-0627754, CNS-0619871 and CNS-0551694.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ram Dantu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fazeen, M., Dantu, R. & Guturu, P. Identification of leaders, lurkers, associates and spammers in a social network: context-dependent and context-independent approaches. Soc. Netw. Anal. Min. 1, 241–254 (2011). https://doi.org/10.1007/s13278-011-0017-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13278-011-0017-9

Keywords

Navigation