Skip to main content
Log in

Discovering spammer communities in twitter

  • Published:
Journal of Intelligent Information Systems Aims and scope Submit manuscript

Abstract

Online social networks have become immensely popular in recent years and have become the major sources for tracking the reverberation of events and news throughout the world. However, the diversity and popularity of online social networks attract malicious users to inject new forms of spam. Spamming is a malicious activity where a fake user spreads unsolicited messages in the form of bulk message, fraudulent review, malware/virus, hate speech, profanity, or advertising for marketing scam. In addition, it is found that spammers usually form a connected community of spam accounts and use them to spread spam to a large set of legitimate users. Consequently, it is highly desirable to detect such spammer communities existing in social networks. Even though a significant amount of work has been done in the field of detecting spam messages and accounts, not much research has been done in detecting spammer communities and hidden spam accounts. In this work, an unsupervised approach called SpamCom is proposed for detecting spammer communities in Twitter. We model the Twitter network as a multilayer social network and exploit the existence of overlapping community-based features of users represented in the form of Hypergraphs to identify spammers based on their structural behavior and URL characteristics. The use of community-based features, graph and URL characteristics of user accounts, and content similarity among users make our technique very robust and efficient.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. http://help.twitter.com/entries/18311-the-twitter-rules

References

  • Aha, D.W., Kibler, D., Albert, M.K. (1991). Instance-based learning algorithms. Machine learning, 6(1), 37–66.

    Google Scholar 

  • Baumes, J., Goldberg, M., Magdon-Ismail, M. (2005). Efficient identification of overlapping communities. In International Conference on Intelligence and Security Informatics (pp. 27–36). Springer.

  • Baumes, J., Goldberg, M., Magdon-Ismail, M., Al Wallace, W. (2004). Discovering hidden groups in communication networks. In International Conference on Intelligence and Security Informatics (pp. 378–389). Springer.

  • Benevenuto, F., Magno, G., Rodrigues, T., Almeida, V. (2010). Detecting spammers on twitter. In Collaboration, electronic messaging, anti-abuse and spam conference (CEAS) (Vol. 6 p. 12).

  • Benevenuto, F., Rodrigues, T., Almeida, V., Almeida, J., Zhang, C., Ross, K. (2008). Identifying video spammers in online social networks. In Proceedings of the 4th international workshop on adversarial information retrieval on the web (pp. 45–52). ACM.

  • Bhat, S.Y., & Abulaish, M. (2013). Community-based features for identifying spammers in online social networks. In Proceedings of the 2013 IEEE/ACM international conference on advances in social networks analysis and mining (pp. 100–107). ACM.

  • Bindu, P.V., & Thilagam, P.S. (2016). Mining social networks for anomalies: Methods and challenges. Journal of Network and Computer Applications, 68, 213–229. https://doi.org/10.1016/j.jnca.2016.02.021.

    Article  Google Scholar 

  • Bindu, P.V., Thilagam, P.S., Ahuja, D. (2017). Discovering suspicious behavior in multilayer social networks. Computers in Human Behavior, 73, 568–582. https://doi.org/10.1016/j.chb.2017.04.001.

    Article  Google Scholar 

  • Bródka, P., Kazienko, P. (2014). Encyclopedia of social network analysis and mining, chap. Multilayered social networks (pp. 998–1013). New York: Springer. https://doi.org/10.1007/978-1-4614-6170-8_239.

    Google Scholar 

  • Chu, Z., Gianvecchio, S., Wang, H., & Jajodia, S. (2010). Who is tweeting on twitter: human, bot, or cyborg?. In Proceedings of the 26th annual computer security applications conference (pp. 21–30). ACM.

  • DeBarr, D., & Wechsler, H. (2009). Spam detection using clustering, random forests, and active learning. In 6th conference on email and anti-spam. Mountain view: Citeseer.

  • Facebook. (2016). Facebook company-info. http://newsroom.fb.com/company-info/.

  • Fire, M., Katz, G., Elovici, Y. (2012). Strangers intrusion detection-detecting spammers and fake proles in social networks based on topology anomalies. HUMAN, 1 (1), 26.

    Google Scholar 

  • Freund, Y., & Mason, L. (1999). The alternating decision tree learning algorithm. In Proceedings of the sixteenth international conference on machine learning (Vol. 99 pp. 124–133): Morgan kaufmann.

  • Gao, H., Hu, J., Wilson, C., Li, Z., Chen, Y., Zhao, B.Y. (2010). Detecting and characterizing social spam campaigns. In Proceedings of the 10th ACM SIGCOMM conference on internet measurement (pp. 35–47). ACM.

  • Ghosh, S., Viswanath, B., Kooti, F., Sharma, N.K., Korlam, G., Benevenuto, F., Ganguly, N., Gummadi, K.P. (2012). Understanding and combating link farming in the twitter social network. In Proceedings of the 21st international conference on world wide web (pp. 61–70). ACM.

  • Grier, C., Thomas, K., Paxson, V., Zhang, M. (2010). @ spam: the underground on 140 characters or less. In Proceedings of the 17th ACM conference on computer and communications security (pp. 27–37). ACM.

  • Haythornthwaite, C. (2005). Social networks and internet connectivity effects. Information, Communication & Society, 8(2), 125–147. https://doi.org/10.1080/13691180500146185.

    Article  Google Scholar 

  • Hu, X., Tang, J., Zhang, Y., Liu, H. (2013). Social spammer detection in microblogging. In Proceedings of the twenty-third international joint conference on artificial intelligence (vol. 13 pp. 2633–2639). AAAI Press.

  • John, G.H., & Langley, P. (1995). Estimating continuous distributions in bayesian classifiers. In Proceedings of the eleventh conference on uncertainty in artificial intelligence (pp. 338–345). Morgan kaufmann.

  • Kohavi, R., & Quinlan, J.R. (2002). Data mining tasks and methods: Classification: decision-tree discovery. In Handbook of data mining and knowledge discovery (pp. 267–276). Oxford University Press Inc.

  • Lee, K., Caverlee, J., Webb, S. (2010). Uncovering social spammers: social honeypots+ machine learning. In Proceedings of the 33rd international ACM SIGIR conference on research and development in information retrieval (pp. 435–442). ACM.

  • Lee, K., Eoff, B.D., Caverlee, J. (2011). Seven months with the devils: a long-term study of content polluters on twitter. In Proceedings of 5th international AAAI conference on weblogs and social media (ICWSM).

  • Martinez-Romo, J., & Araujo, L. (2013). Detecting malicious tweets in trending topics using a statistical analysis of language. Expert Systems with Applications, 40(8), 2992–3000.

    Article  Google Scholar 

  • Mustafaraj, E., & Metaxas, P.T. (2010). From obscurity to prominence in minutes: Political speech and real-time search. In In proceedings of the WebSci10: extending the frontiers of society on-line.

  • Reichardt, J., & Bornholdt, S. (2006). Statistical mechanics of community detection. Physical Review E, 74(1), 016110.

    Article  MathSciNet  Google Scholar 

  • Ratkiewicz, J., Conover, M., Meiss, M., Gonçalves, B., Flammini, A., Menczer, F. (2011). Detecting and tracking political abuse in social media. In Proceedings of 5th international AAAI conference on weblogs and social media (pp. 297–304).

  • Ratkiewicz, J., Conover, M., Meiss, M., Gonçalves, B., Patil, S., Flammini, A., Menczer, F. (2011). Truthy: mapping the spread of astroturf in microblog streams. In Proceedings of the 20th international conference companion on world wide web (ICWSM), (pp. 249–252). ACM.

  • SciTechBlog (2016). Scitechblog. http://scitech.blogs.cnn.com/.

  • Shrivastava, N., Majumder, A., Rastogi, R. (2008). Mining (social) network graphs to detect random link attacks. In IEEE 24th international conference on data engineering, 2008. ICDE 2008. (pp. 486–495). IEEE.

  • Song, J., Lee, S., Kim, J. (2011). Spam filtering in twitter using sender-receiver relationship. In Recent advances in intrusion detection (pp. 301–317). Springer.

    Google Scholar 

  • Stringhini, G., Kruegel, C., Vigna, G. (2010). Detecting spammers on social networks. In Proceedings of the 26th annual computer security applications conference (pp. 1–9). ACM.

  • Swamynathan, G., Wilson, C., Boe, B., Almeroth, K., Zhao, B.Y. (2008). Do social networks improve e-commerce?: a study on social marketplaces. In Proceedings of the 1st workshop on online social networks (pp. 1–6). ACM.

  • Thomas, K., Grier, C., Song, D., Paxson, V. (2011). Suspended accounts in retrospect: an analysis of twitter spam. In Proceedings of the 2011 ACM SIGCOMM conference on internet measurement conference (pp. 243–258). ACM.

  • Twitter. (2016). Twitter company-info. https://about.twitter.com/company.

  • Wang, A.H. (2010). Don’t follow me: Spam detection in twitter. In proceedings of the 2010 international conference on Security and cryptography (SECRYPT) (pp. 1–10). IEEE.

  • Yang, C., Harkreader, R., Gu, G. (2013). Empirical evaluation and new design for fighting evolving twitter spammers. IEEE Transactions on Information Forensics and Security, 8(8), 1280–1293. https://doi.org/10.1109/TIFS.2013.2267732.

    Article  Google Scholar 

  • Yang, C., Harkreader, R., Zhang, J., Shin, S., Gu, G. (2012). Analyzing spammers’ social networks for fun and profit: a case study of cyber criminal ecosystem on twitter. In Proceedings of the 21st international conference on world wide web (pp. 71–80). ACM.

  • Yang, C., Zhang, J., Gu, G. (2014). A taste of tweets: reverse engineering twitter spammers. In Proceedings of the 30th annual computer security applications conference (pp. 86–95). ACM.

  • Yardi, S., Romero, D., Schoenebeck, G., & et al. (2009). Detecting spam in a twitter network. First Monday 15(1).

  • Ying, X., Wu, X., Barbará, D. (2011). Spectrum based fraud detection in social networks. In Proceedings of the 27th international conference on data engineering, ICDE 2011, April 11–16, 2011, Hannover, Germany, pp. 912–923, https://doi.org/10.1109/ICDE.2011.5767910, (to appear in print).

  • Zheng, X., Zeng, Z., Chen, Z., Yu, Y., Rong, C. (2015). Detecting spammers on social networks. Neurocomputing, 159, 27–34. https://doi.org/10.1016/j.neucom.2015.02.047.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to P. V. Bindu.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bindu, P.V., Mishra, R. & Thilagam, P.S. Discovering spammer communities in twitter. J Intell Inf Syst 51, 503–527 (2018). https://doi.org/10.1007/s10844-017-0494-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10844-017-0494-z

Keywords

Navigation