skip to main content
10.1145/2187836.2187846acmotherconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

Understanding and combating link farming in the twitter social network

Published: 16 April 2012 Publication History

Abstract

Recently, Twitter has emerged as a popular platform for discovering real-time information on the Web, such as news stories and people's reaction to them. Like the Web, Twitter has become a target for link farming, where users, especially spammers, try to acquire large numbers of follower links in the social network. Acquiring followers not only increases the size of a user's direct audience, but also contributes to the perceived influence of the user, which in turn impacts the ranking of the user's tweets by search engines.
In this paper, we first investigate link farming in the Twitter network and then explore mechanisms to discourage the activity. To this end, we conducted a detailed analysis of links acquired by over 40,000 spammer accounts suspended by Twitter. We find that link farming is wide spread and that a majority of spammers' links are farmed from a small fraction of Twitter users, the social capitalists, who are themselves seeking to amass social capital and links by following back anyone who follows them. Our findings shed light on the social dynamics that are at the root of the link farming problem in Twitter network and they have important implications for future designs of link spam defenses. In particular, we show that a simple user ranking scheme that penalizes users for connecting to spammers can effectively address the problem by disincentivizing users from linking with other users simply to gain influence.

References

[1]
bitly blog - Spam and Malware Protection. http://tinyurl.com/nv2oer.
[2]
Klout | The Standard for Influence. http://klout.com/home.
[3]
There Are Now 155m Tweets Posted Per Day, Triple the Number a Year Ago. http://rww.to/gv4VqA, April 2011.
[4]
Twitter help center: The Twitter rules. http://tinyurl.com/22obg56, 2011.
[5]
L. Becchetti, C. Castillo, D. Donato, R. Baeza-Yates, and S. Leonardi. Link analysis for web spam detection. ACM Transactions on the Web, 2:1--42, March 2008.
[6]
F. Benevenuto, G. Magno, T. Rodrigues, and V. Almeida. Detecting spammers on Twitter. In Annual Collaboration, Electronic Messaging, Anti-Abuse and Spam Conference (CEAS), 2010.
[7]
K. Bharat and M. R. Henzinger. Improved algorithms for topic distillation in a hyperlinked environment. In ACM Int'l Conference on Research and Development in Information Retrieval (SIGIR), 1998.
[8]
C. Castillo, D. Donato, A. Gionis, V. Murdock, and F. Silvestri. Know your neighbors: web spam detection using the web topology. In ACM Int'l Conference on Research and Development in Information Retrieval (SIGIR), 2007.
[9]
M. Cha, H. Haddadi, F. Benevenuto, and K. P. Gummadi. Measuring user influence in Twitter: the million follower fallacy. In AAAI Int'l Conference on Weblogs and Social Media (ICWSM), 2010.
[10]
S. Chakrabarti. Integrating the document object model with hyperlinks for enhanced topic distillation and information extraction. In ACM Int'l Conference on World Wide Web (WWW), 2001.
[11]
B. D. Davison. Recognizing nepotistic links on the web. In AAAI Workshop on Artificial Intelligence for Web Search, 2000.
[12]
D. Gayo-Avello and D. J. Brenes. Overcoming Spammers in Twitter - a tale of five algorithms. In Spanish Conference on Information Retrieval (CERI), 2010.
[13]
C. Grier, K. Thomas, V. Paxson, and M. Zhang. @spam: the underground on 140 characters or less. In ACM Int'l Conference on Computer and Communications Security (CCS), 2010.
[14]
Z. Gyöngyi and H. Garcia-Molina. Link spam alliances. In Int'l Conference on Very Large Data Bases (VLDB), 2005.
[15]
Z. Gyöngyi, H. Garcia-Molina, and J. Pedersen. Combating web spam with trustrank. In Int'l Conference on Very Large Data Bases (VLDB), 2004.
[16]
T. H. Haveliwala. Topic-sensitive pagerank. In ACM Int'l Conference on World Wide Web (WWW), 2002.
[17]
US confirms it asked Twitter to stay open to help Iran protesters. http://tinyurl.com/klv36p.
[18]
H. Kwak, H. Chun, and S. Moon. Fragile online relationship: a first look at unfollow dynamics in Twitter. In Annual Conference on Human Factors in Computing Systems (CHI), 2011.
[19]
K. Lee, J. Caverlee, and S. Webb. Uncovering social spammers: social honeypots
[20]
machine learning. In ACM Int'l Conference on Research and Development in Information Retrieval (SIGIR), 2010.
[21]
K. Lee, B. D. Eoff, and J. Caverlee. Seven months with the devils: a long-term study of content polluters on Twitter. In AAAI Int'l Conference on Weblogs and Social Media (ICWSM), 2011.
[22]
R. Lempel and S. Moran. The stochastic approach for link-structure analysis (SALSA) and the TKC effect. Computer Networks, 33:387--401, Jun 2000.
[23]
A. Ramachandran and N. Feamster. Understanding the network-level behavior of spammers. SIGCOMM Computer Communication Review, 36:291--302, Aug 2006.
[24]
T. Sakaki, M. Okazaki, and Y. Matsuo. Earthquake shakes twitter users: real-time event detection by social sensors. In ACM Int'l Conference on World Wide Web (WWW), 2010.
[25]
M. Sobek. Google PageRank - PR 0. http://pr.efactory.de/e-pr0.shtml.
[26]
D. Talbot. How Google Ranks Tweets. http://www.technologyreview.in/web/24353/.
[27]
J. Teevan, D. Ramage, and M. R. Morris. #TwitterSearch: a comparison of microblog search and web search. In ACM Int'l Conference on Web Search and Data Mining (WSDM), 2011.
[28]
K. Thomas, C. Grier, V. Paxson, and D. Song. Suspended accounts in retrospect: an analysis of Twitter spam. In ACM SIGCOMM Conference on Internet Measurement (IMC), 2011.
[29]
L. Rao, Twitter Seeing 90 Million Tweets Per Day, 25 Percent Contain Links, TechCrunch, 2010. http://tinyurl.com/27x5cay.
[30]
J. Weng, E.-P. Lim, J. Jiang, and Q. He. TwitterRank: finding topic-sensitive influential Twitterers. In ACM Int'l Conference on Web Search and Data Mining (WSDM), 2010.
[31]
B. Wu and B. D. Davison. Identifying link farm spam pages. In ACM Int'l Conference on World Wide Web (WWW), 2005.
[32]
B. Wu, V. Goel, and B. D. Davison. Propagating trust and distrust to demote web spam. In Workshop on Models of Trust for the Web, 2006.
[33]
S. Yardi, D. Romero, G. Schoenebeck, and D. M. Boyd. Detecting spam in a twitter network. First Monday, 15(1):1--13, Jan 2010.
[34]
C. M. Zhang and V. Paxson. Detecting and analyzing automated activity on Twitter. In Int'l Conference on Passive and Active Measurement (PAM), 2011.

Cited By

View all
  • (2024)A Hybrid Deep Learning Approach for Enhanced Sentiment Classification and Consistency Analysis in Customer ReviewsMathematics10.3390/math1223385612:23(3856)Online publication date: 7-Dec-2024
  • (2024)A corpus-based real-time text classification and tagging approach for social dataFrontiers in Computer Science10.3389/fcomp.2024.12949856Online publication date: 13-Mar-2024
  • (2024)Unraveling the Web of Disinformation: Exploring the Larger Context of State-Sponsored Influence Campaigns on TwitterProceedings of the 27th International Symposium on Research in Attacks, Intrusions and Defenses10.1145/3678890.3678911(353-367)Online publication date: 30-Sep-2024
  • Show More Cited By

Index Terms

  1. Understanding and combating link farming in the twitter social network

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      WWW '12: Proceedings of the 21st international conference on World Wide Web
      April 2012
      1078 pages
      ISBN:9781450312295
      DOI:10.1145/2187836
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      • Univ. de Lyon: Universite de Lyon

      In-Cooperation

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 16 April 2012

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. collusionrank
      2. link farming
      3. pagerank
      4. spam
      5. twitter

      Qualifiers

      • Research-article

      Conference

      WWW 2012
      Sponsor:
      • Univ. de Lyon
      WWW 2012: 21st World Wide Web Conference 2012
      April 16 - 20, 2012
      Lyon, France

      Acceptance Rates

      Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)61
      • Downloads (Last 6 weeks)4
      Reflects downloads up to 16 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)A Hybrid Deep Learning Approach for Enhanced Sentiment Classification and Consistency Analysis in Customer ReviewsMathematics10.3390/math1223385612:23(3856)Online publication date: 7-Dec-2024
      • (2024)A corpus-based real-time text classification and tagging approach for social dataFrontiers in Computer Science10.3389/fcomp.2024.12949856Online publication date: 13-Mar-2024
      • (2024)Unraveling the Web of Disinformation: Exploring the Larger Context of State-Sponsored Influence Campaigns on TwitterProceedings of the 27th International Symposium on Research in Attacks, Intrusions and Defenses10.1145/3678890.3678911(353-367)Online publication date: 30-Sep-2024
      • (2024)CGNN: A Compatibility-Aware Graph Neural Network for Social Media Bot DetectionIEEE Transactions on Computational Social Systems10.1109/TCSS.2024.339641311:5(6528-6543)Online publication date: Oct-2024
      • (2023)Hierarchical Dense Pattern Detection in TensorsACM Transactions on Knowledge Discovery from Data10.1145/357702217:6(1-29)Online publication date: 28-Feb-2023
      • (2023)Fake News Classification using Transfer Learning2023 International Conference on Artificial Intelligence and Knowledge Discovery in Concurrent Engineering (ICECONF)10.1109/ICECONF57129.2023.10083678(1-7)Online publication date: 5-Jan-2023
      • (2023)A hybrid framework for bot detection on twitter: Fusing digital DNA with BERTMultimedia Tools and Applications10.1007/s11042-023-14730-582:20(30831-30854)Online publication date: 1-Mar-2023
      • (2022)A Deep Neural Network Technique for Detecting Real-Time Drifted Twitter SpamApplied Sciences10.3390/app1213640712:13(6407)Online publication date: 23-Jun-2022
      • (2022)An Evolutionary Computation Approach for Twitter Bot DetectionApplied Sciences10.3390/app1212591512:12(5915)Online publication date: 10-Jun-2022
      • (2022)TrollMagnifier: Detecting State-Sponsored Troll Accounts on Reddit2022 IEEE Symposium on Security and Privacy (SP)10.1109/SP46214.2022.9833706(2161-2175)Online publication date: May-2022
      • Show More Cited By

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media